Temperature Aware Task Scheduling in MPSoCs:
Project Website
In deep submicron circuits, thermal hot spots and temperature variations have
brought new challenges in reliability, performance, cooling costs and leakage power.
Conventional thermal management sacrifices performance to control the thermal behavior
by slowing down or stalling the processors when a critical temperature threshold
is exceeded. Moreover, such techniques do not target minimizing the temporal and spatial
variations in temperature, which impact system reliability adversely.
In our work, we explore temperature-aware task scheduling for multiprocessor
systems-on-a-chip (MPSoC). We design and evaluate OS-level dynamic scheduling policies
with negligible performance overhead. We show that, using simple-to-implement scheduling
policies that make decisions based on temperature measurements, frequency of high-magnitude
thermal cycles and spatial gradients can be decreased dramatically in comparison to state-of-the-art
schedulers. OS-level temperature aware scheduling can also be combined with reactive methods
such as dynamic thread migration in order to further decrease the hot spots and temperature
variations at low performance cost.
Analysis of Temperature Induced Reliability Problems in MPSoCs:
The combination of increasing integration level and the rising power consumption leads to higher
power densities in deep submicron and nanoscale SoCs. This trend results in large temperature offsets
and hot spots on chip, constituting a significant design challenge for system reliability. Conventional
power management policies reduce the system level power consumption and the overall temperature on chip,
and thus are expected to contribute to improved reliability. However, high temperature differentials
caused by power management may adversely affect system reliability, and create conflicting demands for system design.
The goal of my research is to introduce a simulation methodology to analyze reliability on SoCs,
in order to accurately evaluate the effects of power management policies as well as workload scheduling,
system topology and thermal packaging on multi-core SoC failure rates.
Fault Tolerant Architectures:
I am also interested in fault tolerant computer architectures. I have worked on developing an
architecture for superscalar processor pipeline to provide high transient fault coverage while incurring minimal performance
and hardware overhead.
|
|

|
|
E-mail: |
acoskun (at) cs.ucsd.edu |
|
Fax: |
(858) 534-7029 |
|
Address: |
9500 Gilman Drive
CSE Department
La Jolla, CA 92093-0404 |
|