Aviral Shrivastava: Research
We are part of an industry that is driving on Moore's law. It is
extremely difficult to concieve of the impact of exponential
growth. If the automobile industry had grown similarly, today, we
could buy a new car in less than a cent, travel to any place of earth
in a fraction of a second, and keep the car in our pockets!. A large
portion of this speedup has been achieved by the continuous downsizing
of the fundamental computational unit, called the transistor. However,
along with all the goodies, come significant challenges.
- Power Challenge
- The power density of computer chips has gone through the
roof. Power consumption is an extremely important concern for battery
operated handheld embedded systems, since it has the most significant
impact on the size, shape, volume, recharging frequency, and ultimately the
usability of the embedded system. Power reduction is difficult since
it can be achieve only by increasing the efficiency of operation. A
lot of our research is on developing compiler and microarchitectural
techniques for reducing the power consumption of processor-based
embedded systems.
- Temperature Challenge
- Air cooling of chips only works until about 30W. To extract more
power than that, sophisticated cooling mechanisms, e.g., water
cooling, oil cooling, or liquid nitrogen cooling are required; but not
only they are much more expensive, they are also unsafe and
inconvenient. To save cost, current processors come with packages that
are unable to extract the worst case power dissipated by the
processor. Consequently, the temperature of the processor can
increase. Obviously beyond a certain point (around 125oC),
high temperature can cause permanent physical damage to the chip. To
prevent this, the system is stopped/slowed down as the temperature
reaches near this upper limit - called Dynamic Thermal Management.
This causes and will cause increasingly severe performance degradation
in high-end processors. Significant focus of our efforts are to reduce
the performance degradation caused by these thermal issues.
- Multi-core Challenge
- Part of the solution for the power and temperature problem is
Multi-core architectures. It is not clear, that the only way ahead to
improve performance without too much increase in the power consumption
is the use of multi-core architectures. However, there are significant
challenges in successful use of multi-cores, i.e., achieving linear
speedup on a large set of to-be-written and already-written
applications. Challenges exist in how to express all the parallelism
in the application, how to divide the parallelism onto the cores,
considering the communication and synchronization overheads, and how
to efficiently execute a thread on a core. The right multi-core
architecture is also under question. One of our observation from the
futuristic multicore like the IBM Cell and the Intel 80-core processor is that
caches will not be able to scale when we will have 100s and 1000s of
cores. Cores will have to operate on limited local memories. This
constraint on the size of local memories is not going away, and
parallelization and mapping schemes need to accomodate this constraint
in order to optimize for the multi-cores.
- Variations Challenge
- 40 years of technology scaling has brought us to a point, where
we cannot manufacture the transistors exactly how we design them. The
inaccuracies in the gemoetrical shapes and dopant concentrations
have become extremely significantly. For example, even if Intel
designs their pentium 4 processor to work at 3 GHz, but after manufacturing,
some come out working at 2 GHz, and some others at 4 GHz , and most
others in between them. What we all are aware of the pricing structure
that Intel has built on this manufacturing variations. Faster
processors are much most costly than the ones that run at lower
frequency.
Our research focuses on how can we contain the effect of reduce the
impact of process variations at the application level.
- Reliability Challenge
- With transistors now only a few 10s of atoms thick, they have
become extremely susceptible to even slight noises in voltage and
power levels, signal interference and even cosmic particle
strike. All these effects, called transient faults, or soft errors can
switch the logic value of a transistor, and can ultimately lead to
system failure. While manufacturing solutions exist, with
exponentially increasing soft error rate, they will not be enough, and
more cost efficient solutions will be needed at higher levels of
design abstraction to mitigate the impacts of soft errors.
More detailed information about my research on the lab page.