Aviral Shrivastava: Research

Home Publications Teaching Service Lab

We are part of an industry that is driving on Moore's law. It is extremely difficult to concieve of the impact of exponential growth. If the automobile industry had grown similarly, today, we could buy a new car in less than a cent, travel to any place of earth in a fraction of a second, and keep the car in our pockets!. A large portion of this speedup has been achieved by the continuous downsizing of the fundamental computational unit, called the transistor. However, along with all the goodies, come significant challenges.
Power Challenge
The power density of computer chips has gone through the roof. Power consumption is an extremely important concern for battery operated handheld embedded systems, since it has the most significant impact on the size, shape, volume, recharging frequency, and ultimately the usability of the embedded system. Power reduction is difficult since it can be achieve only by increasing the efficiency of operation. A lot of our research is on developing compiler and microarchitectural techniques for reducing the power consumption of processor-based embedded systems.
Temperature Challenge
Air cooling of chips only works until about 30W. To extract more power than that, sophisticated cooling mechanisms, e.g., water cooling, oil cooling, or liquid nitrogen cooling are required; but not only they are much more expensive, they are also unsafe and inconvenient. To save cost, current processors come with packages that are unable to extract the worst case power dissipated by the processor. Consequently, the temperature of the processor can increase. Obviously beyond a certain point (around 125oC), high temperature can cause permanent physical damage to the chip. To prevent this, the system is stopped/slowed down as the temperature reaches near this upper limit - called Dynamic Thermal Management. This causes and will cause increasingly severe performance degradation in high-end processors. Significant focus of our efforts are to reduce the performance degradation caused by these thermal issues.
Multi-core Challenge
Part of the solution for the power and temperature problem is Multi-core architectures. It is not clear, that the only way ahead to improve performance without too much increase in the power consumption is the use of multi-core architectures. However, there are significant challenges in successful use of multi-cores, i.e., achieving linear speedup on a large set of to-be-written and already-written applications. Challenges exist in how to express all the parallelism in the application, how to divide the parallelism onto the cores, considering the communication and synchronization overheads, and how to efficiently execute a thread on a core. The right multi-core architecture is also under question. One of our observation from the futuristic multicore like the IBM Cell and the Intel 80-core processor is that caches will not be able to scale when we will have 100s and 1000s of cores. Cores will have to operate on limited local memories. This constraint on the size of local memories is not going away, and parallelization and mapping schemes need to accomodate this constraint in order to optimize for the multi-cores.
Variations Challenge
40 years of technology scaling has brought us to a point, where we cannot manufacture the transistors exactly how we design them. The inaccuracies in the gemoetrical shapes and dopant concentrations have become extremely significantly. For example, even if Intel designs their pentium 4 processor to work at 3 GHz, but after manufacturing, some come out working at 2 GHz, and some others at 4 GHz , and most others in between them. What we all are aware of the pricing structure that Intel has built on this manufacturing variations. Faster processors are much most costly than the ones that run at lower frequency. Our research focuses on how can we contain the effect of reduce the impact of process variations at the application level.
Reliability Challenge
With transistors now only a few 10s of atoms thick, they have become extremely susceptible to even slight noises in voltage and power levels, signal interference and even cosmic particle strike. All these effects, called transient faults, or soft errors can switch the logic value of a transistor, and can ultimately lead to system failure. While manufacturing solutions exist, with exponentially increasing soft error rate, they will not be enough, and more cost efficient solutions will be needed at higher levels of design abstraction to mitigate the impacts of soft errors.

More detailed information about my research on the lab page.