Compiler -assisted hardware-based data prefetching for next generation processors
Prefetching has emerged as one of the most successful techniques to bridge the gap between modern processors and memory systems. On the other hand, as we move to the deep sub-micron era, power consumption has become one of the most important design constraints besides performance. Intensive research efforts have been done on data prefetching focusing on performance improvement, however, as far as we know, the energy aspects of prefetching have not been fully investigated.
This dissertation investigates data prefetching techniques for next-generation processors targeting both energy-effciency and performance speedup. We first evaluate a number of state-of-the-art data prefetching techniques from an energy perspective and identify the main energy-consuming components due to prefetching. We then propose a set of compiler-assisted energy-aware techniques to make hardware-based data prefetching more energy-efficient.
From our evaluation on a number of data prefetching techniques, we have found that if leakage is optimized with recently proposed circuit-level techniques, most of the energy overhead of hardware data prefetching comes from prefetch hardware related costs and unnecessary L1 data cache lookups related to prefetches that hit in the L1 cache. This energy overhead on the memory system can be as much as 30%.
We propose a set of power-aware prefetch filtering techniques to reduce the energy overhead of hardware data prefetching techniques. Our proposed techniques include three compiler-based filtering approaches that make the prefetch predictor more energy efficient. We also propose a hardware-based filtering technique to further reduce the energy overhead due to unnecessary prefetching in the L1 data cache. The energy-aware filtering techniques combined could reduce up to 40% of the energy overhead introduced due to aggressive prefetching with almost no performance degradation.
We also develop a location-set driven data prefetching technique to further reduce the energy consumption of prefetching hardware. In this scheme, we use a power-aware prefetch engine with a novel design of an indexed hardware history table. With the help of compiler-based location-set analysis, we show that the proposed prefetching scheme reduces the energy consumed by the prefetch history table by 7-11X with very small impact on performance.
Our experiments show that the proposed techniques could overcome the prefetching-related energy overhead in most applications, improving the energy-delay product by 33% on average. For many applications studied, our work has transformed data prefetching into not only a performance improvement mechanism, but an energy saving technique as well.
0984: Computer science