Hardware prefetching strategies for second level caches
One way to improve the performance of a memory hierarchy is by prefetching data and instructions before they are needed by the CPU. Second level caches are more suitable candidates for prefetching than first level caches because they are not as busy servicing CPU requests. Thus more complex prefetching algorithms may be employed or more spatial locality of the data stream exploited.
This dissertation examines the gains in system performance that result from hardware prefetching schemes implemented in the second level of a two level cache memory hierarchy. These gains are calculated using a detailed trace driven simulation model that includes cycle-by-cycle timing. Since system execution time is measured rather than hit-rates alone, memory bandwidth limitations and other resource contention effects on prefetching performance may be accurately measured.
Simulation results show that penalty per instruction due to cache misses may be reduced by 20 to 30% using second level cache prefetching if the line size is optimized for the cache size, and up to 50% if the line size is smaller than optimal. These results were consistent for a variety of second level cache sizes up to 4MB. Similar penalty reductions were realized using traces of different applications.