Coordinating prefetching and STT-RAM based last-level cache management for multicore systems

Abstract

Data prefetching is a common mechanism to mitigate the bottleneck of off-chip memory bandwidth in modern computing systems. Unfortunately, the side effects of prefetching are an additional burden on off-chip communication and increased cache write operations. With the proposal of spin-transfer torque random access memory (STT-RAM) based last-level caches (LLCs) for their high density and low power consumption, the increase of write pressure to the cache from prefetching coupled with the characteristically long write access compared with traditional SRAM caches exacerbates the performance cost of prefetching schemes. In this work, we propose two orthogonal techniques to reduce the negative performance impact induced by aggressive prefetching on multicore systems employing STT-RAM based LLC. First, basic priority assignment prioritizes the different types of access requests of LLC by their criticality and responds to them based on priority. Second, priority boosting differentiates requests by application and prioritizes the relatively few requests from applications with non-intensive accesses to the LLC, which usually creates the most severe performance degradation in multi-core systems. Combining these two prioritization policies can alleviate the negative effect induced by aggressive prefetching. Our results show that these techniques can achieve an 8.3 average application speedup compared to a baseline, prefetch only design without prioritization. © 2013 ACM.

DOI

10.1145/2483028.2483060

Year

2013