Compiler orchestrated prefetching via speculation and predication

Please use this identifier to cite or link to this item: https://doi.org/10.1145/1037949.1024416

DC Field	Value
dc.title	Compiler orchestrated prefetching via speculation and predication
dc.contributor.author	Rabbah, R.M.
dc.contributor.author	Sandanagobalane, H.
dc.contributor.author	Ekpanyapong, M.
dc.contributor.author	Wong, W.-F.
dc.date.accessioned	2013-07-04T08:45:33Z
dc.date.available	2013-07-04T08:45:33Z
dc.date.issued	2004
dc.identifier.citation	Rabbah, R.M.,Sandanagobalane, H.,Ekpanyapong, M.,Wong, W.-F. (2004). Compiler orchestrated prefetching via speculation and predication. Operating Systems Review (ACM) 38 (5) : 189-198. ScholarBank@NUS Repository. <a href="https://doi.org/10.1145/1037949.1024416" target="_blank">https://doi.org/10.1145/1037949.1024416</a>
dc.identifier.issn	01635980
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/42187
dc.description.abstract	This paper introduces a compiler-orchestrated prefetching system as a unified framework geared toward ameliorating the gap between processing speeds and memory access latencies. We focus the scope of the optimization on specific subsets of the program dependence graph that succinctly characterize the memory access pattern of both regular array-based applications and irregular pointer-intensive programs. We illustrate how program embedded precomputation via speculative execution can accurately predict and effectively prefetch future memory references with negligible overhead. The proposed techniques reduce the total running time of seven SPEC benchmarks and two OLDEN benchmarks by 27% on an Itanium 2 processor. The improvements are in addition to several state-of-the-art optimizations including software pipelining and data prefetching. In addition, we use cycle-accurate simulations to identify important and lightweight architectural innovations that further mitigate the memory system bottleneck. In particular, we focus on the notoriously challenging class of pointer-chasing applications, and demonstrate how they may benefit from a novel scheme of sentineled prefetching. Our results for twelve SPEC benchmarks demonstrate that 45% of the processor stalls that are caused by the memory system are avoidable. The techniques in this paper can effectively mask long memory latencies with little instruction overhead, and can readily contribute to the performance of processors today. Copyright 2004 ACM.
dc.description.uri	http://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1145/1037949.1024416
dc.source	Scopus
dc.subject	Precomputation
dc.subject	Predicated execution
dc.subject	Prefetching
dc.subject	Speculation
dc.type	Conference Paper
dc.contributor.department	COMPUTER SCIENCE
dc.description.doi	10.1145/1037949.1024416
dc.description.sourcetitle	Operating Systems Review (ACM)
dc.description.volume	38
dc.description.issue	5
dc.description.page	189-198
dc.description.coden	OSRED
dc.identifier.isiut	NOT_IN_WOS
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM