Please use this identifier to cite or link to this item: https://doi.org/10.1145/1037949.1024416
DC FieldValue
dc.titleCompiler orchestrated prefetching via speculation and predication
dc.contributor.authorRabbah, R.M.
dc.contributor.authorSandanagobalane, H.
dc.contributor.authorEkpanyapong, M.
dc.contributor.authorWong, W.-F.
dc.date.accessioned2013-07-04T08:45:33Z
dc.date.available2013-07-04T08:45:33Z
dc.date.issued2004
dc.identifier.citationRabbah, R.M.,Sandanagobalane, H.,Ekpanyapong, M.,Wong, W.-F. (2004). Compiler orchestrated prefetching via speculation and predication. Operating Systems Review (ACM) 38 (5) : 189-198. ScholarBank@NUS Repository. <a href="https://doi.org/10.1145/1037949.1024416" target="_blank">https://doi.org/10.1145/1037949.1024416</a>
dc.identifier.issn01635980
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/42187
dc.description.abstractThis paper introduces a compiler-orchestrated prefetching system as a unified framework geared toward ameliorating the gap between processing speeds and memory access latencies. We focus the scope of the optimization on specific subsets of the program dependence graph that succinctly characterize the memory access pattern of both regular array-based applications and irregular pointer-intensive programs. We illustrate how program embedded precomputation via speculative execution can accurately predict and effectively prefetch future memory references with negligible overhead. The proposed techniques reduce the total running time of seven SPEC benchmarks and two OLDEN benchmarks by 27% on an Itanium 2 processor. The improvements are in addition to several state-of-the-art optimizations including software pipelining and data prefetching. In addition, we use cycle-accurate simulations to identify important and lightweight architectural innovations that further mitigate the memory system bottleneck. In particular, we focus on the notoriously challenging class of pointer-chasing applications, and demonstrate how they may benefit from a novel scheme of sentineled prefetching. Our results for twelve SPEC benchmarks demonstrate that 45% of the processor stalls that are caused by the memory system are avoidable. The techniques in this paper can effectively mask long memory latencies with little instruction overhead, and can readily contribute to the performance of processors today. Copyright 2004 ACM.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1145/1037949.1024416
dc.sourceScopus
dc.subjectPrecomputation
dc.subjectPredicated execution
dc.subjectPrefetching
dc.subjectSpeculation
dc.typeConference Paper
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.doi10.1145/1037949.1024416
dc.description.sourcetitleOperating Systems Review (ACM)
dc.description.volume38
dc.description.issue5
dc.description.page189-198
dc.description.codenOSRED
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.