Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/14193
DC FieldValue
dc.titleCompiler driver memory system optimization using speculative execution
dc.contributor.authorHARIHARAN SANDANAGOBALANE
dc.date.accessioned2010-04-08T10:40:45Z
dc.date.available2010-04-08T10:40:45Z
dc.date.issued2004-08-30
dc.identifier.citationHARIHARAN SANDANAGOBALANE (2004-08-30). Compiler driver memory system optimization using speculative execution. ScholarBank@NUS Repository.
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/14193
dc.description.abstractWide-issue microprocessors are capable of remarkable execution rates, but they generally achieve only a fraction of their peak instruction throughput on real programs. This discrepancy is due to performance degrading events, largely branch mispredictions and cache misses. In this work we have addressed the performance degradation due to the latter through the use of Program Embedded Precomputation using Speculative Execution (PEPSE). Towards this, we introduce the Load Dependence Graph (LDG), which is a sub-graph of the traditional Program Dependence Graph(PDG) that computes the address of a load instruction.In the context of data prefetching, we illustrate how PEPSE can accurately predict and effectively prefetch future memory references with negligible overhead for both regular array-based applications as well as irregular pointer-based applications. We use profiling to identify delinquent loads. LDGs are created only for those loads. Subsequently, speculative versions of the LDG operations are statically scheduled along with a prefetch instruction for the computed address, such that these instructions execute and prefetch the value before the actual load is encountered resulting in either an elimination or reduction of the processor stall cycles due to the load instruction. Our prototype implementation of the optimizations within the Open Research Compiler (ORC) delivered encouraging results. For a 900 MHz Itanium 2 server, we could achieve speedups ranging from 1.05 to 2.14 for several benchmarks from SPEC and OLDEN suites.
dc.language.isoen
dc.subjectMicroprocessors, Cache misses, Program Dependence Graph, Prefetching, Scheduling, Optimizations
dc.typeThesis
dc.contributor.departmentCOMPUTER SCIENCE
dc.contributor.supervisorWONG WENG FAI
dc.description.degreeMaster's
dc.description.degreeconferredMASTER OF SCIENCE
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Master's Theses (Open)

Show simple item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
thesis.pdf307.87 kBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.