Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/181937
Title: STUDY ON INDEX REGISTER PREFETCHING AND PREFETCH SCHEDULING
Authors: CHAN WAI WAI FLETCH
Issue Date: 1997
Citation: CHAN WAI WAI FLETCH (1997). STUDY ON INDEX REGISTER PREFETCHING AND PREFETCH SCHEDULING. ScholarBank@NUS Repository.
Abstract: Processor chips are generally optimized first for speed while memory chips are mainly optimized for density. This dilemma results in an ever widening speed gap between processor and memory. As a result, modern processors spend more cycles in waiting for data from memory than their ancestors did. Memory latency has become a performance bottleneck in modern computer systems. Data cache prefetching helps to tolerate memory latency by overlapping memory latency with instruction execution. It consists of two major functional components : address anticipation and prefetch scheduling. Depending on the enabling technology, a data cache prefetching scheme can be identified as a hardware scheme, a software scheme or a hybrid scheme. Our study concentrated on hardware scheme as there is no instruction execution overhead and no software compatibility problem. In designing hardware schemes, simplicity is a very important consideration. Too complicated a hardware scheme would need more transistors than available and have a pathlength too long to be acceptable. In the scope of address anticipation, we propose, in this thesis, a hardware scheme, known as Index Register Prefetching (RP). The scheme targets for constant stride accesses. Instead of monitoring memory accessing instructions and detecting patterns in effective addresses, IRP monitors updates of registers and detects patterns in index registers. Our study reveals that, while generally achieving about 90% performance of Reference Prediction Table (RPT), a leading hardware scheme, IRP requires only less than 5% of the hardware costs required by RPT. We submit that IRP is more cost-effective than RPT. We further studied the reasons for performance difference between IRP and RPT. Two scenarios contribute to most of the difference : index registers are reused within the innermost iterations and some memory accessing instructions use two index registers. The former can be amended by using default prefetching which was shown to be able to reclaim about 50% of the performance losses with significant side effect in occasional situations. The latter can be solved by a simple loop invariant register detector. Nearly all performance losses can be reclaimed without observable side effects. Under the umbrella of prefetch scheduling, we propose to abort an on-going prefetch when a demand fetch arises in the course. Our simulations showed that a performance gain up to 70% were generally observed. The benefit is mainly contributed by reducing delay in handling demand fetches. The possible increase in cache misses is prevented by make-up prefetches. Our simulation also showed that selective abortion based on maturity of prefetches and on abortion-to-prefetch ratio were not very appealing. Observed performance improvement was too little to justify extra hardware investment. Restart of aborted prefetches was found to be making more harm than good and should be avoided. We have also studied multiple iterations look ahead. The look ahead distance is intentionally restricted to a power of 2 so that only rewiring but no extra hardware is needed!. Improvement of around 10% was often observed. This is because of filling free memory cycles with prefetches, which are, otherwise, generated in the subsequent iterations. Last but not lest, two generally acceptable concepts about cache performance are found questionable in our study.
URI: https://scholarbank.nus.edu.sg/handle/10635/181937
Appears in Collections:Master's Theses (Restricted)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
B20838700.PDF3.29 MBAdobe PDF

RESTRICTED

NoneLog In

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.