Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/49389
Title: PARALLELISM-ENERGY PERFORMANCE ANALYSIS OF MULTICORE SYSTEMS
Authors: TUDOR BOGDAN MARIUS
Keywords: performance analysis, parallelism, multicore, energy, performance
Issue Date: 7-Nov-2013
Source: TUDOR BOGDAN MARIUS (2013-11-07). PARALLELISM-ENERGY PERFORMANCE ANALYSIS OF MULTICORE SYSTEMS. ScholarBank@NUS Repository.
Abstract: Modern multicore systems consist of multiple on-chip cores supported by off-chip shared resources such as memory and I/O devices. Scaling the performance in the multicore era requires programs to expose sufficient parallelism such that their execution consists of overlapping the activities on both on-chip and off-chip resources. But too much overlap might trigger contention for the shared resources, extending the response time of the program. On the other hand, the performance of many multicore systems is increasingly constrained by either a power or an energy budget. Thus, in the multicore era, analyzing the performance of an application requires understanding of how the application parallelism is mapped to hardware parallelism and its effects on execution time and energy usage. This thesis proposes a hybrid measurement-analytical modeling approach for analyzing the performance of shared-memory applications on multicore systems. For a given application we predict the impact of the number of cores and core clock frequency on the parallelism and energy performance on traditional x64 and emerging low-power ARM multicore systems. The proposed parallelism model captures the overlap among response times of cores, memory and I/O devices to predict both the amount of parallelism exploited and the parallelism lost due to data dependency, memory contention and network I/O overhead. Based on the parallelism model and a static power characterization of a multicore system, our proposed energy model predicts the power and energy usage of a program. In contrast to previous approaches that rely on instrumentation of the program source or binary code, our model uses non-intrusive inputs such as the size of the OS run-queue, hardware events counters and external power measurements. Validation against direct measurements of applications covering HPC, financial analysis, multimedia and datacenter computing on four UMA and NUMA multicore systems shows an average relative model error of 6-13%. A number of key insights are drawn using our approach. First, for memory- or I/O-bounded problems, allocating a large number of cores increases energy usage and, counter to intuition, may also increase execution time due to resource contention among cores. Second, balancing the core and memory resources by selecting an appropriate number of cores and clock frequency can reduce the energy by up to 27% even on an ARM Cortex-A9 system. Third, we show that more energy savings can be achieved on datacenter workloads such as memcached if the cores, memory and I/O resources demands are balanced by improving bottlenecked resources, rather than by turning off under-utilized resources. In summary, we show that balancing system resources is the key for reducing the energy usage of an application, and this is achieved by improving the hardware performance, rather than by lowering the power usage.
URI: http://scholarbank.nus.edu.sg/handle/10635/49389
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
TUDORBogdanMarius_PhDThesis_Dept_ComputerScience_SoC_NUS_2013.pdf1.35 MBAdobe PDF

OPEN

NoneView/Download

Page view(s)

210
checked on Dec 11, 2017

Download(s)

27
checked on Dec 11, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.