Please use this identifier to cite or link to this item: https://doi.org/10.1016/j.jpdc.2010.05.002
Title: Reliability-aware scheduling strategy for heterogeneous distributed computing systems
Authors: Tang, X.
Li, K.
Li, R.
Veeravalli, B. 
Keywords: Duplication
Heterogeneous distributed systems
Precedence constrained tasks
Reliability
Scheduling algorithm
Issue Date: Sep-2010
Citation: Tang, X., Li, K., Li, R., Veeravalli, B. (2010-09). Reliability-aware scheduling strategy for heterogeneous distributed computing systems. Journal of Parallel and Distributed Computing 70 (9) : 941-952. ScholarBank@NUS Repository. https://doi.org/10.1016/j.jpdc.2010.05.002
Abstract: Heterogeneous computing systems are promising computing platforms, since single parallel architecture based systems may not be sufficient to exploit the available parallelism with the running applications. In some cases, heterogeneous distributed computing (HDC) systems can achieve higher performance with lower cost than single-machine supersystems. However, in HDC systems, processors and networks are not failure free and any kind of failure may be critical to the running applications. One way of dealing with such failures is to employ a reliable scheduling algorithm. Unfortunately, most existing scheduling algorithms for precedence constrained tasks in HDC systems do not adequately consider reliability requirements of inter-dependent tasks. In this paper, we design a reliability-driven scheduling architecture that can effectively measure system reliability, based on an optimal reliability communication path search algorithm, and then we introduce reliability priority rank (RRank) to estimate the task's priority by considering reliability overheads. Furthermore, based on directed acyclic graph (DAG) we propose a reliability-aware scheduling algorithm for precedence constrained tasks, which can achieve high quality of reliability for applications. The comparison studies, based on both randomly generated graphs and the graphs of some real applications, show that our scheduling algorithm outperforms the existing scheduling algorithms in terms of makespan, scheduling length ratio, and reliability. At the same time, the improvement gained by our algorithm increases as the data communication among tasks increases. © 2010 Elsevier Inc. All rights reserved.
Source Title: Journal of Parallel and Distributed Computing
URI: http://scholarbank.nus.edu.sg/handle/10635/57247
ISSN: 07437315
DOI: 10.1016/j.jpdc.2010.05.002
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

SCOPUSTM   
Citations

43
checked on Sep 17, 2018

WEB OF SCIENCETM
Citations

36
checked on Aug 29, 2018

Page view(s)

61
checked on Aug 17, 2018

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.