Please use this identifier to cite or link to this item: https://doi.org/10.1109/TC.2008.172
DC FieldValue
dc.titleOn the design of fault-tolerant scheduling strategies using primary-backup approach for computational grids with low replication costs
dc.contributor.authorZheng, Q.
dc.contributor.authorVeeravalli, B.
dc.contributor.authorTham, C.-K.
dc.date.accessioned2014-06-17T02:59:50Z
dc.date.available2014-06-17T02:59:50Z
dc.date.issued2009
dc.identifier.citationZheng, Q., Veeravalli, B., Tham, C.-K. (2009). On the design of fault-tolerant scheduling strategies using primary-backup approach for computational grids with low replication costs. IEEE Transactions on Computers 58 (3) : 380-393. ScholarBank@NUS Repository. https://doi.org/10.1109/TC.2008.172
dc.identifier.issn00189340
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/56890
dc.description.abstractFault-tolerant scheduling is an imperative step for large-scale computational Grid systems, as often geographically distributed nodes cooperate to execute a task. By and large, primary-backup approach is a common methodology used for fault tolerance wherein each task has a primary copy and a backup copy on two different processors. For independent tasks, the backup copy can overload with other backup copies on the same processor, as long as their corresponding primary copies are scheduled on different processors. However, for dependent tasks, precedence constraint among tasks must be considered when scheduling backup copies and overloading backups. In this paper, we first identify two cases that may happen when scheduling dependent tasks with primary-backup approach. For one of the cases, we derive two important constraints that must be satisfied. Further, we show that these two constraints play a crucial role in limiting the schedulability and overloading efficiency of backups of dependent tasks. We then propose two strategies to improve schedulability and overloading efficiency, respectively. We propose two algorithms, called the Minimum Replication Cost with Early Completion Time (MRC-ECT) algorithm and the Minimum Completion Time with Less Replication Cost (MCT-LRC) algorithm, to schedule backups of independent jobs and dependent jobs, respectively. Algorithm MRC-ECT is shown to guarantee an optimal backup schedule in terms of replication cost for an independent task, while MCT-LRC can schedule a backup of a dependent task with minimum completion time and less replication cost. We conduct extensive simulation experiments to quantify the performance of the proposed algorithms and strategies. © 2009 IEEE.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1109/TC.2008.172
dc.sourceScopus
dc.subjectDirected acyclic graphs
dc.subjectFault-tolerance
dc.subjectGrid computing
dc.subjectIndependent tasks
dc.subjectPrimary-backup
dc.typeArticle
dc.contributor.departmentELECTRICAL & COMPUTER ENGINEERING
dc.description.doi10.1109/TC.2008.172
dc.description.sourcetitleIEEE Transactions on Computers
dc.description.volume58
dc.description.issue3
dc.description.page380-393
dc.description.codenITCOB
dc.identifier.isiut000262561900008
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.