Handling biological sequence alignments on networked computing systems: A divide-and-conquer approach

Please use this identifier to cite or link to this item: https://doi.org/10.1016/j.jpdc.2009.04.014

DC Field	Value
dc.title	Handling biological sequence alignments on networked computing systems: A divide-and-conquer approach
dc.contributor.author	Bharadwaj, V.
dc.contributor.author	Wong, H.M.
dc.date.accessioned	2014-06-17T02:51:34Z
dc.date.available	2014-06-17T02:51:34Z
dc.date.issued	2009-10
dc.identifier.citation	Bharadwaj, V., Wong, H.M. (2009-10). Handling biological sequence alignments on networked computing systems: A divide-and-conquer approach. Journal of Parallel and Distributed Computing 69 (10) : 854-865. ScholarBank@NUS Repository. https://doi.org/10.1016/j.jpdc.2009.04.014
dc.identifier.issn	07437315
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/56174
dc.description.abstract	In this paper, we address the biological sequence alignment problem, which is one of the most commonly used steps in several bioinformatics applications. We employ the Divisible Load Theory (DLT) paradigm that is suitable for handling large-scale processing on network-based systems to achieve a high degree of parallelism. Using the DLT paradigm, we propose a strategy in which we carefully partition the computation work load among the processors in the system so as to minimize the overall computation time of determining the maximum similarity between the DNA/protein sequences. We consider handling such a computational problem on networked computing platforms connected as a linear daisy chain. We derive the individual load quantum to be assigned to the processors according to computation and communication link speeds along the chain. We consider two cases of sequence alignment where post-processes, i.e., trace-back processes that are required to determine an optimal alignment, may or may not be done at individual processors in the system. We derive some critical conditions to determine if our strategies are able to yield an optimal processing time. We apply three different heuristic strategies proposed in the literature to generate sub-optimal solutions for processing time when the above conditions cannot be satisfied. To testify the proposed schemes, we use real-life DNA samples of house mouse mitochondrion and the DNA of human mitochondrion obtained from the public database GenBank [GenBank, http://www.ncbi.nlm.nih.gov] in our simulation experiments. By this study, we conclusively demonstrate the applicability and potential of the DLT paradigm to such biological sequence related computational problems. © 2009 Elsevier Inc. All rights reserved.
dc.description.uri	http://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1016/j.jpdc.2009.04.014
dc.source	Scopus
dc.subject	Biological sequences
dc.subject	Communication delay
dc.subject	Divisible load theory
dc.subject	Linear networks
dc.subject	Sequence alignment
dc.subject	Smith-Waterman algorithm
dc.type	Article
dc.contributor.department	ELECTRICAL & COMPUTER ENGINEERING
dc.description.doi	10.1016/j.jpdc.2009.04.014
dc.description.sourcetitle	Journal of Parallel and Distributed Computing
dc.description.volume	69
dc.description.issue	10
dc.description.page	854-865
dc.description.coden	JPDCE
dc.identifier.isiut	000273937500005
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM