Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/41262
Title: Practical aspects of compressed suffix arrays and FM-index in searching DNA sequences
Authors: Hon, W.-K.
Lam, T.-W.
Sung, W.-K. 
Tse, W.-L.
Wong, C.-K.
Yiu, S.-M.
Issue Date: 2004
Source: Hon, W.-K.,Lam, T.-W.,Sung, W.-K.,Tse, W.-L.,Wong, C.-K.,Yiu, S.-M. (2004). Practical aspects of compressed suffix arrays and FM-index in searching DNA sequences. Proceedings of the Sixth Workshop on Algorithm Engineering and Experiments and the First Workshop on Analytic Algorithms and Combinatorics : 31-38. ScholarBank@NUS Repository.
Abstract: Searching patterns in the DNA sequence is an important step in biological research. To speed up the search process, one can index the DNA sequence. However, classical indexing data structures like suffix trees and suffix arrays are not feasible for indexing DNA sequences due to main memory requirement, as DNA sequences can be very long. In this paper, we evaluate the performance of two compressed data structures, Compressed Suffix Array (CSA) and FM-index, in the context of searching and indexing DNA sequences. Our results show that CSA is better than FM-index for searching long patterns. We also investigate other practical aspects of the data structures such as the memory requirement for building the indexes.
Source Title: Proceedings of the Sixth Workshop on Algorithm Engineering and Experiments and the First Workshop on Analytic Algorithms and Combinatorics
URI: http://scholarbank.nus.edu.sg/handle/10635/41262
ISBN: 0898715644
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

63
checked on Jan 20, 2018

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.