Please use this identifier to cite or link to this item:
Title: CPS-tree: A compact partitioned suffix tree for disk-based indexing on large genome sequences
Authors: Wong, S.-S. 
Sung, W.-K. 
Wong, L. 
Issue Date: 2007
Citation: Wong, S.-S.,Sung, W.-K.,Wong, L. (2007). CPS-tree: A compact partitioned suffix tree for disk-based indexing on large genome sequences. Proceedings - International Conference on Data Engineering : 1350-1354. ScholarBank@NUS Repository.
Abstract: Suffix tree is an important data structure for indexing a long sequence (like a genome sequence) or a concatenation of sequences. It finds many applications in practice, especially in the domain of bioinformatics. Suffix tree allows for efficient pattern search with time independent of the sequence length. However, the performance of disk-based suffix tree is a concern as it is slowed down significantly by poor localized access resulting in high IO disk access. The focus of this paper is to design an IO-ejficient and Compact Partitioned Suffix tree representation (CPS-tree) on disk. We show that representing suffix tree using CPS-tree has several advantages. First, our representation allows us to visit any node in the suffix tree by accessing at most log n pages of the tree where n is the length of the sequence. Second, our storage scheme improves the access pattern and reduces the number of page fault resulting in efficient search retrieval and efficient tree traversal operations. Third, by bit packing, our index is compact. Experimental results show that CPS-tree outperforms other indexes on disk. When fully loaded into the main memory, CPS-tree is still efficient. Hence, we expect CPS-tree to be a good disk-based representation of suffix tree, with potential use in practical applications. © 2007 IEEE.
Source Title: Proceedings - International Conference on Data Engineering
ISBN: 1424408032
ISSN: 10844627
DOI: 10.1109/ICDE.2007.369009
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.


checked on Jul 10, 2019

Page view(s)

checked on May 23, 2019

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.