Please use this identifier to cite or link to this item: https://doi.org/10.1137/070685373
Title: Breaking a time-and-space barrier in constructing full-text indices
Authors: Hon, W.-K.
Sadakane, K.
Sung, W.-K. 
Keywords: Preprocessing
Suffix arrays
Suffix trees
Text indexing
Issue Date: 2009
Citation: Hon, W.-K., Sadakane, K., Sung, W.-K. (2009). Breaking a time-and-space barrier in constructing full-text indices. SIAM Journal on Computing 38 (6) : 2162-2178. ScholarBank@NUS Repository. https://doi.org/10.1137/070685373
Abstract: Suffix trees and suffix arrays are the most prominent full-text indices, and their construction algorithms are well studied. In the literature, the fastest algorithm runs in O (n) time, while it requires O (n log n)-bit working space, where n denotes the length of the text. On the other hand, the most space-efficient algorithm requires O (n)-bit working space while it runs in O (n log n) time. It was open whether these indices can be constructed in both o (n log n) time and o (n log n)-bit working space. This paper breaks the above time-and-space barrier under the unit-cost word RAM. We give an algorithm for constructing the suffix array, which takes O (n) time and O (n)-bit working space, for texts with constant-size alphabets. Note that both the time and the space bounds are optimal. For constructing the suffix tree, our algorithm requires O (n log∈ n) time and O (n)-bit working space for any 0 < ∈ < 1. Apart from that, our algorithm can also be adopted to build other existing full-text indices, such as compressed suflix tree, compressed suffix arrays, and FM-index. We also study the general case where the size of the alphabet S is not constant. Our algorithm can construct a suffix array and a suffix tree using optimal O (n log |Σ|)-bit working space while running in O (n log log |Σ|) time and O (n (log∈ n + log|Σ|)) time, respectively. These are the first algorithms that achieve o (n log n) time with optimal working space. Moreover, for the special case where log |Σ| = O ((log log n)1-∈), we can speed up our suffix array construction algorithm to the optimal O (n). © 2009 Society for Industrial and Applied Mathematics.
Source Title: SIAM Journal on Computing
URI: http://scholarbank.nus.edu.sg/handle/10635/39472
ISSN: 00975397
DOI: 10.1137/070685373
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.