Please use this identifier to cite or link to this item: https://doi.org/10.1137/070685373
Title: Breaking a time-and-space barrier in constructing full-text indices
Authors: Hon, W.-K.
Sadakane, K.
Sung, W.-K. 
Keywords: Preprocessing
Suffix arrays
Suffix trees
Text indexing
Issue Date: 2009
Source: Hon, W.-K.,Sadakane, K.,Sung, W.-K. (2009). Breaking a time-and-space barrier in constructing full-text indices. SIAM Journal on Computing 38 (6) : 2162-2178. ScholarBank@NUS Repository. https://doi.org/10.1137/070685373
Abstract: Suffix trees and suffix arrays are the most prominent full-text indices, and their construction algorithms are well studied. In the literature, the fastest algorithm runs in O (n) time, while it requires O (n log n)-bit working space, where n denotes the length of the text. On the other hand, the most space-efficient algorithm requires O (n)-bit working space while it runs in O (n log n) time. It was open whether these indices can be constructed in both o (n log n) time and o (n log n)-bit working space. This paper breaks the above time-and-space barrier under the unit-cost word RAM. We give an algorithm for constructing the suffix array, which takes O (n) time and O (n)-bit working space, for texts with constant-size alphabets. Note that both the time and the space bounds are optimal. For constructing the suffix tree, our algorithm requires O (n log∈ n) time and O (n)-bit working space for any 0 < ∈ < 1. Apart from that, our algorithm can also be adopted to build other existing full-text indices, such as compressed suflix tree, compressed suffix arrays, and FM-index. We also study the general case where the size of the alphabet S is not constant. Our algorithm can construct a suffix array and a suffix tree using optimal O (n log |Σ|)-bit working space while running in O (n log log |Σ|) time and O (n (log∈ n + log|Σ|)) time, respectively. These are the first algorithms that achieve o (n log n) time with optimal working space. Moreover, for the special case where log |Σ| = O ((log log n)1-∈), we can speed up our suffix array construction algorithm to the optimal O (n). © 2009 Society for Industrial and Applied Mathematics.
Source Title: SIAM Journal on Computing
URI: http://scholarbank.nus.edu.sg/handle/10635/39472
ISSN: 00975397
DOI: 10.1137/070685373
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

SCOPUSTM   
Citations

31
checked on Dec 13, 2017

WEB OF SCIENCETM
Citations

22
checked on Nov 1, 2017

Page view(s)

66
checked on Dec 16, 2017

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.