Please use this identifier to cite or link to this item: https://doi.org/10.1137/070685373
DC FieldValue
dc.titleBreaking a time-and-space barrier in constructing full-text indices
dc.contributor.authorHon, W.-K.
dc.contributor.authorSadakane, K.
dc.contributor.authorSung, W.-K.
dc.date.accessioned2013-07-04T07:42:22Z
dc.date.available2013-07-04T07:42:22Z
dc.date.issued2009
dc.identifier.citationHon, W.-K., Sadakane, K., Sung, W.-K. (2009). Breaking a time-and-space barrier in constructing full-text indices. SIAM Journal on Computing 38 (6) : 2162-2178. ScholarBank@NUS Repository. https://doi.org/10.1137/070685373
dc.identifier.issn00975397
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/39472
dc.description.abstractSuffix trees and suffix arrays are the most prominent full-text indices, and their construction algorithms are well studied. In the literature, the fastest algorithm runs in O (n) time, while it requires O (n log n)-bit working space, where n denotes the length of the text. On the other hand, the most space-efficient algorithm requires O (n)-bit working space while it runs in O (n log n) time. It was open whether these indices can be constructed in both o (n log n) time and o (n log n)-bit working space. This paper breaks the above time-and-space barrier under the unit-cost word RAM. We give an algorithm for constructing the suffix array, which takes O (n) time and O (n)-bit working space, for texts with constant-size alphabets. Note that both the time and the space bounds are optimal. For constructing the suffix tree, our algorithm requires O (n log∈ n) time and O (n)-bit working space for any 0 < ∈ < 1. Apart from that, our algorithm can also be adopted to build other existing full-text indices, such as compressed suflix tree, compressed suffix arrays, and FM-index. We also study the general case where the size of the alphabet S is not constant. Our algorithm can construct a suffix array and a suffix tree using optimal O (n log |Σ|)-bit working space while running in O (n log log |Σ|) time and O (n (log∈ n + log|Σ|)) time, respectively. These are the first algorithms that achieve o (n log n) time with optimal working space. Moreover, for the special case where log |Σ| = O ((log log n)1-∈), we can speed up our suffix array construction algorithm to the optimal O (n). © 2009 Society for Industrial and Applied Mathematics.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1137/070685373
dc.sourceScopus
dc.subjectPreprocessing
dc.subjectSuffix arrays
dc.subjectSuffix trees
dc.subjectText indexing
dc.typeArticle
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.doi10.1137/070685373
dc.description.sourcetitleSIAM Journal on Computing
dc.description.volume38
dc.description.issue6
dc.description.page2162-2178
dc.description.codenSMJCA
dc.identifier.isiut000266018400003
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.