Publication

Compressed index for dynamic text

Hon, W.-K.
Lam, T.-W.
Sadakane, K.
Sung, W.-K.
Yiu, S.-M.
Citations
Altmetric:
Alternative Title
Abstract
This paper investigates how to index a text which is subject to updates. The best solution in the literature is based on suffix tree using O(n log n) bits of storage, where n is the length of the text. It supports finding all occurrences of a pattern P in O(|P| + occ) time, where occ is the number of occurrences. Each text update consists of inserting or deleting a substring of length y and can be supported in O(y + √n) time. In this paper, we initiate the study of compressed index using only O(n log |Σ|) bits of space, where Σ denotes the alphabet. Our solution supports finding all occurrences of a pattern P in O(|P|log2 n(logε n + log |ΣE|) + occ log1+ε n) time, while insertion or deletion of a substring of length y can be done in O((y + √n) log 2+εn) amortized time, where 0 < ε ≤ 1. The core part of our data structure is based on the recent work on Compressed Suffix Trees (CST) and Compressed Suffix Arrays (CSA).
Keywords
Source Title
Data Compression Conference Proceedings
Publisher
Series/Report No.
Organizational Units
Organizational Unit
COMPUTER SCIENCE
dept
Rights
Date
2004
DOI
Type
Conference Paper
Additional Links
Related Datasets
Related Publications