Please use this identifier to cite or link to this item:
Title: Efficient and effective KNN sequence search with approximate ngrams
Authors: Wang, X.
Ding, X.
Tung, A.K.H. 
Zhang, Z.
Issue Date: 2013
Citation: Wang, X.,Ding, X.,Tung, A.K.H.,Zhang, Z. (2013). Efficient and effective KNN sequence search with approximate ngrams. Proceedings of the VLDB Endowment 7 (1) : 1-12. ScholarBank@NUS Repository.
Abstract: In this paper, we address the problem of finding k-nearest neighbors (KNN) in sequence databases using the edit distance. Unlike most existing works using short and exact ngram matchings together with a filter-and-refine framework for KNN sequence search, our new approach allows us to use longer but approximate n-gram matchings as a basis of KNN candidates pruning. Based on this new idea, we devise a pipeline framework over a two-level index for searching KNN in the sequence database. By coupling this framework together with several efficient filtering strategies, i.e. the frequency queue and the well-known Combined Algorithm (CA), our proposal brings various enticing advantages over existing works, including 1) huge reduction on false positive candidates to avoid large overheads on candidate verifications; 2) progressive result update and early termination; and 3) good extensibility to parallel computation. We conduct extensive experiments on three real datasets to verify the superiority of the proposed framework. © 2013 VLDB Endowment 21508097/13/09... $ 10.00.
Source Title: Proceedings of the VLDB Endowment
ISSN: 21508097
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

checked on Oct 20, 2018

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.