Scalable kNN search on vertically stored time series

Please use this identifier to cite or link to this item: https://doi.org/10.1145/2020408.2020607

DC Field	Value
dc.title	Scalable kNN search on vertically stored time series
dc.contributor.author	Kashyap S.
dc.contributor.author	Karras, P.
dc.date.accessioned	2021-09-10T02:00:58Z
dc.date.available	2021-09-10T02:00:58Z
dc.date.issued	20110821
dc.identifier.citation	Kashyap S., Karras, P. (20110821). Scalable kNN search on vertically stored time series. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining : 1334 - 1342. ScholarBank@NUS Repository. https://doi.org/10.1145/2020408.2020607
dc.identifier.isbn	9781450308137
dc.identifier.uri	https://scholarbank.nus.edu.sg/handle/10635/200474
dc.description.abstract	Nearest-neighbor search over time series has received vast research attention as a basic data mining task. Still, none of the hitherto proposed methods scales well with increasing time-series length. This is due to the fact that all methods provide an one-off pruning capacity only. In particular, traditional methods utilize an index to search in a reduced-dimensionality feature space; however, for high timeseries length, search with such an index yields many false hits that need to be eliminated by accessing the full records. An attempt to reduce false hits by indexing more features exacerbates the curse of dimensionality, and vice versa. A recently proposed alternative, iSAX, uses symbolic approximate representations accessed by a simple file-system directory as an index. Still, iSAX also encounters false hits, which are again eliminated by accessing records in full: once a false hit is generated by the index, there is no second chance to prune it; thus, the pruning capacity iSAX provides is also one-off. This paper proposes an alternative approach to time series kNN search, following a nontraditional pruning style. Instead of navigating through candidate records via an index, we access their features, obtained by a multi-resolution transform, in a stepwise sequential-scan manner, one level of resolution at a time, over a vertical representation. Most candidates are progressively eliminated after a few of their terms are accessed, using pre-computed information and an unprecedentedly tight double-bounding scheme, involving not only lower, but also upper distance bounds. Our experimental study with large, high-length time-series data confirms the advantage of our approach over both the current state-of-the-art method, iSAX, and classical index-based methods. Copyright 2011 ACM.
dc.description.uri	https://dl.acm.org/doi/10.1145/2020408.2020607
dc.language.iso	en
dc.source	Scopus
dc.subject	Algorithms
dc.subject	Experimentation
dc.subject	Performance
dc.subject	Theory
dc.type	Conference Paper
dc.contributor.department	COMPUTATIONAL SCIENCE
dc.description.doi	10.1145/2020408.2020607
dc.description.sourcetitle	Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
dc.description.page	1334 - 1342
dc.published.state	Published
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM