Please use this identifier to cite or link to this item:
Title: Similarity evaluation on tree-structured data
Authors: Yang, R.
Kalnis, P. 
Tung, A.K.H. 
Issue Date: 2005
Source: Yang, R.,Kalnis, P.,Tung, A.K.H. (2005). Similarity evaluation on tree-structured data. Proceedings of the ACM SIGMOD International Conference on Management of Data : 754-765. ScholarBank@NUS Repository.
Abstract: Tree-structured data are becoming ubiquitous nowadays and manipulating them based on similarity is essential for many applications. The generally accepted similarity measure for trees is the edit distance. Although similarity search has been extensively studied, searching for similar trees is still an open problem due to the high complexity of computing the tree edit distance. In this paper, we propose to transform tree-structured data into an approximate numerical multidimensional vector which encodes the original structure information. We prove that the L 1 distance of the corresponding vectors, whose computational complexity is 0(|T 1| + |T 2|), forms a lower bound for the edit distance between trees. Based on the theoretical analysis, we describe a novel algorithm which embeds the proposed distance into a filter-and-refine framework to process similarity search on tree-structured data. The experimental results show that our algorithm reduces dramatically the distance computation cost. Our method is especially suitable for accelerating similarity query processing on large trees in massive datasets. Copyright 2005 ACM.
Source Title: Proceedings of the ACM SIGMOD International Conference on Management of Data
ISSN: 07308078
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

checked on Dec 9, 2017

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.