Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/39485
Title: Efficient and effective similarity search over probabilistic data based on Earth Mover's Distance
Authors: Xu, J.
Zhang, Z. 
Tung, A.K.H. 
Yu, G.
Issue Date: 2010
Citation: Xu, J.,Zhang, Z.,Tung, A.K.H.,Yu, G. (2010). Efficient and effective similarity search over probabilistic data based on Earth Mover's Distance. Proceedings of the VLDB Endowment 3 (1) : 758-769. ScholarBank@NUS Repository.
Abstract: Probabilistic data is coming as a new deluge along with the technical advances on geographical tracking, multimedia processing, sensor network and RFID. While similarity search is an important functionality supporting the manipulation of probabilistic data, it raises new challenges to traditional relational database. The problem stems from the limited effectiveness of the distance metric supported by the existing database system. On the other hand, some complicated distance operators have proven their values for better distinguishing ability in the probabilistic domain. In this paper, we discuss the similarity search problem with the Earth Mover's Distance, which is the most successful distance metric on probabilistic histograms and an expensive operator with cubic complexity. We present a new database approach to answer range queries and k-nearest neighbor queries on probabilistic data, on the basis of Earth Mover's Distance. Our solution utilizes the primal-dual theory in linear programming and deploys B + tree index structures for effective candidate pruning. Extensive experiments show that our proposal dramatically improves the scalability of probabilistic databases. © 2010 VLDB Endowment.
Source Title: Proceedings of the VLDB Endowment
URI: http://scholarbank.nus.edu.sg/handle/10635/39485
ISSN: 21508097
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.