Please use this identifier to cite or link to this item: https://doi.org/10.1016/j.ins.2012.02.004
Title: Efficient processing of probabilistic set-containment queries on uncertain set-valued data
Authors: Zhang, X.
Chen, K.
Shou, L.
Chen, G.
Gao, Y.
Tan, K.-L. 
Keywords: Expected Jaccard containment
Probabilistic set-containment query/join
Set containment query
Uncertain query
Uncertain set-valued attributes
Uncertain set-valued data
Issue Date: 2012
Citation: Zhang, X., Chen, K., Shou, L., Chen, G., Gao, Y., Tan, K.-L. (2012). Efficient processing of probabilistic set-containment queries on uncertain set-valued data. Information Sciences 196 : 97-117. ScholarBank@NUS Repository. https://doi.org/10.1016/j.ins.2012.02.004
Abstract: Set-valued data is a natural and concise representation for modeling complex objects. As an important operation of object-oriented or object-relational database, set containment query processing over set-valued data has been extensively studied in previous works. Recently, there is a growing realization that uncertain information is a first-class citizen in modern database management. As such, there is a strong demand for study of set containment queries over uncertain set-valued data. This paper investigates how set-containment queries over uncertain set-valued data can be efficiently processed. Based on the popular possible world semantics, we first present a practical model in which the uncertainty in set-valued data is represented by existential probabilities, and propose the probabilistic set containment semantics and its generalization-the expected Jaccard containment. Second, to avoid expensive computations in enumerating all possible worlds, we develop efficient schemes for computing these two probabilistic semantics. Third, we introduce two important queries, namely probability threshold containment query (PTCQ) and probability threshold containment join (PTCJ), and propose novel techniques to process them efficiently. Finally, we conduct extensive experiments to study the efficiency of the proposed methods. The experimental results indicate that the proposed methods are efficient in processing the uncertain set containment queries. © 2012 Published by Elsevier Inc.
Source Title: Information Sciences
URI: http://scholarbank.nus.edu.sg/handle/10635/77849
ISSN: 00200255
DOI: 10.1016/j.ins.2012.02.004
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.