Please use this identifier to cite or link to this item: https://doi.org/10.1109/WI-IAT.2010.14
Title: Hierarchical cost-sensitive web resource acquisition for record matching
Authors: Tan, Y.F.
Kan, M.-Y. 
Issue Date: 2010
Citation: Tan, Y.F.,Kan, M.-Y. (2010). Hierarchical cost-sensitive web resource acquisition for record matching. Proceedings - 2010 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2010 1 : 382-389. ScholarBank@NUS Repository. https://doi.org/10.1109/WI-IAT.2010.14
Abstract: Web information is increasingly used as evidence in solving various problems, including record matching. However, acquiring web-based resources is slow and can incur other access costs. As such, solutions often acquire only a subset of the resources to achieve a balance between acquisition cost and benefit. Unfortunately, existing work has largely ignored the issue of which resources to acquire. They also fail to emphasize on the hierarchical nature of resource acquisitions, e.g., the search engine results for two queries must be obtained before their TF-IDF cosine similarity be computed. In this paper, we propose a framework for performing cost-sensitive acquisition of resources with hierarchical dependencies, and apply it to the web resource context. Our framework is versatile, and we show that a large variety of problems can be formulated using resource dependency graphs. We solve the resource acquisition problem by casting it as a combinatorial search problem. Finally, we demonstrate the effectiveness of our acquisition framework on record matching problems of different domains. © 2010 IEEE.
Source Title: Proceedings - 2010 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2010
URI: http://scholarbank.nus.edu.sg/handle/10635/40678
ISBN: 9780769541914
DOI: 10.1109/WI-IAT.2010.14
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.