Please use this identifier to cite or link to this item: https://doi.org/10.1016/j.eswa.2009.08.023
DC FieldValue
dc.titleAutomatic accuracy assessment via hashing in multiple-source environment
dc.contributor.authorHan, J.
dc.contributor.authorJiang, D.
dc.contributor.authorLi, L.
dc.date.accessioned2013-07-04T07:32:02Z
dc.date.available2013-07-04T07:32:02Z
dc.date.issued2010
dc.identifier.citationHan, J., Jiang, D., Li, L. (2010). Automatic accuracy assessment via hashing in multiple-source environment. Expert Systems with Applications 37 (3) : 2609-2620. ScholarBank@NUS Repository. https://doi.org/10.1016/j.eswa.2009.08.023
dc.identifier.issn09574174
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/39015
dc.description.abstractAccuracy is a most important data quality dimension and its assessment is a key issue in data management. Most of current studies focus on how to qualitatively analyze accuracy dimension and the analysis depends heavily on experts' knowledge. Seldom work is given on how to automatically quantify accuracy dimension. Based on Jensen-Shannon divergence (JSD) measure, we propose accuracy of data can be automatically quantified by comparing data with its entity's most approximation in available context. To quickly identify most approximation in large scale data sources, locality-sensitive hashing (LSH) is employed to extract most approximation at multiple levels, namely column, record and field level. Our approach can not only give each data source an objective accuracy score very quickly as long as context member is available but also avoid human's laborious interaction. As an automatic accuracy assessment solution in multiple-source environment, our approach is distinguished, especially for large scale data sources. Theory and experiment show our approach performs well in achieving metadata on accuracy dimension. © 2009 Elsevier Ltd. All rights reserved.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1016/j.eswa.2009.08.023
dc.sourceScopus
dc.subjectAccuracy
dc.subjectAutomatic assessment
dc.subjectContext
dc.subjectData quality
dc.subjectJensen-Shannon divergence (JSD)
dc.subjectLocality-sensitive hashing (LSH)
dc.typeArticle
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.doi10.1016/j.eswa.2009.08.023
dc.description.sourcetitleExpert Systems with Applications
dc.description.volume37
dc.description.issue3
dc.description.page2609-2620
dc.description.codenESAPE
dc.identifier.isiut000272846500089
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.