Please use this identifier to cite or link to this item: https://doi.org/10.1109/ICDM.2006.132
DC FieldValue
dc.titleRapid identification of column heterogeneity
dc.contributor.authorDai, B.T.
dc.contributor.authorKoudas, N.
dc.contributor.authorOoi, B.C.
dc.contributor.authorSrivastava, D.
dc.contributor.authorVenkatasubramanian, S.
dc.date.accessioned2013-07-04T08:26:35Z
dc.date.available2013-07-04T08:26:35Z
dc.date.issued2007
dc.identifier.citationDai, B.T.,Koudas, N.,Ooi, B.C.,Srivastava, D.,Venkatasubramanian, S. (2007). Rapid identification of column heterogeneity. Proceedings - IEEE International Conference on Data Mining, ICDM : 159-170. ScholarBank@NUS Repository. <a href="https://doi.org/10.1109/ICDM.2006.132" target="_blank">https://doi.org/10.1109/ICDM.2006.132</a>
dc.identifier.isbn0769527019
dc.identifier.issn15504786
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/41397
dc.description.abstractData quality is a serious concern in every data management application, and a variety of quality measures have been proposed, e.g., accuracy, freshness and completeness, to capture common sources of data quality degradation. We identify and focus attention on a novel measure, column heterogeneity, that seeks to quantify the data quality problems that can arise when merging data from different sources. We identify desiderata that a column heterogeneity measure should intuitively satisfy, and describe our technique to quantify database column heterogeneity based on using a novel combination of cluster entropy and soft clustering. Finally, we present detailed experimental results, using diverse data sets of different types, to demonstrate that our approach provides a robust mechanism for identifying and quantifying database column heterogeneity. © 2006 IEEE.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1109/ICDM.2006.132
dc.sourceScopus
dc.typeConference Paper
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.doi10.1109/ICDM.2006.132
dc.description.sourcetitleProceedings - IEEE International Conference on Data Mining, ICDM
dc.description.page159-170
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.