Please use this identifier to cite or link to this item:
https://doi.org/10.1109/ICDM.2006.132
DC Field | Value | |
---|---|---|
dc.title | Rapid identification of column heterogeneity | |
dc.contributor.author | Dai, B.T. | |
dc.contributor.author | Koudas, N. | |
dc.contributor.author | Ooi, B.C. | |
dc.contributor.author | Srivastava, D. | |
dc.contributor.author | Venkatasubramanian, S. | |
dc.date.accessioned | 2013-07-04T08:26:35Z | |
dc.date.available | 2013-07-04T08:26:35Z | |
dc.date.issued | 2007 | |
dc.identifier.citation | Dai, B.T.,Koudas, N.,Ooi, B.C.,Srivastava, D.,Venkatasubramanian, S. (2007). Rapid identification of column heterogeneity. Proceedings - IEEE International Conference on Data Mining, ICDM : 159-170. ScholarBank@NUS Repository. <a href="https://doi.org/10.1109/ICDM.2006.132" target="_blank">https://doi.org/10.1109/ICDM.2006.132</a> | |
dc.identifier.isbn | 0769527019 | |
dc.identifier.issn | 15504786 | |
dc.identifier.uri | http://scholarbank.nus.edu.sg/handle/10635/41397 | |
dc.description.abstract | Data quality is a serious concern in every data management application, and a variety of quality measures have been proposed, e.g., accuracy, freshness and completeness, to capture common sources of data quality degradation. We identify and focus attention on a novel measure, column heterogeneity, that seeks to quantify the data quality problems that can arise when merging data from different sources. We identify desiderata that a column heterogeneity measure should intuitively satisfy, and describe our technique to quantify database column heterogeneity based on using a novel combination of cluster entropy and soft clustering. Finally, we present detailed experimental results, using diverse data sets of different types, to demonstrate that our approach provides a robust mechanism for identifying and quantifying database column heterogeneity. © 2006 IEEE. | |
dc.description.uri | http://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1109/ICDM.2006.132 | |
dc.source | Scopus | |
dc.type | Conference Paper | |
dc.contributor.department | COMPUTER SCIENCE | |
dc.description.doi | 10.1109/ICDM.2006.132 | |
dc.description.sourcetitle | Proceedings - IEEE International Conference on Data Mining, ICDM | |
dc.description.page | 159-170 | |
dc.identifier.isiut | NOT_IN_WOS | |
Appears in Collections: | Staff Publications |
Show simple item record
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.