Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/78084
DC FieldValue
dc.titleDeepDetect: An eXtensible system for detecting attribute outliers & duplicates in XML
dc.contributor.authorLau, Q.P.
dc.contributor.authorHsu, W.
dc.contributor.authorKoh, J.L.Y.
dc.contributor.authorLee, M.L.
dc.date.accessioned2014-07-04T03:12:15Z
dc.date.available2014-07-04T03:12:15Z
dc.date.issued2009
dc.identifier.citationLau, Q.P.,Hsu, W.,Koh, J.L.Y.,Lee, M.L. (2009). DeepDetect: An eXtensible system for detecting attribute outliers & duplicates in XML. Data Quality and High-Dimensional Data Analysis - Proceedings of the DASFAA 2008 Workshops : 6-20. ScholarBank@NUS Repository.
dc.identifier.isbn9814273481
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/78084
dc.description.abstractXML, the eXtensible Markup Language, is fast evolving into the new standard for data representation and exchange on the WWW. This has resulted in a growing number of data cleaning techniques to locate "dirty" data (artifacts). In this paper, we present DeepDetect - an extensible system that detects attribute outliers and duplicates in XML documents. Attribute outlier detection finds objects that contain deviating values with respect to a relevant group of objects. This entails utilizing the correlation among element values in a given XML document. Duplicate detection in XML requires the identification of sub-trees that correspond to real world objects. Our system architecture enables sharing of common operations that prepare XML data for the various artifact detection techniques. DeepDetect also provides an intuitive visual interface for the user to specify various parameters for preprocessing and detection, as well as to view results.
dc.sourceScopus
dc.subjectAttribute outlier detection
dc.subjectData cleaning
dc.subjectData quality
dc.subjectDuplicate detection
dc.subjectXML
dc.typeConference Paper
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.sourcetitleData Quality and High-Dimensional Data Analysis - Proceedings of the DASFAA 2008 Workshops
dc.description.page6-20
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.