Please use this identifier to cite or link to this item: https://doi.org/10.1007/978-3-642-00887-0_54
DC FieldValue
dc.titleDetecting aggregate incongruities in XML
dc.contributor.authorHsu, W.
dc.contributor.authorLau, Q.P.
dc.contributor.authorLee, M.L.
dc.date.accessioned2013-07-04T08:14:10Z
dc.date.available2013-07-04T08:14:10Z
dc.date.issued2009
dc.identifier.citationHsu, W.,Lau, Q.P.,Lee, M.L. (2009). Detecting aggregate incongruities in XML. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 5463 : 601-615. ScholarBank@NUS Repository. <a href="https://doi.org/10.1007/978-3-642-00887-0_54" target="_blank">https://doi.org/10.1007/978-3-642-00887-0_54</a>
dc.identifier.isbn9783642008863
dc.identifier.issn03029743
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/40866
dc.description.abstractThe problem of identifying deviating patterns in XML repositories has important applications in data cleaning, fraud detection, and stock market analysis. Current methods determine data discrepancies by assessing whether the data conforms to the expected distribution of its immediate neighborhood. This approach may miss interesting deviations involving aggregated information. For example, the average number of transactions of a particular bank account may be exceptionally high as compared to other accounts with similar profiles. Such incongruity could only be revealed through aggregating appropriate data and analyzing the aggregated results in the associated neighborhood. This neighborhood is implicitly encapsulated in the XML structure. In addition, the hierarchical nature of the XML structure reflects the different levels of abstractions in the real world. This work presents a framework that detects incongruities in aggregate information. It utilizes the inherent characteristics of the XML structure to systematically aggregate leaf-level data and propagate the aggregated information up the hierarchy. The aggregated information is analyzed using a novel method by first clustering similar data, then, assuming a statistical distribution and identifying aggregate incongruity within the clusters. Experiments results indicate that the proposed approach is effective in detecting interesting discrepancies in a real world bank data set.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1007/978-3-642-00887-0_54
dc.sourceScopus
dc.typeConference Paper
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.doi10.1007/978-3-642-00887-0_54
dc.description.sourcetitleLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.description.volume5463
dc.description.page601-615
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Page view(s)

175
checked on Nov 24, 2022

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.