Please use this identifier to cite or link to this item:
|Title:||Using interval association rules to identify dubious data values|
|Source:||Lu, R.,Lee, M.L.,Hsu, W. (2004). Using interval association rules to identify dubious data values. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 3129 : 528-538. ScholarBank@NUS Repository.|
|Abstract:||A hard-to-catch erroneous data is one whose value looks perfectly legitimate. Yet, if we examine this value in conjunction with other attribute values, the value appear questionable. Detecting such dubious values is a major problem in data cleaning. This paper presents a framework to automatically detect dubious data values in the datasets. Data is first pre-processed by data smoothing and mapping. Next, interval association rules are generated which involved data partitioning via clustering before the rules are generated using an Apriori algorithm. Finally, these rules are used to identify data values that fall outside the expected intervals. Experiment results show that the proposed framework is able to accurately and efficiently dubious values in large datasets. © Springer-Verlag 2004.|
|Source Title:||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Appears in Collections:||Staff Publications|
Show full item record
Files in This Item:
There are no files associated with this item.
checked on Dec 15, 2017
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.