Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/13427
DC Field | Value | |
---|---|---|
dc.title | Correlation-based methods for data cleaning, with application to biological databases | |
dc.contributor.author | KOH LIE YONG | |
dc.date.accessioned | 2010-04-08T10:32:54Z | |
dc.date.available | 2010-04-08T10:32:54Z | |
dc.date.issued | 2007-09-25 | |
dc.identifier.citation | KOH LIE YONG (2007-09-25). Correlation-based methods for data cleaning, with application to biological databases. ScholarBank@NUS Repository. | |
dc.identifier.uri | http://scholarbank.nus.edu.sg/handle/10635/13427 | |
dc.description.abstract | Data cleaning aims at improving data quality through detecting and eliminating data artifacts that hamper the efficacy of analysis or data mining. Despite the importance, data cleaning remains neglected in certain knowledge-driven domains such as Bioinformatics. An in-depth study of real-world biological databases indicates that the biological data quality problem is multi-factorial and requires a number of different data cleaning approaches. Current data cleaning approaches that derive observations of data artifacts from the attribute values are inadequate. This thesis exploits the correlations patterns between attributes to provide additional information of the relationships embedded within a data set for data cleaning. We propose three novel correlation-based data cleaning methods to detect outliers and duplicates, and apply them to biological databases as proof-of-concepts. Experimental results show the effectiveness of these correlation-based data cleaning methods in detecting data artifacts that existing approaches fall short of addressing. | |
dc.language.iso | en | |
dc.subject | data cleaning, correlation mining, biological data, data artifacts, duplicate detection, outlier detection | |
dc.type | Thesis | |
dc.contributor.department | COMPUTER SCIENCE | |
dc.contributor.supervisor | LEE MONG LI, JANICE | |
dc.description.degree | Ph.D | |
dc.description.degreeconferred | DOCTOR OF PHILOSOPHY | |
dc.identifier.isiut | NOT_IN_WOS | |
Appears in Collections: | Ph.D Theses (Open) |
Show simple item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
KOHJLY.pdf | 1.92 MB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.