Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/13427
DC FieldValue
dc.titleCorrelation-based methods for data cleaning, with application to biological databases
dc.contributor.authorKOH LIE YONG
dc.date.accessioned2010-04-08T10:32:54Z
dc.date.available2010-04-08T10:32:54Z
dc.date.issued2007-09-25
dc.identifier.citationKOH LIE YONG (2007-09-25). Correlation-based methods for data cleaning, with application to biological databases. ScholarBank@NUS Repository.
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/13427
dc.description.abstractData cleaning aims at improving data quality through detecting and eliminating data artifacts that hamper the efficacy of analysis or data mining. Despite the importance, data cleaning remains neglected in certain knowledge-driven domains such as Bioinformatics. An in-depth study of real-world biological databases indicates that the biological data quality problem is multi-factorial and requires a number of different data cleaning approaches. Current data cleaning approaches that derive observations of data artifacts from the attribute values are inadequate. This thesis exploits the correlations patterns between attributes to provide additional information of the relationships embedded within a data set for data cleaning. We propose three novel correlation-based data cleaning methods to detect outliers and duplicates, and apply them to biological databases as proof-of-concepts. Experimental results show the effectiveness of these correlation-based data cleaning methods in detecting data artifacts that existing approaches fall short of addressing.
dc.language.isoen
dc.subjectdata cleaning, correlation mining, biological data, data artifacts, duplicate detection, outlier detection
dc.typeThesis
dc.contributor.departmentCOMPUTER SCIENCE
dc.contributor.supervisorLEE MONG LI, JANICE
dc.description.degreePh.D
dc.description.degreeconferredDOCTOR OF PHILOSOPHY
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Ph.D Theses (Open)

Show simple item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
KOHJLY.pdf1.92 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.