Please use this identifier to cite or link to this item:
Authors: LU YING
Issue Date: 1999
Citation: LU YING (1999). ON MULTIVARIATE DISCRETIZATION. ScholarBank@NUS Repository.
Abstract: In the current information age, organizations have gathered a huge amount of data but with little knowledge extracted. This starvation for knowledge with huge amount of data available has resulted in a new research field: knowledge discovery in databases(KDD). Knowledge discovery in databases is a nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data. The KDD process mainly consists of 4 major steps: data integration and selection, data preprocessing, data mining, and pattern evaluation and applica- tion. In this thesis, we have concentrated on the study of discretization, a prepro- cessing, technique. A taxonomy of the existing, discretization algorithms is first presented, followed by the descriptions of the various existing methods under each category. Since rnost existing (discretization algorithms use univariate approach in discretizing, the attributes values, but the class labels of the real world data are usually dependent. on the values of more than one attribute, this has motivated us to explore the possibility of multivariate discretization. The multivariate algo- rithm proposed in this thesis is a conflict-free multivariate discretization algorithm which uses conflict check to decide the merging of two adjacent intervals of an attribute. Also, experiments were performed on this algorithm to test the effects of classification accuracy and data reduction. And the results were good, as they showed that the proposed algorithm indeed improves classification accuracy and reduces data size of some data sets.
Appears in Collections:Master's Theses (Restricted)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
b22107782.pdf2.05 MBAdobe PDF


NoneLog In

Page view(s)

checked on Jan 15, 2021

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.