Please use this identifier to cite or link to this item: https://doi.org/10.1023/A:1016304305535
DC FieldValue
dc.titleDiscretization: An enabling technique
dc.contributor.authorLiu, H.
dc.contributor.authorHussain, F.
dc.contributor.authorTan, C.L.
dc.contributor.authorDash, M.
dc.date.accessioned2013-07-04T07:41:00Z
dc.date.available2013-07-04T07:41:00Z
dc.date.issued2002
dc.identifier.citationLiu, H., Hussain, F., Tan, C.L., Dash, M. (2002). Discretization: An enabling technique. Data Mining and Knowledge Discovery 6 (4) : 393-423. ScholarBank@NUS Repository. https://doi.org/10.1023/A:1016304305535
dc.identifier.issn13845810
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/39410
dc.description.abstractDiscrete values have important roles in data mining and knowledge discovery. They are about intervals of numbers which are more concise to represent and specify, easier to use and comprehend as they are closer to a knowledge-level representation than continuous values. Many studies show induction tasks can benefit from discretization: rules with discrete values are normally shorter and more understandable and discretization can lead to improved predictive accuracy. Furthermore, many induction algorithms found in the literature require discrete features. All these prompt researchers and practitioners to discretize continuous features before or during a machine learning or data mining task. There are numerous discretization methods available in the literature. It is time for us to examine these seemingly different methods for discretization and find out how different they really are, what are the key components of a discretization process, how we can improve the current level of research for new development as well as the use of existing methods. This paper aims at a systematic study of discretization methods with their history of development, effect on classification, and trade-off between speed and accuracy. Contributions of this paper are an abstract description summarizing existing discretization methods, a hierarchical framework to categorize the existing methods and pave the way for further development, concise discussions of representative discretization methods, extensive experiments and their analysis, and some guidelines as to how to choose a discretization method under various circumstances. We also identify some issues yet to solve and future research for discretization.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1023/A:1016304305535
dc.sourceScopus
dc.subjectClassification
dc.subjectContinuous feature
dc.subjectData mining
dc.subjectDiscretization
dc.typeArticle
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.doi10.1023/A:1016304305535
dc.description.sourcetitleData Mining and Knowledge Discovery
dc.description.volume6
dc.description.issue4
dc.description.page393-423
dc.identifier.isiut000176865200003
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

SCOPUSTM   
Citations

672
checked on Dec 6, 2021

WEB OF SCIENCETM
Citations

522
checked on Dec 6, 2021

Page view(s)

224
checked on Dec 2, 2021

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.