A hybrid evolutionary algorithm for attribute selection in data mining | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://doi.org/10.1016/j.eswa.2008.10.013

Title:	A hybrid evolutionary algorithm for attribute selection in data mining
Authors:	Tan, K.C. Teoh, E.J. Yu, Q. Goh, K.C.
Keywords:	Attribute selection Data mining Evolutionary algorithms Pattern classification Support vector machines
Issue Date:	May-2009
Citation:	Tan, K.C., Teoh, E.J., Yu, Q., Goh, K.C. (2009-05). A hybrid evolutionary algorithm for attribute selection in data mining. Expert Systems with Applications 36 (4) : 8616-8630. ScholarBank@NUS Repository. https://doi.org/10.1016/j.eswa.2008.10.013
Abstract:	Real life data sets are often interspersed with noise, making the subsequent data mining process difficult. The task of the classifier could be simplified by eliminating attributes that are deemed to be redundant for classification, as the retention of only pertinent attributes would reduce the size of the dataset and subsequently allow more comprehensible analysis of the extracted patterns or rules. In this article, a new hybrid approach comprising of two conventional machine learning algorithms has been proposed to carry out attribute selection. Genetic algorithms (GAs) and support vector machines (SVMs) are integrated effectively based on a wrapper approach. Specifically, the GA component searches for the best attribute set by applying the principles of an evolutionary process. The SVM then classifies the patterns in the reduced datasets, corresponding to the attribute subsets represented by the GA chromosomes. The proposed GA-SVM hybrid is subsequently validated using datasets obtained from the UCI machine learning repository. Simulation results demonstrate that the GA-SVM hybrid produces good classification accuracy and a higher level of consistency that is comparable to other established algorithms. In addition, improvements are made to the hybrid by using a correlation measure between attributes as a fitness measure to replace the weaker members in the population with newly formed chromosomes. This injects greater diversity and increases the overall fitness of the population. Similarly, the improved mechanism is also validated on the same data sets used in the first stage. The results justify the improvements in the classification accuracy and demonstrate its potential to be a good classifier for future data mining purposes. © 2008.
Source Title:	Expert Systems with Applications
URI:	http://scholarbank.nus.edu.sg/handle/10635/54264
ISSN:	09574174
DOI:	10.1016/j.eswa.2008.10.013
Appears in Collections:	Staff Publications

Show full item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Altmetric

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.