Please use this identifier to cite or link to this item:
Title: Some issues on scalable feature selection
Authors: Liu, H. 
Setiono, R. 
Keywords: Features
Large databases
Pattern classification
Probabilistic selection
Issue Date: Oct-1998
Citation: Liu, H.,Setiono, R. (1998-10). Some issues on scalable feature selection. Expert Systems with Applications 15 (3-4) : 333-339. ScholarBank@NUS Repository.
Abstract: Feature selection determines relevant features in the data. It is often applied in pattern classification, data mining, as well as machine learning. A special concern for feature selection nowadays is that the size of a database is normally very large, both vertically and horizontally. In addition, feature sets may grow as the data collection process continues. Effective solutions are needed to accommodate the practical demands. This paper concentrates on three issues: large number of features, large data size, and expanding feature set. For the first issue, we suggest a probabilistic algorithm to select features. For the second issue, we present a scalable probabilistic algorithm that expedites feature selection further and can scale up without sacrificing the quality of selected features. For the third issue, we propose an incremental algorithm that adapts to the newly extended feature set and captures 'concept drifts' by removing features from previously selected and newly added ones. We expect that research on scalable feature selection will be extended to distributed and parallel computing and have impact on applications of data mining and machine learning. © 1998 Elsevier Science Ltd. All rights reserved.
Source Title: Expert Systems with Applications
ISSN: 09574174
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

checked on Feb 16, 2020

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.