Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/19064
Title: Independent component analysis for naive bayes classification
Authors: FAN LIWEI
Keywords: Independent component analysis, Bayesian classification, Feature extraction, Naïve Bayes, Multi-label classification, Microarray data analysis
Issue Date: 31-Mar-2010
Source: FAN LIWEI (2010-03-31). Independent component analysis for naive bayes classification. ScholarBank@NUS Repository.
Abstract: Independent component analysis (ICA) has received increasing attention as a feature extraction technique for pattern classification. Some recent studies have shown that ICA and its variant called class-conditional ICA (CC-ICA) seem to be suitable for Bayesian classifiers, especially for naive Bayes classifier. Nevertheless, there are still some limitations that may restrict the use of ICA/CC-ICA as a feature extraction method for naive Bayes classifier in practice. This thesis focuses on several methodological and application issues in applying ICA to naive Bayes classification for solving both single-label and multi-label problems. In this study, we first carry out a comparative study of principal component analysis (PCA), ICA and CC-ICA for naive Bayes classifier. It is found that CC-ICA is often advantageous over PCA and ICA in improving the performance of naive Bayes classifier. However, CC-ICA often requires more training data to ensure that there are enough training data for each class. In the case where the sample size is smaller than the number of features, e.g. in microarray data analysis, the direct application of CC-ICA may become infeasible. To address this limitation, we propose a sequential feature extraction approach for naive Bayes classification of microarray data. This offers researchers or data analysts a novel method for classifying datasets with small sample size but extremely large attribute size. Despite the usefulness of the sequential feature extraction approach, the number of samples for some classes may be limited to just a few in microarray data analysis. The result is that CC-ICA cannot be used for these classes even if feature selection has been done on the data. Therefore, we extend CC-ICA and present the partition-conditional independent component analysis (PC-ICA) for naive Bayes classification of microarray data. As a feature extraction method, PC-ICA essentially represents a compromise between ICA and CC-ICA. It is particularly suitable for datasets which come with only few examples per class. The research work mentioned above only deals with single-label naive Bayes classification. Since multi-label classification has received much attention in different application domains, we finally investigate the usefulness of ICA for multi-label naive Bayes (MLNB) classification and present the ICA-MLNB scheme for solving multi-label classification problems. This research does not only demonstrate the usefulness of ICA in improving MLNB but also enriches the application scope of the ICA feature extraction method.
URI: http://scholarbank.nus.edu.sg/handle/10635/19064
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
FanLW.pdf755.61 kBAdobe PDF

OPEN

NoneView/Download

Page view(s)

503
checked on Dec 11, 2017

Download(s)

2,495
checked on Dec 11, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.