Please use this identifier to cite or link to this item:
Title: On machine learning methods for Chinese document categorization
Authors: He, J. 
Tan, A.-H.
Tan, C.-L. 
Keywords: Comparative experiments
Machine learning
Text categorization
Issue Date: 2003
Citation: He, J., Tan, A.-H., Tan, C.-L. (2003). On machine learning methods for Chinese document categorization. Applied Intelligence 18 (3) : 311-322. ScholarBank@NUS Repository.
Abstract: This paper reports our comparative evaluation of three machine learning methods, namely k Nearest Neighbor (kNN), Support Vector Machines (S VM), and Adaptive Resonance Associative Map (ARAM) for Chinese document categorization. Based on two Chinese corpora, a series of controlled experiments evaluated their learning capabilities and efficiency in mining text classification knowledge. Benchmark experiments showed that their predictive performance were roughly comparable, especially on clean and well organized data sets. While kNN and ARAM yield better performances than SVM on small and clean data sets, SVM and ARAM significantly outperformed kNN on noisy data. Comparing efficiency, kNN was notably more costly in terms of time and memory than the other two methods. SVM is highly efficient in learning from well organized samples of moderate size, although on relatively large and noisy data the efficiency of SVM and ARAM are comparable.
Source Title: Applied Intelligence
ISSN: 0924669X
DOI: 10.1023/A:1023202221875
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.


checked on Oct 23, 2018


checked on Jan 2, 2018

Page view(s)

checked on May 26, 2018

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.