Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/40120
Title: | A MFoM learning approach to robust multiclass multi-label text categorization | Authors: | Gao, S. Wu, W. Lee, C.-H. Chua, T.-S. |
Issue Date: | 2004 | Citation: | Gao, S.,Wu, W.,Lee, C.-H.,Chua, T.-S. (2004). A MFoM learning approach to robust multiclass multi-label text categorization. Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004 : 329-336. ScholarBank@NUS Repository. | Abstract: | We propose a multiclass (MC) classification approach to text categorization (TC). To fully take advantage of both positive and negative training examples, a maximal figure-of-merit (MFoM) learning algorithm is introduced to train high performance MC classifiers. In contrast to conventional binary classification, the proposed MC scheme assigns a uniform score function to each category for each given test sample, and thus the classical Bayes decision rules can now be applied. Since all the MC MFoM classifiers are simultaneously trained, we expect them to be more robust and work better than the binary MFoM classifiers, which are trained separately and are known to give the best TC performance. Experimental results on the Reuters-21578 TC task indicate that the MC MFoM classifiers achieve a micro-averaging F 1 value of 0.377, which is significantly better than 0.138, obtained with the binary MFoM classifiers, for the categories with less than 4 training samples. Furthermore, for all 90 categories, most with large training sizes, the MC MFoM classifiers give a micro-averaging F 1 value of 0.888, better than 0.884, obtained with the binary MFoM classifiers. | Source Title: | Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004 | URI: | http://scholarbank.nus.edu.sg/handle/10635/40120 | ISBN: | 1581138385 |
Appears in Collections: | Staff Publications |
Show full item record
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.