Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/41554
Title: Proposing a new term weighting scheme for text categorization
Authors: Lan, M.
Tan, C.-L. 
Low, H.-B.
Issue Date: 2006
Source: Lan, M.,Tan, C.-L.,Low, H.-B. (2006). Proposing a new term weighting scheme for text categorization. Proceedings of the National Conference on Artificial Intelligence 1 : 763-768. ScholarBank@NUS Repository.
Abstract: In text categorization, term weighting methods assign appropriate weights to the terms to improve the classification performance. In this study, we propose an effective term weighting scheme, i.e. tf.rf, and investigate several widely-used unsupervised and supervised term weighting methods on two popular data collections in combination with SVM and kNN algorithms. From our controlled experimental results, not all supervised term weighting methods have a consistent superiority over unsupervised term weighting methods. Specifically, the three supervised methods based on the information theory, i.e. tf.χ 2, tf.ig and tf.or, perform rather poorly in all experiments. On the other hand, our proposed tf.rf achieves the best performance consistently and outperforms other methods substantially and significantly. The popularly-used tf.idf method has not shown a uniformly good performance with respect to different data corpora. Copyright © 2006, American Association for Artificial Intelligence (www.aaai.org). All rights reserved.
Source Title: Proceedings of the National Conference on Artificial Intelligence
URI: http://scholarbank.nus.edu.sg/handle/10635/41554
ISBN: 1577352815
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

41
checked on Dec 16, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.