Please use this identifier to cite or link to this item: https://doi.org/10.1109/IJCNN.2005.1555890
Title: A comparative study on term weighting schemes for text categorization
Authors: Lan, M. 
Sung, S.-Y. 
Low, H.-B.
Tan, C.-L. 
Issue Date: 2005
Citation: Lan, M.,Sung, S.-Y.,Low, H.-B.,Tan, C.-L. (2005). A comparative study on term weighting schemes for text categorization. Proceedings of the International Joint Conference on Neural Networks 1 : 546-551. ScholarBank@NUS Repository. https://doi.org/10.1109/IJCNN.2005.1555890
Abstract: The term weighting scheme, which is used to convert documents into vectors in the term spaces, is a vital step in automatic text categorization. The previous studies showed that term weighting schemes dominate the performance rather than the kernel functions of S Ms for the text categorization task. In this paper, we conducted experiments to compare various term weighting schemes with S M on two widely-used benchmark data sets. We also presented a new term weighting scheme t f . r f for text categorization. The cross-scheme comparison was performed by using McNcmar's Tests. The controlled experimental results showed that the newly proposed t f . r f scheme is significantly better than other term weighting schemes. Compared with schemes related with t f factor alone, the idf factor does not improve or even decrease the term's discriminating power for text categorization. The binary and t f .chi representations significantly underperform the other term weighting schemes. © 2005 IEEE.
Source Title: Proceedings of the International Joint Conference on Neural Networks
URI: http://scholarbank.nus.edu.sg/handle/10635/40449
ISBN: 0780390482
DOI: 10.1109/IJCNN.2005.1555890
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.