Please use this identifier to cite or link to this item: https://doi.org/10.1109/IJCNN.2005.1555890
Title: A comparative study on term weighting schemes for text categorization
Authors: Lan, M. 
Sung, S.-Y. 
Low, H.-B.
Tan, C.-L. 
Issue Date: 2005
Source: Lan, M.,Sung, S.-Y.,Low, H.-B.,Tan, C.-L. (2005). A comparative study on term weighting schemes for text categorization. Proceedings of the International Joint Conference on Neural Networks 1 : 546-551. ScholarBank@NUS Repository. https://doi.org/10.1109/IJCNN.2005.1555890
Abstract: The term weighting scheme, which is used to convert documents into vectors in the term spaces, is a vital step in automatic text categorization. The previous studies showed that term weighting schemes dominate the performance rather than the kernel functions of S Ms for the text categorization task. In this paper, we conducted experiments to compare various term weighting schemes with S M on two widely-used benchmark data sets. We also presented a new term weighting scheme t f . r f for text categorization. The cross-scheme comparison was performed by using McNcmar's Tests. The controlled experimental results showed that the newly proposed t f . r f scheme is significantly better than other term weighting schemes. Compared with schemes related with t f factor alone, the idf factor does not improve or even decrease the term's discriminating power for text categorization. The binary and t f .chi representations significantly underperform the other term weighting schemes. © 2005 IEEE.
Source Title: Proceedings of the International Joint Conference on Neural Networks
URI: http://scholarbank.nus.edu.sg/handle/10635/40449
ISBN: 0780390482
DOI: 10.1109/IJCNN.2005.1555890
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

SCOPUSTM   
Citations

40
checked on Dec 5, 2017

Page view(s)

57
checked on Dec 9, 2017

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.