Please use this identifier to cite or link to this item:
Title: A comparative study on term weighting schemes for text categorization
Authors: Lan, M. 
Sung, S.-Y. 
Low, H.-B.
Tan, C.-L. 
Issue Date: 2005
Citation: Lan, M.,Sung, S.-Y.,Low, H.-B.,Tan, C.-L. (2005). A comparative study on term weighting schemes for text categorization. Proceedings of the International Joint Conference on Neural Networks 1 : 546-551. ScholarBank@NUS Repository.
Abstract: The term weighting scheme, which is used to convert documents into vectors in the term spaces, is a vital step in automatic text categorization. The previous studies showed that term weighting schemes dominate the performance rather than the kernel functions of S Ms for the text categorization task. In this paper, we conducted experiments to compare various term weighting schemes with S M on two widely-used benchmark data sets. We also presented a new term weighting scheme t f . r f for text categorization. The cross-scheme comparison was performed by using McNcmar's Tests. The controlled experimental results showed that the newly proposed t f . r f scheme is significantly better than other term weighting schemes. Compared with schemes related with t f factor alone, the idf factor does not improve or even decrease the term's discriminating power for text categorization. The binary and t f .chi representations significantly underperform the other term weighting schemes. © 2005 IEEE.
Source Title: Proceedings of the International Joint Conference on Neural Networks
ISBN: 0780390482
DOI: 10.1109/IJCNN.2005.1555890
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.


checked on Jun 23, 2019

Page view(s)

checked on Jun 22, 2019

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.