Please use this identifier to cite or link to this item:
|Title:||A comparative study on term weighting schemes for text categorization|
|Authors:||Lan, M. |
|Source:||Lan, M.,Sung, S.-Y.,Low, H.-B.,Tan, C.-L. (2005). A comparative study on term weighting schemes for text categorization. Proceedings of the International Joint Conference on Neural Networks 1 : 546-551. ScholarBank@NUS Repository. https://doi.org/10.1109/IJCNN.2005.1555890|
|Abstract:||The term weighting scheme, which is used to convert documents into vectors in the term spaces, is a vital step in automatic text categorization. The previous studies showed that term weighting schemes dominate the performance rather than the kernel functions of S Ms for the text categorization task. In this paper, we conducted experiments to compare various term weighting schemes with S M on two widely-used benchmark data sets. We also presented a new term weighting scheme t f . r f for text categorization. The cross-scheme comparison was performed by using McNcmar's Tests. The controlled experimental results showed that the newly proposed t f . r f scheme is significantly better than other term weighting schemes. Compared with schemes related with t f factor alone, the idf factor does not improve or even decrease the term's discriminating power for text categorization. The binary and t f .chi representations significantly underperform the other term weighting schemes. © 2005 IEEE.|
|Source Title:||Proceedings of the International Joint Conference on Neural Networks|
|Appears in Collections:||Staff Publications|
Show full item record
Files in This Item:
There are no files associated with this item.
checked on Dec 5, 2017
checked on Dec 9, 2017
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.