Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/73093
Title: A simple probability based term weighting scheme for automated text classification
Authors: Liu, Y.
Loh, H.T. 
Issue Date: 2007
Source: Liu, Y.,Loh, H.T. (2007). A simple probability based term weighting scheme for automated text classification. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 4570 LNAI : 33-43. ScholarBank@NUS Repository.
Abstract: In the automated text classification, tfidf is often considered as the default term weighting scheme and has been widely reported in literature. However, tfidf does not directly reflect terms' category membership. Inspired by the analysis of various feature selection methods, we propose a simple probability based term weighting scheme which directly utilizes two critical information ratios, i.e. relevance indicators. These relevance indicators are nicely supported by probability estimates which embody the category membership. Our experimental study based on two data sets, including Reuters-21578, demonstrates that the proposed probability based term weighting scheme outperforms tfidf significantly using Bayesian classifier and Support Vector Machines (SVM). © Springer-Verlag Berlin Heidelberg 2007.
Source Title: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
URI: http://scholarbank.nus.edu.sg/handle/10635/73093
ISBN: 9783540733225
ISSN: 03029743
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

19
checked on Dec 9, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.