Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/43330
DC FieldValue
dc.titleText classification by labeling words
dc.contributor.authorLiu, B.
dc.contributor.authorLi, X.
dc.contributor.authorLee, W.S.
dc.contributor.authorYu, P.S.
dc.date.accessioned2013-07-23T09:31:09Z
dc.date.available2013-07-23T09:31:09Z
dc.date.issued2004
dc.identifier.citationLiu, B., Li, X., Lee, W.S., Yu, P.S. (2004). Text classification by labeling words. Proceedings of the National Conference on Artificial Intelligence : 425-430. ScholarBank@NUS Repository.
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/43330
dc.description.abstractTraditionally, text classifiers are built from labeled training examples. Labeling is usually done manually by human experts (or the users), which is a labor intensive and time consuming process. In the past few years, researchers investigated various forms of semi-supervised learning to reduce the burden of manual labeling. In this paper, we propose a different approach. Instead of labeling a set of documents, the proposed method labels a set of representative words for each class. It then uses these words to extract a set of documents for each class from a set of unlabeled documents to form the initial training set. The EM algorithm is then applied to build the classifier. The key issue of the approach is how to obtain a set of representative words for each class. One way is to ask the user to provide them, which is difficult because the user usually can only give a few words (which are insufficient for accurate learning). We propose a method to solve the problem. It combines clustering and feature selection. The technique can effectively rank the words in the unlabeled set according to their importance. The user then selects/labels some words from the ranked list for each class. This process requires less effort than providing words with no help or manual labeling of documents. Our results show that the new method is highly effective and promising.
dc.sourceScopus
dc.typeConference Paper
dc.contributor.departmentCOMPUTER SCIENCE
dc.contributor.departmentSINGAPORE-MIT ALLIANCE
dc.description.sourcetitleProceedings of the National Conference on Artificial Intelligence
dc.description.page425-430
dc.description.codenPNAIE
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.