Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/77981
Title: | A semi-supervised bayesian network model for microblog topic classification | Authors: | Chen, Y. Li, Z. Nie, L. Hu, X. Wang, X. Chua, T.-S. Zhang, X. |
Keywords: | Microblog classification Probabilistic graph model Semi-supervised algorithm |
Issue Date: | 2012 | Citation: | Chen, Y.,Li, Z.,Nie, L.,Hu, X.,Wang, X.,Chua, T.-S.,Zhang, X. (2012). A semi-supervised bayesian network model for microblog topic classification. 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers : 561-576. ScholarBank@NUS Repository. | Abstract: | Microblogging services have brought users to a new era of knowledge dissemination and information seeking. However, the large volume and multi-aspect of messages hinder the ability of users to conveniently locate the specific messages that they are interested in. While many researchers wish to employ traditional text classification approaches to effectively understand messages on microblogging services, the limited length of the messages prevents these approaches from being employed to their full potential. To tackle this problem, we propose a novel semi-supervised learning scheme to seamlessly integrate the external web resources to compensate for the limited message length. Our approach first trains a classifier based on the available labeled data as well as some auxiliary cues mined from the web, and probabilistically predicts the categories for all unlabeled data. It then trains a new classifier using the labels for all messages and the auxiliary cues, and iterates the process to convergence. Our approach not only greatly reduces the time-consuming and labor-intensive labeling process, but also deeply exploits the hidden information from unlabeled data and related text resources. We conducted extensive experiments on two real-world microblogging datasets. The results demonstrate the effectiveness of the proposed approaches which produce promising performance as compared to state-of-the-art methods. © 2012 The COLING. | Source Title: | 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers | URI: | http://scholarbank.nus.edu.sg/handle/10635/77981 |
Appears in Collections: | Staff Publications |
Show full item record
Files in This Item:
There are no files associated with this item.
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.