Using redundancy reduction in summarization to improve text classification by SVMs | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/73996

Title:	Using redundancy reduction in summarization to improve text classification by SVMs
Authors:	Zhan, J. Loh, H.-T.
Keywords:	Text classification Text mining Text summarization support vector machines maximal marginal relevance
Issue Date:	Mar-2009
Citation:	Zhan, J.,Loh, H.-T. (2009-03). Using redundancy reduction in summarization to improve text classification by SVMs. Journal of Information Science and Engineering 25 (2) : 591-601. ScholarBank@NUS Repository.
Abstract:	In this paper, we investigate the use of summarization technique to improve text classification. As summarization inherently assign more weights to the more important sentences in an article, this may improve the accuracy of classification of the article. Redundancy in summaries was reduced to different levels and its effect on classification performance was investigated. The classification algorithm used here was Support Vector Machines (SVMs) which has proven to be very effective and robust for text classification problem. Experimental results showed that summaries with lowest redundancy could improve the classification performance of Reuters corpus with more than 6% increase on average Fi measure. In order to explain why summarization can improve the performance while feature selection makes no sense for SVMs, a fürther experiment was conducted to demonstrate the difference between summarization and traditional feature selection techniques.
Source Title:	Journal of Information Science and Engineering
URI:	http://scholarbank.nus.edu.sg/handle/10635/73996
ISSN:	10162364
Appears in Collections:	Staff Publications

Show full item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.