Please use this identifier to cite or link to this item:
Title: Using redundancy reduction in summarization to improve text classification by SVMs
Authors: Zhan, J.
Loh, H.-T. 
Keywords: Text classification
Text mining
Text summarization support vector machines maximal marginal relevance
Issue Date: Mar-2009
Source: Zhan, J.,Loh, H.-T. (2009-03). Using redundancy reduction in summarization to improve text classification by SVMs. Journal of Information Science and Engineering 25 (2) : 591-601. ScholarBank@NUS Repository.
Abstract: In this paper, we investigate the use of summarization technique to improve text classification. As summarization inherently assign more weights to the more important sentences in an article, this may improve the accuracy of classification of the article. Redundancy in summaries was reduced to different levels and its effect on classification performance was investigated. The classification algorithm used here was Support Vector Machines (SVMs) which has proven to be very effective and robust for text classification problem. Experimental results showed that summaries with lowest redundancy could improve the classification performance of Reuters corpus with more than 6% increase on average Fi measure. In order to explain why summarization can improve the performance while feature selection makes no sense for SVMs, a fürther experiment was conducted to demonstrate the difference between summarization and traditional feature selection techniques.
Source Title: Journal of Information Science and Engineering
ISSN: 10162364
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

checked on Jan 21, 2018

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.