Please use this identifier to cite or link to this item:
Title: Exploiting category-specific information for multi-document summarization
Authors: Ng, J.P.
Bysani, P.
Lin, Z.
Kan, M.Y. 
Tan, C.-.L. 
Keywords: Csi
Guided summarization
Text summarization
Issue Date: 2012
Citation: Ng, J.P.,Bysani, P.,Lin, Z.,Kan, M.Y.,Tan, C.-.L. (2012). Exploiting category-specific information for multi-document summarization. 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers : 2093-2108. ScholarBank@NUS Repository.
Abstract: We show that by making use of information common to document sets belonging to a common category, we can improve the quality of automatically extracted content in multi-document summaries. This simple property is widely applicable in multi-document summarization tasks, and can be encapsulated by the concept of category-specific importance (CSI). Our experiments show that CSI is a valuable metric to aid sentence selection in extractive summarization tasks. We operationalize the computation CSI of sentences through the introduction of two new features that can be computed without needing any external knowledge. We also generalize this approach, showing that when manually-curated document-to-category mappings are unavailable, performing automatic categorization of document sets also improves summarization performance. We have incorporated these features into a simple, freely available, open-source extractive summarization system, called SWING. In the recent TAC-2011 guided summarization task, SWING outperformed all other participant summarization systems as measured by automated ROUGE measures. © 2012 The COLING.
Source Title: 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

checked on Apr 12, 2021

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.