Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/41313
Title: Unsupervised learning of soft patterns for generating definitions from online news
Authors: Cui, H.
Kan, M.-Y. 
Chua, T.-S. 
Keywords: Definition generation
Definitional question answering
Pseudo-relevance feedback
Soft patterns
Unsupervised learning
Issue Date: 2004
Source: Cui, H.,Kan, M.-Y.,Chua, T.-S. (2004). Unsupervised learning of soft patterns for generating definitions from online news. Thirteenth International World Wide Web Conference Proceedings, WWW2004 : 90-98. ScholarBank@NUS Repository.
Abstract: Breaking news often contains timely definitions and descriptions of current terms, organizations and personalities. We utilize such web sources to construct definitions for such terms. Previous work has identified definitions using hand-crafted rules or supervised learning that constructs rigid, hard text patterns. In contrast, we demonstrate a new approach that uses flexible, soft matching patterns to characterize definition sentences. Our soft patterns are able to effectively accommodate the diversity of definition sentence structure exhibited in news. We use pseudo-relevance feedback to automatically label sentences for use in soft pattern generation. The application of our unsupervised method significantly improves baseline systems on both the standardized TREC corpus as well as crawled online news articles by 27% and 30%, respectively, in terms of F measure. When applied to a state-of-art definition generation system recently fielded in the TREC 2003 definitional question answering task, it improves the performance by 14%.
Source Title: Thirteenth International World Wide Web Conference Proceedings, WWW2004
URI: http://scholarbank.nus.edu.sg/handle/10635/41313
ISBN: 158113844X
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

54
checked on Dec 9, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.