Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/41232
Title: Discovering relations between named entities from a large raw corpus using tree similarity-based clustering
Authors: Zhang, M.
Su, J.
Wang, D.
Zhou, G.
Tan, C.L. 
Issue Date: 2005
Citation: Zhang, M.,Su, J.,Wang, D.,Zhou, G.,Tan, C.L. (2005). Discovering relations between named entities from a large raw corpus using tree similarity-based clustering. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 3651 LNAI : 378-389. ScholarBank@NUS Repository.
Abstract: We propose a tree-similarity-based unsupervised learning method to extract relations between Named Entities from a large raw corpus. Our method regards relation extraction as a clustering problem on shallow parse trees. First, we modify previous tree kernels on relation extraction to estimate the similarity between parse trees more efficiently. Then, the similarity between parse trees is used in a hierarchical clustering algorithm to group entity pairs into different clusters. Finally, each cluster is labeled by an indicative word and unreliable clusters are pruned out. Evaluation on the New York Times (1995) corpus shows that our method outperforms the only previous work by 5 in F-measure. It also shows that our method performs well on both high-frequent and less-frequent entity pairs. To the best of our knowledge, this is the first work to use a tree similarity metric in relation clustering. © Springer-Verlag Berlin Heidelberg 2005.
Source Title: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
URI: http://scholarbank.nus.edu.sg/handle/10635/41232
ISBN: 3540291725
ISSN: 03029743
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.