Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/41133
Title: Extraction in scientific publications
Authors: Nguyen, T.D. 
Kan, M.-Y. 
Issue Date: 2007
Citation: Nguyen, T.D.,Kan, M.-Y. (2007). Extraction in scientific publications. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 4822 LNCS : 317-326. ScholarBank@NUS Repository.
Abstract: We present a keyphrase extraction algorithm for scientific publications. Different from previous work, we introduce features that capture the positions of phrases in document with respect to logical sections found in scientific discourse. We also introduce features that capture salient morphological phenomena found in scientific keyphrases, such as whether a candidate keyphrase is an acronyms or uses specific terminologically productive suffixes. We have implemented these features on top of a baseline feature set used by Kea [1]. In our evaluation using a corpus of 120 scientific publications multiply annotated for keyphrases, our system significantly outperformed Kea at the p < .05 level. As we know of no other existing multiply annotated keyphrase document collections, we have also made our evaluation corpus publicly available. We hope that this contribution will spur future comparative research. © Springer-Verlag Berlin Heidelberg 2007.
Source Title: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
URI: http://scholarbank.nus.edu.sg/handle/10635/41133
ISBN: 9783540770930
ISSN: 03029743
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.