Please use this identifier to cite or link to this item: https://doi.org/10.1145/2467696.2467703
Title: Extracting and matching authors and affiliations in scholarly documents
Authors: Do, H.H.N.
Chandrasekaran, M.K.
Cho, P.S.
Kan, M.-Y. 
Keywords: Conditional random fields
Logical structure discovery
Metadata extraction
Rich document features
Support vector machine
Issue Date: 2013
Source: Do, H.H.N.,Chandrasekaran, M.K.,Cho, P.S.,Kan, M.-Y. (2013). Extracting and matching authors and affiliations in scholarly documents. Proceedings of the ACM/IEEE Joint Conference on Digital Libraries : 219-228. ScholarBank@NUS Repository. https://doi.org/10.1145/2467696.2467703
Abstract: We introduce Enlil, an information extraction system that discovers the institutional affiliations of authors in scholarly papers. Enlil consists of two steps: one that first identifies authors and affiliations using a conditional random field; and a second support vector machine that connects authors to their affiliations. We benchmark Enlil in three separate experiments drawn from three different sources: The ACL Anthology, the ACM Digital Library, and a set of cross-disciplinary scientific journal articles acquired by querying Google Scholar. Against a state-of-the-art production baseline, Enlil reports a statistically significant improvement in F1 of nearly 10% (p « 0.01). In the case of multidisciplinary articles from Google Scholar, Enlil is benchmarked over both clean input (F1 > 90%) and automatically-acquired input (F1 > 80%). We have deployed Enlil in a case study involving Asian genomics research publication patterns to understand how government sponsored collaborative links evolve. Enlil has enabled our team to construct and validate new metrics to quantify the facilitation of research as opposed to direct publication. Copyright © 2013 by the Association for Computing Machinery, Inc. (ACM).
Source Title: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries
URI: http://scholarbank.nus.edu.sg/handle/10635/78140
ISBN: 9781450320764
ISSN: 15525996
DOI: 10.1145/2467696.2467703
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

SCOPUSTM   
Citations

10
checked on Feb 21, 2018

Page view(s)

43
checked on Feb 24, 2018

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.