Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/15166
Title: Protein-protein interaction: A supervised learning approach
Authors: XIAO JUAN
Keywords: Protein-Protein Interaction, Maximum Entropy Model, Feature-based
Issue Date: 23-Feb-2006
Source: XIAO JUAN (2006-02-23). Protein-protein interaction: A supervised learning approach. ScholarBank@NUS Repository.
Abstract: In this thesis, we try to explore an effective solution for Protein-Protein Interaction (PPI) extraction, a specific relation extraction (RE) task in bio-literature, through a systematic study using Maximum Entropy model. We explore a rich set of features, including lexical, syntactic and semantic features. Finally, we propose a method with all features integrated via a Maximum Entropy model for PPI. Evaluation on IEPA corpus shows our system achieves 93.9% recall and 88.0% precision. Noting the unique problems in PPI extraction in contrast to existing RE tasks and the lack of current in depth studies in this area, our work finds new insights into PPI extraction. For instance, we explore some features (keyword, protein pairs and protein abbreviations features) hitherto not attempted in other PPI research. Our study also gives us further insight to RE in general, which is still a research area far from mature. For instance, we find the abbreviation feature, which has not been attempted in other feature-based approaches in news domain. Furthermore, comparing to other RE findings, we find that protein pairs, surrounding words and chunk features contribute a large portion of performance improvement.
URI: http://scholarbank.nus.edu.sg/handle/10635/15166
Appears in Collections:Master's Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
thesis.pdf311.45 kBAdobe PDF

OPEN

NoneView/Download

Page view(s)

264
checked on Dec 11, 2017

Download(s)

150
checked on Dec 11, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.