Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/15166
Title: | Protein-protein interaction: A supervised learning approach | Authors: | XIAO JUAN | Keywords: | Protein-Protein Interaction, Maximum Entropy Model, Feature-based | Issue Date: | 23-Feb-2006 | Citation: | XIAO JUAN (2006-02-23). Protein-protein interaction: A supervised learning approach. ScholarBank@NUS Repository. | Abstract: | In this thesis, we try to explore an effective solution for Protein-Protein Interaction (PPI) extraction, a specific relation extraction (RE) task in bio-literature, through a systematic study using Maximum Entropy model. We explore a rich set of features, including lexical, syntactic and semantic features. Finally, we propose a method with all features integrated via a Maximum Entropy model for PPI. Evaluation on IEPA corpus shows our system achieves 93.9% recall and 88.0% precision. Noting the unique problems in PPI extraction in contrast to existing RE tasks and the lack of current in depth studies in this area, our work finds new insights into PPI extraction. For instance, we explore some features (keyword, protein pairs and protein abbreviations features) hitherto not attempted in other PPI research. Our study also gives us further insight to RE in general, which is still a research area far from mature. For instance, we find the abbreviation feature, which has not been attempted in other feature-based approaches in news domain. Furthermore, comparing to other RE findings, we find that protein pairs, surrounding words and chunk features contribute a large portion of performance improvement. | URI: | http://scholarbank.nus.edu.sg/handle/10635/15166 |
Appears in Collections: | Master's Theses (Open) |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
thesis.pdf | 311.45 kB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.