EXPLOITING TAGGED AND UNTAGGED CORPORA FOR WORD SENSE DISAMBIGUATION

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/15693

Title:	EXPLOITING TAGGED AND UNTAGGED CORPORA FOR WORD SENSE DISAMBIGUATION
Authors:	NIU ZHENGYU
Keywords:	word sense disambiguation, word sense discrimination, word sense detection, semi-supervised classification, partially supervised classification.
Issue Date:	24-Feb-2007
Citation:	NIU ZHENGYU (2007-02-24). EXPLOITING TAGGED AND UNTAGGED CORPORA FOR WORD SENSE DISAMBIGUATION. ScholarBank@NUS Repository.
Abstract:	Traditional supervised methods to sense disambiguation require a lot of sense tagged examples that are often difficult, expensive, or time consuming to obtain. Moreover, if there are no tagged examples for a sense (e.g., a domain specific sense) in the sense tagged corpus, then sense taggers built on this corpus using traditional learning technique will mis-tag the instances with the missed sense. We investigate a series of novel machine learning approaches on benchmark corpora for sense disambiguation and empirically compare them with other related state of the art sense disambiguation methods. They address following questions: How to automatically estimate the number of senses (or sense number, model order) of an ambiguous word from an untagged corpus? (Minimum Description Length criterion); How to use untagged corpora to build a better sense tagger? (label propagation); How to perform sense disambiguation with an incomplete sense tagged corpus? (partially supervised learning).
URI:	http://scholarbank.nus.edu.sg/handle/10635/15693
Appears in Collections:	Ph.D Theses (Open)

File	Description	Size	Format	Access Settings	Version
NiuZY.pdf		492.36 kB	Adobe PDF	OPEN	None	View/Download

Check