Please use this identifier to cite or link to this item:
|Title:||Domain adaptive bootstrapping for named entity recognition||Authors:||Wu, D.
|Issue Date:||2009||Citation:||Wu, D.,Lee, W.S.,Ye, N.,Chieu, H.L. (2009). Domain adaptive bootstrapping for named entity recognition. EMNLP 2009 - Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: A Meeting of SIGDAT, a Special Interest Group of ACL, Held in Conjunction with ACL-IJCNLP 2009 : 1523-1532. ScholarBank@NUS Repository.||Abstract:||Bootstrapping is the process of improving the performance of a trained classifier by iteratively adding data that is labeled by the classifier itself to the training set, and retraining the classifier. It is often used in situations where labeled training data is scarce but unlabeled data is abundant. In this paper, we consider the problem of domain adaptation: the situation where training data may not be scarce, but belongs to a different domain from the target application domain. As the distribution of unlabeled data is different from the training data, standard bootstrapping often has difficulty selecting informative data to add to the training set. We propose an effective domain adaptive bootstrapping algorithm that selects unlabeled target domain data that are informative about the target domain and easy to automatically label correctly. We call these instances bridges, as they are used to bridge the source domain to the target domain. We show that the method outperforms supervised, transductive and bootstrapping algorithms on the named entity recognition task. © 2009 ACL and AFNLP.||Source Title:||EMNLP 2009 - Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: A Meeting of SIGDAT, a Special Interest Group of ACL, Held in Conjunction with ACL-IJCNLP 2009||URI:||http://scholarbank.nus.edu.sg/handle/10635/40416|
|Appears in Collections:||Staff Publications|
Show full item record
Files in This Item:
There are no files associated with this item.
checked on Feb 19, 2020
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.