Improved combinatory categorial grammar induction with boundary words and bayesian inference

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/78183

Title:	Improved combinatory categorial grammar induction with boundary words and bayesian inference
Authors:	Huang, Y. Zhang, M. Tan, C.L.
Keywords:	Bayesian model Boundary words Combinatory categorial grammar Grammar induction
Issue Date:	2012
Citation:	Huang, Y.,Zhang, M.,Tan, C.L. (2012). Improved combinatory categorial grammar induction with boundary words and bayesian inference. 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers : 1257-1274. ScholarBank@NUS Repository.
Abstract:	Combinatory Categorial Grammar (CCG) is an expressive grammar formalism which is able to capture long-range dependencies. However, building large and wide-coverage treebanks for CCG is expensive and time-consuming. In this paper, we focus on the problem of unsupervised CCG induction from plain texts. Based on the baseline model in (Bisk and Hockenmaier, 2012), we propose following two improvements: (1) we utilize boundary part-of-speech (POS) tags to capture lexical information; (2) we perform nonparametric Bayesian inference based on the Pitman-Yor process to learn compact grammars. Experiments on English Penn treebank demonstrate the effectiveness of our boundary model and Bayesian learning. © 2012 The COLING.
Source Title:	24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers
URI:	http://scholarbank.nus.edu.sg/handle/10635/78183
Appears in Collections:	Staff Publications

There are no files associated with this item.

Check