Please use this identifier to cite or link to this item:
|Title:||Improved Constituent Context model with features|
|Citation:||Huang, Y.,Zhang, M.,Tan, C.L. (2012). Improved Constituent Context model with features. Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012 : 564-573. ScholarBank@NUS Repository.|
|Abstract:||The Constituent-Context Model (CCM) achieves promising results for unsupervised grammar induction. However, its performance drops for longer sentences. In this paper, we describe a general feature-based model for CCM, in which linguistic knowledge can be easily integrated as features. Features take the log-linear form with local normalization, so the Expectation-Maximization (EM) algorithm is still applicable to estimate model parameters. The l1-norm is used to control the model complexity, leading to sparse and compact grammar. We also propose to use a separated development to perform model selection and an additional test set to evaluate the performance. Under this framework, we could automatically choose suitable model parameters rather than setting them empirically. Experiments on the English treebank demonstrate that the feature-based model achieves comparable performance on short sentences but significant improvement on longer sentences. © 2012 The PACLIC.|
|Source Title:||Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012|
|Appears in Collections:||Staff Publications|
Show full item record
Files in This Item:
There are no files associated with this item.
checked on Oct 12, 2018
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.