Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/78109
Title: | Dynamic conditional random fields for joint sentence boundary and punctuation prediction | Authors: | Wang, X. Ng, H.T. Sim, K.C. |
Keywords: | Dynamic conditional random fields Punctuation Sentence boundary detection |
Issue Date: | 2012 | Citation: | Wang, X.,Ng, H.T.,Sim, K.C. (2012). Dynamic conditional random fields for joint sentence boundary and punctuation prediction. 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 2 : 1382-1385. ScholarBank@NUS Repository. | Abstract: | The use of dynamic conditional random fields (DCRF) has been shown to outperform linear-chain conditional random fields (LCRF) for punctuation prediction on conversational speech texts [1]. In this paper, we combine lexical, prosodic, and modified n-gram score features into the DCRF framework for a joint sentence boundary and punctuation prediction task on TDT3 English broadcast news. We show that the joint prediction method outperforms the conventional two-stage method using LCRF or maximum entropy model (MaxEnt). We show the importance of various features using DCRF, LCRF, Max-Ent, and hidden-event n-gram model (HEN) respectively. In addition, we address the practical issue of feature explosion by introducing lexical pruning, which reduces model size and improves the F1-measure. We adopt incremental local training to overcome memory size limitation without incurring significant performance penalty. Our results show that adding prosodic and n-gram score features gives about 20% relative error reduction in all cases. Overall, DCRF gives the best accuracy, followed by LCRF, MaxEnt, and HEN. | Source Title: | 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 | URI: | http://scholarbank.nus.edu.sg/handle/10635/78109 | ISBN: | 9781622767595 |
Appears in Collections: | Staff Publications |
Show full item record
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.