Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/78109
Title: Dynamic conditional random fields for joint sentence boundary and punctuation prediction
Authors: Wang, X.
Ng, H.T. 
Sim, K.C. 
Keywords: Dynamic conditional random fields
Punctuation
Sentence boundary detection
Issue Date: 2012
Citation: Wang, X.,Ng, H.T.,Sim, K.C. (2012). Dynamic conditional random fields for joint sentence boundary and punctuation prediction. 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 2 : 1382-1385. ScholarBank@NUS Repository.
Abstract: The use of dynamic conditional random fields (DCRF) has been shown to outperform linear-chain conditional random fields (LCRF) for punctuation prediction on conversational speech texts [1]. In this paper, we combine lexical, prosodic, and modified n-gram score features into the DCRF framework for a joint sentence boundary and punctuation prediction task on TDT3 English broadcast news. We show that the joint prediction method outperforms the conventional two-stage method using LCRF or maximum entropy model (MaxEnt). We show the importance of various features using DCRF, LCRF, Max-Ent, and hidden-event n-gram model (HEN) respectively. In addition, we address the practical issue of feature explosion by introducing lexical pruning, which reduces model size and improves the F1-measure. We adopt incremental local training to overcome memory size limitation without incurring significant performance penalty. Our results show that adding prosodic and n-gram score features gives about 20% relative error reduction in all cases. Overall, DCRF gives the best accuracy, followed by LCRF, MaxEnt, and HEN.
Source Title: 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
URI: http://scholarbank.nus.edu.sg/handle/10635/78109
ISBN: 9781622767595
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.