Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/78109
DC FieldValue
dc.titleDynamic conditional random fields for joint sentence boundary and punctuation prediction
dc.contributor.authorWang, X.
dc.contributor.authorNg, H.T.
dc.contributor.authorSim, K.C.
dc.date.accessioned2014-07-04T03:12:32Z
dc.date.available2014-07-04T03:12:32Z
dc.date.issued2012
dc.identifier.citationWang, X.,Ng, H.T.,Sim, K.C. (2012). Dynamic conditional random fields for joint sentence boundary and punctuation prediction. 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 2 : 1382-1385. ScholarBank@NUS Repository.
dc.identifier.isbn9781622767595
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/78109
dc.description.abstractThe use of dynamic conditional random fields (DCRF) has been shown to outperform linear-chain conditional random fields (LCRF) for punctuation prediction on conversational speech texts [1]. In this paper, we combine lexical, prosodic, and modified n-gram score features into the DCRF framework for a joint sentence boundary and punctuation prediction task on TDT3 English broadcast news. We show that the joint prediction method outperforms the conventional two-stage method using LCRF or maximum entropy model (MaxEnt). We show the importance of various features using DCRF, LCRF, Max-Ent, and hidden-event n-gram model (HEN) respectively. In addition, we address the practical issue of feature explosion by introducing lexical pruning, which reduces model size and improves the F1-measure. We adopt incremental local training to overcome memory size limitation without incurring significant performance penalty. Our results show that adding prosodic and n-gram score features gives about 20% relative error reduction in all cases. Overall, DCRF gives the best accuracy, followed by LCRF, MaxEnt, and HEN.
dc.sourceScopus
dc.subjectDynamic conditional random fields
dc.subjectPunctuation
dc.subjectSentence boundary detection
dc.typeConference Paper
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.sourcetitle13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
dc.description.volume2
dc.description.page1382-1385
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.