Please use this identifier to cite or link to this item: https://doi.org/10.21437/Interspeech.2017-391
DC FieldValue
dc.titleExploiting untranscribed broadcast data for improved code-switching detection
dc.contributor.authorYilmaz E.
dc.contributor.authorHenk van den Heuvel
dc.contributor.authorDavid van Leeuwen
dc.date.accessioned2018-08-02T04:58:55Z
dc.date.available2018-08-02T04:58:55Z
dc.date.issued2017-01-01
dc.identifier.citationYilmaz E., Henk van den Heuvel, David van Leeuwen (2017-01-01). Exploiting untranscribed broadcast data for improved code-switching detection. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017-August : 42-46. ScholarBank@NUS Repository. https://doi.org/10.21437/Interspeech.2017-391
dc.identifier.issn2308457X
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/145521
dc.description.abstractWe have recently presented an automatic speech recognition (ASR) system operating on Frisian-Dutch code-switched speech. This type of speech requires careful handling of unexpected language switches that may occur in a single utterance. In this paper, we extend this work by using some raw broadcast data to improve multilingually trained deep neural networks (DNN) that have been trained on 11.5 hours of manually annotated bilingual speech. For this purpose, we apply the initial ASR to the untranscribed broadcast data and automatically create transcriptions based on the recognizer output using different language models for rescoring. Then, we train new acoustic models on the combined data, i.e., the manually and automatically transcribed bilingual broadcast data, and investigate the automatic transcription quality based on the recognition accuracies on a separate set of development and test data. Finally, we report code-switching detection performance elaborating on the correlation between the ASR and the code-switching detection performance.
dc.language.isoen
dc.publisherISCA
dc.subjectBilingual ASR, Code-switching, Frisian language, Under-resourced languages
dc.typeConference Paper
dc.contributor.departmentELECTRICAL & COMPUTER ENGINEERING
dc.description.doi10.21437/Interspeech.2017-391
dc.description.sourcetitleProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
dc.description.volume2017-August
dc.description.page42-46
dc.published.statePublished
dc.grant.idNWO Project 314-99-119 (Frisian Audio Mining Enterprise)
dc.grant.fundingagencyNederlandse Organisatie voor Wetenschappelijk Onderzoek
Appears in Collections:Elements
Staff Publications

Show simple item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
Interspeech2017_1.pdf283.18 kBAdobe PDF

OPEN

Post-printView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.