Multi-stage DNN training for automatic recognition of dysarthric speech

Please use this identifier to cite or link to this item: https://doi.org/10.21437/Interspeech.2017-303

DC Field	Value
dc.title	Multi-stage DNN training for automatic recognition of dysarthric speech
dc.contributor.author	Yilmaz E.
dc.contributor.author	Mario Ganzeboom
dc.contributor.author	Catia Cucchiarini
dc.contributor.author	Helmer Strik
dc.date.accessioned	2018-08-02T05:10:34Z
dc.date.available	2018-08-02T05:10:34Z
dc.date.issued	2017-08-01
dc.identifier.citation	Yilmaz E., Mario Ganzeboom, Catia Cucchiarini, Helmer Strik (2017-08-01). Multi-stage DNN training for automatic recognition of dysarthric speech. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017-August : 2685-2689. ScholarBank@NUS Repository. https://doi.org/10.21437/Interspeech.2017-303
dc.identifier.issn	2308457X
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/145522
dc.description.abstract	Incorporating automatic speech recognition (ASR) in individualized speech training applications is becoming more viable thanks to the improved generalization capabilities of neural network-based acoustic models. The main problem in developing applications for dysarthric speech is the relative in-domain data scarcity. Collecting representative amounts of dysarthric speech data is difficult due to rigorous ethical and medical permission requirements, problems in accessing patients who are generally vulnerable and often subject to altering health conditions and, last but not least, the high variability in speech resulting from different pathological conditions. Developing such applications is even more challenging for languages which in general have fewer resources, fewer speakers and, consequently, also fewer patients than English, as in the case of a mid-sized language like Dutch. In this paper, we investigate a multi-stage deep neural network (DNN) training scheme aimed at obtaining better modeling of dysarthric speech by using only a small amount of in-domain training data. The results show that the system employing the proposed training scheme considerably improves the recognition of Dutch dysarthric speech compared to a baseline system with single-stage training only on a large amount of normal speech or a small amount of in-domain data.
dc.language.iso	en
dc.publisher	International Speech Communication Association
dc.subject	Automatic speech recognition, Deep neural networks, Dysarthria, Pathological speech
dc.type	Conference Paper
dc.contributor.department	ELECTRICAL & COMPUTER ENGINEERING
dc.description.doi	10.21437/Interspeech.2017-303
dc.description.sourcetitle	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
dc.description.volume	2017-August
dc.description.page	2685-2689
dc.published.state	Published
dc.grant.id	NWO 314-99-101 (CHASING)
dc.grant.fundingagency	Nederlandse Organisatie voor Wetenschappelijk Onderzoek
Appears in Collections:	Elements Staff Publications

Show simple item record

Files in This Item:

File	Description	Size	Format	Access Settings	Version
Interspeech2017_2.pdf	Preprint version	96.46 kB	Adobe PDF	OPEN	Pre-print	View/Download

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM