Please use this identifier to cite or link to this item:
https://doi.org/10.1109/ASRU.2013.6707743
DC Field | Value | |
---|---|---|
dc.title | Improving robustness of deep neural networks via spectral masking for automatic speech recognition | |
dc.contributor.author | Li, B. | |
dc.contributor.author | Sim, K.C. | |
dc.date.accessioned | 2014-07-04T03:13:24Z | |
dc.date.available | 2014-07-04T03:13:24Z | |
dc.date.issued | 2013 | |
dc.identifier.citation | Li, B.,Sim, K.C. (2013). Improving robustness of deep neural networks via spectral masking for automatic speech recognition. 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings : 279-284. ScholarBank@NUS Repository. <a href="https://doi.org/10.1109/ASRU.2013.6707743" target="_blank">https://doi.org/10.1109/ASRU.2013.6707743</a> | |
dc.identifier.isbn | 9781479927562 | |
dc.identifier.uri | http://scholarbank.nus.edu.sg/handle/10635/78187 | |
dc.description.abstract | The performance of human listeners degrades rather slowly compared to machines in noisy environments. This has been attributed to the ability of performing auditory scene analysis which separates the speech prior to recognition. In this work, we investigate two mask estimation approaches, namely the state dependent and the deep neural network (DNN) based estimations, to separate speech from noises for improving DNN acoustic models' noise robustness. The second approach has been experimentally shown to outperform the first one. Due to the stereo data based training and ill-defined masks for speech with channel distortions, both methods do not generalize well to unseen conditions and fail to beat the performance of the multi-style trained baseline system. However, the model trained on masked features demonstrates strong complementariness to the baseline model. The simple average of the two system's posteriors yields word error rates of 4.4% on Aurora2 and 12.3% on Aurora4. © 2013 IEEE. | |
dc.description.uri | http://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1109/ASRU.2013.6707743 | |
dc.source | Scopus | |
dc.subject | Deep Neural Network | |
dc.subject | Noise Robustness | |
dc.subject | Spectral Masking | |
dc.type | Conference Paper | |
dc.contributor.department | COMPUTER SCIENCE | |
dc.description.doi | 10.1109/ASRU.2013.6707743 | |
dc.description.sourcetitle | 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings | |
dc.description.page | 279-284 | |
dc.identifier.isiut | NOT_IN_WOS | |
Appears in Collections: | Staff Publications |
Show simple item record
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.