Please use this identifier to cite or link to this item:
https://doi.org/10.1017/ATSIP.2020.9
DC Field | Value | |
---|---|---|
dc.title | An analysis of speaker dependent models in replay detection | |
dc.contributor.author | Suthokumar, G. | |
dc.contributor.author | Sriskandaraja, K. | |
dc.contributor.author | Sethu, V. | |
dc.contributor.author | Ambikairajah, E. | |
dc.contributor.author | Li, H. | |
dc.date.accessioned | 2021-08-13T02:54:11Z | |
dc.date.available | 2021-08-13T02:54:11Z | |
dc.date.issued | 2020 | |
dc.identifier.citation | Suthokumar, G., Sriskandaraja, K., Sethu, V., Ambikairajah, E., Li, H. (2020). An analysis of speaker dependent models in replay detection. APSIPA Transactions on Signal and Information Processing 9 : e14. ScholarBank@NUS Repository. https://doi.org/10.1017/ATSIP.2020.9 | |
dc.identifier.issn | 20487703 | |
dc.identifier.uri | https://scholarbank.nus.edu.sg/handle/10635/196802 | |
dc.description.abstract | Most research on replay detection has focused on developing a stand-alone countermeasure that runs independently of a speaker verification system by training a single spoofed model and a single genuine model for all speakers. In this paper, we explore the potential benefits of adapting the back-end of a spoofing detection system towards the claimed target speaker. Specifically, we characterize and quantify speaker variability by comparing speaker-dependent and speaker-independent (SI) models of feature distributions for both genuine and spoofed speech. Following this, we develop an approach for implementing speaker-dependent spoofing detection using a Gaussian mixture model (GMM) back-end, where both the genuine and spoofed models are adapted to the claimed speaker. Finally, we also develop and evaluate a speaker-specific neural network-based spoofing detection system in addition to the GMM based back-end. Evaluations of the proposed approaches on replay corpora BTAS2016 and ASVspoof2017 v2.0 reveal that the proposed speaker-dependent spoofing detection outperforms equivalent SI replay detection baselines on both datasets. Our experimental results show that the use of speaker-specific genuine models leads to a significant improvement (around 4% in terms of equal error rate (EER)) as previously shown and the addition of speaker-specific spoofed models adds a small improvement on top (less than 1% in terms of EER). © 2020 The Author(s). Published by Cambridge University Press in association with Asia Pacific Signal and Information Processing Association. | |
dc.publisher | Cambridge University Press | |
dc.rights | Attribution 4.0 International | |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | |
dc.source | Scopus OA2020 | |
dc.subject | Replay Attack | |
dc.subject | Speaker Adapted Neural Networks | |
dc.subject | Speaker Dependent Models | |
dc.subject | Speaker Verification | |
dc.subject | Spoofing Detection | |
dc.type | Article | |
dc.contributor.department | ELECTRICAL AND COMPUTER ENGINEERING | |
dc.description.doi | 10.1017/ATSIP.2020.9 | |
dc.description.sourcetitle | APSIPA Transactions on Signal and Information Processing | |
dc.description.volume | 9 | |
dc.description.page | e14 | |
Appears in Collections: | Staff Publications Elements |
Show simple item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
10_1017_ATSIP_2020_9.pdf | 2.86 MB | Adobe PDF | OPEN | None | View/Download |
This item is licensed under a Creative Commons License