Please use this identifier to cite or link to this item: https://doi.org/10.1017/ATSIP.2020.9
Title: An analysis of speaker dependent models in replay detection
Authors: Suthokumar, G.
Sriskandaraja, K.
Sethu, V.
Ambikairajah, E.
Li, H. 
Keywords: Replay Attack
Speaker Adapted Neural Networks
Speaker Dependent Models
Speaker Verification
Spoofing Detection
Issue Date: 2020
Publisher: Cambridge University Press
Citation: Suthokumar, G., Sriskandaraja, K., Sethu, V., Ambikairajah, E., Li, H. (2020). An analysis of speaker dependent models in replay detection. APSIPA Transactions on Signal and Information Processing 9 : e14. ScholarBank@NUS Repository. https://doi.org/10.1017/ATSIP.2020.9
Rights: Attribution 4.0 International
Abstract: Most research on replay detection has focused on developing a stand-alone countermeasure that runs independently of a speaker verification system by training a single spoofed model and a single genuine model for all speakers. In this paper, we explore the potential benefits of adapting the back-end of a spoofing detection system towards the claimed target speaker. Specifically, we characterize and quantify speaker variability by comparing speaker-dependent and speaker-independent (SI) models of feature distributions for both genuine and spoofed speech. Following this, we develop an approach for implementing speaker-dependent spoofing detection using a Gaussian mixture model (GMM) back-end, where both the genuine and spoofed models are adapted to the claimed speaker. Finally, we also develop and evaluate a speaker-specific neural network-based spoofing detection system in addition to the GMM based back-end. Evaluations of the proposed approaches on replay corpora BTAS2016 and ASVspoof2017 v2.0 reveal that the proposed speaker-dependent spoofing detection outperforms equivalent SI replay detection baselines on both datasets. Our experimental results show that the use of speaker-specific genuine models leads to a significant improvement (around 4% in terms of equal error rate (EER)) as previously shown and the addition of speaker-specific spoofed models adds a small improvement on top (less than 1% in terms of EER). © 2020 The Author(s). Published by Cambridge University Press in association with Asia Pacific Signal and Information Processing Association.
Source Title: APSIPA Transactions on Signal and Information Processing
URI: https://scholarbank.nus.edu.sg/handle/10635/196802
ISSN: 20487703
DOI: 10.1017/ATSIP.2020.9
Rights: Attribution 4.0 International
Appears in Collections:Staff Publications
Elements

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
10_1017_ATSIP_2020_9.pdf2.86 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons