An analysis of speaker dependent models in replay detection

Please use this identifier to cite or link to this item: https://doi.org/10.1017/ATSIP.2020.9

DC Field	Value
dc.title	An analysis of speaker dependent models in replay detection
dc.contributor.author	Suthokumar, G.
dc.contributor.author	Sriskandaraja, K.
dc.contributor.author	Sethu, V.
dc.contributor.author	Ambikairajah, E.
dc.contributor.author	Li, H.
dc.date.accessioned	2021-08-13T02:54:11Z
dc.date.available	2021-08-13T02:54:11Z
dc.date.issued	2020
dc.identifier.citation	Suthokumar, G., Sriskandaraja, K., Sethu, V., Ambikairajah, E., Li, H. (2020). An analysis of speaker dependent models in replay detection. APSIPA Transactions on Signal and Information Processing 9 : e14. ScholarBank@NUS Repository. https://doi.org/10.1017/ATSIP.2020.9
dc.identifier.issn	20487703
dc.identifier.uri	https://scholarbank.nus.edu.sg/handle/10635/196802
dc.description.abstract	Most research on replay detection has focused on developing a stand-alone countermeasure that runs independently of a speaker verification system by training a single spoofed model and a single genuine model for all speakers. In this paper, we explore the potential benefits of adapting the back-end of a spoofing detection system towards the claimed target speaker. Specifically, we characterize and quantify speaker variability by comparing speaker-dependent and speaker-independent (SI) models of feature distributions for both genuine and spoofed speech. Following this, we develop an approach for implementing speaker-dependent spoofing detection using a Gaussian mixture model (GMM) back-end, where both the genuine and spoofed models are adapted to the claimed speaker. Finally, we also develop and evaluate a speaker-specific neural network-based spoofing detection system in addition to the GMM based back-end. Evaluations of the proposed approaches on replay corpora BTAS2016 and ASVspoof2017 v2.0 reveal that the proposed speaker-dependent spoofing detection outperforms equivalent SI replay detection baselines on both datasets. Our experimental results show that the use of speaker-specific genuine models leads to a significant improvement (around 4% in terms of equal error rate (EER)) as previously shown and the addition of speaker-specific spoofed models adds a small improvement on top (less than 1% in terms of EER). © 2020 The Author(s). Published by Cambridge University Press in association with Asia Pacific Signal and Information Processing Association.
dc.publisher	Cambridge University Press
dc.rights	Attribution 4.0 International
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.source	Scopus OA2020
dc.subject	Replay Attack
dc.subject	Speaker Adapted Neural Networks
dc.subject	Speaker Dependent Models
dc.subject	Speaker Verification
dc.subject	Spoofing Detection
dc.type	Article
dc.contributor.department	ELECTRICAL AND COMPUTER ENGINEERING
dc.description.doi	10.1017/ATSIP.2020.9
dc.description.sourcetitle	APSIPA Transactions on Signal and Information Processing
dc.description.volume	9
dc.description.page	e14
Appears in Collections:	Staff Publications Elements

Show simple item record

Files in This Item:

File	Description	Size	Format	Access Settings	Version
10_1017_ATSIP_2020_9.pdf		2.86 MB	Adobe PDF	OPEN	None	View/Download

Google Scholar^TM

Check

Altmetric

This item is licensed under a Creative Commons License

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM