Please use this identifier to cite or link to this item:
Title: Language Agnostic Speaker Embedding for Cross-Lingual Personalized Speech Generation
Authors: Zhou, Yi
Tian, Xiaohai
Li, Haizhou 
Keywords: cross-lingual
Language agnostic
personalized speech generation
speaker embedding
Issue Date: 1-Jan-2021
Publisher: Institute of Electrical and Electronics Engineers Inc.
Citation: Zhou, Yi, Tian, Xiaohai, Li, Haizhou (2021-01-01). Language Agnostic Speaker Embedding for Cross-Lingual Personalized Speech Generation. IEEE/ACM Transactions on Audio Speech and Language Processing 29 : 3427-3439. ScholarBank@NUS Repository.
Rights: Attribution 4.0 International
Abstract: Cross-lingual personalized speech generation seeks to synthesize a target speaker's voice from only a few training samples that are in a different language. One popular technique is to condition a speech synthesizer on a speaker embedding, that characterizes the target speaker. Unfortunately, such a speaker embedding is usually affected by the language being spoken, which compromises the speaker similarity in cross-lingual personalized speech generation. In this paper, we propose a novel speaker encoding mechanism that learns a language agnostic speaker embedding to characterize speaker individuality. Specifically, we adopt an encoder-decoder architecture to disentangle the language information from speaker embeddings via multi-task learning. We conduct experiments on both voice conversion and text-to-speech synthesis between English and Mandarin that involve cross-lingual speech generation. All objective and subjective evaluations consistently confirm that the proposed speaker embedding is language agnostic, thus improving cross-lingual personalized speech generation in terms of speaker similarity. © 2014 IEEE.
Source Title: IEEE/ACM Transactions on Audio Speech and Language Processing
ISSN: 2329-9290
DOI: 10.1109/taslp.2021.3125142
Rights: Attribution 4.0 International
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
10_1109_taslp_2021_3125142.pdf3.19 MBAdobe PDF




checked on Jan 30, 2023

Page view(s)

checked on Feb 2, 2023

Google ScholarTM



This item is licensed under a Creative Commons License Creative Commons