Please use this identifier to cite or link to this item:
https://doi.org/10.1109/taslp.2021.3125142
Title: | Language Agnostic Speaker Embedding for Cross-Lingual Personalized Speech Generation | Authors: | Zhou, Yi Tian, Xiaohai Li, Haizhou |
Keywords: | cross-lingual Language agnostic personalized speech generation speaker embedding |
Issue Date: | 1-Jan-2021 | Publisher: | Institute of Electrical and Electronics Engineers Inc. | Citation: | Zhou, Yi, Tian, Xiaohai, Li, Haizhou (2021-01-01). Language Agnostic Speaker Embedding for Cross-Lingual Personalized Speech Generation. IEEE/ACM Transactions on Audio Speech and Language Processing 29 : 3427-3439. ScholarBank@NUS Repository. https://doi.org/10.1109/taslp.2021.3125142 | Rights: | Attribution 4.0 International | Abstract: | Cross-lingual personalized speech generation seeks to synthesize a target speaker's voice from only a few training samples that are in a different language. One popular technique is to condition a speech synthesizer on a speaker embedding, that characterizes the target speaker. Unfortunately, such a speaker embedding is usually affected by the language being spoken, which compromises the speaker similarity in cross-lingual personalized speech generation. In this paper, we propose a novel speaker encoding mechanism that learns a language agnostic speaker embedding to characterize speaker individuality. Specifically, we adopt an encoder-decoder architecture to disentangle the language information from speaker embeddings via multi-task learning. We conduct experiments on both voice conversion and text-to-speech synthesis between English and Mandarin that involve cross-lingual speech generation. All objective and subjective evaluations consistently confirm that the proposed speaker embedding is language agnostic, thus improving cross-lingual personalized speech generation in terms of speaker similarity. © 2014 IEEE. | Source Title: | IEEE/ACM Transactions on Audio Speech and Language Processing | URI: | https://scholarbank.nus.edu.sg/handle/10635/233329 | ISSN: | 2329-9290 | DOI: | 10.1109/taslp.2021.3125142 | Rights: | Attribution 4.0 International |
Appears in Collections: | Staff Publications Elements |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
10_1109_taslp_2021_3125142.pdf | 3.19 MB | Adobe PDF | OPEN | None | View/Download |
This item is licensed under a Creative Commons License