Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/164838
Title: MACHINE LEARNING FOR LIMITED DATA VOICE CONVERSION
Authors: BERRAK SISMAN
ORCID iD:   orcid.org/0000-0001-8078-3305
Keywords: voice conversion, machine learning, deep learning, limited data, speech synthesis, AI
Issue Date: 25-Oct-2019
Citation: BERRAK SISMAN (2019-10-25). MACHINE LEARNING FOR LIMITED DATA VOICE CONVERSION. ScholarBank@NUS Repository.
Abstract: Voice Conversion aims to convert one’s voice to sound like that of another. This thesis is focused on developing advanced machine learning algorithms and frameworks for voice conversion under the constraint of limited training data. Firstly, a new voice conversion approach is proposed to address the problem of limited training data with and without parallel data, where phonetic information is introduced to the exemplar-based voice conversion framework. Secondly, we propose a voice conversion framework by addressing research problems in both spectral feature transformation and waveform generation. Thirdly, we study different training strategies for WaveNet vocoder in GAN-based voice conversion and propose to use WaveNet as a vocoder as well as a residual compensator. Lastly, we propose to use GANs as a solution to cross-lingual voice conversion with limited data. To our best knowledge, this work is the first to study GANs in cross-lingual voice conversion.
URI: https://scholarbank.nus.edu.sg/handle/10635/164838
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
BERRAKSISMAN-PHD-THESIS.pdf2.12 MBAdobe PDF

OPEN

NoneView/Download

Page view(s)

813
checked on Mar 5, 2021

Download(s)

61
checked on Mar 5, 2021

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.