Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/243782
Title: | EMOTION MODELLING FOR SPEECH GENERATION | Authors: | ZHOU KUN | ORCID iD: | ![]() |
Keywords: | emotional speech generation, speech synthesis, voice conversion, speech emotion, emotional intelligence, emotion modelling | Issue Date: | 20-Dec-2022 | Citation: | ZHOU KUN (2022-12-20). EMOTION MODELLING FOR SPEECH GENERATION. ScholarBank@NUS Repository. | Abstract: | Speech generation aims to synthesize human-like voices from the input of text or speech. Current speech generation techniques can generate natural-sounding speech but do not convey emotional context in human-human interaction. This thesis is focused on emotion modelling for speech generation to facilitate more natural human-computer interaction. Despite significant research efforts, several open issues remain, such as the limited generalizability and controllability of the generated emotions, which limit the scope of applications. This thesis aims to overcome the limitations of existing approaches, and bring one step closer to achieve emotional intelligence, by: 1) improving the generalizability of emotion modelling for seen and unseen speakers and emotions; 2) studying sequence-to-sequence emotion modelling to enable spectrum and duration manipulation; 3) explicitly modelling and controlling emotion intensity; and 4) synthesizing and controlling the rendering of mixed emotions. | URI: | https://scholarbank.nus.edu.sg/handle/10635/243782 |
Appears in Collections: | Ph.D Theses (Open) |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
ZhouKUN.pdf | 8.68 MB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.