Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/248162
Title: CODE-SWITCHING DETECTION TECHNIQUES AND LANGUAGE MODELING STRATEGIES FOR AUTOMATIC SPEECH RECOGNITION
Authors: WANG QINYI
ORCID iD:   orcid.org/0000-0001-9858-1170
Keywords: speech recognition, language model, code-switching, language identification, transformer, semi-supervised learning
Issue Date: 5-Aug-2023
Citation: WANG QINYI (2023-08-05). CODE-SWITCHING DETECTION TECHNIQUES AND LANGUAGE MODELING STRATEGIES FOR AUTOMATIC SPEECH RECOGNITION. ScholarBank@NUS Repository.
Abstract: Automatic speech recognition (ASR) systems have become an essential part of our daily lives. The advent of deep learning has brought about end-to-end ASR models, revolutionizing the field and achieving impressive performance. However, developing robust end-to-end ASR systems faces a significant hurdle due to the scarcity of paired speech-text data. Additionally, the practice of code-switching (CS) adds further complexity to speech recognition. This thesis aims to address these challenges by proposing innovative code-switching detection and language modeling strategies to enhance the performance of code-switching and end-to-end ASR systems. We first introduce a novel code-switching detection method to mitigate the over-switching issue in traditional CS ASR systems. Then we introduce speech-and-text Transformer to leverage unpaired text data and effectively address the challenges of catastrophic forgetting and model capacity gap prevalent in pre-training methods. Lastly, we address the challenge of language confusion in end-to-end CS ASR systems by leveraging text-derived language identities.
URI: https://scholarbank.nus.edu.sg/handle/10635/248162
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
Qinyi_Thesis.pdf4.05 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.