Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/247650
Title: | SELF-SUPERVISED MODELING FOR MULTI-MODAL UNDERSTANDING | Authors: | YUE XIANGHU | ORCID iD: | orcid.org/0000-0003-3527-6034 | Keywords: | Self-supervised learning; multimodal; unsupervised learning; pre-training | Issue Date: | 29-Sep-2023 | Citation: | YUE XIANGHU (2023-09-29). SELF-SUPERVISED MODELING FOR MULTI-MODAL UNDERSTANDING. ScholarBank@NUS Repository. | Abstract: | We humans perceive information from our surrounding environment through multiple mediums and further understand or interact with the world. These multimodal clues offer different but complementary information. Currently, self-supervised learning has emerged as a promising approach to learn meaningful representations from many modalities separately, including text, speech, and vision. In this thesis, we aim to leverage self-supervised pre-training techniques for multimodal processing. We finished several works to achieve our target step-by-step. Starting from the traditional unimodal understanding task, e.g., speech recognition, the first work focuses to remedy the code-switching problem. Learning purely from labeled examples does not resemble language acquisition in humans, so the second work focuses on learning speech representations from unlabeled speech data. The third work takes the universality of self-supervised pre-training one step further, by unifying speech and text pre-training within a single model. Finally, the fourth work attempts to build a unified audio-visual-text model to enable various multimodal understanding tasks. | URI: | https://scholarbank.nus.edu.sg/handle/10635/247650 |
Appears in Collections: | Ph.D Theses (Open) |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
YueXianghu.pdf | 3.33 MB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.