Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/230998
Title: LINGUISTICALLY-INCLUSIVE NATURAL LANGUAGE PROCESSING
Authors: TAN MIN RONG SAMSON
ORCID iD:   orcid.org/0000-0003-1019-8228
Keywords: adversarial, robustness, natural language processing, machine learning, sociolinguistic variation, reliability
Issue Date: 3-Mar-2022
Citation: TAN MIN RONG SAMSON (2022-03-03). LINGUISTICALLY-INCLUSIVE NATURAL LANGUAGE PROCESSING. ScholarBank@NUS Repository.
Abstract: Language is a largely social construct, shaped by each community's lived experiences, culture, and language repertoire. However, current natural language processing (NLP) systems fail to account for sociolinguistic variation: common NLP practices implicitly assume that all speakers of a language speak a single, "standard" version. This is damaging to minority language varieties, perpetuating the perception of being "ungrammatical" and "incorrect". Failing to address this gap predisposes NLP systems to discriminate against minority language communities. This can take the form of disproportionately poor performance or encoding harmful stereotypes. Hence, this thesis focuses on the issues surrounding sociolinguistic generalization, defined as an NLP system's ability to generalize beyond the language variety it was trained on. In some situations, this can be viewed as the ability to be robust to sociolinguistic variation. Using adversarial attacks, we reveal the linguistic biases of existing NLP models and design methods to mitigate them. We conclude by generalizing the prior adversarial attacks into a framework for testing NLP system reliability in the presence of language variation. Language technology is often hailed as an avenue of improving technological accessibility. This thesis strives for a world in which NLP not only works for the privileged, but for everyone.
URI: https://scholarbank.nus.edu.sg/handle/10635/230998
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
samsontmr.pdf15.58 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.