Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/242654
Title: SELF-SUPERVISED MODELING FOR OPEN-DOMAIN DIALOGUE EVALUATION
Authors: ZHANG CHEN
ORCID iD:   orcid.org/0000-0002-2406-8734
Keywords: Automatic Dialogue Evaluation,Open-Domain Dialogue,Self-Supervised Learning,Dialogue Systems,Interactive Evaluation,Multidimensional Evaluation
Issue Date: 4-Mar-2023
Citation: ZHANG CHEN (2023-03-04). SELF-SUPERVISED MODELING FOR OPEN-DOMAIN DIALOGUE EVALUATION. ScholarBank@NUS Repository.
Abstract: Mainstream open-domain dialogue (ODD) systems, designed for natural conversations on a wide array of topics, employ deep generative models. Evaluating these models' output, however, is an extremely complex task. The current gold standard is human evaluation, but its high cost, time-intensiveness, lack of scalability, and irreproducibility often necessitate the use of automatic evaluation as an alternative. In this thesis, we tackle three important challenges in the field of automatic ODD evaluation: (1) multi-dimensional evaluation, (2) understanding and modeling multi-turn interaction, and (3) domain generalization, which collectively contribute to the ultimate goal of creating reliable, adaptable, and holistic automatic dialogue evaluation metrics. Our proposed automatic metrics represent a significant step forward in our ability to evaluate dialogue systems and guide their development with the establishment of new state-of-the-art correlations with human evaluation at both turn and dialogue levels.
URI: https://scholarbank.nus.edu.sg/handle/10635/242654
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
ZhangC.pdf2.43 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.