Please use this identifier to cite or link to this item:
|Title:||SELF-SUPERVISED MODELING FOR OPEN-DOMAIN DIALOGUE EVALUATION||Authors:||ZHANG CHEN||ORCID iD:||orcid.org/0000-0002-2406-8734||Keywords:||Automatic Dialogue Evaluation,Open-Domain Dialogue,Self-Supervised Learning,Dialogue Systems,Interactive Evaluation,Multidimensional Evaluation||Issue Date:||4-Mar-2023||Citation:||ZHANG CHEN (2023-03-04). SELF-SUPERVISED MODELING FOR OPEN-DOMAIN DIALOGUE EVALUATION. ScholarBank@NUS Repository.||Abstract:||Mainstream open-domain dialogue (ODD) systems, designed for natural conversations on a wide array of topics, employ deep generative models. Evaluating these models' output, however, is an extremely complex task. The current gold standard is human evaluation, but its high cost, time-intensiveness, lack of scalability, and irreproducibility often necessitate the use of automatic evaluation as an alternative. In this thesis, we tackle three important challenges in the field of automatic ODD evaluation: (1) multi-dimensional evaluation, (2) understanding and modeling multi-turn interaction, and (3) domain generalization, which collectively contribute to the ultimate goal of creating reliable, adaptable, and holistic automatic dialogue evaluation metrics. Our proposed automatic metrics represent a significant step forward in our ability to evaluate dialogue systems and guide their development with the establishment of new state-of-the-art correlations with human evaluation at both turn and dialogue levels.||URI:||https://scholarbank.nus.edu.sg/handle/10635/242654|
|Appears in Collections:||Ph.D Theses (Open)|
Show full item record
Files in This Item:
|ZhangC.pdf||2.43 MB||Adobe PDF|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.