A statistical language modeling approach to lattice-based spoken document retrieval | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/41490

Title:	A statistical language modeling approach to lattice-based spoken document retrieval
Authors:	Chia, T.K. Li, H. Ng, H.T.
Issue Date:	2007
Citation:	Chia, T.K.,Li, H.,Ng, H.T. (2007). A statistical language modeling approach to lattice-based spoken document retrieval. EMNLP-CoNLL 2007 - Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning : 810-818. ScholarBank@NUS Repository.
Abstract:	Speech recognition transcripts are far from perfect; they are not of sufficient quality to be useful on their own for spoken document retrieval. This is especially the case for conversational speech. Recent efforts have tried to overcome this issue by using statistics from speech lattices instead of only the 1- best transcripts; however, these efforts have invariably used the classical vector space retrieval model. This paper presents a novel approach to lattice-based spoken document retrieval using statistical language models: a statistical model is estimated for each document, and probabilities derived from the document models are directly used to measure relevance. Experimental results show that the lattice-based language modeling method outperforms both the language modeling retrieval method using only the 1-best transcripts, as well as a recently proposed lattice-based vector space retrieval method. © 2007 Association for Computational Linguistics.
Source Title:	EMNLP-CoNLL 2007 - Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
URI:	http://scholarbank.nus.edu.sg/handle/10635/41490
Appears in Collections:	Staff Publications

Show full item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.