Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/15849
Title: Lattice-based statistical spoken document retrieval
Authors: CHIA TEE KIAH @ XIE ZHIJIA
Keywords: information retrieval, language modeling, lattice-based spoken document retrieval, probabilistic IR, conversational telephone speech, query by example
Issue Date: 19-May-2009
Source: CHIA TEE KIAH @ XIE ZHIJIA (2009-05-19). Lattice-based statistical spoken document retrieval. ScholarBank@NUS Repository.
Abstract: Recent research efforts on spoken document retrieval (SDR) have tried to overcome the low quality of 1-best automatic speech recognition transcripts -- especially for conversational speech -- by using statistics derived from speech lattices containing multiple transcription hypotheses as output by a speech recognizer. However, these efforts have invariably used the classical vector space retrieval model. In this thesis, I present a lattice-based SDR method based on a statistical approach to information retrieval. I formulate a way to estimate statistical models for documents from expected word counts derived from lattices; query-document relevance is computed as a log probability under such models. Experiments show that my method outperforms statistical retrieval using 1-best transcripts, a recent lattice-based vector space method, and BM25 using lattice statistics. I also extend my proposed SDR method to the task of query-by-example SDR -- retrieving documents from a speech corpus, where the queries are themselves full-fledged spoken documents (query exemplars).
URI: http://scholarbank.nus.edu.sg/handle/10635/15849
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
ChiaTK.pdf925.67 kBAdobe PDF

OPEN

NoneView/Download

Page view(s)

318
checked on Dec 11, 2017

Download(s)

278
checked on Dec 11, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.