Please use this identifier to cite or link to this item:
https://doi.org/10.1145/1390334.1390397
Title: | A lattice-based approach to query-by-example spoken document retrieval | Authors: | Chia, T.K. Sim, K.C. Li, H. Ng, H.T. |
Keywords: | Algorithms Experimentation Performance Theory |
Issue Date: | 2008 | Citation: | Chia, T.K.,Sim, K.C.,Li, H.,Ng, H.T. (2008). A lattice-based approach to query-by-example spoken document retrieval. ACM SIGIR 2008 - 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Proceedings : 363-370. ScholarBank@NUS Repository. https://doi.org/10.1145/1390334.1390397 | Abstract: | Recent efforts on the task of spoken document retrieval (SDR) have made use of speech lattices: speech lattices contain information about alternative speech transcription hypotheses other than the 1-best transcripts, and this information can improve retrieval accuracy by overcoming recognition errors present in the 1-best transcription. In this paper, we look at using lattices for the query-by-example spoken document retrieval task - retrieving documents from a speech corpus, where the queries are themselves in the form of complete spoken documents (query exemplars). We extend a previously proposed method for SDR with short queries to the query-by-example task. Specifically, we use a retrieval method based on statistical modeling: we compute expected word counts from document and query lattices, estimate statistical models from these counts, and compute relevance scores as divergences between these models. Experimental results on a speech corpus of conversational English show that the use of statistics from lattices for both documents and query exemplars results in better retrieval accuracy than using only 1-best transcripts for either documents, or queries, or both. In addition, we investigate the effect of stop word removal which further improves retrieval accuracy. To our knowledge, our work is the first to have used a lattice-based approach to query-by-example spoken document retrieval. Copyright 2008 ACM. | Source Title: | ACM SIGIR 2008 - 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Proceedings | URI: | http://scholarbank.nus.edu.sg/handle/10635/40321 | ISBN: | 9781605581644 | DOI: | 10.1145/1390334.1390397 |
Appears in Collections: | Staff Publications |
Show full item record
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.