Please use this identifier to cite or link to this item:
|Title:||Soft pattern matching models for definitional question answering|
|Keywords:||Definitional question answering|
|Citation:||Cui, H., Kan, M.-Y., Chua, T.-S. (2007). Soft pattern matching models for definitional question answering. ACM Transactions on Information Systems 25 (2). ScholarBank@NUS Repository. https://doi.org/10.1145/1229179.1229182|
|Abstract:||We explore probabilistic lexico-syntactic pattern matching, also known as soft pattern matching, in a definitional question answering system. Most current systems use regular expression-based hard matching patterns to identify definition sentences. Such rigid surface matching often fares poorly when faced with language variations. We propose two soft matching models to address this problem: one based on bigrams and the other on the Profile Hidden Markov Model (PHMM). Both models provide a theoretically sound method to model pattern matching as a probabilistic process that generates token sequences. We demonstrate the effectiveness of the models on definition sentence retrieval for definitional question answering. We show that both models significantly outperform the state-of-the-art manually constructed hard matching patterns on recent TREC data. A critical difference between the two models is that the PHMM has a more complex topology. We experimentally show that the PHMM can handle language variations more effectively but requires more training data to converge. While we evaluate soft pattern models only on definitional question answering, we believe that both models are generic and can be extended to other areas where lexico-syntactic pattern matching can be applied. © 2007 ACM.|
|Source Title:||ACM Transactions on Information Systems|
|Appears in Collections:||Staff Publications|
Show full item record
Files in This Item:
There are no files associated with this item.
checked on Jul 17, 2018
WEB OF SCIENCETM
checked on Jul 9, 2018
checked on Jul 13, 2018
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.