Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/41304
Title: A hybrid morpheme-word representation for machine translation of morphologically rich languages
Authors: Luong, M.-T. 
Nakov, P. 
Kan, M.-Y. 
Issue Date: 2010
Source: Luong, M.-T.,Nakov, P.,Kan, M.-Y. (2010). A hybrid morpheme-word representation for machine translation of morphologically rich languages. EMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference : 148-157. ScholarBank@NUS Repository.
Abstract: We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the basic unit of translation is the morpheme, but word boundaries are respected at all stages of the translation process. Our model extends the classic phrase-based model by means of (1) word boundary-aware morpheme-level phrase extraction, (2) minimum error-rate training for a morpheme-level translation model using word-level BLEU, and (3) joint scoring with morpheme- and word-level language models. Further improvements are achieved by combining our model with the classic one. The evaluation on English to Finnish using Europarl (714K sentence pairs; 15.5M English words) shows statistically significant improvements over the classic model based on BLEU and human judgments. © 2010 Association for Computational Linguistics.
Source Title: EMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
URI: http://scholarbank.nus.edu.sg/handle/10635/41304
ISBN: 1932432868
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

57
checked on Nov 13, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.