A hybrid morpheme-word representation for machine translation of morphologically rich languages | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/41304

DC Field	Value
dc.title	A hybrid morpheme-word representation for machine translation of morphologically rich languages
dc.contributor.author	Luong, M.-T.
dc.contributor.author	Nakov, P.
dc.contributor.author	Kan, M.-Y.
dc.date.accessioned	2013-07-04T08:24:23Z
dc.date.available	2013-07-04T08:24:23Z
dc.date.issued	2010
dc.identifier.citation	Luong, M.-T.,Nakov, P.,Kan, M.-Y. (2010). A hybrid morpheme-word representation for machine translation of morphologically rich languages. EMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference : 148-157. ScholarBank@NUS Repository.
dc.identifier.isbn	1932432868
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/41304
dc.description.abstract	We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the basic unit of translation is the morpheme, but word boundaries are respected at all stages of the translation process. Our model extends the classic phrase-based model by means of (1) word boundary-aware morpheme-level phrase extraction, (2) minimum error-rate training for a morpheme-level translation model using word-level BLEU, and (3) joint scoring with morpheme- and word-level language models. Further improvements are achieved by combining our model with the classic one. The evaluation on English to Finnish using Europarl (714K sentence pairs; 15.5M English words) shows statistically significant improvements over the classic model based on BLEU and human judgments. © 2010 Association for Computational Linguistics.
dc.source	Scopus
dc.type	Conference Paper
dc.contributor.department	COMPUTER SCIENCE
dc.description.sourcetitle	EMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
dc.description.page	148-157
dc.identifier.isiut	NOT_IN_WOS
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Altmetric

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.