Publication

Character-level machine translation evaluation for languages with ambiguousword boundaries

Citations
Altmetric:
Alternative Title
Abstract
In this work, we introduce the TESLACELAB metric (Translation Evaluation of Sentences with Linear-programming-based Analysis - Character-level Evaluation for Languages with Ambiguous word Boundaries) for automatic machine translation evaluation. For languages such as Chinese where words usually have meaningful internal structure and word boundaries are often fuzzy, TESLA-CELAB acknowledges the advantage of character-level evaluation over word-level evaluation. By reformulating the problem in the linear programming framework, TESLACELAB addresses several drawbacks of the character-level metrics, in particular the modeling of synonyms spanning multiple characters. We show empirically that TESLACELAB significantly outperforms characterlevel BLEU in the English-Chinese translation evaluation tasks. © 2012 Association for Computational Linguistics.
Keywords
Source Title
50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference
Publisher
Series/Report No.
Organizational Units
Organizational Unit
COMPUTER SCIENCE
dept
Rights
Date
2012
DOI
Type
Conference Paper
Additional Links
Related Datasets
Related Publications