Language Technologies Institute
Student Research Symposium 2006

When Does Multi-Engine Machine Translation Work and Why?

Greg Hanneman

The goal of multi-engine machine translation (MEMT) is to effectively combine output from machine translation (MT) systems to produce a higher-quality translation. Some approaches developed over the last decade have required shared data structures, such as charts or lattices [1], among the MT systems. Others have built on the confusion network model used in speech recognition [2], which allows word- and phrase-level alternatives, but does not allow for reordering. Our MEMT system [3], developed over the past two years, requires no cooperation or shared data from the individual MT engines and provides for some reordering of translation elements.

Given target-language output from a set of independent MT engines all processing a common source text, the CMU MEMT engine creates a new synthetic translation by mixing elements of the various target-language outputs. Words from the original systems are first aligned with each other using exact, stemming, and synonymy matches; the system then attempts to create forced or "artificial" alignments for unaligned words based on parts of speech. Alignment information is passed to the MEMT decoder, which explores the search space of combination hypotheses by iteratively selecting unused words from the original systems. Each hypothesis word is assigned a language model probability score and an alignment score based on the global confidence values assigned to the original system(s) that produced the word. Total scores are accumulated for each hypothesis, and the highest-ranking completed hypothesis is returned as the MEMT output.

We have experimented with the MEMT system on a variety of test data sets and original MT systems. In this talk, I will briefly describe our MEMT algorithm, and then summarize results obtained in the last 18 months. By comparing MEMT performance on different sets of MT engines and under different conditions, I will point out advantages and limitations of our current system. The effects of using MT systems from different paradigms (statistical, rule-based, example-based, etc.) and of differing strengths (as measured by automatic scoring metrics) will be particularly explored. If time permits, I will also introduce some current work on an improved language model scoring system designed to boost the MEMT system's discriminative power.

[1] Frederking, R. and C. Nirenburg, "Three Heads are Better than One," Proceedings of the Fourth Conference on Applied Natural Language Processing, 1994.

[2] Woodland, P., "Hypothesis Selection Using Consensus Networks for MT System Combination," presentation at NIST Machine Translation Workshop, Sept. 7, 2006.

[3] Jayaraman, S. and A. Lavie, "Multi-Engine Machine Translation Guided by Explicit Word Matching," Proceedings of the 10th Annual Conference of the European Association for Machine Translation, 2005.