Syntax Augmented Machine
Translation via Chart Parsing
References:
Our open-source SAMT system consists of three parts:
- Extraction of statistical
translation rules from a training corpus; either plain
hierarchical
rules a la Chiang (2005) or syntax-augmented rules a la
Zollmann&Venugopal (2006).
- CKY+ style chart-parser employing the statistical
translation rules to translate test sentences
- Fast C++ code - translates the 2000 (realtest) sentences of
the Europarl French-English data in approx. 40 min, i.e., 46 sentences
per minute, achieving state-of-the-art scores
- Implements CKY+ for internal binarization during parsing
- Can efficiently handle thousands of non-terminal categories
- Performs LM intersection with the grammar at run-time, or
optionally uses future cost estimates for LM cost, producing
state-of-the-art scores
- A
minimum-error-rate optimization and scoring tool (integrated into the
chart parser)
to tune the parameters of the underlying log-linear model on a held-out
development corpus
The system is available open-source under the GNU General Public
License. Click here to
download it.
(Library LGPL version [needed if used for commercial purposes, no support provided]: here.)
Documentation for the SAMT is available by consulting the
following sources.
- Readme.html
documentation Detailed instructions on installation of the
system and running through an quick-start example.
- Detailed technical overview at the top of
FastTranslateChart.cc, complements the published work
- Doxygen comments on classes/functions + detailed notes in
code
- The samt-technical mailing list (see below), for all the
points we forgot to explain fully
We will regularly updating the SAMT system. We have created the
following Google groups to manage announcements, and
host technical discussions regarding the system.
- samt-announce
to receive information on major updates.
- samt-technical
to participate in technical discussion regarding the SAMT system. Get
your compiling / running / theory questions answered here.
Of course, you also can email us directly: {zollmann or ashishv} (at) cs.cmu.edu
InterACT homepage
Andreas's homepage
Ashish's homepage
visitors
since 3th June 2006 were here for more than one minute.