Software
This software is maintained by me, my students, and many collaborators around the world.
cdec - a fast, mature decoder, alignment, and modeling toolkit for statistical machine translation and similar structure-prediction problems.
creg - a small and fast toolkit for large-scale linear, logistic, and ordinal regression modeling.
GIZA++ and mkcls - at some point I adopted these venerable MT tools, which were originally written by Franz Och.
Data
Korean-English Wikipedia Titles - a parallel corpus of Wikipedia titles from January 2012.
Chinese-English place names - a parallel corpus of Chinese place names from Wikipedia.