This software has been developed by my research group, me, and collaborators around the world.
- cdec - a fast, mature decoder, alignment, and modeling toolkit for statistical machine translation and similar structure-prediction problems.
- The CMU cross-lingual metaphor detector - a toolkit for identifying instances of figurative language in English and any other language for which a bilingual dictionary is available.
fast_align- a very fast—but pretty effective—unsupervised bilingual word aligner.
- creg - a small and fast toolkit for large-scale linear, logistic, and ordinal regression modeling.
- English adjective supersenses - a 13-class supersense taxonomy of English adjectives developed by Yulia Tsvetkov
- Korean-English Wikipedia Titles - a parallel corpus of Wikipedia titles from January 2012.
- Chinese-English place names - a parallel corpus of Chinese place names from Wikipedia.