This page contains links to software and data sets that I maintain.
- Rampion, a framework for training statistical machine translation models (coming soon!)
- Code for performing inference for monolingual and bilingual gappy pattern models
[link] [sample patterns]
- Code to find trigger word pairs using mutual information (reimplementation of Rosenfeld, 1994)
[code]
- Twitter part-of-speech tagger and a corpus of tweets manually annotated with POS tags
[link]
- Corpus of movie critic reviews and opening weekend revenues
[link]
- Scripts for performing bootstrap resampling for BLEU significance testing
[link]
|