|
|
Course Information
The LTI colloquium is a series of talks related
to language technologies. The topics include but are not restricted
to Computational Linguistics, Machine Translation, Speech Recognition
and Synthesis, Information Retrieval, Computational Biology, Machine
Learning, Text Mining, Knowledge Representation, Computer-Assisted
Language Learning and Intelligent Language Tutoring. To get credit for
the course, students are required to write either a short critique of
one of the presentations or a comparison of two.
| Time: |
Fridays 2:30-4:00pm |
| Location: |
Doherty Hall 2315 |
| Instructor: |
Roni Rosenfeld, roni (at) cs.cmu.edu |
| TA: |
Reyyan Yeniterzi, reyyan (at) cs.cmu.edu |
| Course Secretary: |
Kate Schaich, kschaich (at) cs.cmu.edu |
Upcoming Talk
December 7, Friday, 2:30pm
Doherty Hall 2315
Anoop Sarkar
Simon Fraser University

Ensemble Decoding for Statistical Machine Translation
Statistical machine translation is often faced with the problem of combining data from many diverse sources into a single translation model. In this talk we introduce a novel approach called ensemble decoding that combines multiple translation models during the process of translation. We show that this technique is applicable in many diverse areas in machine translation:
(a) Domain adaptation is needed when the training data is from a different domain than the test data. We show that ensemble decoding can effectively combine out-of-domain and in-domain translation models.
(b) Multi-metric optimization modifies discriminative training for machine translation to prefer Pareto-optimal points with respect to multiple evaluation measures. We use ensemble decoding to combine the Pareto-optimal weight vectors obtained in multi-metric optimization. Furthermore, the ensemble weights are tuned to prefer Pareto-optimal solutions.
(c) In translation out of resource-poor languages, a pivot language is often used to augment the translation model from source to target. Ensemble models provide a novel way to combine the direct translation model (from source to target) and the pivot model (from source to pivot to target).
Bio: Anoop Sarkar is an Associate Professor at Simon Fraser University in British Columbia, Canada where he co-directs the Natural Language Laboratory (http://natlang.cs.sfu.ca). He received his Ph.D. from the Department of Computer and Information Sciences at the University of Pennsylvania under Prof. Aravind Joshi for his work on semi-supervised statistical parsing using tree-adjoining grammars.
His research is focused on statistical parsing and machine translation: in the areas of syntax and morphology in MT, semi-supervised learning, and domain adaptation. His interests also include formal language theory and stochastic grammars, in particular tree automata and tree-adjoining grammars.
Schedule
| Date |
Speaker |
Host |
Title of the Talk |
| Aug 31 |
Chris Dyer, CMU |
- |
Feature-Rich Latent Variable Models for Statistical Machine Translation |
| Sep 7 |
Alan Black, CMU |
- |
Statistical Parametric Speech Synthesis |
| Sep 14 |
Eduard Hovy, CMU |
- |
NLP: Its Past and 3.5 Possible Futures |
| Sep 21 |
Joakim Nivre, Uppsala University / Google |
Chris |
Beyond MaltParser -- Advances in Transition-Based Dependency Parsing |
| Sep 28 |
Mirella Lapata, University of Edinburgh (and an LTI graduate) |
Lori |
Talk to Me in Plain English, Please! Explorations in Data-driven Text Simplification |
| Oct 5 |
David Blei, Princeton University |
Roni |
Probabilistic Topic Models of Text and Users |
| Oct 12 |
Daniel Povey, Johns Hopkins University |
Bhiksha |
Subspace Gaussian Mixture Models for Speech Recognition |
| Oct 19 |
No colloquium - Mid-semester break |
|
|
| Oct 26 |
Donald Metzler, Google |
Jamie |
Sanitizing, Searching, and Summarizing Microblog Streams |
| Nov 2 |
Bhiksha Raj, CMU |
- |
Hearing Without Listening |
| Nov 9 |
Micha Elsner, Ohio State University |
Noah |
Bridging the gap: from sounds to words |
| Nov 16 |
Antoine Raux, Honda Research |
Maxine |
Understanding User Intention in Context for Robust Human-Machine Interaction |
| Nov 23 |
No colloquium - Thanksgiving |
|
|
| Nov 30 |
Dan Gildea, University of Rochester |
Chris |
Models and Algorithms for Machine Translation |
| Dec 7 |
Anoop Sarkar, Simon Fraser University |
Noah |
Ensemble Decoding for Statistical Machine Translation |
Past Colloquia
|