Lemur Search   
Language Technologies Institute
Carnegie Mellon University
School of Computer Science

LTI Colloquium Fall 2012

Course Information

The LTI colloquium is a series of talks related to language technologies. The topics include but are not restricted to Computational Linguistics, Machine Translation, Speech Recognition and Synthesis, Information Retrieval, Computational Biology, Machine Learning, Text Mining, Knowledge Representation, Computer-Assisted Language Learning and Intelligent Language Tutoring. To get credit for the course, students are required to write either a short critique of one of the presentations or a comparison of two.

Time: Fridays 2:30-4:00pm
Location: Doherty Hall 2315
Instructor: Roni Rosenfeld, roni (at) cs.cmu.edu
TA: Reyyan Yeniterzi, reyyan (at) cs.cmu.edu
Course Secretary: Kate Schaich, kschaich (at) cs.cmu.edu

Upcoming Talk

December 7, Friday, 2:30pm

Doherty Hall 2315

Anoop Sarkar

Simon Fraser University

Ensemble Decoding for Statistical Machine Translation

Statistical machine translation is often faced with the problem of combining data from many diverse sources into a single translation model. In this talk we introduce a novel approach called ensemble decoding that combines multiple translation models during the process of translation. We show that this technique is applicable in many diverse areas in machine translation:
(a) Domain adaptation is needed when the training data is from a different domain than the test data. We show that ensemble decoding can effectively combine out-of-domain and in-domain translation models.
(b) Multi-metric optimization modifies discriminative training for machine translation to prefer Pareto-optimal points with respect to multiple evaluation measures. We use ensemble decoding to combine the Pareto-optimal weight vectors obtained in multi-metric optimization. Furthermore, the ensemble weights are tuned to prefer Pareto-optimal solutions.
(c) In translation out of resource-poor languages, a pivot language is often used to augment the translation model from source to target. Ensemble models provide a novel way to combine the direct translation model (from source to target) and the pivot model (from source to pivot to target).

Bio: Anoop Sarkar is an Associate Professor at Simon Fraser University in British Columbia, Canada where he co-directs the Natural Language Laboratory (http://natlang.cs.sfu.ca). He received his Ph.D. from the Department of Computer and Information Sciences at the University of Pennsylvania under Prof. Aravind Joshi for his work on semi-supervised statistical parsing using tree-adjoining grammars.
His research is focused on statistical parsing and machine translation: in the areas of syntax and morphology in MT, semi-supervised learning, and domain adaptation. His interests also include formal language theory and stochastic grammars, in particular tree automata and tree-adjoining grammars.

Schedule

Date Speaker Host Title of the Talk
Aug 31 Chris Dyer, CMU - Feature-Rich Latent Variable Models for Statistical Machine Translation
Sep 7 Alan Black, CMU - Statistical Parametric Speech Synthesis
Sep 14 Eduard Hovy, CMU - NLP: Its Past and 3.5 Possible Futures
Sep 21 Joakim Nivre, Uppsala University / Google Chris Beyond MaltParser -- Advances in Transition-Based Dependency Parsing
Sep 28 Mirella Lapata, University of Edinburgh (and an LTI graduate) Lori Talk to Me in Plain English, Please! Explorations in Data-driven Text Simplification
Oct 5 David Blei, Princeton University Roni Probabilistic Topic Models of Text and Users
Oct 12 Daniel Povey, Johns Hopkins University Bhiksha Subspace Gaussian Mixture Models for Speech Recognition
Oct 19 No colloquium - Mid-semester break
Oct 26 Donald Metzler, Google Jamie Sanitizing, Searching, and Summarizing Microblog Streams
Nov 2 Bhiksha Raj, CMU - Hearing Without Listening
Nov 9 Micha Elsner, Ohio State University Noah Bridging the gap: from sounds to words
Nov 16 Antoine Raux, Honda Research Maxine Understanding User Intention in Context for Robust Human-Machine Interaction
Nov 23 No colloquium - Thanksgiving
Nov 30 Dan Gildea, University of Rochester Chris Models and Algorithms for Machine Translation
Dec 7 Anoop Sarkar, Simon Fraser University Noah Ensemble Decoding for Statistical Machine Translation

Past Colloquia

Language Technologies Institute • 5000 Forbes Ave • Pittsburgh, PA 15213-3891 • (412) 268-6591