Course Information

The LTI colloquium is a series of talks related to language technologies. The topics include but are not restricted to Computational Linguistics, Machine Translation, Speech Recognition and Synthesis, Information Retrieval, Computational Biology, Machine Learning, Text Mining, Knowledge Representation, Computer-Assisted Language Learning and Intelligent Language Tutoring. To get credit of the course, students are required to write either a short critique of one of the presentations or a comparison of two.

Time:

 Fridays 2:30-3:50pm

Location:

 2315 Doherty Hall

Instructor:

 Bhiksha Raj, bhiksha (at) cs.cmu.edu

TA:

 Pallavi Baljekar, pbaljeka (at) cs.cmu.edu

 

UpComing talk

 

December 6th, Friday, 2:30pm

Sungjin Lee

CMU

Statistical modeling for Spoken Dialog System in Real World

 

With the recent remarkable growth of speech-enabled applications, statistical dialog modeling has become a critical component not only for typical telephone-based spoken dialog systems but also for multi-modal dialog systems on mobile devices and in automobiles. Due to present Automatic Speech Recognition and Spoken Language Understanding uncertainty, it is crucial to accurately track dialog states by updating the probability distribution over possible dialog states as a dialog unfolds. Given the significant size of dialog state space, it is almost impossible to design effective dialog strategies by hand. It is therefore desirable to have a machine automatically optimize the dialog strategies. These statistical approaches were initially developed on toy problems. Then they were tested in simulation and controlled laboratory studies and showed great benefits over conventional methods. Now the question is, would this result translate to real world applications? Recently, the first public deployments have been done as part of Spoken Dialog Challenge, providing the first opportunity to empirically assess real-world performance of statistical approaches for dialog state tracking. As a result of the Challenge, some important issues were identified which partly explain why statistical methods were not as successful as expected in real world. In this talk, I will introduce some of the recent progress made in the subsequent Dialog State Tracking Challenge to address those issues. Even though we have seen decent improvements in intrinsic evaluations, there are still some open questions, e.g., whether the intrinsic evaluation result will translate to extrinsic evaluation and which is the best metric for evaluating dialog state tracking to make more accurate prediction on how it relates to extrinsic evaluation. To partly address these questions, I will present some preliminary analysis on the relation between the performance of dialog state tracking and that of policy optimization. 

 

Bio:

Dr. Sungjin Lee is a Post-doctoral research fellow in Language Technologies Institute at Carnegie Mellon University. He received his PhD from the Pohang University of Science and Technology in 2012. His research interests lie in various areas of speech and language processing as well as machine learning. He is primarily working on statistical dialog modeling which includes structured discriminative models for dialog state tracking, sparse Bayesian models for online dialog strategy learning and unsupervised methods for user simulation. He is also interested in applying spoken language technologies to computer-assisted language learning settings. He serves on the advisory boards of Dialog State Tracking Challenge and Real Challenge. He is a member of Program Committee of many prestigious conferences.

 

SCHEDULE

Date

Speaker

Host

Title of the Talk

Talk Information

Aug 30

Peter Turney

Ed Hovy

Experiments with Three Approaches to Recognizing Lexical Entailment

[slides]

Sept 6

Andrew Moore

Yiming Yang

Computational Statistics Meets Fashion

Not Available

Sept 13

Percy Liang

Noah Smith

Learning Latent-Variable Models of Language

[video]

Sept 20

Richard Sproat

Prasanna Kumar

Corpora and Statistical Analysis of Non-Linguistic Symbol Systems: Was the Indus Civilization Literate?

[slides]

Sept 27

Miles Osborne

Chris Dyer

Cross Stream Event Detection

[video]

Oct 4

Jason Ernst

Meghana Kshirsagar

Epigenomic Signatures for Genome Annotation

available on request

Oct 11

Sharon Goldwater

Chris Dyer

Modeling 'Bootstrapping' in Language Acquisition: A Probabilistic Approach

[slides]

Oct 18

----------Mid Semester Break(No Colloquium) ----------

-------------------

Oct 25

Hynek Hermansky

Prasanna Kumar

Artificial Neural Networks: Deep, Long and Wide

[video]

[slides]

Nov 1

----------Faculty Retreat(No Colloquium) ----------

-------------------

Nov 8

Chris Manning

Nathan Schneider

Parsing with Compositional Vector Grammars

Nov 15

Philipp Koehn

Nathan Schneider

Human Translation and Machine Translation

[slides]

Nov 22

Wei-Ying Ma

Yiming Yang

Advancing Web Search with New Capabilities from Knowledge

Nov 29

----------Thanksgiving Holiday (No Colloquium) ----------

-------------------

Dec 6

Sung Jin

Maxine Eskenazi

Statistical modeling for Spoken Dialog System in Real World