Alexander
Hauptmann |
Video
Information Extraction for Long Term Activity Analysis in Health Care
Bio: |
|
Roni Rosenfeld |
Predicting
Influenza
|
|
David
Smith |
Efficient
Inference for Declarative Approaches to Language
The
first part of this talk formulates syntactic dependency parsing as a dynamic
Markov random field with the novel ingredient of global constraints. Global
constraints are propagated by combinatorial optimization algorithms, which
greatly improve on collections of local constraints. In particular, such
factors enforce the constraint that the parser's output variables must form a
tree. Even with second-order features or latent variables, which would make
exact parsing asymptotically slower or NP-hard, accurate approximate
inference with belief propagation is as efficient as a simple edge-factored
parser times a constant factor. Inference can be further sped up by ignoring
98% of the higher-order factors that do not contribute significantly to
overall accuracy. The
second part extends these models to capture correspondences among
non-isomorphic structures. When bootstrapping a parser in a low-resource
target language by exploiting a parser in a high-resource source language,
models that score the alignment and the correspondence of divergent syntactic
configurations in translational sentence pairs achieve higher accuracy in
parsing the target language. These noisy (quasi-synchronous) mappings have
further applications in adapting parsers across domains, in learning features
of the syntax-semantics interface, and in question answering, paraphrasing,
and information retrieval. |
|
Jaime
Carbonell |
Language Technologies
and Machine Learning in Computational Proteomics
- Given a new protein sequence, to which
protein family or sub-family does it belong? - Given a protein sequence, what will be its
3D structure (fold)? - Given the proteins in an organism, what
are their likely interactions? The
talk addresses research in addressing these and related challenges, much of
it carried out by past and present LTI students. |
|
Steve
Minton |
Entity
Resolution in a Open, Changing World
Bio: Dr.
Steven Minton is the President and CTO of InferLink
Corp., which he founded in 2011. InferLink is developing
technology for integrating massive amounts of entity-oriented data. He is
also the Chairman of Fetch Technologies. Dr.
Minton received his Ph.D. in Computer Science from Carnegie Mellon University
in 1988. He then became a Principal Investigator at NASA's Ames Research
Center, and subsequently a Project Leader and Research Associate Professor at
USC's Information Sciences Institute. He left USC in 1999 to found Fetch
Technologies. Dr.
Minton was the founder and first executive editor of the Journal of
Artificial Intelligence Research (JAIR), and an editor of the Machine
Learning Journal. He also directs AI Access Foundation, a nonprofit
corporation which runs JAIR and is devoted to the electronic dissemination of
research results in AI. Dr.
Minton is a AAAI fellow. His awards include the 2008
AAAI Classic Paper award. |
|
Alice
Oh (alumna) |
Applications
of Latent Dirichlet Allocation and Hierarchical Dirichlet Processes Latent
Dirichlet Allocation (LDA) and Hierarchical Dirichlet Processes (HDP) have become popular models for
discovering latent semantics from text corpora. I will first start this talk
with a brief explanation of what LDA and HDP are and how they are used in
common text analysis tasks. Then, I will describe our recent research that
extends LDA and HDP to analyze two different text corpora: online reviews and
conference proceedings. With the online reviews, we propose a variant of LDA
called Aspect and Sentiment Unification Model (ASUM) to analyze topics and
sentiments jointly in an unsupervised fashion. With the conference
proceedings, we propose a variant of HDP called distant dependent Chinese
Restaurant Franchise (ddCRF) to discover how new
topics emerge through time. Unlike the HDP, the ddCRF
makes no assumption of the exchangeability of data, and hence the model can
capture relationships among data such as temporal patterns of topics. Bio: Alice
Oh is an Assistant Professor of Computer Science at Korea Advanced Institute
of Science and Technology. She leads her research group, Users and
Information Lab, with the vision of delivering information to satisfy the
user. To that end, she studies and employs methods from machine learning,
human-computer interaction, and statistical natural language processing.
Alice completed her M.S. in Language and Information Technologies at CMU and
her Ph.D. in Computer Science at MIT. |
|
Ellen
Riloff |
Adventures in Bootstrapping:
Acquiring Lexical Knowledge for NLP
|
|
David
Ferrucci |
Deep Dive into
Deep QA and Natural Language Technology IBM
Research undertook a challenge to build a computer system that could compete
at the human champion level in real time on the American TV quiz show,
Jeopardy. The extent of the challenge includes fielding a real-time automatic
contestant on the show, not merely a laboratory exercise. The Jeopardy
Challenge helped us address requirements that led to the design of the DeepQA architecture and the implementation of Watson.
After three years of intense research and development by a core team of about
20 researchers, Watson is performing at human expert levels in terms of
precision, confidence, and speed at the Jeopardy quiz show. Our results
strongly suggest that DeepQA is an effective and
extensible architecture that can be used as a foundation for combining,
deploying, evaluating, and advancing a wide range of algorithmic techniques
to rapidly advance the field of question answering (QA). Bio: Dr.
David Ferrucci is a Research Staff Member and leader
of the Semantic Analysis and Integration Department at IBM’s T.J. Watson’s
Research Center. His team of 25 researchers focuses on developing
technologies for discovering knowledge in natural language and leveraging
those technologies in a variety of intelligent search, data analytics, and
knowledge management solutions. In
2007, Dr. Ferrucci began exploring the feasibility
of designing a computer system that can rival human champions at the game of
Jeopardy!. Dubbed DeepQA,
the project focused on advancing natural language question answering using
massively parallel evidence-based computing. After winning support, Ferrucci has set and driven the technical agenda for
Jeopardy! The IBM Challenge. The
Watson computer system designed by Ferrucci’s team
represents the integration and advancement of many search,
natural language processing, and semantic technologies. Following
the Jeopardy! challenge, Dr. Ferrucci
and his team plan to apply Deep QA technologies to areas like medicine,
government, and law to drive advances in computer supported intelligence and
decision-making. |
|
Eric
Fosler-Lussier |
Integrating
speech science and technology: Traditional
speech recognition techniques adopt a hierarchical, top down approach to modeling
speech data; linguistic information such as word pronunciations or language
models typically act as priors in statistical models for automatic speech
recognition (ASR). One line of research has started to integrate
linguistic information within the representation of the underlying speech
data. However, the top down approach typically used in ASR (Hidden Markov
Models) does not easily allow for combining evidence from different
linguistic representations. Similarly,
in speech separation (removing background noise from a speech-noise mixture),
different cues have been identified that indicate speech or background noise.
However, the techniques that have utilized multiple cues typically
combine them in an ad hoc manner. In
this talk, I will discuss a line of research from my lab that looks at
combining evidence using Conditional Random Fields: CRFs have been utilized
within the NLP community for many tasks, but their use in the speech
community is only starting to take off. Applications of CRFs to the ASR
and speech separation problems show that this type of model can be an
effective combiner of information, and can allow us to easily integrate ideas
from speech science into working systems. Bio: Eric
Fosler-Lussier is an Associate Professor of Computer
Science and Engineering, with a courtesy appointment in Linguistics, at The
Ohio State University. After receiving a B.A.S. (Computer and Cognitive
Science) and B.A. (Linguistics) from the University of Pennsylvania in 1993,
he received his Ph.D. in 1999 from the University of California, Berkeley,
performing his dissertation research at the International Computer Science
Institute under the tutelage of Prof. Nelson Morgan. He has also been a
Member of Technical Staff at Bell Labs, Lucent Technologies, and a Visiting
Researcher at Columbia University. In 2006, Prof. Fosler-Lussier
was awarded an NSF CAREER award, and in 2010 was presented with a Lumley
Research Award by the Ohio State College of Engineering. He is also the
recipient (with co-author Jeremy Morris) of the 2010 IEEE Signal Processing
Society Best Paper Award. He has published over 90 papers in speech and
language processing, is a member of the Association for Computational
Linguistics, the International Speech Communication Association, and a senior
member of the IEEE. Fosler-Lussier serves on the
IEEE Speech and Language Technical Committee (2006-2008, 2010-2013), as well
as on the editorial boards of the ACM Transactions on Speech and Language
Processing and the Journal of Experimental Linguistics. He is generally
interested in integrating linguistic insights as priors in statistical
learning systems. |
|
Bonnie Webber |
Discourse
Structures and Language Technologies
In
Part 1, I show that early computational work on discourse structure aimed to
assign a simple tree structure to a discourse. At issue was what its internal
nodes corresponded to. The debate was fierce, and suggestions that other
structures might be more appropriate were ignored or subjected to ridicule.
The main uses of discourse structure were text generation and summarization,
but mostly in small-scale experiments. In
Part 2, I describe several different types of discourse structure receiving
computational attention, though perhaps not always clearly distinguished.
There is an increasing number of credible efforts
aimed at recognizing these structures automatically, though performance on
unrestricted text still resembles that of the early days of robust parsing.
Generic applications are also beginning to appear, as researchers recognize
the value of these structures to tasks of interest to them. In
Part 3, I argue for the need for a mid-line between approaches hostage to
theory and empirical approaches free of theory. An empirical approach
underpinned by theory will not only motivate sensible back-off strategies in the
face of unseen data, but also enable us to understand how the different
discourse structures inter-relate and thereby exploit their mutual
recognition. This should allow more challenging applications, such as
improving the performance of statistical machine translation (SMT) through
the extended locality of discourse structures and the linguistic phenomena
they correlate with. Bio: |
|
Mark Steedman |
The
Statistical Problem of Language Acquisition The
talk will report on recent work with Tom Kwiatkowski, Sharon Goldwater, and
Luke Zettlemoyer on semantic parser induction by
machine from a number of corpora pairing sentences with logical forms,
including a corpus consisting of real child-directed utterance from the
CHILDES corpus. The
problem of child language acquisition is often identified as a
"logical" problem. The term refers to the fact that children
learn language rapidly from exposure to a sample of utterances in the
language, and seem to need access to some other source of information than
mere positive examples of the sentences of the language. The
most obvious candidate for this other source of information, at least in the
earliest stages of language acquisition, is representations of meaning, in
the form of logical forms supported by the contextual situation, available in
the case of the child on the basis of pre-linguistic sensory-motor cognition
(in which we should include certain kinds of social cognition). In the
case of machines, there has been some success recently on this task for
datasets such as GeoQuery and ATIS, including
multi-lingual versions. The
paper argues that the problem of language acquisition interpreted in this way
is similar to the problem of inducing a grammar and a parsing model from a treebank such as the Penn treebank,
except that a) the trees are unordered logical forms, in which the preterminals are not aligned with words in the target
language, and b) there may be noise and spurious distracting logical forms
supported by the context but irrelevant to the utterance. The
talk shows that this class of problem can be solved if the child or machine
initially parses with the entire space of possibilities that universal
grammar allows under the assumptions of the Combinatory Categorial
theory of grammar (CCG), and learns a generative statistical parsing model for
that space using EM algorithm-related methods such as Variational
Bayes learning. This
can be done without all-or-none "parameter-setting" or attendant
"triggers", and without invoking any "subset principle",
provided the system is presented with a representative sample of reasonably
short utterances from the target language. Bio: His
research covers a range of problems in computational linguistics, artificial
intelligence, computer science, and cognitive science, including syntax and
semantics of natural language, and parsing and comprehension of natural
language discourse by humans and by machine using Combinatory Categorial Grammar (CCG). Much of his current NLP
research concerns wide-coverage parsing for robust semantic interpretation
and natural language inference. Some of his research concerns the
analysis of music by humans and machines. |
|
Jerome Bellegarda |
A Guided Tour
of Latent Semantic Mapping Originally
formulated in the context of information retrieval, latent semantic analysis
exhibits three main characteristics: (i) words and
documents (i.e., discrete entities) are mapped onto a continuous vector
space; (ii) this mapping is determined by global correlation patterns; and
(iii) dimensionality reduction is an integral part of the process. Because
such fairly generic properties may be advantageous in a variety of different
contexts, this has sparked interest in a more inclusive interpretation of the
underlying paradigm. The outcome is latent semantic mapping, a data-driven
framework for modeling global relationships implicit in large volumes of
data. The purpose of this talk is to give a broad overview of the framework,
highlight the possibilities it offers for general feature extraction, and
underscore the multi-faceted benefits it can bring to a number of problems in
speech and language processing. We conclude with a discussion of the inherent
trade-offs associated with the approach, and some perspectives on its likely
role in information extraction going forward. Bio: |