Jan 21

Alexander Hauptmann
SCS, CMU

alex

 

Video Information Extraction for Long Term Activity Analysis in Health Care


Analyzing human activity is the key to understand and search surveillance videos. I will discuss current results from a study to automatically analyze video of human activities to improve geriatric health care. Beyond just recognizing human activities observed on video, we mine a 25 day archive of observations in a nursing home, and link the observational results to medical records. This work explores the statistical patterns between a patient's daily activity and his/her clinical diagnosis. I will discuss some of the technical details of the work, as well as issues related to this type of health research. The main goal of this work is to develop an intelligent visual surveillance system based on efficient and robust activity analysis. The current results represent a feasibility demonstration of exploiting long term human activity patterns though video analysis.

Bio:
Alex Hauptmann received his B.A. and M.A. in Psychology from Johns Hopkins University, studied Computer Science at the Technische Universität Berlin from 1982-1984, and received his Ph.D. in Computer Science from CMU in 1991.  He is currently on the faculty in the Computer Science Department and the Language Technologies Institute at CMU. His research interests have led him to pursue and combine several different areas: man-machine communication, natural language processing, speech understanding and synthesis, video analysis and machine learning. He worked on speech and machine translation from 1984-94, when he joined the Informedia project for digital video analysis and retrieval and led the development and evaluation of the News-on-Demand applications.

Feb 4

Roni Rosenfeld
LTI, CMU

0703-MidRes-inJacket.jpg

 

Predicting Influenza


Influenza kills some 250,000 people worldwide every year, and seriously sickens hundreds of millions, all in spite of massive vaccination campaigns. Influenza is difficult to tame because it changes very fast, especially in response to current population immunity and antiviral drug use. In recent years, there has been exponential growth in the amount of genetic information available for influenza, making it possible, at least in principle, to quantitatively predict its course of evolution. At the same time, enhanced public health surveillance offers indirect information about the way influenza spreads from person to person, allowing for quantitative predictions of epidemic and pandemic behaviors. Combining the two types of prediction (evolution and spread) may lead to better vaccines, better vaccination and antiviral drug use policies, and better preparation of our healthcare system for the next pandemic. I will present a brief overview of influenza and epidemic modeling, and will discuss our ongoing prediction efforts. This is joint work with numerous faculty and students at both Pitt and CMU.

Feb 11

David Smith
University of Massachusetts, Amherst

dasmith.jpg

Efficient Inference for Declarative Approaches to Language


Much recent work in natural language processing treats linguistic analysis as an inference problem over graphs. This development opens up useful connections between machine learning, graph theory, and linguistics.

The first part of this talk formulates syntactic dependency parsing as a dynamic Markov random field with the novel ingredient of global constraints. Global constraints are propagated by combinatorial optimization algorithms, which greatly improve on collections of local constraints. In particular, such factors enforce the constraint that the parser's output variables must form a tree. Even with second-order features or latent variables, which would make exact parsing asymptotically slower or NP-hard, accurate approximate inference with belief propagation is as efficient as a simple edge-factored parser times a constant factor. Inference can be further sped up by ignoring 98% of the higher-order factors that do not contribute significantly to overall accuracy.

The second part extends these models to capture correspondences among non-isomorphic structures. When bootstrapping a parser in a low-resource target language by exploiting a parser in a high-resource source language, models that score the alignment and the correspondence of divergent syntactic configurations in translational sentence pairs achieve higher accuracy in parsing the target language. These noisy (quasi-synchronous) mappings have further applications in adapting parsers across domains, in learning features of the syntax-semantics interface, and in question answering, paraphrasing, and information retrieval.

Feb 18

Jaime Carbonell
LTI, CMU

jaimepic.jpg

Language Technologies and Machine Learning in Computational Proteomics


Proteomics is the study of proteins -- their sequence, structure and function -- in vivo or in vitro, and computational proteomics extends the research to in silico methods.  Since proteins are the active components of most biological processes, their study is crucial to understand and to model processes such as viral infection or organ rejection, and to design new drugs We borrow methods from language technologies and machine learning to address questions such as:

 - Given a new protein sequence, to which protein family or sub-family does it belong?

 - Given a protein sequence, what will be its 3D structure (fold)?

 - Given the proteins in an organism, what are their likely interactions?

The talk addresses research in addressing these and related challenges, much of it carried out by past and present LTI students.

Mar 4

Steve Minton
Fetch Technologies

Steve_portrait_smaller.jpg

Entity Resolution in a Open, Changing World


Recent improvements in technology have made it easier than ever to extract information about named entities from unstructured and semi-structured data sources on the internet.  However, when data is collected from multiple sources that use different data formats and/or terminology, it may be difficult to identify two or more references to the same entity.  In this talk, I will discuss the "entity resolution" problem, and in particular, consider how entity resolution can be addressed in open domains where there is no pre-specified "reference set" to match against.   I will describe cases where existing edit distance metrics for measuring the similarity of two data records are inadequate.  To address such cases, machine learning can be used to acquire more sophisticated world models that allow better entity resolution decisions to be made.

Bio:

Dr. Steven Minton is the President and CTO of InferLink Corp., which he founded in 2011. InferLink is developing technology for integrating massive amounts of entity-oriented data. He is also the Chairman of Fetch Technologies.

Dr. Minton received his Ph.D. in Computer Science from Carnegie Mellon University in 1988. He then became a Principal Investigator at NASA's Ames Research Center, and subsequently a Project Leader and Research Associate Professor at USC's Information Sciences Institute. He left USC in 1999 to found Fetch Technologies.

Dr. Minton was the founder and first executive editor of the Journal of Artificial Intelligence Research (JAIR), and an editor of the Machine Learning Journal.   He also directs AI Access Foundation, a nonprofit corporation which runs JAIR and is devoted to the electronic dissemination of research results in AI.

Dr. Minton is a AAAI fellow. His awards include the 2008 AAAI Classic Paper award.

Mar 18

Alice Oh (alumna)
Korea Advanced Institute of Science and Technology (KAIST)

aliceoh

Applications of Latent Dirichlet Allocation and Hierarchical Dirichlet Processes

 

Latent Dirichlet Allocation (LDA) and Hierarchical Dirichlet Processes (HDP) have become popular models for discovering latent semantics from text corpora. I will first start this talk with a brief explanation of what LDA and HDP are and how they are used in common text analysis tasks. Then, I will describe our recent research that extends LDA and HDP to analyze two different text corpora: online reviews and conference proceedings. With the online reviews, we propose a variant of LDA called Aspect and Sentiment Unification Model (ASUM) to analyze topics and sentiments jointly in an unsupervised fashion. With the conference proceedings, we propose a variant of HDP called distant dependent Chinese Restaurant Franchise (ddCRF) to discover how new topics emerge through time. Unlike the HDP, the ddCRF makes no assumption of the exchangeability of data, and hence the model can capture relationships among data such as temporal patterns of topics.

Bio:

Alice Oh is an Assistant Professor of Computer Science at Korea Advanced Institute of Science and Technology. She leads her research group, Users and Information Lab, with the vision of delivering information to satisfy the user. To that end, she studies and employs methods from machine learning, human-computer interaction, and statistical natural language processing. Alice completed her M.S. in Language and Information Technologies at CMU and her Ph.D. in Computer Science at MIT.

Mar 25

Ellen Riloff
Univ of Utah

Adventures in Bootstrapping: Acquiring Lexical Knowledge for NLP


Understanding natural language requires many types of lexical knowledge. Some lexical resources have been created (e.g., WordNet and FrameNet), but they are far from complete and they are rarely sufficient for informal jargon or specialized domains. Starting in 1997, the Utah NLP lab has been developing bootstrapping techniques to automatically acquire lexical knowledge from unannotated text collections. We have created several bootstrapping algorithms to induce semantic lexicons, as well as resources for subjectivity classification, event extraction, and plot unit analysis. Most recently, we used bootstrapping to create a contextual semantic tagger, given only seed words and domain-specific texts for training. In this talk, we will overview the bootstrapping methods that we have developed and try to distill out general lessons we have learned about what it takes to make bootstrapping work.

Mar 30

David Ferrucci
IBM

Deep Dive into Deep QA and Natural Language Technology

 

IBM Research undertook a challenge to build a computer system that could compete at the human champion level in real time on the American TV quiz show, Jeopardy. The extent of the challenge includes fielding a real-time automatic contestant on the show, not merely a laboratory exercise. The Jeopardy Challenge helped us address requirements that led to the design of the DeepQA architecture and the implementation of Watson. After three years of intense research and development by a core team of about 20 researchers, Watson is performing at human expert levels in terms of precision, confidence, and speed at the Jeopardy quiz show. Our results strongly suggest that DeepQA is an effective and extensible architecture that can be used as a foundation for combining, deploying, evaluating, and advancing a wide range of algorithmic techniques to rapidly advance the field of question answering (QA).

Bio:

Dr. David Ferrucci is a Research Staff Member and leader of the Semantic Analysis and Integration Department at IBM’s T.J. Watson’s Research Center. His team of 25 researchers focuses on developing technologies for discovering knowledge in natural language and leveraging those technologies in a variety of intelligent search, data analytics, and knowledge management solutions.

In 2007, Dr. Ferrucci began exploring the feasibility of designing a computer system that can rival human champions at the game of Jeopardy!. Dubbed DeepQA, the project focused on advancing natural language question answering using massively parallel evidence-based computing. After winning support, Ferrucci has set and driven the technical agenda for Jeopardy! The IBM Challenge.

The Watson computer system designed by Ferrucci’s team represents the integration and advancement of many search, natural language processing, and semantic technologies.

Following the Jeopardy! challenge, Dr. Ferrucci and his team plan to apply Deep QA technologies to areas like medicine, government, and law to drive advances in computer supported intelligence and decision-making.

Apr 1

Eric Fosler-Lussier
Ohio State Univ.

Integrating speech science and technology:
New models for speech and audio processing

 

Traditional speech recognition techniques adopt a hierarchical, top down approach to modeling speech data; linguistic information such as word pronunciations or language models typically act as priors in statistical models for automatic speech recognition (ASR).  One line of research has started to integrate linguistic information within the representation of the underlying speech data. However, the top down approach typically used in ASR (Hidden Markov Models) does not easily allow for combining evidence from different linguistic representations.

Similarly, in speech separation (removing background noise from a speech-noise mixture), different cues have been identified that indicate speech or background noise.  However, the techniques that have utilized multiple cues typically combine them in an ad hoc manner.

In this talk, I will discuss a line of research from my lab that looks at combining evidence using Conditional Random Fields: CRFs have been utilized within the NLP community for many tasks, but their use in the speech community is only starting to take off.  Applications of CRFs to the ASR and speech separation problems show that this type of model can be an effective combiner of information, and can allow us to easily integrate ideas from speech science into working systems.

Bio:

Eric Fosler-Lussier is an Associate Professor of Computer Science and Engineering, with a courtesy appointment in Linguistics, at The Ohio State University. After receiving a B.A.S. (Computer and Cognitive Science) and B.A. (Linguistics) from the University of Pennsylvania in 1993, he received his Ph.D. in 1999 from the University of California, Berkeley, performing his dissertation research at the International Computer Science Institute under the tutelage of Prof. Nelson Morgan. He has also been a Member of Technical Staff at Bell Labs, Lucent Technologies, and a Visiting Researcher at Columbia University. In 2006, Prof. Fosler-Lussier was awarded an NSF CAREER award, and in 2010 was presented with a Lumley Research Award by the Ohio State College of Engineering. He is also the recipient (with co-author Jeremy Morris) of the 2010 IEEE Signal Processing Society Best Paper Award.  He has published over 90 papers in speech and language processing, is a member of the Association for Computational Linguistics, the International Speech Communication Association, and a senior member of the IEEE.

Fosler-Lussier serves on the IEEE Speech and Language Technical Committee (2006-2008, 2010-2013), as well as on the editorial boards of the ACM Transactions on Speech and Language Processing and the Journal of Experimental Linguistics. He is generally interested in integrating linguistic insights as priors in statistical learning systems.

Apr 21

Bonnie Webber
University of Edinburgh

Discourse Structures and Language Technologies


This talk tells a story about computational approaches to discourse structure. Like all such stories, it takes some liberty with actual events and times, but I think stories put things into perspective, and make it easier to understand where we are and how we might progress.

In Part 1, I show that early computational work on discourse structure aimed to assign a simple tree structure to a discourse. At issue was what its internal nodes corresponded to. The debate was fierce, and suggestions that other structures might be more appropriate were ignored or subjected to ridicule.  The main uses of discourse structure were text generation and summarization, but mostly in small-scale experiments.

In Part 2, I describe several different types of discourse structure receiving computational attention, though perhaps not always clearly distinguished. There is an increasing number of credible efforts aimed at recognizing these structures automatically, though performance on unrestricted text still resembles that of the early days of robust parsing. Generic applications are also beginning to appear, as researchers recognize the value of these structures to tasks of interest to them.

In Part 3, I argue for the need for a mid-line between approaches hostage to theory and empirical approaches free of theory. An empirical approach underpinned by theory will not only motivate sensible back-off strategies in the face of unseen data, but also enable us to understand how the different discourse structures inter-relate and thereby exploit their mutual recognition. This should allow more challenging applications, such as improving the performance of statistical machine translation (SMT) through the extended locality of discourse structures and the linguistic phenomena they correlate with.

Bio:
Bonnie Webber is a Professor of Informatics at Edinburgh University. She is known for computational work on discourse and question answering, and also for research on animation from instructions, medical decision support systems and biomedical text processing. She is a Fellow of the Royal Society of Edinburgh and the American Association for Artificial Intelligence. She has given recent keynote talks on computational approaches to discourse at ACL'2009 in Singapore and at the International Conference on Natural Language Processing in Hyderabad, and (with Markus Egg and Valia Kordoni) a recent tutorial on computational approaches to discourse structure at ACL'2010.

Apr 22

Mark Steedman
University of Edinburgh

The Statistical Problem of Language Acquisition

 

The talk will report on recent work with Tom Kwiatkowski, Sharon Goldwater, and Luke Zettlemoyer on semantic parser induction by machine from a number of corpora pairing sentences with logical forms, including a corpus consisting of real child-directed utterance from the CHILDES corpus.

The problem of child language acquisition is often identified as a "logical" problem.  The term refers to the fact that children learn language rapidly from exposure to a sample of utterances in the language, and seem to need access to some other source of information than mere positive examples of the sentences of the language.

The most obvious candidate for this other source of information, at least in the earliest stages of language acquisition, is representations of meaning, in the form of logical forms supported by the contextual situation, available in the case of the child on the basis of pre-linguistic sensory-motor cognition (in which we should include certain kinds of social cognition).  In the case of machines, there has been some success recently on this task for datasets such as GeoQuery and ATIS, including multi-lingual versions.

The paper argues that the problem of language acquisition interpreted in this way is similar to the problem of inducing a grammar and a parsing model from a treebank such as the Penn treebank, except that a) the trees are unordered logical forms, in which the preterminals are not aligned with words in the target language, and b) there may be noise and spurious distracting logical forms supported by the context but irrelevant to the utterance.

The talk shows that this class of problem can be solved if the child or machine initially parses with the entire space of possibilities that universal grammar allows under the assumptions of the Combinatory Categorial theory of grammar (CCG), and learns a generative statistical parsing model for that space using EM algorithm-related methods such as Variational Bayes learning.

This can be done without all-or-none "parameter-setting" or attendant "triggers", and without invoking any "subset principle", provided the system is presented with a representative sample of reasonably short utterances from the target language.

Bio:
Mark Steedman is Professor of Cognitive Science in the School of Informatics at the University of Edinburgh, to which he moved in 1998 from the University of Pennsylvania, where he taught for many years as Professor in the Department of Computer and Information Science.  He is a Fellow of the British Academy, the Royal Society of Edinburgh, and the American Association for Artificial Intelligence.

His research covers a range of problems in computational linguistics, artificial intelligence, computer science, and cognitive science, including syntax and semantics of natural language, and parsing and comprehension of natural language discourse by humans and by machine using Combinatory Categorial Grammar (CCG).  Much of his current NLP research concerns wide-coverage parsing for robust semantic interpretation and natural language inference.  Some of his research concerns the analysis of music by humans and machines.

Apr 29

Jerome Bellegarda
Apple Inc.

A Guided Tour of Latent Semantic Mapping

 

Originally formulated in the context of information retrieval, latent semantic analysis exhibits three main characteristics: (i) words and documents (i.e., discrete entities) are mapped onto a continuous vector space; (ii) this mapping is determined by global correlation patterns; and (iii) dimensionality reduction is an integral part of the process. Because such fairly generic properties may be advantageous in a variety of different contexts, this has sparked interest in a more inclusive interpretation of the underlying paradigm. The outcome is latent semantic mapping, a data-driven framework for modeling global relationships implicit in large volumes of data. The purpose of this talk is to give a broad overview of the framework, highlight the possibilities it offers for general feature extraction, and underscore the multi-faceted benefits it can bring to a number of problems in speech and language processing. We conclude with a discussion of the inherent trade-offs associated with the approach, and some perspectives on its likely role in information extraction going forward.

Bio:
Jerome R. Bellegarda is currently Apple Distinguished Scientist in Human Language Technologies at Apple Inc, Cupertino, California. His general interests span voice-driven man-machine communications, multiple input/output modalities, and multimedia knowledge management. In these areas he has written approximately 150 publications, and holds about 50 U.S. and foreign patents. He has served on many international scientific committees, review panels, and advisory boards. In particular, he has worked as Expert Advisor on speech technology for both the National Science Foundation and the European Commission, was Associate Editor for the IEEE Transactions on Audio, Speech and Language Processing, served on the IEEE Signal Processing Society Speech Technical Committee, and is currently an Editorial Board member for both Speech Communication and the ACM Transactions on Speech and Language Processing. He is a Fellow of the IEEE.