Place: Wean Hall 5409
Assistant: Phyllis Pomerantz (email@example.com). Unless a different person is named below, please contact Phyllis for appointments with the speakers.
9/26/03, Wean Hall 7220, 3:30pm
Note unusual date & room
Jason Hartline (This talk is part of the theory seminar series, but I highly recommend it to AI seminar series participants also.)
CMU Computer Science Department
UC Berkeley, CS
UMass Amherst, Computer Science
UC Berkeley, CS
Generalized Mean Field Inference in Graphical Models (For appointments, contact Jill Lentz [firstname.lastname@example.org].)
School of Computer Science and Engineering and
Note unusual date & room
Leslie Pack Kaelbling
MIT Computer Science and Artificial Intelligence Lab
Computational Induction of Explanatory Process Models
TBA (Brain science & machine learning)
One feature of the classical treatment of optimization problems is that the space over which the optimization is being performed, i.e., the input description of the problem, is assumed to be publicly known to the optimizer. This assumption does not always accurately represent reality, for example, in applications involving resource sharing and electronic commerce on the Internet where transactions must occur among parties with diverse and selfish interests. In such settings, the inputs to many optimizations being performed are not publicly known in advance. Instead they must be solicited from companies, computerized agents, individuals, etc. that may act selfishly to promote their own self-interests. In particular, they may lie about their values or may not adhere to specified protocols if it benefits them. One of the most fundamental problems where the selfish interests of the participants comes into play is in auctions. In this talk I will review the worst case competitive analysis of auctions for digital goods. This review will include a review of the necessary game theoretic notions, the solution concept of truthful mechanism design, and the framework for competitive analysis of auctions. I will then pose two open problems relating to this work and present what is known about them. The first open problem I will present is on the existence of competitive deterministic auctions. On this topic, I will show that there is no symmetric deterministic auction that is competitive. On the other hand, I will show that there is a competitive auction that uses only 1 random bit. The open question left is whether there exists asymmetric deterministic auctions which are competitive. The second open problem I will present is on the optimal competitive ratio. The best known auction achieves a competitive ratio of 3.39. I will discuss a lower bound on the competitive ratio of 2.42. Furthermore, I will present two simplifications of the auction problem, one for which the optimal competitive ratio is known, and the other for which it is not known.
(This talk presents work that is joint with Amos Fiat, Andrew Goldberg,
Anna Karlin, and Mike Saks)
We analyze incentive compatible (truthful) mechanisms over restricted domains of preferences, the leading example being combinatorial auctions. Our work generalizes the characterization of Roberts (1979) who showed that truthful mechanisms over unrestricted domains with at least 3 possible outcomes must be "affine maximizers". We show that truthful mechanisms for combinatorial auctions (and related restricted domains) must be "almost affine maximizers" if they also satisfy an additional requirement of "independence of irrelevant alternatives". This requirement is without loss of generality for unrestricted domains as well as for auctions between two players where all goods must be allocated. This implies unconditional results for these cases, including a new proof of Roberts' theorem. The computational implications of this characterization are severe, as reasonable "almost affine maximizers" are shown to be as computationally hard as exact optimization. This implies the near-helplessness of such truthful polynomial-time auctions in all cases where exact optimization is computationally intractable.
See also http://www.cs.huji.ac.il/~tron for more information.
Joint work with Ahuva Mu'alem and Noam Nisan.
Increasingly, learning systems are called upon to not merely predict,
but to take actions that affect the world. The fields of planning and
control are meanwhile asked to make good sequential decisions in
complex problems where simple analytic models are unavailable. This
confluence of learning and control has driven research to identify
algorithms that can search for good strategies while remaining
efficient in both sample and computational resources.
Policy search approaches are becoming a popular alternative to value
function methods in part because they gracefully handle both large
state-spaces and partial observability. I'll discuss two recent
developments that transplant techniques from supervised learning to
learning policies. First, I'll show how notions from information
geometry give simple algorithms with dramatically better performance
than existing reinforcement learning algorithms. Next, I'll discuss a
new approach that discards Bellman equations and searches directly in
a space of policies while retaining the central insight of dynamic
programming. With appropriate domain knowledge, this algorithm is
efficient in both computational and sample complexity and provides
much stronger guarantees than classical value-function approximation
can. Examples will illustrate how each theoretical advance leads to
state-of-the-art empirical performance on difficult benchmark
J.A. (Drew) Bagnell is a doctoral candidate in Robotics and an NSF
graduate fellow at Carnegie Mellon University. His interests include
rich probabilistic models for statistical learning, robust and
stochastic control, and the application of machine learning to large,
partially observable planning and control problems.
I will discuss the problem of learning topic hierarchies from text data. Model selection in this domain is daunting---which of the large collection of possible trees to use? Taking a Bayesian approach, we have developed the nested Chinese restaurant process, a flexible, nonparametric prior on tree structures which is built via a collection of distributions on partitions of integers. This prior allows arbitrarily large branching factors and readily accommodates growing data collections.
We have used this distribution as a prior in a hierarchical variant of latent Dirichlet allocation (LDA). LDA is a probabilistic graphical model of text collections in which documents can exhibit multiple latent topics. In the hierarchical variant, these topics lie in a hierarchy which itself is random and unknown. A first approach to inference in such a model is a straightforward Gibbs sampling algorithm which finds an approximate posterior distribution on topic hierarchies induced by a collection of documents. This yields interesting results when applied to a dataset of abstracts from the NIPS conferences.
(Joint work with Thomas Griffiths, Michael Jordan, and Josh Tenenbaum)
David Blei is a doctoral candidate in the Computer Science Department at U.C. Berkeley. He is interested in probabilistic graphical models applied to problems in text-processing and information retrieval.
extraction is the process of filling a structured database
from unstructured text. It is a difficult statistical and
computational problem often involving hundreds of thousands of
variables, complex algorithms, and noisy and sparse data. Recently
there has been significant interest in conditionally-trained
alternatives to joint probabilistic models such as HMMs. In this talk
I will briefly review previous work in finite state, Conditional
Random Field (CRF) models for information extraction, and then
describe four pieces of more recent work: (1) the application of CRFs
to extraction of tables from government reports, (2) feature induction
for these models, applied to named entity extraction, (3) a
random field method for noun co-reference resolution that has strong
ties to graph partitioning, (4) an extension of CRFs to factorial
state representation, enabling simultaneous part-of-speech tagging and
work with colleagues at UMass (Ben Wellner, Khashayar
Rohanimanesh, Charles Sutton, David Pinto, Xing Wei, Wei Li, Bruce Croft),
CMU (John Lafferty), and UPenn (Fernando Pereira).
McCallum is an Associate Professor at
and Development at WhizBang Labs, a company that used machine learning
for information extraction from the Web. In the late 1990's he was a
Research Scientist and Coordinator at
receiving his PhD from the
editorial board of the Journal of Machine Learning Research.
For the past
eight years, McCallum has been active in research on statistical machine
learning applied to text, especially information extraction, document
classification, finite state models, and learning from combinations of
labeled and unlabeled data. Web page: http://www.cs.umass.edu/~mccallum.
Recent years have seen increasing popularity of modeling complex phenomena in applied fields using "probabilistic graphical models", a formulism that exploits the conjoined talents of graphical theory and probability theory to build complex models out of simpler pieces. In this talk, I discuss a class of generalized mean field (GMF) inference algorithms for approximate inference in exponential family graphical models, and a novel combination of graph partitioning (GP) algorithms with the GMF algorithms, which potentially leads to a generic, autonomous, and distributed algorithm for deterministic approximate inference in arbitrary graphical models. I discuss the mathematical underpinnings for the GMF algorithms and formal relationships between GMF and GP. I also discuss applications of the GMF algorithms in both toy examples and a large-scale computational biology problem in genomic sequence analysis.
(Joint work with Michael Jordan and Stuart Russell.)
Eric Xing received his B.S. with honors in Physics and Biology from
Tsinghua university, his Ph.D. in Molecular Biology and Biochemistry
Science at UC Berkeley. His early work in molecular biology focused on
the genetic mechanisms of human carcinogenesis and the mutational
spectrum of tumor suppressor genes. Then he moved into machine
learning and has worked on probabilistic graphical models, approximate
inference and pattern recognition. He is interested in studying
biological problems (in particular, systems biology, genetic inference
and evolution) using statistical learning approaches, theory and
application of graphical models, nonparametric Bayesian analysis and
Poker is a challenging
problem for AI research: multiple agents (up to 10), stochastic element (cards
being dealt), imperfect information (don't know the opponent's cards), user modeling
(identifying player patterns), and risk management (betting decisions). For
over 10 years the
1) knowledge-based system,
3) game theory, and
4) tree searching with learning.
The prospects of a program successfully challenging
the best human players in the near future is excellent.
In this talk we will motivate the research, compare the
four different program designs, and discuss our future directions.
Jonathan Schaeffer is a
professor in Computing Science at the
A fundamental issue in computational learning theory, as
well as in
biological and cognitive information processing, is the best possible
relationship between model representation complexity and its prediction
accuracy. Clearly, we expect more complex models that require longer data
representation to be more accurate. Can one provide a quantitative - yet
general - formulation of this tradeoff? In this talk I will discuss this
this trade-off can be traced back to the basic duality between source and
channel coding and is also related to the notion of "coding with side
information". Moreover, using some resent results I will argue that when
formulated properly (as an "Information Bottleneck") the optimal
representation form a "perfectly matched source-channel" and is achievable
without delays or block-coding. I will review the theoretical achievability
results for such relevant data representations and discuss some of our
algorithms for extracting them. I will then demonstrate the application of
these ideas for the analysis of natural language corpora and speculate on
possibly-universal scaling between meaningful information and word-entropy
in human communication, that they reveal.
Over the last several years, a number of authors have developed graph-theoretic or network models for large-population game theory and economics. In such models, each player or organization is represented by a vertex in a graph, and payoffs and transactions are restricted to obey the topology of the graph. This allows the detailed specification of rich structure (social, technological, organizational, political, regulatory) in strategic and economic systems.
In this talk, I will survey these models and the attendant algorithms for certain basic computations, including Nash, correlated, and Arrow-Debreu equilibria. Connections to related topics, such as Bayesian and Markov networks for probabilistic modeling and inference, will be discussed. Time permitting, I will briefly discuss some recent experimental work marrying this general line of thought with topics in social network theory.
Many web recommendation systems direct users to webpages, from a single website, that other similar users have visited. By contrast, our WebIC web recommendation system is designed to locate "information content (IC) pages" --- pages the current user needs to see to complete her task --- from essentially anywhere on the web. WebIC first extracts the "browsing properties" of each word encountered in the user's current click-stream --- eg, how often each word appears in the title of a page, or in the "anchor" of a link that was followed, etc. It then uses a user- and site-independent model, learned from a set of annotated web logs acquired in a user study, to determine which of these words is likely to appear in an IC page. We discuss how to use these IC-words to find IC-pages, and demonstrate empirically that this browsing-based approach works effectively.
Joint work with Tingshao Zhu, Gerald Häubl and Bob Price.
After earning a PhD from Stanford, Russ Greiner worked
in both academic and industrial research before settling at the
For more information, see http://www.cs.ualberta.ca/~greiner/WebIC
Dr. Greiner is visiting CALD here at CMU this Spring.
In the early days of AI some researchers proposed that intelligent problem solving could be reduced to the application of general purpose theorem provers to an axiomatization of commonsense knowledge. Although automated first-order theorem proving was unwieldy, general reasoning engines for propositional logic turned out to be surprisingly efficient for a wide variety of applications. Yet many problems of interest to AI involve probabilities or quantification, and would seem to be beyond propositional methods. However, research research has shown that the basic backtrack search algorithm for satisfiability generalizes to a strikingly efficient approach for broader classes of inference. We describe our progress in building the world's fastest engine for model-counting (the core #P-completeproblem) by leveraging the techniques of clause learning and formula cacheing.
Henry Kautz is an Associate Professor in the Department of Computer
Science and Engineering at the
the faculty in the summer of the year 2000 after a career
Labs and AT&T Laboratories, where he was Head of the AI Principles
Research Department. His academic degrees include an A.B. in
and Thought Award" from the International Joint Conference on
Artificial Intelligence and a Fellow of the American Association
for Artificial Intelligence. In 1998 he was elected to the
Executive Council of AAAI, and in 2000 was Program Chair for
the AAAI National Conference.
In the last 10 years, the combination of techniques from machine
learning and operations research has allowed major advances in learning
and planning for uncertain environments. Reasonably large problems
can be solved using current techniques. But what if we want to scale
up to the uncertain learning and planning problem that you face every
day? It is many orders of magnitude larger than the biggest problem we
can solve currently.
In this talk, I'll describe three early pieces of work that try to
begin to address working in truly huge environments. The first is a
method for learning probabilistic rules to describe naive physics
models of the interactions between objects. The second is an uncertain
planning algorithm that uses the rules learned by the first method to
construct contingency plans that consider enough cases to perform
robustly, but are much smaller than complete policies. The last piece
is preliminary work on combining multiple abstraction methods
dynamically, in order to allow an agent to have a working model of the
environment that changes focus depending on the current situation.
Our perceptual systems are bewildering in their complexity. Although a
great deal has been learned about their structure and organization,
insights into the underlying information processing algorithms have
proved elusive. Recent theoretical work, however, has begun to shed
light on some of the underlying computational principles and has
provided explanations for both the properties of early perceptual
representations and their relationship to the natural environment. A
motivating hypothesis of these theories is that part of what makes
biological sensory systems so robust and powerful is their ability to
learn cues and features from the statistical regularities of natural
signals. By developing probabilistic models and algorithms for
learning structure from these regularities, we will both be able to
develop more robust artificial systems, and gain insight into the types
of representations and computations used by biological systems. I will
show that a simple set of theoretical principles can be used to learn
optimal codes for both natural images and sounds, and that these
representations can explain a wide range of complex neural properties
and provide new insights into their organization. I will also show
that this framework can be extended in order to learn abstract sensory
properties and invariant features. This framework allows us to make
novel predictions about the functional roles of many aspects of
biological visual and auditory systems.
Joint work with Yan Karklin and Evan Smith.
Dr. Lewicki received his B.S. degree in mathematics and cognitive
computation and neural systems from the California Institute of
Technology, and did postdoctoral studies in the Computational
Neurobiology Laboratory at the Salk Institute. His research involves
the study and development of computational and theoretical approaches
to the representation, processing, and learning of pattern structure in
natural visual and acoustic environments. He is the recipient of an
NSF CAREER award for his multidisciplinary work on the representation
of natural auditory scenes. He is currently an assistant professor in
the Computer Science Department at
the CMU-University of
Computational Induction of Explanatory Process Models
The growing amount of scientific data has led to the increased use of computational discovery methods to understand and interpret them. However, most work has relied on knowledge-lean techniques like clustering and classification learning, which produce descriptive rather than explanatory
models, and it has utilized formalisms developed in AI or statistics, so that results seldom make contact with current theories or scientific notations. In this talk, I present an approach to computational discovery that encodes explanatory scientific models as sets of quantitative processes, simulates these models' behavior over time, incorporates background knowledge to constrain model construction, and induces these models from time-series data in a robust manner. I illustrate this framework on data and models from Earth science and microbiology, two domains in which explanatory process accounts occur frequently. In closing, I describe our progress toward an interactive software
environment for the construction, evaluation, and revision of such explanatory scientific models.
This talk describes joint work with Kevin Arrigo, Nima Asgharbeygi,