models overview slides
Factor Graph Propagation slides
whiteboard jpg image
introduction to variational methods for graphical models
Michael Jordan, Zoubin Ghahramani, Tommi Jaakkola, and Lawrence Saul
Tutorial on variational approximation methods
J. M. Winn. Variational Message Passing and its Applications.
Ph.D. Thesis, Department of Physics, University of Cambridge, 2003
Chapter 2 and Chapter 5
Wiegerinck, W. Variational approximations between mean field theory and the junction tree algorithm , UAI-2000.
M. J. Wainwright, and M. I. Jordan. Graphical models, exponential families, and variational inference. UC Berkeley, Dept. of Statistics, Technical Report 649. September, 2003.
(Note: starts 4:30pm, half an hour later than usual)
|Yuan Qi||Intro to EP
A family of algorithms for approximate Bayesian inference
Thomas Minka Thesis, Ch. 4
Tree-structured Approximations by Expectation Propagation,
Thomas Minka and Yuan Qi, NIPS2003
EP for dynamic systems
Expectation Propagation for Signal Detection in Flat-fading Channels,
Yuan Qi and Thomas Minka,
in the proceedings of IEEE International Symposium on Information Theory,
June, 2003, Yokohama, Japan
Structure learning (1)
|Ricardo Silva, Anna Goldenberg,
||'Learning Bayesian Networks'
book by Richard Neapolitan
A tutorial on learning with bayesian networks
Scheines, R, (1997) "An Introduction to Causal Inference", in Causality in Crisis, ed. by Steven Turner and Vaughan McKim, University of Notre Dame Press. Available at
(a nice overview of representational issues that are very relevant for structure learning)
Learning Bayesian Networks with Local Structure
Nir Friedman and Moises Goldszmidt
| May 4
Bayesian Error-Bars for Belief Net Inference
(notice room change: NSH1507)
graphical model results
A Bayesian Belief Network (BN) models a joint distribution over a set of n
variables, using a DAG structure to represent the immediate dependencies
between the variables, and a set of parameters (aka "CPTables") to represent
the local conditional probabilities of a node, given each assignment to its
parents. In many situations, these parameters are themselves random variables
--- this may reflect the uncertainty of the domain expert, or may come from a
training sample used to estimate the parameter values. The distribution over
these "CPtable variables" induces a distribution over the response the BN
will return to any "What is Pr(Q=q | E=e)?" query. This paper investigates
properties of this response: showing first that it is asymptotically normal,
then providing, in closed form, its mean and asymptotic variance. We then
present an effective general algorithm for computing this variance, which has
the same complexity as simply computing (the mean value of) the response
itself --- ie, O(n 2^w), where w is the effective tree width. Finally, we
provide empirical evidence that a Beta approximation works much better than
the normal distribution, especially for small sample sizes, and that our
algorithm works effectively in practice, over a range of belief net
structures, sample sizes and queries.
This is joint work with Tim Van Allen, Ajit Singh and Peter Hooper.
| May 11
A Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers
||Bayesian belief nets (BNs) are
often used for classification tasks ---
typically to return the most likely class label for each specified instance.
Many BN-learners, however, attempt to find the BN that maximizes a different
objective function --- viz., likelihood, rather than classification accuracy
--- typically by first learning an appropriate graphical structure, then
finding the maximal likelihood parameters for that structure. As these
parameters may not maximize the classification accuracy, ``discriminative
learners'' follow the alternative approach of seeking the parameters that
maximize *conditional likelihood* (CL), over the distribution of instances the
BN will have to classify. This presentation first formally specifies this
task, and shows how it extends standard logistic regression. After analyzing
its inherent sample and computational complexity, we present a general
algorithm for this task, ELR, that applies to arbitrary BN structures and
works effectively even when given incomplete training data. We present
empirical evidence that ELR produces better classifiers than are produced by
the standard ``generative'' algorithms in a variety of situations, especially
in common situations where the given BN-structure is incorrect.
This is joint work with Wei Zhou, Xiaoyuan Su and Bin Shen
Structure learning (2)
|Ricardo Silva, Anna Goldenberg,
||Scheines, R, (1997) "An Introduction to Causal Inference", in
in Crisis, ed. by Steven Turner and Vaughan McKim, University of Notre
Dame Press. Available at
(a nice overview of representational issues that are very relevant for
Friedman, N. (1997). Learning belief networks in the presence of missing
values and hidden variables..
In Fourteenth Inter. Conf. on Machine Learning (ICML). 1997.
Friedman, N. and Koller, D. (2003). Being Bayesian about Network
Structure: A Bayesian Approach to Structure Discovery in Bayesian
Networks. Machine Learning, 50:95-126, 2003. PostScript, PDF.
Elidan, G. Discovering hidden variables: A structure Based-Approach with
Noam Lotner, Nir Friedman and Daphne Koller. Proceeding of the Neural
Information Processing Systems conference (NIPS), 2000.
Silva, R.; Scheines, R.; Glymour, C. and Spirtes P. (2003) "Learning
measurement models for unobserved variables". Proceedings of the 19th
Conference on Uncertainty on Artificial Intelligence.
N. Friedman, D. Pe'er, and I. Nachman
Learning Bayesian Network Structure from Massive Datasets: The ``Sparse
Candidate'' Algorithm. N. Friedman, D. Pe'er, and I. Nachman
UAI 15, 1999.
A Moore, Weng-Keen Wong Optimal Reinsertion: A new search operator for
accelerated and more accurate Bayesian network structure learning, ICML
A. Goldenberg and A. Moore, Tractable Learning of Large Bayes Net
Structures from Sparse Data, ICML 2004
(do we want to cover this?)
They cover the bulk of causality, which is estimating causal effects from a structure given in advance.
| David Edwards (2000): "Causal
Inference", this is Chapter 8 (pp. 219-243) of his book "Introduction to
Graphical Modelling" (Springer, 2nd ed).
Phil Dawid (2000): Causal inference without counterfactuals. J. Amer. Statist. Ass. 95 (2000), 407-448. An earlier version is available for download at http://www.homepages.ucl.ac.uk/~ucak06d/reports.html number 188 (year 1997)
J. Pearl, "Statistics and Causal Inference: A Review" In Test Journal,
Vol. 12(2), pp. 281-345, December 2003 (with discussions). Available at
J. Pearl, ``Simpson's paradox: An anatomy'' Extracted from Chapter 6
of CAUSALITY. Available at http://bayes.cs.ucla.edu/R264.pdf
dynamic bayes nets,
prob. relational models,
other types of graphs,
feature selection for maxent models,
|Kernel Conditional Random Fields
Conditional Random Fields: Representation, Clique Selection, and
John Lafferty, Yan Liu, Xiaojin Zhu CMU tech report CMU-CS-04-115
|Monte Carlo|| Introduction
to Monte Carlo Methods
Probabilistic Inference Using Markov Chain Monte Carlo Methods
Zoubin's 2003 unsupervised learning course website http://www.gatsby.ucl.ac.uk/~zoubin/course03/index.html
Lise's GM reading group http://glue.umd.edu/~acardena/graphmod/
Rutgers GM course: http://www.cs.rutgers.edu/~vladimir/class/cs500gm.html
Learning in Graphical Models (eds, Jordan): http://www.dai.ed.ac.uk/homes/felixa/jordancol.html
Kevin Murphy's reading list: http://www.ai.mit.edu/~murphyk/Bayes/bnintro.html
Luo Si (lsi)
Jian Zhang (jian.zhang)
Jerry Zhu (zhuxj)