Introduction to the Theory of Sets of Probabilities

(Probability Intervals, Belief Functions, Lower Probability, Lower Expectations, Choquet Capacities, Robust Bayesian Methods, etc...)

Fabio Cozman

What I hope is that these pages contain a brief but reasonably general presentation of the foundations of theories that handle sets of probability distributions. There are many such theories: Quasi-Bayesian theory, Lower Probability, Lower Expectations, Choquet Capacities, Robust Bayesian Methods, and some other similar theories. I feel sorry that I can't possibly refer to all the good work that has been published on these topics --- I attempted to refer to some representative papers and books, mostly of foundational character, and most of them have been written prior to 1993.

There are two main pieces of information off of this page:

The one-page Theory in a nutshell.
A longer, but still quite informal, description of the theory of sets of distributions (and related theories).

There are two other sources of material on these things in the web (you can get more recent references through them):

The web site for the First Symposium on Imprecise Probabilities and Their Applications (which I co-organized).
The Imprecise Probabilities Project (which I co-edit).

Most of the content in this web site talks about foundational concepts and basic results. A collection of practical results would also be useful, but I think the first step must be to present a consistent theory.

I focus only on proposals that maintain the basic infra-structure of Bayesian theory and augment/enrich/generalize it. Proposals that require an entirely different view of uncertainty (like Dempster-Shafer theory), or which handle other concepts (like fuzzy logic) are not covered here. Among all the possible theories that use sets of probability distributions to represent uncertainty, there is a particular axiomatization that is very simple to present and understand. It is the axiomatization given by two statisticians, Giron and Rios, in 1980 [2]. Their paper is very nice; they call the resulting theory Quasi-Bayesian theory.

The original theory by Giron and Rios was quite elegant but did not include discussions of conditionalization and independence; they also did not have a clear statement of decision criteria. I try to present their theory and fill in those gaps with ideas that have been proposed in a variety of contexts in the last decade; the goal is to present the theory in a unified format so that its scope can be better analyzed.

I'm aware that there is a lot of excellent work that I have not reviewed; please send me e-mail with a pointer to your work (or other work that you think is relevant). Thanks.

There are postscript versions of the content that you can reach from this page.

Why so many words in the sub-title of this page?

There are several similar generalization of probability that use sets of probability distributions:

The theories of Lower Expectations and Lower Previsions use intervals of expected losses to generate sets of distributions.
A slightly different approach in theories that impose axioms on events. The resulting structures are generalizations of probability called Lower Probabilities or Choquet Capacities. Special cases of such structures are the Monotone Choquet Capacities and the Lower Envelopes. Infinitely Monotone Choquet Capacities are sometimes called Belief functions. Such structures can be in most cases represented by convex sets of probability distributions.
From a slightly different perspective, many statisticians use sets of distributions to study the robustness of a statistical analysis.

These theories have points of divergence, but this work tries to emphasize the points where there is agreement.

I hope these pages are useful for anyone interested in sets of distributions, but because I work more in Robotics and Artificial Intelligence, I can better understand the theory from this point of view.

My work with sets of probabilities

I work both on foundational and algorithmic issues (with emphasis on the later). Most of my work on this theory can be grasped through the papers

F. G. Cozman. Credal networks, Artificial Intelligence Journal, vol. 120, pp. 199-233, 2000. (This paper is a mature version of papers presented at UAI97 and UAI98.)
F. G. Cozman. Computing posterior upper expectations, International Journal of Approximate Reasoning, vol. 24, pp. 191-205, 2000.
F. G. Cozman. Calculation of Posterior Bounds Given Convex Sets of Prior Probability Measures and Likelihood Functions, Journal of Computational and Graphical Statistics, vol. 8(4), pp. 824-838, 1999.

Most of the results in the second paper, and a summary of the third paper, can be found at

F. G. Cozman. Computing Posterior Upper Expectations, First International Symposium on Imprecise Probabilities and Their Applications (ISIPTA), pp. 131-140, Ghent, Belgium, June/July, 1999.

My interests in the theory of sets of probability follow some general lines. First, I'm interested in efficient algorithms to obtain posterior quantities. I'm now trying to extend such algorithms to deal with more complicated cases; for example, situations where observations may have probability zero, and situations where judgements of independence are stated. Second, I'm interested in concepts and properties of irrelevance/independence connected to the theory of sets of probabilities. A starting point on this:

F. G. Cozman. Separation Properties of Sets of Probability Measures. XVI Conference on Uncertainty in Artificial Intelligence, pp. 107-115, San Francisco, California, July 2000.
F. G. Cozman. Irrelevance and Independence Axioms in Quasi-Bayesian Theory, European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU), London, England, published in Symbolic and Quantitative Approaches to Reasoning with Uncertainty, A. Hunter e S. Parsons (eds.), pp. 128-136, Springer, July, 1999.

While working with these things, I have developed algorithms for the JavaBayes system, where I explore robust inferences with Quasi-Bayesian networks, using both local and global perturbations.

I have also pursued some different directions, looking at the problem of sequential-decision making associated with observations, and also exploring the possibility of learning convex sets of probability from data.

Basic references

Most of the foundational issues in the theory of sets of probabilities can be absorbed through the work of two researchers:

I. Levi, whose The Enterprise of Knowledge [3] is a great analysis of many philosophical issues related to the theory. If you want Philosophy, you probably want to read this.
P. Walley, whose Statistical Reasoning with Imprecise Probabilities [4] is a tremendous summary of all that has been said about the theory in the field of Statistics (and also tries some connections with Economics and Artificial Intelligence).

These books are very dense and require some background. I'm trying to construct these informal pages for the reader that is not entirely familiar with the theory of sets of probabilities, but has already learned some probability and decision theory.

Here are four references that capture a vast portion of the theory of sets of probabilities:

1: J. O. Berger. Statistical Decision Theory and Bayesian Analysis. Springer-Verlag, 1985.
2: F. J. Giron and S. Rios. Quasi-Bayesian behaviour: A more realistic approach to decision making? In J. M. Bernardo, J. H. DeGroot, D. V. Lindley, and A. F. M. Smith, editors, Bayesian Statistics, pages 17-38. University Press, Valencia, Spain, 1980.
3: I. Levi. The Enterprise of Knowledge. The MIT Press, Cambridge, Massachusetts, 1980.
4: P. Walley. Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, New York, 1991.

Thanks for the visit; you're visitor [count]

since July 15, 1996.

Colophon

I typed almost all of this document with gnu-emacs using LaTeX commands. The LaTeX documents were converted to postscript with dvips, and to HTML with LaTeX2HTML. The translations from LaTeX to postscript and LaTeX to LaTeX2HTML were coordinated by some simple makefiles. I drew the figures in CorelDraw (in Windows), and used xv (in Unix) to help me produce the imagemaps (wow, that was quite a pain).

Acknowledgement

This work started at the Robotics Institute at the School of Computer Science, Carnegie Mellon University. I had a scholarship from CNPq (Brazil). Thanks to these organizations!