next up previous
Next: The Mathematical Axioms of Up: A Brief Introduction to Previous: The Basics of the

Subsections


Foundation of the Theory of Sets of Probabilities

Decision theory starts from the states, acts and utilities that have to be specified by the acting agent. For the purposes of this discussion, usually we can represent the states, acts and utilities in a table. For example, you can either go to the park, go to the movie or stay home, and it can either be sunny or cloudy:

  sunny cloudy
park 10 -10
market -5 4
home 0 0

How do we choose the best act? Just by looking at the table we could argue that park is better because it could gives us maximum reward (10); but also home could be better because it never gives us a strong punishment.

The problem of decision theory is to specify how to choose the ``best act''. Bayesian theory has been very succesful in this regard [4,5,21,23] as a prescription for what a rational agent should do. The Bayesian framework essentially says that:

The Bayesian framework is derived from a number of axioms that are supposed to apply to decision making.

The next idea is to start with a similar, but more general, set of axioms and generate a convex set of probability distributions, called the credal set [11,18]. When we follow this route, Bayesian theory is a particular case in which we assume that the agent always has a single distribution (the convex set of distributions has a single member).

Modifications of axioms of usual Bayesian decision theory have been proposed with a variety of justifications, ranging from psychological observations of human behavior to robustness techniques in statistical analysis. A theory of sets of probabilities represent one of the main ways in which one can relax the Bayesian framework in a principled manner. In Quasi-Bayesian theory, we ask: how can any agent be sure about preferences and decisions to the point that a single probability distribution can be chosen? This appears unreasonable for the kinds of agents that we have to deal with in real life; it also appears unreasonable if we consider agents composed of many entities (like organizations, for example).

In short: a rational agent has a utility function that translates his preferences and a convex set of probability distributions that translates his beliefs.

The Meaning of the Credal Set

Let us study carefully what a convex set of distributions means in terms of preferences. Consider a loss function $l(\cdot)$ and two acts a1 and a2. Since each act is a function of the states, we can obtain the expected loss of an act by picking a probability distribution.

Take a distribution p1. You can obtain the expected values E1[a1] and E1[a2] for the acts.

Take another distribution p2. You can obtain the expected values E2[a1] and E2[a2] for the three acts.

Suppose E1[a1] < E1[a2] and E2[a1] > E2[a2]. Now a1 and a2 cannot be compared with respect to expected loss.

There is a lot of controversy about what the agent should do at this point; this will be discussed later. Right now, the important point is to understand that we cannot create a complete order with a convex set of distributions.

So an agent that uses a credal set has a partial order of preferences. What is this supposed to mean?

There are two basic ways to look at this situation [30]:

Incomplete beliefs
In this interpretation, the agent could possibly refine beliefs and establish a unique, complete order among acts. In other words, the agent could specify a single probability distribution that would reflect a complete order of acts. That would be the ``true'' distribution. Why doesn't the agent do that in the first place? Here we can have two answers:
Exhaustive beliefs
In this interpretation, the agent has already thought as much as possible about the situation, but still could not specify complete preferences. Some acts are just incomparable for the agent.

So here we have some similar but different interpretations of credal sets. Different interpretations have led to different technical questions and results, so it is important to pay attention to these issues.

A Digression: The Convexity of the Credal Set
Giron and Rios require that their axioms produce a convex set of distributions. A convex set of functions is a set of functions where, if f1 and f2 belong to the set, then a mixture of f1 and f2 belong to the set. A convex combination of a set of functions fj is given by $\sum a_j f_j$, where aj are non-negative numbers that sum to unity.

Why a convex set? A partial order can be created with a non-convex set of distributions (for example, by picking the boundary of a convex set).

But here is the point: all preferences that are valid with a given set of distributions, are valid if we pick the convex hull of this set! This is due to the linear character of the the expected loss operation. Whatever happens with a set of distributions, it also happens with all convex combinations of those distributions -- hence you have the convex hull. In general, the partial order of preference is unchanged if we take the convex hull of a set of distributions.

What can we make of this fundamental observation? If we justify our theory in terms of preferences, then it seems that there is a strong bias toward convex sets. Convex sets of distributions are the larger sets that induce a particular pattern of preferences. This is the path followed by Quasi-Bayesian theory. The theory is formalized axiomatically in terms of preference axioms, such that convex credal sets arise as the basic representation for beliefs and preferences.

But if we have a different interpretation for sets of distributions, then there may be no reason to take them to be convex. One can construct a theory that explicitly differentiates between sets of distributions when they are convex and non-convex. We will see that this point can be used to discuss independence concepts later.

Reasons to Adopt a Set of Probabilities

In short, there are strong reasons for adopting a set of probabilities as the basic model for uncertainty [25]:

Of course, the theory of sets of probabilities has the advantage of a solid axiomatic foundation (take for example the Quasi-Bayesian theory of Giron and Rios or the theory of coherent lower previsions of Walley), which is no small thing when you consider the number of possible ad hoc approaches to uncertainty.


next up previous
Next: The Mathematical Axioms of Up: A Brief Introduction to Previous: The Basics of the
Fabio Gagliardi Cozman
1999-12-30