This paper concentrates on the connection between data and convex sets of distributions. Results do not use the possible existence of prior distribution for events. An alternative approach is to use prior distributions and Bayes rule to obtain posterior measures for events. The objective of this paper is not to replace Bayes rule, but rather to enhance one's intuition about probabilities constructed solely from data. There has been work on subjective approaches to the process of learning convex sets of distributions; we mention two approaches that are relevant to Bayesian networks.

The estimation of parameters for a Bayesian network usually has to deal with missing data, i.e., observations for some variables are not collected. The standard Bayesian assumption is that missing data happens at random; if this assumption is violated, inferences may be biased. Ramoni and Sebastiani propose to lift the ``missing at random assumption'' [19] in a Bayesian network learning scenario. They consider all possible ways in which missing data could have happened, and create a convex set of joint distributions that represent the gamut of possibilities for the data actually collected. The idea is to avoid using unjustified assumptions and replacing those by sets of distributions, so that the effects of missing data can be evaluated.

The imprecise Dirichlet prior has been proposed by Walley [26] as a model for inferences associated with multinomial sampling. Here we indicate how this model can be used to learn Bayesian networks associated with convex sets of distributions.

An imprecise Dirichlet distribution for a vector valued variable theta is:

p(theta) = _{i=2}^{|theta|} theta_{i}^{2}s t_{i} - 1,

This class of distributions can be used as a prior credal set;
the prior assumptions are much less restrictive than standard Bayesian assumptions. Note
that for any event A, the prior imprecise Dirichlet model induces the bounds __p__(A) = 0
and

First consider standard Bayesian network learning when complete data is available. A Bayesian network codifies a joint distribution through the expression:

p(x) = prod_{i} p(x_{i} | _{i})),

p(Theta) = prod_{i=1}^{n} prod_{j=1}^{3}_{i}) p(theta_{ij}).

Suppose that every vector theta_{ij} is associated with an imprecise Dirichlet prior:

p(theta_{ij}) = _{ij}|s_{ij}, t_{ij})
prod_{k=2}^{2}|x_{i}| theta_{ijk}^{3}s_{ij} t_{ijk} - 1,

Suppose data n_{ij} observations are made with _{i}) = j
and n_{ijk} observations are made with x_{i} = k, _{i}) = j.

The posterior distribution for theta_{ij} is given by imprecise
Dirichlet distributions, due to the parameter independence assumption and the
convexification convention. We have theta_{ij} with marginals:

p(theta_{ij} = _{ij} | s_{ij}' , t_{ij}'),

Sun Jun 29 22:16:40 EDT 1997