Learning the Structure of Probabilistic Networks

Dimitris Margaritis, Sebastian Thrun

November 21, 1998

Problem:

Probabilistic or Bayesian Networks (BNs)--see Figure 1--are compact and semantically clean representations of probabilistic independencies and possible dependencies among sets of variables. As such they may provide with insights as to the application domain as well as be used to characterize future data. The basic technique for constructing Bayesian Nets is for an expert to provide causal relations among variables, which is then converted to a network structure under certain assumptions. However, these representations can be inaccurate and impossible to manually specify for large domains. Automatic structure inference is necessary in such cases. The research I am proposing relates to producing efficient algorithms for BN structure inference from data by exploiting probabilistic relations that may exist among sets of variables.

Impact:

There is a large number of prediction/classification tasks that benefit from accurate modelling of the probabilistic relationships in the underlying representation of a problem. A practical solution to the abovementioned problem will enable systems with a large number of variables such as image retrieval or text categorization to recover complex interdependencies of the domain variables and improve their classification accuracy. As an example, the proliferation of web search engines for multimedia information should benefit from such a scientific advancement.

State of the Art:

There are two approaches to recovery of structure in the case where all variables are observable. The first, more popular one involves a local, gradient ascent type search in the space of structures, with each step being an arc addition, removal or reversal [3]. Conditional probability tables accompanying the structure are computed from data. This process is guaranteed to reach a local maximum of the score function, but not necessarily a global one. The second approach [5] involves exploiting statistical independence among sets of variables upon conditioning on another set, and is the one most related to the current research.

In the case of existence of hidden variables, recovery of structure is more difficult, especially since usually the exact number of hidden variables is unknown. Computation of the score is also more difficult because of the unobserved values for the hidden nodes. Conditioning on variables for local structure recovery may no longer be possible. Algorithms such as gradient ascent [4] and EM [2] are the ones mainly used in this case.

Approach:

At present I am interested in structure discovery in the case where all variables are observed. I propose an algorithm that discovers the local structure in the case where the actual net is a polytree. The structure around each node is revealed after examination of how the neighborhood sets (maximal sets of nodes that are all probabilistically dependent to each other) of that node changes by conditioning on that particular node. A graph-theoretic connectedness reconstruction discovers sets of ancestor and descendant nodes grouped by the link they are connected to the node in question. The algorithm is polynomial in time in the number of nodes.

Future Work:

I am interested in extending the domain of applicability of the algorithm to general Bayesian Nets that may involve (undirected) cycles. The general problem is NP-complete [1] so it would be interesting to see how well structure discovery can be done in practice. I am also interested in structure discovery in the case of hidden variables, since that will have a much higher applicability to domains such as vision and image retrieval, as well as a plethora of other application areas.

**Figure 1:** Examples of Bayesian Nets: **(left)** general, **(right)** polytree.
$\begin{figure} \begin{center} \epsfig{file=bn-general.eps}\hspace{.2in} \epsfig... ...n-polytree.eps}\end{center}\vspace{-0.2in} \rule{\textwidth}{.2mm} \end{figure}$

Bibliography

1: D. M. Chickering.
Learning bayesian networks is np-complete.
In Submitted to Proceedings of AI and Statistics, 1995.
2: N. Friedman.
Learning belief networks in the presence of missing values and hidden variables.
In Proceedings of the Fourteenth International Conference on Machine Learning (ICML), 1997.
3: D. Heckerman.
A tutorial on learning bayesian networks.
Technical Report MSR-TR-95-06, Microsoft Research, Advanced Technology Division, March 1995.
4: S. Russel, J. Binder, D. Koller, and K. Kanazawa.
Local learning in probabilistic networks with hidden variables.
In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pages 1146-1152. San Francisco: Morgan Kaufmann, 1995.
5: P. Spirtes, G. Glymour, and R. Scheines.
Causation, Prediction, and Search.
Springer-Verlag, New York, 1993.

About this document...

This document was generated using the LaTeX2HTML translator Version 98.1p1 release (March 2nd, 1998).
The translation was performed on 1998-11-24.