\documentstyle[psfig]{article}

\newcommand{\normalfig}[2]{ %\begin{figure}
\centerline{\psfig{file=#1.ps,width=0.87\textwidth}}
%\begin{center}
%\begin{minipage}{0.85\textwidth}
%{\footnotesize
%\refstepcounter{figure}
%\label{#1}
%\noindent
%Figure~\ref{#1}: #2}
%\end{minipage}
%\end{center}
%\end{figure}
}

\begin{document}

Bayesian networks are graphical models for representing probability
distributions over multiple variables.  Each variable $X_i$ is
represented as a vertex in an directed acyclic graph ("dag"); the
probability distribution $P(X_1, X_2, \ldots, X_N)$ is represented in
factorized form as follows:

\[ P(X_1, X_2, \ldots, X_N) = \prod_{i=1}^{N} P(X_i \mid \Pi_{X_i})\]

where $\Pi_{X_i}$ is the set of vertices that are $X_i$'s parents in
the graph.  A Bayesian network is thus fully specified by the combination of:

\begin{itemize}
\item The graph structure, i.e., what directed arcs exist in the graph.
\item The probability table $P(X_i \mid \Pi_{X_i})$ for each variable $X_i$.
\end{itemize}

A small example Bayesian network structure for a (somewhat
facetious/futuristic) medical diagnostic domain is shown below.  This
network might be used to diagnose whether a patient is suffering from
a mere common cold (C) or the more dangerous Martial Death Flu (M),
based on the patients' symptoms --- whether or not the patient has a
runny nose (R), whether or not the patient has a headache (H), and
whether or not the patient occasionally spontaneously bursts into
flames (S) --- as well as relevant background information, namely
whether or not he or she has previously visited Mars (V).

\vspace{.2in}

\normalfig{examplenet}{An example Bayesian network}

\vspace{.2in}

Assuming all six variables are binary, with 1 representing ``true''
and 0 ``false'', the probability tables for the network might be
defined as follows:

%\begin{itemize}
%\item $P(V=1) = .0001$
%\item $P(C=1) = .05$
%\item $P(F=1|V=0) = 0; P(F=1|V=1) = .001$
%\item $P(R=1|C=0,F=0) = .05; P(R=1|C=0,F=1) = .5; P(R=1|C=1,F=0) = .9; P(R=1|C=1,F=1) = .95$
%\item $P(H=1|C=0,F=0) = .07; P(H=1|C=0,F=1) = .98; P(H=1|C=1,F=0) = .3; P(H=1|C=1,F=1) = .99$
%\item $P(S=1|F=0) = 0; P(S=1|F=1) = .8$
%\end{itemize}

\vspace{.2in}

\begin{tabular}{|ll|} \hline
\multicolumn{2}{|c|}{$P(V)$} \\ 
V=0 & V=1 \\ \hline
0.9999 & 0.0001 \\ \hline
\end{tabular}
\hspace{.05in}
\begin{tabular}{|ll|} \hline
\multicolumn{2}{|c|}{$P(C)$} \\ 
C=0 & C=1 \\ \hline
0.95 & 0.05 \\ \hline
\end{tabular}
\hspace{.05in}
\begin{tabular}{|l|ll|} \hline
\multicolumn{3}{|c|}{$P(F \mid V)$} \\ 
V & F=0 & F=1 \\ \hline
0 & 1.0 & 0.0 \\ 
1 & 0.001 & 0.999 \\ \hline
\end{tabular}
\hspace{.05in}
\begin{tabular}{|ll|ll|} \hline
\multicolumn{4}{|c|}{$P(R \mid C,F)$} \\ 
C & F & R=0 & R=1 \\ \hline
0 & 0 & 0.95 & 0.05 \\ 
0 & 1 & 0.50 & 0.50 \\ 
1 & 0 & 0.10 & 0.90 \\
1 & 1 & 0.02 & 0.98 \\ \hline
\end{tabular}
\vspace{.05in}
\begin{tabular}{|ll|ll|} \hline
\multicolumn{4}{|c|}{$P(H \mid C,F)$} \\ 
C & F & H=0 & H=1 \\ \hline
0 & 0 & 0.93 & 0.07 \\ 
0 & 1 & 0.02 & 0.98 \\ 
1 & 0 & 0.40 & 0.60 \\
1 & 1 & 0.01 & 0.99 \\ \hline
\end{tabular}
\hspace{.05in}
\begin{tabular}{|l|ll|} \hline
\multicolumn{3}{|c|}{$P(S \mid F)$} \\ 
F & S=0 & S=1 \\ \hline
0 & 1.0 & 0.0 \\ 
1 & 0.2 & 0.8 \\ \hline
\end{tabular}

\vspace{.2in}

Once a Bayesian network has been specified, it may be used to compute
any conditional probability one wishes to compute.  For example, given
that a person has recently visited Mars and has a runny nose, the
network above could be used to compute the probability $P(C=1,M=0 \mid
R=1,V=1)$ that the person has the common cold but not the Martian Death
Flu.

Bayesian networks are very convenient for representing systems of
probabilistic causal relationships.  The fact ``X often causes Y'' may
easily be modeled in the network by adding a directed arc from X to Y
and setting the probabilities appropriately.  On the other hand, if A
has no causal influence on B, we may simply leave out an arc from A to
B.  (For example, there is no arc from C to S in the network above,
since the common cold presumably neither causes nor prevents 
spontaneous combustion.)  

Some important Bayesian network caveats / research areas :
\begin{itemize}
\item Calculating conditional probabilities with general Bayesian networks
is NP-hard.  However, there are techniques that can often make these
calculations practical even with fairly large network structures.
\item How does one automatically learn Bayesian networks from data?
\begin{itemize}
\item In general, finding the best network structure is NP-hard; combinatorial
optimization techniques such as simulated annealing or hillclimbing
are often used to search for good network structures.
\item If not all the variables are observable in the data, calculating the
correct probabilities to use in the probability tables can also be difficult
(even when the network structure has already been fixed), requiring
the use of iterative methods such as EM.
\end{itemize}
\end{itemize}
\end{document}