The QEM algorithm

Next: Sampling-based techniques Up: INTERIOR-POINT ALGORITHMS Previous: Gradient-based techniques

The QEM algorithm

In this subsection we show how the original Expectation-Maximization algorithm [13] can be extended to a Quasi-Bayesian Expectation-Maximization (QEM) algorithm with the same convergence properties. We must maximize the posterior log-likelihood L(Theta) defined previously. The algorithm begins by assuming that the transparent variables are actual random quantities with distributions specified by Theta. An initial estimate Theta⁰ is assumed for Theta.

Suppose we had i sets of complete data for the transformed network, i.e., we had observed i trials for all variables in the network, including the transparent variables. The log-likelihood for this complete data would be L(Theta) = sum_ijk l_i(j,k) logtheta_ijk, where l_i(j,k) indicates the number of data points when the variable x_i is instantiated in its j value with its parents instantiated in their k value.

The first step of the QEM algorithm is to obtain the expected value of the log-likelihood given the evidence and assuming Theta⁰ is correct [10]:

Q(Theta|Theta^k) = E{[ log{( p(x_q = a, e) } - log{( p(e) } } = sum_ijk p(x_i, pa(x_i) | x_q = a, e) logtheta_ijk - sum_ijk p(x_i, pa(x_i) | e) logtheta_ijk.

The second step of the QEM algorithm is to maximize Q(Theta|Theta^k) for Theta. Only a few terms in the expression for Q(Theta|Theta^k) will be free, since only the theta_ij for z'_i are estimated. Collecting these terms we obtain:

sum_ij p(z'_i = j | x_q = a, e) logtheta_ij - sum_ij p(z'_i = j| e) logtheta_ij,

To perform maximization, use gradient descent with Theta^k as a starting point and ensure that at the end of the process we have Q(Theta^k+1|Theta^k) > Q(Theta^k|Theta^k). The gradient has essentially the same expression used in the previous subsection, which can be obtained through standard Bayesian network algorithms. Now set Theta^k+1 to the maximizing value and go to the next iteration. The following theorem provides the justification for the QEM algorithm (proof can be found in [10]):

Next: Sampling-based techniques Up: INTERIOR-POINT ALGORITHMS Previous: Gradient-based techniques

Fri May 30 15:55:18 EDT 1997