next up previous
Next: Approximate Inference for QMR Up: Variational Methods Previous: Variational Methods

Variational Upper and Lower Bounds for Noisy-OR

 

Let us now return to the problem of computing the posterior probabilities in the QMR model. Recall that it is the conditional probabilities corresponding to the positive findings that need to be simplified. To this end, we write

  displaymath1003

where tex2html_wrap_inline1015 . Consider the exponent tex2html_wrap_inline1017 . For noisy-OR, as well as for many other conditional models involving compact representations (e.g., logistic regression), the exponent f(x) is a concave function of x. Based on the discussion in the previous section, we know that there must exist a variational upper bound for this function that is linear in x:

displaymath1004

Using Eq. (9) to evaluate the conjugate function tex2html_wrap_inline853 for noisy-OR, we obtain:

displaymath1005

The desired bound is obtained by substituting into Eq. (13) (and recalling the definition tex2html_wrap_inline1015 ):

  displaymath1006

Note that the ``variational evidence'' tex2html_wrap_inline1029 is the exponential of a term that is linear in the disease vector d. Just as with the negative findings, this implies that the variational evidence can be incorporated into the posterior in time linear in the number of diseases associated with the finding.

There is also a graphical way to understand the effect of the transformation. We rewrite the variational evidence as follows:

displaymath1007

Note that the first term is a constant, and note moreover that the product is factorized across the diseases. Each of the latter factors can be multiplied with the pre-existing prior on the corresponding disease (possibly itself modulated by factors from the negative evidence). The constant term can be viewed as associated with a delinked finding node tex2html_wrap_inline879 . Indeed, the effect of the variational transformation is to delink the finding node tex2html_wrap_inline879 from the graph, altering the priors of the disease nodes that are connected to that finding node. This graphical perspective will be important for the presentation of our variational algorithm--we will be able to view variational transformations as simplifying the graph until a point at which exact methods can be run.

We now turn to the lower bounds on the conditional probabilities tex2html_wrap_inline1037 . The exponent tex2html_wrap_inline1039 in the exponential representation is of the form to which we applied Jensen's inequality in the previous section. Indeed, since f is concave we need only identify the non-negative variables tex2html_wrap_inline975 , which in this case are tex2html_wrap_inline1045 , and the constant a, which is now tex2html_wrap_inline1049 . Applying the bound in Eq. (12) we have:

  displaymath1008

where we have allowed a different variational distribution tex2html_wrap_inline1051 for each finding. Note that once again the bound is linear in the exponent. As in the case of the upper bound, this implies that the variational evidence can be incorporated into the posterior distribution in time linear in the number of diseases. Moreover, we can once again view the variational transformation in terms of delinking the finding node tex2html_wrap_inline879 from the graph.


next up previous
Next: Approximate Inference for QMR Up: Variational Methods Previous: Variational Methods

Michael Jordan
Sun May 9 16:22:01 PDT 1999