next up previous
Next: Appendix B. Optimization of Up: Variational Probabilistic Inference and Previous: Discussion

Appendix A. Duality


The upper and lower bounds for individual conditional probability distributions that form the basis of our variational method are based on the ``dual'' or ``conjugate'' representations of convex functions. We present a brief description of convex duality in this appendix, and refer the reader to Rockafellar (1970) for a more extensive treatment.

Let f(x) be a real-valued, convex function defined on a convex set X (for example, tex2html_wrap_inline1213 ). For simplicity of exposition, we assume that f is a well-behaved (differentiable) function. Consider the graph of f, i.e., the points (x,f(x)) in an n+1 dimensional space. The fact that the function f is convex translates into convexity of the set tex2html_wrap_inline1225 called the epigraph of f and denoted by epi(f) (Figure 13).

Figure 13: Half-spaces containing the convex set epi(f). The conjugate function tex2html_wrap_inline853 defines the critical half-spaces whose intersection is epi(f), or, equivalently, it defines the tangent planes of f(x).

It is an elementary property of convex sets that they can be represented as the intersection of all the half-spaces that contain them (see Figure 13). Through parameterizing these half-spaces we obtain the dual representations of convex functions. To this end, we define a half-space by the condition:


where tex2html_wrap_inline967 and tex2html_wrap_inline1241 parameterize all (non-vertical) half-spaces. We are interested in characterizing the half-spaces that contain the epigraph of f. We require therefore that the points in the epigraph must satisfy the half-space condition: for tex2html_wrap_inline1245 , we must have tex2html_wrap_inline1247 . This holds whenever tex2html_wrap_inline1249 as the points in the epigraph have the property that tex2html_wrap_inline1251 . Since the condition must be satisfied by all tex2html_wrap_inline1253 , it follows that


as well. Equivalently,


where the right-hand side of this equation defines a function of tex2html_wrap_inline967 , which is known as the ``dual'' or ``conjugate'' function tex2html_wrap_inline853 . This function, which is also a convex function, defines the critical half-spaces which are needed for the representation of epi(f) as an intersection of half-spaces (Figure 13).

To clarify the duality between f(x) and tex2html_wrap_inline1263 , let us drop the maximum and rewrite the inequality as:


In this equation, the roles of the two functions are interchangeable and we may suspect that also f(x) can obtained from the dual function tex2html_wrap_inline1263 by an optimization procedure. This is in fact the case and we have:


This equality states that the dual of the dual gives back the original function. It provides the computational tool for calculating dual functions.

For concave (convex down) functions the results are analogous; we replace tex2html_wrap_inline1269 with tex2html_wrap_inline1271 , and lower bounds with upper bounds.

next up previous
Next: Appendix B. Optimization of Up: Variational Probabilistic Inference and Previous: Discussion

Michael Jordan
Sun May 9 16:22:01 PDT 1999