The purpose of this appendix is to demonstrate that the function:
is a convex function of the variational parameters . We note first that affine transformations do not change convexity properties. Thus convexity in implies convexity in the variational parameters . It remains to show that
is a convex function of the vector ; here we have indicated the discrete values in the range of the random variable X by and denoted the probability measure on such values by . Taking the gradient of f with respect to gives:
where defines a probability distribution. The convexity is revealed by a positive semi-definite Hessian , whose components in this case are
To see that is positive semi-definite, consider
where is the variance of a discrete random variable Z which takes the values with probability .