The purpose of this appendix is to demonstrate that the function:
is a convex function of the variational parameters
.
We note first that affine transformations do not change convexity
properties. Thus convexity in
implies convexity in the variational parameters
. It remains
to show that
is a convex function of the vector
;
here we have indicated the discrete values in the range of the random
variable X by
and denoted the probability measure on
such values by
. Taking the gradient of f with respect to
gives:
where
defines a probability distribution. The convexity is
revealed by a positive semi-definite Hessian
, whose
components in this case are
To see that
is positive semi-definite, consider
where
is the variance of a discrete random variable Z
which takes the values
with probability
.