The diagnostic cases that we used in evaluating the performance of the variational techniques were cases abstracted from clinocopathologic conference (``CPC'') cases. These cases generally involve multiple diseases and are considered to be clinically difficult cases. They are the cases in which Middleton et al. (1990) did not find their importance sampling method to work satisfactorily.

Our evaluation of the variational methodology consists of three parts. In the first part we exploit the fact that for a subset of the CPC cases (4 of the 48 cases) there are a sufficiently small number of positive findings that we can calculate exact values of the posterior marginals using the Quickscore algorithm. That is, for these four cases we were able to obtain a ``gold standard'' for comparison. We provide an assessment of the accuracy and efficiency of variational methods on those four CPC cases. We present variational upper and lower bounds on the likelihood as well as scatterplots that compare variational approximations of the posterior marginals to the exact values. We also present comparisons with the likelihood-weighted sampler of Shwe and Cooper (1991).

In the second section we present results for the remaining, intractable CPC cases. We use lengthy runs of the Shwe and Cooper sampling algorithm to provide a surrogate for the gold standard in these cases.

Finally, in the third section we consider the problem of obtaining interval bounds on the posterior marginals.

