In order to make sure that the AIS-BN algorithm performs well in general, we tested it on two other large networks.
The first network that we used in our tests is the PATHFINDER network [Heckerman et al.1990], which is the core element of an expert system that assists surgical pathologists with the diagnosis of lymph-node diseases. There are two versions of this network. We used the larger version, consisting of 135 nodes. In contrast to the CPCS network, PATHFINDER contains many conditional probabilities that are equal to 1, which reflects deterministic relationships in certain settings. To make the sampling challenging, we randomly selected 20 evidence nodes from among the leaf nodes. Each of these was an observable node (David Heckerman, personal communication). We verified in each case that the probability of so selected evidence was not equal to zero.
We fixed the execution time of the algorithms to be 60 seconds. The learning overhead for the AIS-BN algorithm in the PATHFINDER network was about 3.5 seconds. In 60 seconds, AIS-BN generated about 366,000 samples, SIS generated about 250,000 samples and LW generated about 2,700,000 samples. The reason why LW could generate more than 10 times as many samples as SIS within the same amount of time is that the LW algorithm terminates sample generation at a very early stage in many samples, when the weight of a sample becomes zero. This is a result of determinism in the probability tables, mentioned above. We will see that LW benefits greatly from generating more samples. The other parameters used in AIS-BN were the same as those used in the CPCS network.
We tested 20 cases, each with randomly selected 20 evidence nodes. The
reported MSE for each case was averaged over 10 runs. Some of the
runs of the SIS and LW algorithms did not
manage to generate any effective samples (the weight score sum was
equal to zero). SIS had only 75% effective runs and LW had
only 89% effective runs, which means that in some runs SIS and
LW were unable to yield any information about the posterior
distributions. In all those cases, we discarded the run and only
averaged over the effective runs. All runs in the AIS-BN algorithm
were effective. We report our experimental
results with the summary statistics in Table 4.
From these data, we can see that the AIS-BN algorithm is still
significantly better than the SIS and LW algorithms. Since
the LW algorithm can generate more than ten times the number of
samples than the SIS algorithm, its performance is better than
that of the SIS algorithm.
The second network that we tested was one of the ANDES networks [Conati et al.1997]. ANDES is an intelligent tutoring system for classical Newtonian physics that is being developed by a team of researchers at the Learning Research and Development Center at the University of Pittsburgh and researchers at the United States Naval Academy. The student model in ANDES uses a Bayesian network to do long-term knowledge assessment, plan recognition, and prediction of students' actions during problem solving. We selected the largest ANDES network that was available to us, consisting of 223 nodes.
In contrast to the previous two networks, the depth of the ANDES network was significantly larger and so was its connectivity. There were only 22 leaf nodes. It is quite predictable that this kind of networks will pose difficulties to learning. We selected 20 evidence nodes randomly from the potential evidence nodes and tested 20 cases. All parameters were the same as those used in the CPCS network. We fixed the execution time of the algorithms to be 60 seconds. The learning overhead for the AIS-BN algorithm in the ANDES network was 13.4 seconds. In 60 seconds, AIS-BN generated about 114,000 samples, SIS generated about 98,000 samples and LW generated about 180,000 samples. In this network, LW still can generate almost two times the number of samples generated by the SIS algorithm.
We report our experimental results with the summary statistics in Table 5. The results show that also in the ANDES network the AIS-BN algorithm was significantly better than the SIS and LW algorithms. Since LW generated almost two times the number of samples that were generated by the SIS algorithm, its performance was better than that of the SIS algorithm.
While the AIS-BN algorithm is on the average an order of magnitude more precise than the other two algorithms, the performance improvement is smaller than in the other two networks. The reason why the performance improvement of the AIS-BN algorithm over the SIS and LW algorithms in the ANDES network is smaller compared to that in the CPCS and PATHFINDER networks is that: (1) The ANDES network used in our tests was apparently not challenging enough for sampling algorithms in general. In the ANDES network, SIS and LW also can perform well in some cases. The minimum MSE of SIS and LW in our tested cases is almost the same as that of AIS-BN. (2) The number of samples generated by AIS-BN in this network is significantly smaller than that in the previous two networks and AIS-BN needs more time to learn. Although increasing the number of samples will improve the performance of all three algorithms, it improves the performance of AIS-BN more since the convergence ratio of the AIS-BN algorithm is usually larger than that of SIS and LW (see Figure 10). (3) The parameters that we used in this network were tuned for the CPCS network. (4) The large depth and fewer leaf nodes of the ANDES network pose some difficulties to learning.