Discussion

We have shown that bound propagation is a simple algorithm with surprisingly good results. It performs exact computations on local parts of the network and keeps track of the uncertainties that brings along. In this way it is capable to compute upper and lower bounds on any marginal probability of a small set of nodes in the intractable network. Currently we do not understand which network properties are responsible for the tightness of the bounds found. In Figure , for instance, we saw islands of worse results in a sea of tight bounds. It is obvious that one bad bound influences its direct neighbourhood, since bound propagation performs local computations in the network. We have no clue, however, why these islands are at the location we found them. We tried to find a correlation with the size of the weights (rewriting the network potentials to a Boltzmann distribution) or with the local frustration of the network (spin glass state), but could not explain the quality of the bounds in terms of these quantities. Here we pose it as an open question.

The computational complexity of the algorithm is mainly determined by the state space of the separator nodes one uses, which makes the algorithm unsuitable for heavily connected networks such as Boltzmann machines. Nevertheless, there is a large range of architectures for which bound propagation can easily be done. Such networks typically have a limited number of connections per node which makes the majority of the Markov blankets small enough to be tractable for the bound propagation algorithm.

We want to discuss one important property of the bound propagation algorithm we did not address before. In this article we found our set of 's by Equation . Due to this choice the separator nodes will always be in the neighbourhood of . We have, however, much more freedom to choose . In Section we stated that a sufficient condition to make the algorithm tractable is that is small enough to do exact computations. A more general, but still sufficient condition is that we should choose such that can be computed efficiently, since this is the quantity we need (see Equation ). If the network structure inside the area separated by can be written as a junction tree with a small enough tree width, we can compute these conditional probabilities even if the number of nodes enclosed by is very large. For certain architectures the separator can even in that case be small enough. One can think of a network consisting of a number of tractable junction trees that are connected to each other by a small number of nodes. One should be aware of the important difference between the method outlined here and the method of conditioning pearl88a, which does exact inference. We do not require the part of the network that is outside the separator (i.e. all nodes not in , or ) to be tractable. The open problem is to develop an efficient algorithm to find suitable separators, since the nodes in such an are generally spread widely.

To conclude this article we like to say a few more words about linear programming. This is a field of research that is still developing. Any improvements on LP-solving algorithms directly influence the results presented in this article. We could imagine that more advanced LP-methods could benefit from the fact that the matrix in Equation is sparse. At least half of the entries are zero. To be more precise: the constraints induced by a particular are exactly rows in the matrix with exactly non-zero entries, thus having a fraction of zero´s. Moreover, all the non-zero entries are ones. Finally, there is a promising future for LP-solvers, since the algorithms seems to be suitable for parallel computing alon90parallel.

This research is supported by the Technology Foundation STW, applied science devision of NWO and the technology programme of the Ministry of Economic Affairs.