The aggregation function plays an important role in how we measure the distance between belief states. When we compute more than one state distance measure, either exhaustively or by sampling a subset (as previously mentioned), we must combine the measures by some means, denoted . There is a range of options for taking the state distances and aggregating them into a belief state distance. We discuss several assumptions associated with potential measures:
.
The belief state distances are and . In this case we prefer to . If each state distance is admissible and we do not sample from belief states, then assuming positive interaction is also admissible.
.
In our example, , and . In this case we have no preference over and .
We notice that using the cardinality of a belief state to measure is a special case of assuming state independence, where . If we use cardinality to measure distance in our example, then we have , and . With cardinality we prefer over because we have better knowledge in .
.
Depending on the type of search, we define differently. We assume that sequences used in progression search start at the same time and those used in regression end at the same time. Thus, in progression all sequences are aligned at the first step before we union steps, and in regression all sequences are aligned at the last step before the union.
For example, in progression because we align the sequences at their first steps, then union each step. Notice that this resulting sequence has seven actions, giving , whereas defining as maximum gave a distance of five and as summation gave a distance of eight. Compared with overlap, positive interaction tends to under estimate distance, and independence tends to over estimate distance. As we will see during our empirical evaluation (in Section 6.5), accounting for overlap provides more accurate distance measures for many conformant planning domains.
There are two ways negative interactions play a role in belief state distances. Negative interactions can allow us to prove it is impossible for a belief state to reach a belief state , meaning , or they can potentially increase the distance by a finite amount. We use only the first, more extreme, notion of negative interaction by computing ``cross-world'' mutexes [30] to prune belief states from the search. If we cannot prune a belief state, then we use one of the aforementioned techniques to aggregate state distances. As such, we do not provide a concrete definition for to measure negative interaction.
While we do not explore ways to adjust the distance measure for negative interactions, we mention some possibilities. Like work in classical planning [23], we can penalize the distance measure to reflect additional cost associated with serializing conflicting actions. Additionally in conditional planning, conflicting actions can be conditioned on observations so that they do not execute in the same plan branch. A distance measure that uses observations would reflect the added cost of obtaining observations, as well as the change in cost associated with introducing plan branches (e.g., measuring average branch cost).
The above techniques for belief state distance estimation in terms of state distances provide the basis for our use of multiple planning graphs. We will show in the empirical evaluation that these measures affect planner performance very differently across standard conformant and conditional planning domains. While it can be quite costly to compute several state distance measures, understanding how to aggregate state distances sets the foundation for techniques we develop in the . As we have already mentioned, the conveniently allows us to implicitly aggregate state distances to directly measure belief state distance.