Next: Path Consistency Algorithm Up: Representing Temporal Information Previous: Allen's framework

## Test problems

We tested how well the heuristics we developed for improving path consistency and backtracking algorithms perform on a test suite of problems.

The purpose of empirically testing the algorithms is to determine the performance of the algorithms and the proposed improvements on ``typical'' problems. There are two approaches: (i) collect a set of ``benchmark'' problems that are representative of problems that arise in practice, and (ii) randomly generate problems and ``investigate how algorithmic performance depends on problem characteristics ... and learn to predict how an algorithm will perform on a given problem class'' [15].

For IA networks, there is no existing collection of large benchmark problems that actually arise in practice---as opposed to, for example, planning in a toy domain such as the blocks world. As a start to a collection, we propose an IA network with 145 intervals that arose from a problem in molecular biology (see Example 2, above; [4]). The proposed benchmark problem is not strictly speaking a temporal reasoning problem as the intervals represent segments of DNA, not intervals of time. Nevertheless, it can be formulated as a temporal reasoning problem. The value is that the benchmark problem arose in a real application. We will refer to this problem as Benzer's matrix.

In addition to the benchmark problem, in this paper we use two models of a random IA network, denoted B(n) and S(n, p), to evaluate the performance of the algorithms, where n is the number of intervals, and p is the probability of a (non-trivial) constraint between two intervals. Model B(n) is intended to model the problems that arise in molecular biology (as estimated from the problem discussed in [4]). Model S(n, p) allows us to study how algorithm performance depends on the important problem characteristic of sparseness of the underlying constraint graph. Both models, of course, allow us to study how algorithm performance depends on the size of the problem.

For B(n), the random instances are generated as follows.

Step 1. Generate a ``solution'' of size n as follows. Generate n real intervals by randomly generating values for the end points of the intervals. Determine the IA network by determining, for each pair of intervals, whether the two intervals either intersect or are disjoint.
Step 2. Change some of the constraints on edges to be the trivial constraint by setting the label to be I, the set of all 13 basic relations. This represents the case where no experiment was performed to determine whether a pair of DNA segments intersect or are disjoint. Constraints are changed so that the percentage of non-trivial constraints (approximately 6% are intersects and 17% are disjoint) and their distribution in the graph are similar to those in Benzer's matrix.

For S(n, p), the random instances are generated as follows.

Step 1. Generate the underlying constraint graph by indicating which of the possible
n (n - 1)/ 2 edges is present. Let each edge be present with probability p, independently of the presence or absence of other edges.
Step 2. If an edge occurs in the underlying constraint graph, randomly chose a label for the edge from the set of all possible labels (excluding the empty label) where each label is chosen with equal probability. If an edge does not occur, label the edge with I, the set of all 13 basic relations.
Step 3. Generate a ``solution'' of size n as follows. Generate n real intervals by randomly generating values for the end points of the intervals. Determine the consistent scenario by determining the basic relations which are satisfied by the intervals. Finally, add the solution to the IA network generated in Steps 1-2.

Hence, only consistent IA networks are generated from S(n, p). If we omit Step 3, it can be shown both analytically and empirically that almost all of the different possible IA networks generated by this distribution are inconsistent and that the inconsistency is easily detected by a path consistency algorithm. To avoid this potential pitfall, we test our algorithms on consistent instances of the problem. This method appears to generate a very reasonable test set for temporal reasoning algorithms with problems that range from easy to hard. It was found, for example, that instances drawn from S(n, 1/4) were hard problems for the backtracking algorithms to solve, whereas for values of p on either side (S(n, 1/2) and S(n, 1/8)) the problems were easier.

Next: Path Consistency Algorithm Up: Representing Temporal Information Previous: Allen's framework