# 3. The Easy-Hard-Easy Pattern

In this section, we present an example of the how search cost varies with the tightness of constraints for a class of problems, and describe how this behavior can be understood in terms of changes in the structure of the problems, independent of particular search algorithms. This review and summary of previous studies of the transition then forms a basis for comparison with the new results presented in subsequent sections.

### 3.1 An Example

Figure 1: Typical transition pattern. Median solution cost for dynamic backtracking (solid line) and probability of a solution (dashed line) as a function of number of nogoods. Each point represents 1000 problems of 10 variables and domain size 3, each solved 100 times. Error bars showing 95% confidence intervals are included, but are in some cases smaller than the size of the plotted points.

Figure 1 shows a typical example of the easy-hard-easy pattern as a function of the constrainedness of the problem. Problems with few or many constraints tend to be easy to solve while those with an intermediate number are more difficult. The fraction of solvable problems is also shown in Figure 1, scaled from 1.0 on the left to 0.0 on the right. This illustrates that the hard problems are concentrated in the so-called ``mushy region'' [19] where the probability of a solution is changing from 1.0 to 0.0. In particular, the peak in search cost is near the ``crossover point,'' the point at which half the problems are solvable and half unsolvable. For this problem class, the crossover point occurs at just over 75 binary nogoods, and the peak in dynamic backtracking solution cost occurs at about 85 binary nogoods.

In all of our results in this paper, we include 95% confidence intervals [20]. These intervals for the estimate of the median obtained from our samples are given approximately by the percentiles of the data, where N is the number of samples. For the estimate of fractions the intervals are given approximately by , where f is the estimated value of the fraction. Finally, for the estimate of means the intervals are approximately where is the estimate of the mean and the standard deviation of the sample. In many cases in this paper, there are sufficient samples to make these intervals smaller than the size of the plotted points.

A key point from examples such as this is that the difficult instances within a class of search problems tend to be concentrated near a particular value of the constraint tightness (here measured by the number of binary nogoods). Because this behavior is seen for a variety of search methods, it indicates this concentration does not depend much on the details of the search algorithm. Instead, it appears to be associated with a change in the properties of the problems themselves, namely their solvability.

### 3.2 An Explanation

These observations raise a number of questions, such as why a peak in search cost exists, why the peak occurs near the transition from mostly solvable to mostly unsolvable problems and is thus independent of the particular search algorithm, and why this behavior is seen for a large variety of constraint satisfaction problems.

The existing explanation for the concentration of hard problems relies on a competition between changes in the number of solutions and the amount of pruning provided by the problem constraints [21]. With few constraints, there are many solutions so the search is usually easy. As constraints are added the number of solutions drops rapidly, making problems harder. But the new constraints also increase the pruning of unproductive search choices, tending to make search easier. When there are few constraints, the decrease in the number of solutions overwhelms the increase in pruning, giving harder problems on average. Eventually the last solution is eliminated and all that remains is the increased pruning from additional constraints, leading to easier problems. Thus the phase transition, the point at which there is a precipitous change from solvability to unsolvability, more or less coincides with the peak in solution cost. All these effects become more pronounced as larger problems are considered, leading to sharper peaks and more abrupt transitions. This qualitative description explains many features of the observed behavior. This pruning explanation was also offered by [4] with respect to finding Hamiltonian circuits in highly constrained problems.

This explanation can also be used to obtain a quantitative understanding of the behavior. For instance, the location of the transition region can be understood by an approximate theory predicting that the cost peak occurs when the expected number of solutions equals one [19, 21]. In our example there are possible assignments to the 10 variables in the problem. There are possible binary nogoods for the problem, which counts the number of ways to select a pair of variables and the different assignments for that pair. A given complete assignment for the 10 variables will be a solution provided each of the selected binary nogoods does not use the same assignment for its pair of variables as in the given complete assignment. This leaves possible choices for the binary nogoods. Thus the expected number of solutions is given by

for problems with m randomly selected binary nogoods. This expression equals one at m=82.9, the location of the observed cost peak. Furthermore, because the expected number of solutions grows exponentially with the number of variables when m is smaller than this threshold value and decreases exponentially to zero when m is larger, the range of m values over which the expected number of solutions is near one rapidly decreases as variables are added. This accounts for the observed sharpening of the transition for larger problems.

A further quantitative success of relating the search cost peak to transition phenomena is the evaluation of scaling behavior of the transition and search cost peak [15, 6].

Previous: 2. Some Classes of Search Problems
Next: 4. Search Difficulty and Solvability