The empirical and theoretical comparisons presented in the previous section indicate that alternative parallel search strategies perform well under different conditions, and that performance trends do exist which can be used to automatically select strategies and parameter settings for new problems. However, the results of these studies are not sufficient for automatically determining the appropriate set of strategies. One limitation is that information used to generate the formulas and to control the experiments, such as goal position, is not known in advance. Another limitation is that some of the assumptions, such as a constant branching factor, are not realistic for many applications. As a result, we need a method to automatically select optimal strategies for real-world problems given information that is available.
In response to this need, we add a machine learning component to the EUREKA system. EUREKA merges many of the approaches to parallel search discussed the previous section. Parameters can be set that control the task distribution strategy, the load balancing strategies, and the ordering techniques. In particular, the strategies that can be selected include:
The user is allowed to determine the types of strategies to be used for a given problem. However, because such a decision is difficult to make without significant information about the search space, EUREKA also has the capability of making all necessary strategy selections. Based on the characteristics of a search space, EUREKA automatically configures itself to optimize performance on a particular application. Sample problems are fed as training examples to a machine learning system, which in turn learns the optimal strategies to apply to particular classes of search spaces. Applying machine learning to optimize software applications has been pursued in other areas of research. For example, Minton  has applied a similar technique to automatically synthesize problem-specific versions of constraint-satisfaction algorithms. Research in other areas of computer science has yielded similar ideas of customizable environments applied to computer networks [4,33] and to interactive human-computer interfaces [15,20]. This work is unique in allowing both problem-specific and architecture-specific features to influence the choice of strategies and in applying adaptive software techniques to parallel search. EUREKA also offers a framework that can potentially automate both static and dynamic software customization.
To perform parallel search, EUREKA executes the following steps:
Features from the non-goal iterations represent attributes for the test case, and the strategy that results in the best performance (shortest run time) represents the correct classification of the problem instance for a given strategy choice.
The described set of features and the amount of time to spend on the initial EUREKA iteration are chosen based on our experimental data to yield the most helpful information in the shortest time. Searching enough iterations of the problem space until 200,000 nodes are generated takes less than 10 seconds on the problem domains we tested. Spending less time than this may yield erroneous information because features of the tree do not stabilize until several levels down in the tree. Searching additional iterations in general does not significantly improve the quality of information and the time requirements grow exponentially. Improved approaches may include performing the initial search until the stability of the space reaches a prescribed level, or periodically updating the C4.5 choices and adjusting the parallel search approaches dynamically during execution.
Performing an initial shallow search has also been shown to be effective in other parallel search research. For example, Suttner  performs parallel search of the first few IDA* iterations twice during parallel search in order to determine the number of individual tasks that should be distributed to each processor. Cook et al.  also perform an initial shallow search in order to obtain more accurate operator ordering information to use in reordering the search space. In each case, the amount of time required to perform the extra initial search is minimal yet greatly improves the performance of the overall search task.
The selected features each demonstrate a significant influence on the optimal search strategy. Although feature values can change dramatically from one problem to the next, each feature remains fairly stable between levels of the same tree. As a result, computing the values of these features at a low level in the tree provides a good indication of the structure of the entire tree. The ``Deviation Within Problems'' entries in Table 1 show the standard deviation of each feature value over all search tree levels. The results are averaged over 98 problem instances in the fifteen puzzle domain1 and 20 problem instances in the robot arm motion planning domain. Both of these test domains are described in the next section. The low values in these columns indicate that the features provide a good indicator of the structure of the problem space at all levels in the tree. In contrast, the ``Deviation Between Problems'' table entries show the standard deviation of the average feature value (averaged over all levels in each tree) over all problem spaces. The larger values in these columns indicate that the features can effectively distinguish between different problem spaces.
EUREKA is written in C and is currently implemented on an nCUBE 2, on Linux workstations using the Parallel Virtual Machine (PVM) communication software, on a DEC Alpha using an implementation of Posix threads, on Windows NT running Java threads and Cilk threads, and as a distributed Java system using RMI. We are also currently investigating means of dynamically switching between strategies during execution as the problem structure or environment changes.