Load Balancing Results

Load balancing significantly affects the performance of a majority of parallel algorithms. When work is divided evenly among processors, no load balancing is necessary. Heuristic search frequently creates highly irregular search spaces, which results in load imbalance between processors. EUREKA permits load balancing operations during iterations of IDA*. A processor with nodes available on its open list may donate some or all of the nodes to a requesting processor. Decisions that affect system performance include deciding when to balance the load, identifying a processor from which to request work, and deciding how much work to donate.

In the first load balancing experiment we test EUREKA's ability to select the appropriate processor polling strategy. We have implemented the asynchronous round robin and the random polling approaches. On the nCUBE 2, a processor's D neighbors are polled for work (using the nCUBE's hypercube topology, D corresponds to log₂ P) whereas in the PVM environment, a processor's right and left neighbors are polled (D = 2 because the workstations are connected with a ring topology). The results of this experiment are listed in Table 9.

Table 9: Load Balancing Speedup Results

Approach	15Puzzle	Fil-15P	RMP	Fil-RMP
Neighbor	52.02	65.21	58.81	57.67
Random	55.35	70.75	58.17	56.03
C4.5	50.55	75.01	61.31	60.84
Combined-C4.5	56.71

Table 9 shows that once again C4.5 yields the best speedup in most cases and always yields the best speedup on filtered data sets. Among the fixed results, no single approach outperforms the others on all data sets.

Table 10: Load Balancing Classification Results

Approach	15Puzzle	Fil-15P	RMP	Fil-RMP
Neighbor	.5306 (.08)	.5762 (.00)	.2937 (.21)	.2000 (.04)
Random	.4694 (.03)	.4238 (.00)	.7063 (.03)	.8000 (.00)
C4.5	.3806	.1429	.4048	.0000

Table 10 summarizes the classification results of the fixed strategies in comparison to the C4.5 classifications. For each of the filtered data sets, C4.5 outperforms any fixed strategy with a significance of p $\leq$ 0.04 or better.

The second load balancing experiment demonstrates EUREKA's ability to determine the optimal amount of work to donate upon request. If too little work is donated, the requesting processor will soon return for more work. If too much work is donated, the granting processor will soon be in danger of becoming idle. Table 11 lists the results of this experiment, demonstrating once again that the learning system is capable of effectively selecting load balancing strategies, except when the unfiltered test cases from the fifteen puzzle are used (on the nCUBE and on the distributed network of workstations). The combined results are generated using training cases from the fifteen puzzle and robot arm motion planning nCUBE examples.

Table 11: Distribution Amount Speedup Results

Approach	15Puzzle	Fil-15P	RMP	Fil-RMP	PVM-15P
30%-nCUBE	52.02	65.21	63.10	55.49	7.69
50%-nCUBE	53.68	61.95	61.26	60.13	7.49
C4.5-nCUBE	51.28	76.35	63.67	62.13	7.50
Combined-C4.5	55.44				--

Table 12: Distribution Amount Classification Results

Approach	15Puzzle	Fil-15P	RMP	Fil-RMP
30%	.5639 (.26)	.5333 (.00)	.1984 (.00)	.3000 (.04)
50%	.4361 (.17)	.4667 (.00)	.8016 (.01)	.7000 (.01)
C4.5	.5056	.1429	.1984	.0667

Table 12 lists the classification accuracy results. C4.5 does not perform significantly better than the fixed strategies for the unfiltered data, but does perform significantly better (p $\leq$ 0.04) than the fixed strategies for the filtered data.