next up previous
Next: The Mystery Cause Unmasked Up: A Study in Program Previous: Minimal Cost Results with

Negative Results with MBPR

 

Symbolic Regression was chosen for a third benchmark on which to test the MBPR paradigm both because it is a traditional GP benchmark problem, and more importantly, because it has characteristics that suggest that MBPR will not help. Table 3 shows the basic function set that all the regression problems discussed in this paper used (except where explicitly noted).

The relative simplicity of the function set was an important factor in the decision to choose regression as our third problem domain. The algebraic primitives do not lead directly to intron activity (this is discussed more fully in the next section) because 0 and 1 are not provided as terminals. In addition, the algebraic primitives for regression tend to lead to most or all of the tree being used. This set of functions is relatively homogeneous. This homogeneity leads us to suspect that MBPR will hamper rather than help the search process.

Table 3:1 presents the results of the first set of regression runs with the Gudermannian function and 20 indexed memory units. Not only does the MBPR paradigm not help, it is extremely detrimental to performance. In fourteen runs, it never finds a solution. While this is in the direction that was predicted, it is so negative that it is puzzling. Again, we see that adding useless READ and WRITE functions (the Reg-RPBR-X runs) causes a factor of five rise in computational cost and a 28% reduction in number of solutions. It is at this point that we may begin to suspect that the MBPR effects are being swamped by some other effect.

   table145
Table 3: Regression

In an effort to find the cause of the performance drop, several experiments were run in which the memory size was reduced. The rationale is that, since the the MBPR paradigm did well on the 5-Parity and had only two memory cells whereas the MBPR method did less well on the lawn-mower problem where there were 20 memory cells, perhaps reducing the indexed memory size will, for this problem, lower the cost of using MBPR. Table 3:2-4 shows the results.

It has been repeatedly suggested, though never demonstrated, that unnecessarily large indexed memory sizes can degrade performance. Table 3:2-4 indicates that reducing memory size does not seem to explain the poor performance of the RPBR-X, RPBR-M, and MBPR runs.

The last set of runs in table 3:4 (0 cells for indexed memory) shows runs in which there is no indexed memory to confuse the program while it is computing an answer. Since (Write-Answer x) returns X, it basically has no effect except to save off the answer. In this situation, any tree that has WRITE-ANSWER as its root is exactly the same as a tree using RPBR from the point of view of the return value. This is true even if there are other WRITE-ANSWERs in the tree, since the root is evaluated last. It is hard to believe that the problem is eight times easier than ``Solve the Gudermannian with Add, Subtract, etc and put a WRITE-ANSWER at the root node.''

The runs in table 3:4 have two main points of interest. Notice that while the computational effort of the RPBR-X runs is worse than that of RPBR runs, it is much closer to that of the RPBR runs that to that of the MBPR runs. The former ratio is 1.67; the later is 3.68. This means two things. First, even inert WRITE-ANSWER alone (RPBR-X) is enough to cause a performance degradation. Second, and at least as importantly, MBPR is causing more of a problem than the the intron effect caused by inert WRITE-ANSWER in this case.



next up previous
Next: The Mystery Cause Unmasked Up: A Study in Program Previous: Minimal Cost Results with



Eric Teller
Tue Oct 29 16:58:50 EST 1996