Using a Credit-Blame map

Next: Experimental Results Up: Internal Reinforcement in Neural Previous: Refine Credit-Assignment Values by

Using a Credit-Blame map

The second phase of the internal reinforcement is the use of the created Credit-Blame map to enhance the probability that the program updates will lead either to a better solution or to a similar solution in less time. There are two basic ways that the Credit-Blame map can be used to do this enhancement: through improvement of either the mutation or crossover strategies.

The possibility of using internal reinforcement (explicit credit assignment) not only for mutation (which has analogies to the world of ANNs) but for crossover as well is important. Traditional GP uses random crossover and relies entirely on the mechanism of empirical credit assignment. Work has been done to boot-strap this mechanism by using the evolutionary process itself to evolve improved crossover procedures (e.g. [3, 12]). Thus, NP has the potential not only to improve on the existing GP mechanism, but also to help explain the central mystery of GP, crossover.

Mutation can take a variety of forms in NP. These various mutations can be best divided as: add an arc, delete an arc, swap two arcs, change a node function, add a node to the program, delete a node from the program. In the simple experiment shown in the next section each of these mutations took place with equal likelihood in both the random and internal reinforcement recombination cases. For example, to add an arc under random mutation to an NP program we simply pick a source and destination node at random from the program to be mutated and add the arc between the nodes.

Internal reinforcement can have a positive effect on this recombination procedure. For each recombination type, we pick a few nodes (eight in this paper's experiment) at random and pick a node or arc (depending on the mutation type) that has maximal or minimal credit score as appropriate. For example, when deleting a program node, we can pick a few nodes at random and delete the node with the lowest credit score.

In the random version of crossover, we simply pick a ``cut'' from each graph (i.e., a subset of the program nodes) at random and exchange and reconnect them. Like the mutation, we will keep the same underlying mechanism (for this paper) and simply try to get ``good'' program fragments to change. Given that we will separate a program into two fragments before crossover, let us define CutCost to be the sum of all credit scores of inter-fragment arcs and InternalCost to be the sum of all credit scores of intra-fragment arcs in the program to be crossed-over. For these definitions we will take the credit score of an arc to be the credit score of its destination node. Now we will say that the cost of a particular fragmentation of a program is equal to CutCost/(InternalCost + CutCost). If we try to minimize this value for both of the program fragments we choose, we are much less likely to disrupt a crucial part of the program during crossover.

In addition to these ``aids to random search'', we introduce specific changes to a program that can be affected with the use of the Credit-Blame map. For example, the IRNP procedure may promote high sensitivity parts of programs, but may not necessarily connect these parts to the output nodes. We can refine a program in the following way. First, let X be the node with highest . Second, let u be the output arc of X to a node Y that minimizes . Third, for all arcs p that have an OUTPUT node as a destination, pick the arc v whose source node X has minimal . Finally, switch arcs u and v.

Next: Experimental Results Up: Internal Reinforcement in Neural Previous: Refine Credit-Assignment Values by

Eric Teller
Tue Oct 29 14:55:57 EST 1996