Newsgroups: comp.ai.games
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news4.ner.bbnplanet.net!cam-news-hub1.bbnplanet.com!howland.erols.net!newsfeed.internetmci.com!in3.uu.net!EU.net!usenet2.news.uk.psi.net!uknet!usenet1.news.uk.psi.net!uknet!uknet!newsfeed.ed.ac.uk!dcs.ed.ac.uk!cnews
From: "Mike Moran" <mxm@dcs.ed.ac.uk>
Subject: Re: Programming a computer to play
In-Reply-To: jerry@fvc.com's message of Thu, 05 Sep 1996 16:33:20 -0700
X-Nntp-Posting-Host: orosay.dcs.ed.ac.uk
Message-ID: <nc5sp8rn7cq.fsf@orosay.dcs.ed.ac.uk>
Sender: mxm@orosay.dcs.ed.ac.uk
Organization: eduni
X-Newsreader: Gnus v5.1
References: <50cveb$rlf@hil-news-svc-6.compuserve.com>
	<50li3v$eh5@usenet10.interramp.com>
	<jerry-0509961633200001@206.7.170.151>
Date: Mon, 9 Sep 1996 13:30:45 GMT
Lines: 86


Jerry Whitnell writes:
In article <jerry-0509961633200001@206.7.170.151> jerry@fvc.com (Jerry Whitnell) writes:

 > In article <50li3v$eh5@usenet10.interramp.com>, cd000450@interramp.com (Bryan Stout) wrote:
>> In article <50cveb$rlf@hil-news-svc-6.compuserve.com>, 
>> 102346.3612@compuserve.com says...
>> >
>> >
>> >I am working on an othello program, currently I am using the MinMax 
>> 
>> Neural networks really don't apply here.  You'd have to assemble thousands of 
>> positions -- covering all potential types of situations -- with a score 
>> attached to each, for a NN to learn an evaluation on its own.

 > Not necessiarily.  Seems like you could setup two computer players to play each other, both controlled by different neural nets.  Initial moves would be generated by a random play generator (picking any random legal move).  Play until one side wins.  Feed both sides into both nets with win/lose flag.  Rinse, lather, repeat.  You should end up with a neural net that generates has built its own good move controller.  Might take awhile, however.

	I've just recently read a paper by C. Reynolds ([1]) in the
	field of genetic programming which takes this approach for the
	game of Tag (you know, "you're *it*" and all that). Each
	"player" is a program, consisting of functions which can be
	crossed-over and mutated. For each `generation', pairs of
	these programs are chosen to compete against other in a game
	of Tag (the programs control simple robots in a simulated
	environment). Many such such games are played with different
	pairs, and the `fitness' of a player is dictated by a measure
	of the time it was not "it". This fitness is then used in a
	standard genetic algorithm.

	Now, there is nothing to say that the players could not be
	neural nets. Indeed, the approach of using net's to control a
	virtual creature is taken by Karl Sims in [2]. The relevant
	feature for our discussion, is the choice of fitness function
	used, or rather, how we come to choose the fitness
	function. Rather than comparing a player against some
	`optimal' player or strategy, we compare any player against other
	players in the same population. 

	It seems fairly obvious that this comparison could be in the
	form of a game (in [1] it is Tag, in [2] it is "getting
	closest to a block"). Now, I have never played Othello, so I
	don't know how directly applicable this. However, if it has a
	straightforward points structure, you could perhaps make the
	fitness of a net be, say, given N games, in which P points were
	won in total, and p points were won by the player, its fitness
	would be p/P. This, of course, only considers the number of
	points won, not how many games were won. We could factor in
	number of games won, say n, using n/N. So, the overall fitness
	might be (p/P + n/N)/2, or we could add weightings to each
	factor (I'm just taking this off the top of my head, but I
	hope you get the point). 

	So, you can see how you could use net's, or, perhaps evolvable
	programs, framed in a genetic algorithm, as long as you can
	make up a fitness function based on games played by your
	players. Making up this fitness function seems to me to be the
	hardest part; it may not be so hard, if as described above,
	your game has a points structure and an easily discernible
	winner and loser.

	The major caveat to all this, of course, is the computational
	resources required to run the genetic algorithm, and play
	tournaments for each cycle. In practice we can get round
	these limits by cutting down on the length of the game, the
	size of the tournament and player population, and other such
	variables.

	So, you might want have a look at the papers mentioned, if you
	wish to go in that direction,

					Mike

	refs:

	1: "Competition, Coevolution and the Game of Tag", Craig
	W. Reynolds, http://reality.sgi.com/employees/craig/

	2: "Evolving 3D Morphology and Behavior by Competition", Karl
	Sims, ftp://think.com/public/users/karl/Welcome.html
-- 

 MI   CH AE   LM "You misspelled CHRYSANTHEMUM. Use abbreviations  ->   ,__o
 O R A N G R A D        for long flower names in C code."        -->  _-\_<,
 U  A  T E '9  6        - C Infrequently Asked Questions       ---> (*)'(*)
 A     I C     S Home: http://www.dcs.ed.ac.uk/home/mxm/ !eat emacs biscuits!
 mxm@dcs.ed.ac.uk (Temporary)                rfc1149@tardis.ed.ac.uk (Future)
