Newsgroups: rec.gambling.poker,rec.games.programmer,comp.ai.games,rec.gambling.misc,alt.sources,rec.games.design
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!news.mathworks.com!newsfeed.internetmci.com!howland.reston.ans.net!ix.netcom.com!netcom.com!imgidata
From: imgidata@netcom.com (Robert Fagen)
Subject: Re: 5 cards stud poker logic/algorithm/strategy ?
Message-ID: <imgidataDM1xp5.5n8@netcom.com>
Followup-To: rec.gambling.poker,rec.games.programmer,comp.ai.games,rec.gambling.misc,alt.sources,rec.games.design
Organization: NETCOM On-line Communication Services (408 261-4700 guest)
X-Newsreader: TIN [version 1.2 PL1]
References: <cJp$mCEGv8$D078yn@hk.super.net> <Pine.PMDF.3.91.960122225915.541243530B-100000@minna.acc.iit.edu> <4eirk4$4np@tpd.dsccc.com> <mhallDLyIn0.A4q@netcom.com> <pudaite-3101960248510001@via-annex3-16.cl.msu.edu>
Date: Wed, 31 Jan 1996 15:29:29 GMT
Lines: 98
Sender: imgidata@netcom.netcom.com

Well, guess I could throw in my relatively uneducated thoughts here on
an approach that I was thinking about.

The basic premise is to classify different dimensions of the game into
a relatively small number of 'buckets'. This keeps the total number of
states down in the sub-gigabyte range. The following #defines are part of
a first attempt at how I was thinking about it:

/* playing_round */
#define PREFLOP 0
#define FLOP    1
#define TURN    2
#define RIVER   3

/* position */
#define BLIND            1
#define EARLY            2
#define MIDDLE           4
#define LATE             8
#define BUTTON          16

/* action */
#define OPENED          32
#define CALLED_M        64 /* multiple callers */
#define CALLED_1       128 /* one caller */
#define RAISED         256
#define RERAISED       512

/* hand_description */
#define ONE_OVER      1024 /* overcards */
#define TWO_OVER      2048

#define TOP           4096 /* 'pair'/'set' modifier */
#define MID           8192
#define BOT          16384

#define ONE_PR       32768 /* hand rank */ 
#define TWO_PR       65536
#define SET         131072
#define GUTSHOT     262144
#define OPENEND     524288
#define THREEFLUSH 1048576
#define FOURFLUSH  2097152
#define FLUSH      4194304
#define STR8       8388608
#define BOAT      16777216
#define QUADS     33554432
#define STR8FLSH  67108864
#define PAIRBRD  134217728 /* board state */
#define TRIPBRD  268635456
#define STR8BRD  537270912
#define FLSHBRD 1074541824

(note that the values of these defines could probably be done a better way,
 but my thought was to be able to use each one of these descriptors as
 a bit-switch, so this is not literally an index into the 2400 positions
 of the array below, but more the label for that array position)

These values are added together to be an index into a matrix with the
following dimensions:

play_matrix[pocket_cards][playing_round][position/action/hand_description]
	     (169)	      (4)	   (2400 = 5*5*2*3*16)

play_matrix is an array of structures that contains the 'array position label'
and a structure that indicates what the player's action should be for this
pair of pocket cards, for this betting round and this position, etc.

This data is stored into short ints. Before this program 'learns'
anything by playing, both ints contain the value 33. When the program
finds that this is the appropriate bucket to work from for the current
situation, it picks a random number from 1 to 100. If the random
number is from 1 to 'fold', it folds; from fold to (call+fold), it
calls, and greater than (call+fold) it raises. The program keeps track
of which four buckets it uses for each of the four playing rounds. If
this was the correct decision, ie. program drags the pot, then the
action taken is incremented.  Otherwise, if the program loses this
hand, the action taken is decremented.  Likewise, if the program
folds, it looks to see if it 'would have won', and updates the used
playing round states appropriately (maybe I should call this 'Suckbot'
or 'RunnerRunnerBot').  In the case of a failed raise, I'm guessing
that fold should be incremented, since "it's hardly ever correct to
just call (or something like that)".

After several billion simulated hands against a table of 9 other players 
like itself, they should all reach relatively the same matrix state.
Although, once again, I'm just guessing. At that point, I suspect this
'bot could do ok on IRC.

Suggestions, criticisms, flames, encouragement and ideas that I can steal
are all gratefully and humbly accepted.

Rob

-- 
-------------------------------------------------------------------
Rob Fagen                 voice 415-432-8101   
I only represent myself   imgidata@netcom.com      http://sdbs.org/
