13 Mar 1996

Outline

Playing games

In Assignment 5, you will write programs to play a version of the game Connect 4.

A game playing program has three basic components:

Opening book

Good computer game playing programs have good opening books. (In Assignment 5, however, we will not use opening books.)

An opening book stores the ``right'' responses to opening moves made by an opponent. These responses are either predetermined by expert human players (eg, chess grandmasters) or precomputed using a lot of computer time.

For example: Connect 3: just like tic-tac-toe, but you can only make a move to the lowest point in each column (think of your pieces as "falling down" to the bottom)

Opponent moves first on a 3 X 3 Connect-3 board.

  . . .
  . . .
  . X .
  -----
  0 1 2
Program looks this position up in the book and finds that the right response is 1.
  . . .
  . O .
  . X .
  -----
  0 1 2
(Any other response leads to a loss by O.)

The computer sticks to the book as long as possible. As soon as it does not recognize a position, it switches over to the general game tree search algorithm. Some programs have a second book for the endgame, when there are few pieces on the board (in chess).

Game tree search algorithm

A game tree is a tree in which each node represents the state of the game, and a node's children are the states that can be reached from the node. In this example of Connect 3, the first three levels of the game tree for a 3 x 3 game is shown. X moves first.


                                . . .
                                . . .
                                . . .
                                -----
                                0 1 2
                             /    |    \
                /-----------/     |     \-----------\
               /                  |                  \
              /                   |                   \
        . . .                   . . .                   . . .
        . . .                   . . .                   . . .
        X . .                   . X .                   . . X
        -----                   -----                   -----
        0 1 2                   0 1 2                   0 1 2
     /    |    \             /    |    \             /    |    \     
    /     |     \           /     |     \           /     |     \    
   /      |      \         /      |      \         /      |      \   
. . .   . . .   . . .   . . .   . . .   . . .   . . .   . . .   . . .
O . .   . . .   . . .   . . .   . O .   . . .   . . .   . . .   . . O
X . .   X O .   X . O   O X .   . X .   . X O   O . X   . O X   . . X
-----   -----   -----   -----   -----   -----   -----   -----   -----
0 1 2   0 1 2   0 1 2   0 1 2   0 1 2   0 1 2   0 1 2   0 1 2   0 1 2
We don't build this tree all at once. Instead, we search it recursively, as follows.

Suppose it is X's turn to move.

  1. For each possible move for X, recursively find the best response for O.
  2. Make the move for X that leads to the worst best response for O.

In Assignment 5, exhaustive search will be used. From any starting board position, every branch of the tree will be evaluated until the game ends in either a WIN, LOSS, or TIE.

Board position evaluator

A game-playing must also include code to evaluate a board's state. For exhaustive search this will detect whether a given state is a win, a loss, a tie, or none of the above. Evaluating a state is more complicated if you don't want an exhaustive search.

Speeding up game tree search

As you can see, the real bottleneck is the game tree search. Consequently, people have done a lot of work with making this work better. There are many methods for speeding up the search:

Memoizing

The idea is to save the solution to each subproblem that is solved so that if it is needed later it can be used without recomputing it.

We saw how to do this with the fib function earlier.

(If you're wondering about the difference between memoizing and dynamic programming, it is this: dynamic programming is a more organized way of memoizing. We build a table containing solutions to all of the subproblems in a buttom up fashion, and we ensure that no subproblem is solved more than once. Memoizing is usually added to a program that computes solutions in a top-down fasion.)

How do we apply memoizing in a game playing program?

Each time we compute the correct response to a board position for either player X or player O, we store the solution. If we ever encounter that position again, we just look up the answer.

(In some sense, the opening book can be thought of as a special case of memoizing.)

How many different positions are there in a Connect-4 game?

If player X always goes first, then we can determine whose turn it is from the number of pieces on the board. Thus, all that matters is what the board looks like. How many different boards are there?

A crude bound: 3^(ROWS*COLS), since since there are ROW*COLS squares, and each is either X, O, or EMPTY. We're overcounting, though, since many of these board positions are not possible.

For a 3 X 4 board, this is 3^12 = 531,441. For a 4 X 6 board, this is 3^24, which is about 281,000,000,000.

We can't allocate space to store the solution to every board position. A solution is hashing.

Convert each board position into a bucket number from 0 to M-1. One simple method: treat each square as a digit in a base 3 number modulo M (ie, . == 2, X == 1, O == 0), and scan board from left to right.

As an example:

  . X .
  O O .
  O X X
  -----
  0 1 2
Would be hashed to
  2*3^8 + 1*3^7 + 2*3^6
+ 0*3^5 + 0*3^4 + 2*3^3
+ 0*3^2 + 1*3^1 + 1*3^0 mod M
Picking the right value for M is important (because we don't want to put too many board positions in any one bucket), but also difficult (because it's hard to estimate how many legal board positions we'll encounter in the search -- we know it's a lot less than 3^(ROWS*COLS)).

Terminating the search before reaching a leaf

It may take too long to search from a board position all the way to the end of the game. How many leaves are there in the search tree?

A crude bound would be COLS^(ROWS*COLS), since there are at most ROWS*COLS moves, and COLS choices for each). For a 4 X 6 board, this is 6^24 - a very large number indeed.

But how can we stop at a board position that is not a WIN, LOSS, or TIE?

Well, we can use a heuristic to evaluate the board - assign a numerical value to it, so that if the state looks better for your player, then the value will be higher. Try to think of some heuristics for Connect 4.

What about checkers? Well, one heuristic is to count the number of pieces on each side. And in chess? Well, we talked about that before: we assign a value to each piece and sum up the values for each side.

These are all rather simple heuristics; more sophisticated heuristics tend to be better, but they must be quick to evaluate in order to be worthwhile, since the heuristic function would be used for each leaf in the tree.

Successive deepening

This technique is typically used in tournaments, when the time spent determining a move is limited.

First search the game tree to depth 1 (one move). Then search the game tree to depth 2 (one move by each player) Then search the game tree to depth 3, etc.

Why do this? If time runs out, the program should have a move to fall back on.

Important observation: The last search costs more than all of the other searches put together. So the other searches don't really cost much.

If we're playing Connect 4 on a 4x6 board, the number of evaluations for a search to depth 1 is COLS: 6. On a search to depth 2, the number of states searched is COLS^2, 36. On a search to depth 3, the number of states search is COLS^3, 216. To depth 4: 1296. As you can see, the later searches are so numerous that the added time for the previous searches is trivial.

Pruning

There is an important and useful technique of pruning a tree called alpha-beta search. This is beyond the scope of this class, though.