Playing games

Reading: Chapter 11

One of AI's greatest successes. We study it through tic-tac-toe.

Games covered

Classical AI techniques apply when:

two players alternate turns
all information known
finite number of possible moves

Examples: tic-tac-toe, Connect-4, Othello/reversi, checkers, chess, go

Non-examples: poker, Boggle, Scrabble, Diplomacy

Game trees

X | O | O
--+---+--
X |   |  
--+---+--
O | X |

Question: How should X play?

Answer: Draw a game tree.

          XOO
          O--
      ____OX-____
     /     |     \
    /      |      \
  XOO     XOO     XOO
  XX-     X-X     X--
  OX-     OX-     OXX
 /   \   /   \   /   \
XOO XOO XOO XOO XOO XOO
XXO XX- XOX X-X XO- X-O
OX- OXO OX- OXO OXX OXX
 |   |       |       |
XOO XOO     XOO     XOO
XXO XXX     XXX     XXO
OXX OXO     OXO     OXX

Question: How big is game tree for an empty tic-tac-toe board?

Answer: We said 9! in class, but later I noticed that this only bounds the number of distinct tic-tac-toe games. But a tic-tac-toe game passes through up to 9 boards. The true bound is 9! + 9! + 9!/2! + 9!/3! + 9!/4! + ... + 9!/9!.

Evaluating a tree

Win for X: 1
Win for O: -1
tie game: 0

          XOO
          O--
      ____OX-____
     /     1     \
    /      |      \
  XOO     XOO     XOO
  XX-     X-X     X--
  OX-     OX-     OXX
 / 1 \   /-1 \   /-1 \
 |   |   |   |   |   |
XOO XOO XOO XOO XOO XOO
XXO XX- XOX X-X XO- X-O
OX- OXO OX- OXO OXX OXX
 1   1  -1   1  -1   1
 |   |       |       |
XOO XOO     XOO     XOO
XXO XXX     XXX     XXO
OXX OXO     OXO     OXX
 1   1       1       1

Pseudocode

Algorithm Minimax(board, player):
if board is win for X, then return 1.
else if board is win for O, then return -1.
else if board is tie game, then return 0.
end of if
              -infinity if player is X
let best be {
               infinity if player is O
for each legal move on board, do:
  Make move on board.
  let value be Minimax(board, other player).
  Undo move from board.

  if player is X and value > best, then let best be value.
  else if player is O and value < best, then let best be value.
  end of if
end of loop
return best.

Heuristics

Usually, game tree is much too big.

Idea: Use a heuristic function to estimate ``goodness'' of board.

When we get to particular depth, stop there instead of recursing.

Example

Consider this heuristic: Start with 0. Add 1 for each way with two X's and empty. Subtract 1 for each way with two O's and empty.

Doing the game tree with a depth of 2:

          OX- A
          -O-
      ____XOX____
     /     0     \
    /      |      \
  OXX     OX-     OX- B
  -O-     XO-     -OX
  XOX     XOX     XOX
 /-1 \   / 0 \   / 0 \
 |   |   |   |   |   |
OXX OXX OX- OX- OXO OX- C
OO- -OO XOO XOO -OX OOX
XOX XOX XOX XOX XOX XOX
 0  -1   0   0   0   1

Pseudocode

Algorithm Minimax(board, player, depth):
if board is win for X, then return 1000000.
else if board is win for O, then return -1000000.
else if board is tie game, then return 0.
else if depth = 0, then return Heuristic(board, player).
end of if
              -infinity if player is X
let best be {
               infinity if player is O
for each legal move on board, do:
  Make move on board.
  let value be Minimax(board, other player, depth - 1).
  Undo move from board.

  if player is X and value > best, then let best be value.
  else if player is O and value < best, then let best be value.
  end of if
end of loop
return best.

Alpha-beta search

Brilliant observation: Sometimes entire subtrees are irrelevant.

(In above case, before we get to C, we know A is at least 0. And B can't be more than 0 (since the left way gets 0, and O chooses a minimum). So we won't choose B, and so there's no point in looking at C.

Other tricks

opening books
avoiding recomputation
expand horizons of good subtrees (Deep Thought, CMU, 1988)
fast, parallel hardware (Deep Blue, IBM, 1996)