Winning Hand

There's high drama at the 2008 World Series of Poker in Las Vegas: Antic poker pro and 11-time champ Phil Hellmuth is all-in, betting his entire stake on this hand. With one card yet to be dealt, he holds the ace of hearts and queen of diamonds in his hand; the king of diamonds and the three, four and ten of hearts are face up in front of him.

Any number of cards could win the pot for Hellmuth and keep him alive in the competition--any ace, any queen, either of two jacks, or any heart at all, for heaven's sake. But the dealer turns over the two of spades. Hellmuth is done, his dream of a 12th championship on hold for another year.

Halfway across the country, at the Association for the Advancement of Artificial Intelligence conference in Chicago, the tension at the third annual Computer Poker Competition is — nonexistent. Play ended before the AAAI's July conference even began and it only remains for the competition chair, a computer science PhD student from Carnegie Mellon named Andrew Gilpin, to announce the winners.

"Computer poker is not exactly a spectator sport," Gilpin says later. That's an understatement. Working by himself, Gilpin spent four solid weeks running the competition on Carnegie Mellon computers, pitting 15 computer poker programs, or "pokerbots," from seven countries against one other in more than three million hands of Texas Hold 'Em. As the computers cranked away, he resisted the urge to peek at the results. "I just made sure the programs didn't crash or didn't do something malicious and that the results were being generated," Gilpin says.

. . .

What computer poker lacks in drama, however, it more than makes up in intellectual challenge. Poker has taken the place of chess as the game that's driving artificial intelligence research, giving rise to these annual competitions at the AAAI and, in 2009, at the International Joint Conference on Artificial Intelligence, or IJCAI.

"The 2008 competition was great for us," says Tuomas Sandholm, professor of computer science and leader of Carnegie Mellon's computer poker initiative. For the first time, a Carnegie Mellon pokerbot developed by Sandholm, Gilpin and a collaborator at the University of Munich, Troels Bjerre S__rensen, collected the most chips in Heads-Up Limit Texas Hold 'Em, winning one of the contest's four categories. Carnegie Mellon also collected the most chips in Heads-Up No Limit Texas Hold 'Em, but that category's winner was determined by a runoff rather than total chips; the Carnegie Mellon bot ended up in third place.

Winning any contest is nice, but the stakes are particularly high for poker, Sandholm says. The players must make strategic decisions while knowing little about the cards their opponents hold. That contrasts sharply with chess, which offers a dizzying array of possible moves, but also provides complete information about every move that's made. Adding to the uncertainties of poker is the likelihood that bets placed by opponents can be bluffs.

. . .

Making decisions based on incomplete or misleading information is a common dilemma for business and military leaders, so the payoff in developing computer programs that can solve a poker game is potentially great. "A lot of real-world situations have uncertainty in them, and you have to deal with uncertainty," says Sandholm, whose research focuses primarily on e-commerce, such as automatically setting rules for auctions or negotiations. Take auctions of radio spectrum by the Federal Communications Commission, for instance. "How do you bid rationally in these auctions?" Sandholm asks. "Nobody knows how. Even if you knew the true value of a piece of the spectrum, bidding at its true value is not an optimal strategy."

Perhaps a poker-playing algorithm could help. "It may be rational to bid on things you don't intend to win, just to drive up your competitor's prices, or to steer them to other parts of the spectrum," Sandholm says.

Mobile robotics is another realm where uncertainty is a given. On its way to winning the 2007 the Defense Advanced Research Projects Agency's Urban Challenge robot race, Tartan Racing's robotic SUV, Boss, had to make allowances for vehicles that proceeded out of turn at four-way stops. (See "A November to Remember," The Link, Issue 3.0.) But as autonomous vehicles move from closed race courses to city streets, they'll have to judge the skills and intentions of other drivers, pedestrians and even the occasional squirrel in many situations.

Poker also is relevant to "games of national importance," one of the topics that DARPA asked its Information Science and Technology Board to address. Some of those games have military significance--the feint by the Allies at a landing at Calais before launching their invasion of Normandy is a classic example--but others might address information warfare or international finance.

"So many interactions in the world are games--it just comes down to whether you can model them," Sandholm says.

As in life, "luck plays a large role in poker," Gilpin says. But it's possible to minimize the role of chance when evaluating poker algorithms because the computer can remember each hand it dealt. In the AAAI competition, each two-player match consisted of 10,000 hands--5,000 as originally dealt, and 5,000 with the cards reversed, with pokerbot A getting pokerbot B's original hands and vice versa.

. . .

It's not as easy to filter luck out of a multi-player game, Gilpin points out, because seating order is such a big factor. A player sitting to the left of a weak player, for instance, is at an advantage, while a player to the right is at a disadvantage. So, in the AAAI competition, seats were randomly assigned prior to each of the 25 6,000-hand matches.

One of the challenges of poker is simply whittling down the size of the game to something that is computationally manageable. Heads-Up Limit Texas Hold'Em, a two-player game in which the size of each bet is restricted, has 10 to the 18th power--or a billion times a billion--possible combinations of hands and bets. For no-limit games, in which the amount wagered is limited only by a player's bankroll, the number of possible combinations is a staggering 10 to the 71st power.

As complex as the no-limit game might be, the multi-player game is a far greater challenge. "I think it's one of the big open problems in multi-agent systems and artificial intelligence," says Sam Ganzfried, a computer science PhD student who joined Sandholm and Gilpin in the poker group in 2007 and helped devise a six-player pokerbot for the 2008 competition.

Sandholm's group has developed a number of ways of automatically abstracting the games, such as recognizing strategically similar hands and grouping them together, to make the problem manageable. Gilpin and Sandholm have recently developed algorithms that take advantage of the greater computing power available through the Pittsburgh Supercomputing Center.

. . .

Early attempts at pokerbots, including pioneering work at the University of Alberta, Canada, often attempted to encode the expertise of human card pros. Another approach is to use game theory--the branch of applied mathematics that analyzes interactions between entities--to develop strategy based simply on the rules of the game. Simplified poker games have been tackled with game theory since the 1940s. The Alberta group began experimenting with game theory on full-scale two-person Texas Hold 'Em in 2002 and Sandholm opted to use game theory exclusively when he delved into computer poker research in 2004.

"I haven't read a single poker book in my life," says Sandholm, who makes no claim to being much of a card player. But he's convinced that game theory gave his pokerbots an edge: "Absent computational limitations, computers will play poker optimally."

Sandholm's group introduced the approach of using an automated abstraction algorithm to the game, followed by the use of a custom equilibrium-finding algorithm.  Though the Carnegie Mellon programs didn't notch a win in the AAAI tournament until this year, they have always been strong contenders and all of the competitive pokerbots now employ the Sandholm group's approach.

Computers certainly play the game differently than most human players. Pokerbots are typically programmed to seek a Nash equilibrium--finding an optimal strategy that you wouldn't change even if you knew everything your opponent was trying to do. "But human players aren't optimizing their play," says Michael Bowling (CS'96, '99, '03), leader of Alberta's Computer Poker Research Group. "Their goal is to extract money from weaker players."

. . .

Drastic advances have been achieved in the past three years to both abstraction algorithms and equilibrium-finding algorithms, Sandholm says. For example, in 2006 the largest two-person zero-sum games that could be solved were those with no more than 10 million decision points; today, games 100,000 times bigger can be solved nearly optimally. And the supremacy of human players has begun to yield. Last summer, just prior to the AAAI conference, an Alberta pokerbot with the ability to adapt its play to its opponent's style edged out a couple of human poker champions in a human versus machine Heads-Up Limit Texas Hold 'Em competition in Vegas, just a year after narrowly losing to a couple of pros.

But Bowling, who earned his PhD in computer science at Carnegie Mellon in 2003, says it's too early for pokerbots to declare victory over humans. "The fact is, we just don't know how strong human players are," he says. In repeated matches, humans might alter their style of play to exploit weaknesses in the pokerbots. And last year's win was in "limit" poker, not the more popular and more difficult "no-limit" game.

When it comes to no-limit, "we're good enough to play at the lowest professional ranks or at the upper amateur level," says Bowling, whose pokerbots won three out of the four categories in the 2008 AAAI contest, including the no-limit competition. Pokerbots for the limit game were operating at that level about two or three years ago.

Though Alberta has long dominated computer game research, the entry of Carnegie Mellon into the computer poker field since 2005 was something that the Alberta researchers welcomed because it helped sharpen the competition, Bowling says. "If you look at how much progress has been made on computer poker worldwide in the last three years, it's probably five times as much as the previous 12 years combined."

Which, as Sandholm sees it, means there's still a long way to go. "Humans are just one benchmark along the way," he says. "In 1997, a computer beat the world champion in chess, but that didn't mean that computer chess programs played chess optimally."  To achieve the goal of optimal play, pokerbots will need improved abstraction algorithms so they can create a closer approximation of the real game, he says. Also, achieving a Nash equilibrium may not be sufficient; in multi-player games in particular, it may prove impossible to reach a state where all players have optimized their strategy. As player modeling improves, computers may steal a trick from their human cousins and find that colluding against a weaker opponent may have advantages in some instances.

"We haven't been focusing on exploiting humans," Sandholm says. "But if some human wants to challenge us on a heads-up limit game, we'll put money on it."
Image2: 
Image3: 
For More Information: 

Jason Togyer | 412-268-8721 | jt3y@cs.cmu.edu