Source : http://xkcd.com/601/
In this homework, you will implement the randomized weighted majority algorithm for finding the Nash equilibrium mixed strategy profile of a zero sum game.
The code for this project contains the following files, which are available in a zip archive:
||Your interface with the game solver.|
||Contains unit test cases and test suite harness. Also defines the payoff matrices for the games that will be tested.|
What to submit: You are required to make the test cases pass by implementing the weighted majority algorithm.
You will fill in the
weighted_majority function of
game.py for the assignment. You should submit only this one file by placing it in
/afs/andrew.cmu.edu/usr/dannyz/hw6/[your_andrew_id]. Please do not change or submit any of the other files from the distribution.
Evaluation: Your code will be autograded for technical correctness. Please do not change the names of any provided functions within the code, or you will wreak havoc on the autograder. However, the correctness of your implementation -- not the autograder's judgements -- will be the final judge of your score. If necessary, we will review and grade assignments individually to ensure that you receive due credit for your work.
Academic Dishonesty: We will be checking your code against other submissions in the class for logical redundancy. If you copy someone else's code and submit it with minor changes, we will know. We trust you all to submit your own work only; please don't let us down. If you do, we will pursue the strongest consequences available to us.
Getting Help: You are not alone! If you find yourself stuck on something, contact the course staff for help. Office hours and Piazza are there for your support; please use them. If you can't make our office hours, let us know and we will schedule more. We want these projects to be rewarding and instructional, not frustrating and demoralizing. But, we don't know when or how to help unless you ask.
As discussed in lecture, randomized weighted majority is an iterative algorithm that finds the mixed-strategy profile for both players by playing repeated games. As a reminder, the algorithm proceeds by iterating the following procedure (assuming that the matrix entries represent payoffs for the Row player, as is the convention):
After sufficient iterations, the time-averaged weights on Column's moves converge to his Nash equilibrium mixed strategy, and the distribution of best moves made by Row converges to his Nash equilibrum strategy. You can tell that the distributions have converged by testing whether the expected values of each player's strategies sum to approximately 0.
To compute the expected value for Player X, compute the expected values for all possible moves of Player Y given Player X's mixed strategy. The value that is best for Player Y corresponds to the move that Y will presumably make, so that value (or its negation, if X is the minimizer) is the expected value of X's strategy. Remember that the minimizer will prefer low expected values, while the maximizer will prefer high expected values.
More details on the algorithm can be found in the first half of this paper (note that in this paper's notation, Q -- the Column player -- is the maximizer). You can find more information on the game theory concepts involved in Jerry Zhu's lecture notes on the topic.
To get started, see the options available to you
python main.py --help
The game theory payoff matrices are stored as test cases, which you can ask the utility to print. For example, you can view the Rock-Paper-Scissors-Lizard-Spock matrix by running:
python main.py --view rockSpock
To run the test cases, use the
--test flag. You can run test case for the Rock-Paper-Scissors-Lizard-Spock game
python main.py --test rockSpock
You can run all tests with:
python main.py --test all
You will find that the test cases initially fail with a
weighted_majority function is an abstraction for the weighted majority algorithm. You will need to define your own code and functions for proposing strategies, finding the best response to a mixed strategy, updating weights, assessing stopping criteria, etc. The unit-testing framework interacts with your code only via
weighted_majority function with the correct signature; how you implement the internals is up to you. Please don't hardcode values, as we will be reading your code for algorithmic correctness as well.
Time to code!
HTML template borrowed from the Pacman AI projects