CMU 15-418 (Spring 2012) Final Project:
Parallelizing the Forward Search Sparse Sampling Algorithm for Markov Decision Processes
Christopher Mohr
Reports
Working Schedule
Week |
What We Plan To Do |
What We Actually Did |
Apr 1-7 |
Begin to implement a correct, sequential version of FSSS in C. |
Implemented a correct, sequential version of FSSS in C. |
Apr 8-14 |
Complete the sequential version of FSSS in C and test it for correctness against my existing Matlab version of the algorithm. |
Adapted FSSS to work for Belief MDPs. Implemented a small POMDP called the “Tiger” problem and tested the C version of the algorithm against the Matlab version for various inputs. |
Apr 15-21 |
Begin writing the OpenMP version of this algorithm. This will include considering where the sequential nature of the algorithm can be weakened while preserving correctness. Simultaneously, begin trying to adapt the sequential version to deal with Belief MDPs. |
Implemented a larger POMDP called the “RockSample” problem. Tested results of the algorithm of this problem. Began writing OpenMP version of FSSS. Currently running iterations of the algorithm in parallel and using locks for each node for data update. |
Apr 22-28 |
|
|
Apr 29-May 5 |
|
|
May 6-11 |
|
|
Working Log
Made locks at every node. The problem is determining when exactly locks need to be set. Also struggling with flush directive.
Places where locks need set:
Adding a new branch to the tree (and all associated memory operations). This is necessary to prevent the tree structure from being corrupted, e.g., writing two branches to the same place, or adding too many branches
Other things to implement:
It seems wasteful for each thread to update the values of each node, because only the last thread to make it back to that node will have the most updated values.
Distribution of actions and states to choose. Currently, every thread picks the same action, and the same state if all states have been explored once.
Suggestion from Prof. Brunskill: Use something like the Boltzmann distribution to pick actions and states