The POMDP framework is a generalized model for planning under
uncertainty. However many still believe that POMDPs cannot scale to
real-world-sized problems. There are two distinct but interdependent
reasons for the limited scalability of POMDP algorithms. The first is
the so-called curse of dimensionality, which was discussed in last
week's ML lunch. The second is what we will call the curse of
history: the number of distinct action-observation histories
considered for planning grows exponentially with the planning horizon.
In this talk I will present the Point-Based Value Iteration algorithm for POMDP planning, which specifically addresses the curse of history. It approximates an exact value iteration solution by selecting a small set of representative belief points, and planning for those only. By using stochastic trajectories to choose belief points, and by maintaining only one value hyper-plane per point, it is able to successfully solve large problems, including the robotic Tag domain, a POMDP version of the population game of lasertag.
Back to the Main Page
Charles Rosenberg Last modified: Wed Mar 5 11:59:58 EST 2003