next up previous
Next: Introduction

Speeding Up the Convergence of Value Iteration
in Partially Observable Markov Decision Processes

Nevin L. Zhang
Weihong Zhang
Department of Computer Science
Hong Kong University of Science & Technology
Clear Water Bay Road, Kowloon, Hong Kong, CHINA


Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very effective: It enabled value iteration to converge after only a few iterations on all the test problems.

Dr. Lian Wen Zhang
Thu Feb 15 14:47:09 HKT 2001