Next: Computing POMDP policies Up: Finding Approximate POMDP Solutions Previous: Exponential Family PCA

Subsections

E-PCA Performance

Using the loss function from equation (8) with the iterative optimization procedure described by equation (26) and equation (27) to find the low-dimensional factorization, we can look at how well this dimensionality-reduction procedure performs on some POMDP examples.

Toy Problem

Recall from Figure 9 that we were unable to find good representations of the data with fewer than 10 or 15 bases, even though our domain knowledge indicated that the data had 3 degrees of freedom (horizontal position of the mode along the corridor, concentration about the mode, and probability of being in the top or bottom corridor). Examining one of the sample beliefs in Figure 10, we saw that the representation was worst in the low-probability regions. We can now take the same data set from the toy example, use E-PCA to find a low-dimensional representation and compare the performance of PCA and E-PCA. Figure 11(a) shows that E-PCA is substantially more efficient at representing the data, as we see the KL divergence falling very close to 0 after 4 bases. Additionally, the squared error at 4 bases is . (We need 4 bases for perfect reconstruction, rather than 3, since we must include a constant basis function. The small amount of reconstruction error with 4 bases remains because we stopped the optimization procedure before it fully converged.)

 [Reconstruction Performance] [An Example Belief and Reconstruction] [The Belief and Reconstruction Near the Peak] [The Belief and Reconstruction In Low-Probability Region]

Figure 11(b) shows the E-PCA reconstruction of the same example belief as in Figure 10. We see that many of the artifacts present in the PCA reconstruction are absent. Using only 3 bases, we see that the E-PCA reconstruction is already substantially better than PCA using 10 bases, although there are some small errors at the peaks (e.g., Figure 11c) of the two modes. (Using 4 bases, the E-PCA reconstruction is indistinguishable to the naked eye from the original belief.) This kind of accuracy for both 3 and 4 bases is typical for this data set.

Robot Beliefs

Although the performance of E-PCA on finding good representations of the abstract problem is compelling, we would ideally like to be able to use this algorithm on real-world problems, such as the robot navigation problem in Figure 2. Figures 12 and 13 show results from two such robot navigation problems, performed using a physically-realistic simulation (although with artificially limited sensing and dead-reckoning). We collected a sample set of 500 beliefs by moving the robot around the environment using a heuristic controller, and computed the low-dimensional belief space according to the algorithm in Table 1. The full state space is , discretized to a resolution of per pixel, for a total of 799 states. Figure 12(a) shows a sample belief, and Figure 12(b) the reconstruction using 5 bases. In Figure 12(c) we see the average reconstruction performance of the E-PCA approach, measured as average KL-divergence between the sample belief and its reconstruction. For comparison, the performance of both PCA and E-PCA are plotted. The E-PCA error falls to at 5 bases, suggesting that 5 bases are sufficient for good reconstruction. This is a very substantial reduction, allowing us to represent the beliefs in this problem using only 5 parameters, rather than 799 parameters. Notice that many of the states lie in regions that are outside'' the map; that is, states that can never receive probability mass were not removed. While removing these states would be a trivial operation, the E-PCA is correctly able to do so automatically.

 (a) Original Belief (b) Reconstruction (c) Reconstruction performance

In Figure 13, similar results are shown for a different environment. A sample set of 500 beliefs was again collected using a heuristic controller, and the low-dimensional belief space was computed using the E-PCA. The full state space is , with a resolution of per pixel. An example belief is shown in Figure 13(a), and its reconstruction using 6 bases is shown Figure 13(b). The reconstruction performance as measured by the average KL divergence is shown in Figure 13(c); the error falls very close to 0 around 6 bases, with minimal improvement thereafter.

 [A sample belief] [The reconstruction] [Average reconstruction performance]

Next: Computing POMDP policies Up: Finding Approximate POMDP Solutions Previous: Exponential Family PCA
Nicholas Roy 2005-01-16