next up previous
Next: Computing POMDP policies Up: Finding Approximate POMDP Solutions Previous: Exponential Family PCA

Subsections

E-PCA Performance

Using the loss function from equation (8) with the iterative optimization procedure described by equation (26) and equation (27) to find the low-dimensional factorization, we can look at how well this dimensionality-reduction procedure performs on some POMDP examples.

Toy Problem

Recall from Figure 9 that we were unable to find good representations of the data with fewer than 10 or 15 bases, even though our domain knowledge indicated that the data had 3 degrees of freedom (horizontal position of the mode along the corridor, concentration about the mode, and probability of being in the top or bottom corridor). Examining one of the sample beliefs in Figure 10, we saw that the representation was worst in the low-probability regions. We can now take the same data set from the toy example, use E-PCA to find a low-dimensional representation and compare the performance of PCA and E-PCA. Figure 11(a) shows that E-PCA is substantially more efficient at representing the data, as we see the KL divergence falling very close to 0 after 4 bases. Additionally, the squared $ L_2$ error at 4 bases is $ 4.64 \times 10^{-4}$. (We need 4 bases for perfect reconstruction, rather than 3, since we must include a constant basis function. The small amount of reconstruction error with 4 bases remains because we stopped the optimization procedure before it fully converged.)

Figure 11: (a) The average KL divergence between the original sample set and the reconstructions. The KL divergence is $ 0.018$ after 4 bases. The error bars represent the standard deviation from the mean over the 500 beliefs. (b) The same example belief from Figure 10 and its reconstruction using 3 bases. The reconstruction shows small errors at the peak of each mode. Not shown is the reconstruction using 4 bases, in which the original belief and its reconstruction are indistinguishable to the naked eye. (c) and (d) show fine detail of the original belief and the reconstruction in two parts of the state space. Although the reconstruction is not perfect, in the low-probability area, we see that the error is approximately $ 2\times10^{-9}$.
\includegraphics[height=2in]{figs/toy-kl-epca.ps}
[Reconstruction Performance]
\includegraphics[height=2in]{figs/toy-example-3bases.ps}
[An Example Belief and Reconstruction]
\includegraphics[height=2in]{figs/toy-example-3bases.zoom1.ps}
[The Belief and Reconstruction Near the Peak]
\includegraphics[height=2in]{figs/toy-example-3bases.zoom2.ps}
[The Belief and Reconstruction In Low-Probability Region]

Figure 11(b) shows the E-PCA reconstruction of the same example belief as in Figure 10. We see that many of the artifacts present in the PCA reconstruction are absent. Using only 3 bases, we see that the E-PCA reconstruction is already substantially better than PCA using 10 bases, although there are some small errors at the peaks (e.g., Figure 11c) of the two modes. (Using 4 bases, the E-PCA reconstruction is indistinguishable to the naked eye from the original belief.) This kind of accuracy for both 3 and 4 bases is typical for this data set.

Robot Beliefs

Although the performance of E-PCA on finding good representations of the abstract problem is compelling, we would ideally like to be able to use this algorithm on real-world problems, such as the robot navigation problem in Figure 2. Figures 12 and 13 show results from two such robot navigation problems, performed using a physically-realistic simulation (although with artificially limited sensing and dead-reckoning). We collected a sample set of 500 beliefs by moving the robot around the environment using a heuristic controller, and computed the low-dimensional belief space $ \tilde{B}$ according to the algorithm in Table 1. The full state space is $ 47.7 m \times 17 m$, discretized to a resolution of $ 1m \times
1m$ per pixel, for a total of 799 states. Figure 12(a) shows a sample belief, and Figure 12(b) the reconstruction using 5 bases. In Figure 12(c) we see the average reconstruction performance of the E-PCA approach, measured as average KL-divergence between the sample belief and its reconstruction. For comparison, the performance of both PCA and E-PCA are plotted. The E-PCA error falls to $ 0.02$ at 5 bases, suggesting that 5 bases are sufficient for good reconstruction. This is a very substantial reduction, allowing us to represent the beliefs in this problem using only 5 parameters, rather than 799 parameters. Notice that many of the states lie in regions that are ``outside'' the map; that is, states that can never receive probability mass were not removed. While removing these states would be a trivial operation, the E-PCA is correctly able to do so automatically.

Figure 12: (a) A sample belief for the robot navigation task. (b) The reconstruction of this belief from the learned E-PCA representation using 5 bases. (c) The average KL divergence between the sample beliefs and their reconstructions against the number of bases used. Notice that the E-PCA error falls close to 0 for 5 bases, whereas conventional PCA has much worse reconstruction error even for 9 bases, and is not improving rapidly.
\fbox{\includegraphics[height=.9in]{figs/wean.navigation.start.eps}}
(a) Original Belief
\fbox{\includegraphics[height=.9in]{figs/wean.navigation.reconstruction.ps}}
(b) Reconstruction




\includegraphics[height=2in]{figs/robot_kl.ps}
(c) Reconstruction performance

In Figure 13, similar results are shown for a different environment. A sample set of 500 beliefs was again collected using a heuristic controller, and the low-dimensional belief space $ \tilde{B}$ was computed using the E-PCA. The full state space is $ 53.6m \times
37.9m$, with a resolution of $ .5m \times .5m$ per pixel. An example belief is shown in Figure 13(a), and its reconstruction using 6 bases is shown Figure 13(b). The reconstruction performance as measured by the average KL divergence is shown in Figure 13(c); the error falls very close to 0 around 6 bases, with minimal improvement thereafter.

Figure 13: (a) A sample belief for the navigation problem in Longwood, cf. Figure 2. (b) The reconstruction from the learned E-PCA representation using 6 bases. (c) The average KL divergence between the sample beliefs and their reconstructions against the number of bases used.
\fbox{\includegraphics[height=\picheight]{figs/longwood_example.original.ps}}
[A sample belief]
\fbox{\includegraphics[height=\picheight]{figs/longwood_example.reconstruction.ps}}
[The reconstruction]
\includegraphics[height=\picheight]{figs/longwood_kl.ps}
[Average reconstruction performance]


next up previous
Next: Computing POMDP policies Up: Finding Approximate POMDP Solutions Previous: Exponential Family PCA
Nicholas Roy 2005-01-16