Dynamic Irrelevance

Earlier we claimed that one advantage of PLTLSTR and FLTL over PLTLMIN and PLTLSIM is that the former perform a dynamic analysis of rewards capable of detecting irrelevance of variables to particular policies, e.g. to the optimal policy. Our experiments confirm this claim. However, as for reachability, whether the goal or the triggering condition in a response formula becomes irrelevant plays an important role in determining whether a PLTLSTR or FLTL approach should be taken: PLTLSTR is able to dynamically ignore the goal, while FLTL is able to dynamically ignore the trigger. This is illustrated in Figures 17 and 18. In both figures, the domain considered is ON/OFF with $n=6$ propositions, the response formula is $g \wedge
\circleddash ^{n} c$ as before, here with both $g$ and $c$ achievable. This response formula is assigned a fixed reward. To study the effect of dynamic irrelevance of the goal, in Figure 17, achievement of $\neg g$ is rewarded by the value $r$ (i.e. we have $(\neg g : r)$ in PLTL). In Figure 18, on the other hand, we study the effect of dynamic irrelevance of the trigger and achievement of $\neg
c$ is rewarded by the value $r$. Both figures show the runtime of the methods as $r$ increases. Achieving the goal, resp. the trigger, is made less attractive as $r$ increases up to the point where the response formula becomes irrelevant under the optimal policy. When this happens, the run-time of PLTLSTR resp. FLTL, exhibits an abrupt but durable improvement. The figures show that FLTL is able to pick up irrelevance of the trigger, while PLTLSTR is able to exploit irrelevance of the goal. As expected, PLTLMIN whose analysis is static does not pick up either and performs consistently badly. Note that in both figures, PLTLSTR progressively takes longer to compute as $r$ increases because value iteration requires additional iterations to converge.
Figure 17: Response Formula with Unrewarding Goal
Figure 18: Response Formula with Unrewarding Trigger
Sylvie Thiebaux 2006-01-20