Proof of theorem

Theorem 3   PDDL+$ _{PTA}$ has indirectly equivalent expressive power to Priced Timed Automata.

We begin by showing that an arbitrary PTA can be expressed as a PDDL+$ _{PTA}$ domain without any blow-up in the size of the encoding. We then show that the converse is also the case.

Given a PTA $ \langle L,l_{0},E,I,P \rangle$ we construct a PDDL+$ _{PTA}$ model as follows. For each edge $ e = (l_{i},g,a,r,l_{j})$, where $ g$ is the clock constraint, $ a$ is the action and $ r$ is the subset of clock variables reset by this edge, we construct the following instantaneous action schema, called a transition action, which models the instantaneous transition from $ l_i$ into $ l_j$.

(:action transition$ _e$

:parameters ()
:precondition (and (in l$ _{i}$) g)
:effect (and $ \{\forall x \in r  \cdot$ (assign ($ x$) 0)$ \}$
(increase (c) P(e))
(not (in l$ _i$))
(in l$ _j$)))

We also construct, for each location $ l_i$, the following process schema:

(:process process-location$ _{l_i}$

:parameters ()
:precondition (in l$ _i$)
:effect (and (increase (c) (* #t P(l$ _{i}$))
$ \{\forall x \in C  \cdot$ (increase ($ x$) (* #t 1))$ \}$

Finally, we construct an event for each location:

(:event event-location$ _{l_i}$

:parameters ()
:precondition (and (in l$ _i$) (not (I(l$ _i$))))
:effect (not (in l$ _i$)))

The initial state specifies that each clock variable and the cost variable starts with value 0. It asserts (in $ l_0$). The special error state is the empty state (if an event is triggered then it will remove the current location proposition, leaving it impossible to progress).

Note that this domain is valid PDDL+$ _{PTA}$. No two actions can be applied in parallel because the system can only ever satisfy one condition of the form (in $ l_i$).

For this construction to correctly capture the PTA, it remains to be shown that any PTA trajectory corresponds to a PDDL+$ _{PTA}$ plan for this domain and that any plan corresponds to a trajectory.

Consider a (valid) PTA trajectory,

$\displaystyle (l_0,u_0) \stackrel{\delta_1}{\rightarrow} (l_1,u_1) \stackrel{\delta_2}{\rightarrow} \ldots (l_n,u_n)$

where each $ u_i$ is a clock valuation and each $ \delta_i$ is a transition which may either be an edge or a positive time delay. In the case of time delay transitions, the location remains the same but the clock valuation is updated by the delay duration, while for edge transitions the location is updated but the clock valuation remains the same. This trajectory is mapped to a plan by creating an action instance for each edge transition (using the corresponding action in the PDDL+$ _{PTA}$ description). The action is set to be applied at the time corresponding to the sum of the time delay transitions that precede the transition in the trajectory. This is a valid plan since the processes are active at exactly the same times in the PDDL model as the corresponding time transitions in the trajectory.

Similarly, given any valid PDDL plan for the above domain, the plan defines a trajectory by mapping the instantaneous actions into their corresponding edge transitions and the gaps between them into time delay transitions. The trajectory will be a valid one because of the construction of the actions and process models. Note that no valid plan can trigger an event, since this will leave the system in a state that cannot be progressed, preventing satisfaction of any goal. This means that invariant conditions for each location are always maintained.

We now consider the opposite direction of the proof: PDDL+$ _{PTA}$ planning instances yield state space and transition models that can be expressed as PTAs with a constant factor transformation. We observe that the constraints on the PDDL+$ _{PTA}$ language ensures that clock variables and a cost variable are distinguished and behave exactly as required for a PTA. In the ground state space, all states correspond directly to locations of a PTA and the legal actions that transition between states correspond to the edges of the corresponding PTA. With the exception of event transitions leading into an exceptional error state, each edge in a PDDL+$ _{PTA}$ transition system will, by the constraints defining the language, always correspond to an instantaneous action and this action determines the clock constraints and reset effects of the corresponding PTA edge. Exactly one process is active in each state, and causes the variables to behave as clocks (except for the cost variable). The events govern the invariants for the states, causing any violation of an invariant condition to trigger a transition into an error state.

The correspondence between trajectories for the PTA defined in this way and plans for the PDDL+$ _{PTA}$ transition system is immediate, subject to the same observations made above.

$ \Box$
Derek Long 2006-10-09