
Orbital library  
PREV CLASS NEXT CLASS  FRAMES NO FRAMES  
SUMMARY: NESTED  FIELD  CONSTR  METHOD  DETAIL: FIELD  CONSTR  METHOD 
A
 the type of transition actions.S
 the type of transition states.M
 the class representing transitions.public interface MarkovDecisionProblem
Hook class for MarkovDecisionProcess algorithm. Objects implementing this interface represent a Markov decision process model.
A Markov decision problem (MDP) for a Markov decision process is a mathematical model for making sense of some classes of problems. An MDP is a special kind of sequential decision problem. It is a combinatorical optimization problem as long as the state space is discrete. It is characterized by
transition model
with a transition relation.
Where the transition relation satisfies the Markov property
for states,
action costs
c(s,a)>0 for taking the action a∈A(s) in the state s∈S.
If the immediate costs are bounded random numbers depending on state and action, then
c(s,a) denotes the expected immediate cost, instead.
Also note that there is no strict requirement for c(s,a)>0. Instead there are several
criterions that ensure convergence even for c(s,a)∈[0,∞).
However, they have even more restrictive constraints, like finite state sets (see Barto et al, 1995).
These state and action dependent costs ensure that the utility function is separable.
goal states
. More often this is implied by an accessible description of the goal states.A Partially Observable Markov decision problem (POMDP) is one kind of a Markov decision problem with incomplete information which are an extension to MDPs. POMDPs do not rely on an accessible environment, but work on an inaccessible environment where some percepts might not be enough to determine the state, and thus we have incomplete information about the state. Utilitarian ideas have been around in artificial intelligence for a long time. For ideas of a nonobserving agent see (Lem 1971).
For POMDPs, the Markov property does not hold for percepts, but only for states. In the absence of feedback, a POMDP is a deterministic search in belief state space. In the presence of feedback, a POMDP is an MDP over belief space (Astrom, 1965). However, a single belief state is a probability distribution over physical states, then. In fact, solving a POMDP can be quite complex.
MarkovDecisionProcess
,
Hector Geffner. Modelling and Problem Solving,
"D. P. Bertsekas. Dynamic Programming: Deterministic and Stochastic Models. PrenticeHall, Englewood Cliffs, NJ, 1989.",
"S. Ross. Introduction to Stochastic Dynamic Programming. Academic Press, New York, 1983.",
"A. Barto, S. Bradtke, and S. Singh. Learning to act using realtime dynamic programming. Artificial Intelligence, 72:81138, 1995.",
"Stanisław Lem. Doktor Diagoras in: Sterntagebücher, suhrkamp p491, 1978. (original edition 1971)"Nested Class Summary  

static class 
MarkovDecisionProblem.DefaultTransition
Default implementation of transition options for Markov decision processes. 
static interface 
MarkovDecisionProblem.Transition
Represents a transition option during a Markov decision process. 
Method Summary  

boolean 
isSolution(java.lang.Object state)
Check whether the given state is a goal state (a valid solution to the problem). 
Methods inherited from interface orbital.algorithm.template.TransitionModel 

actions, states, transition 
Method Detail 

boolean isSolution(java.lang.Object state)
Optional variation: Search algorithms generally seek out for one single solution. In order to find all solutions to a problem, simply let this method store solutions and return false instead, until enough solutions have occurred. However, expanding solution nodes should result in an empty list at some time to ensure termination, then.
state
 the state s∈S to check for being a goal state.

Orbital library 1.3.0: 11 Apr 2009 

PREV CLASS NEXT CLASS  FRAMES NO FRAMES  
SUMMARY: NESTED  FIELD  CONSTR  METHOD  DETAIL: FIELD  CONSTR  METHOD 