: Class i_PriQLearner

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: INNER | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

EDU.cmu.cs.coral.learning
Class i_PriQLearner_id

java.lang.Object
  |
  +--EDU.gatech.cc.is.learning.i_ReinforcementLearner_id
        |
        +--EDU.cmu.cs.coral.learning.i_PriQLearner_id

public class i_PriQLearner_id
extends i_ReinforcementLearner_id
implements java.lang.Cloneable, java.io.Serializable

An object that learns to select from several actions based on a reward. Uses the Prioritized Sweeping technique of Moore.

The module will learn to select a discrete output based on state and a continuous reinforcement input. The "i"s in front of and behind the name imply that this class takes integers as input and output. The "d" indicates a double for the reinforcement input (i.e. a continuous value).

See Also:: Serialized Form

Inner Class Summary

protected class i_PriQLearner_id.state


Field Summary

protected PriorityQueue changeQueue


protected int criteria


static int DISCOUNTED
          Used to indicate the learner uses discounted rewards.

protected int numactions


protected i_PriQLearner_id.state[] states


Fields inherited from class EDU.gatech.cc.is.learning.i_ReinforcementLearner_id

logging, numactions, numstates, policyfilename

Constructor Summary

i_PriQLearner_id(int numstatesin, int numactionsin)
          Instantiate a Q learner using default parameters.

i_PriQLearner_id(int numstatesin, int numactionsin, int criteriain)
          Instantiate a Q learner using default parameters.

i_PriQLearner_id(int numstatesin, int numactionsin, int criteriain, long seedin)
          Instantiate a Prioritized Sweeping learner using default parameters.

Method Summary

void endTrial(double Vn, double rn)
          Called when the current trial ends.

double getAvgReward()
          Report the average reward per step in the trial.

int getPolicyChanges()
          Report the number of policy changes in the trial.

int getQueries()
          Report the number of queries in the trial.

int initTrial(int s)
          Called to initialize for a new trial.

int query(int yn, double rn)
          Select an output based on the state and reward.

void readPolicy()
          Read the policy from a file.

void savePolicy()
          Write the policy to a file.

void saveProfile(java.lang.String profile_filename)
          Write the policy profile to a file.

void setGamma(double g)
          Set gamma for the Q-learner.

void setRandomRate(double r)
          Set the random rate for the Q-learner.

void setRandomRateDecay(double r)
          Set the random decay for the Q-learner.

java.lang.String toString()
          Generate a String that describes the current state of the learner.

protected void updateState(i_PriQLearner_id.state st)


Methods inherited from class EDU.gatech.cc.is.learning.i_ReinforcementLearner_id

log, loggingOff, loggingOn, loggingOn, setPolicyFileName

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Field Detail

DISCOUNTED

public static final int DISCOUNTED

Used to indicate the learner uses discounted rewards.

criteria

protected int criteria

states

protected i_PriQLearner_id.state[] states

changeQueue

protected PriorityQueue changeQueue

numactions

protected int numactions

Constructor Detail

i_PriQLearner_id

public i_PriQLearner_id(int numstatesin,
                        int numactionsin,
                        int criteriain,
                        long seedin)

Instantiate a Prioritized Sweeping learner using default parameters. Parameters may be adjusted using accessor methods.

Parameters:: numstates - int, the number of states the system could be in.; numactions - int, the number of actions or outputs to select from.; criteria - int, should be DISCOUNTED or AVERAGE.; seed - long, the seed.

i_PriQLearner_id

public i_PriQLearner_id(int numstatesin,
                        int numactionsin,
                        int criteriain)

Instantiate a Q learner using default parameters. This version assumes you will use a seed of 0. Parameters may be adjusted using accessor methods.

Parameters:: numstates - int, the number of states the system could be in.; numactions - int, the number of actions or outputs to select from.; criteria - int, should be DISCOUNTED or AVERAGE.

i_PriQLearner_id

public i_PriQLearner_id(int numstatesin,
                        int numactionsin)

Instantiate a Q learner using default parameters. This version assumes you will use discounted rewards. Parameters may be adjusted using accessor methods.

Parameters:: numstates - int, the number of states the system could be in.; numactions - int, the number of actions or outputs to select from.

Method Detail

setGamma

public void setGamma(double g)

Set gamma for the Q-learner. This is the discount rate, 0.8 is typical value. It should be between 0 and 1.

Parameters:: g - double, the new value for gamma (0 < g < 1).

setRandomRate

public void setRandomRate(double r)

Set the random rate for the Q-learner. This reflects how frequently it picks a random action. Should be between 0 and 1.

Parameters:: r - double, the new value for random rate (0 < r < 1).

setRandomRateDecay

public void setRandomRateDecay(double r)

Set the random decay for the Q-learner. This reflects how quickly the rate of chosing random actions decays. 1 would never decay, 0 would cause it to immediately quit chosing random values. Should be between 0 and 1.

Parameters:: r - double, the new value for randomdecay (0 < r < 1).

toString

public java.lang.String toString()

Generate a String that describes the current state of the learner.

Overrides:: toString in class i_ReinforcementLearner_id

Returns:: a String describing the learner.

updateState

protected void updateState(i_PriQLearner_id.state st)

query

public int query(int yn,
                 double rn)

Select an output based on the state and reward.

Overrides:: query in class i_ReinforcementLearner_id

Parameters:: statein - int, the current state.; rewardin - double, reward for the last output, positive numbers are "good."

endTrial

public void endTrial(double Vn,
                     double rn)

Called when the current trial ends.

Overrides:: endTrial in class i_ReinforcementLearner_id

Parameters:: Vn - double, the value of the absorbing state.; reward - double, the reward for the last output.

initTrial

public int initTrial(int s)

Called to initialize for a new trial.

Overrides:: initTrial in class i_ReinforcementLearner_id

Tags copied from class: i_ReinforcementLearner_id

Parameters:: statein - int, the current state.

getAvgReward

public double getAvgReward()

Report the average reward per step in the trial.

Overrides:: getAvgReward in class i_ReinforcementLearner_id

Returns:: the average.

getQueries

public int getQueries()

Report the number of queries in the trial.

Overrides:: getQueries in class i_ReinforcementLearner_id

Returns:: the total.

getPolicyChanges

public int getPolicyChanges()

Report the number of policy changes in the trial.

Overrides:: getPolicyChanges in class i_ReinforcementLearner_id

Returns:: the total.

readPolicy

public void readPolicy()
                throws java.io.IOException

Read the policy from a file.

Overrides:: readPolicy in class i_ReinforcementLearner_id

Parameters:: filename - String, the name of the file to read from.

savePolicy

public void savePolicy()
                throws java.io.IOException

Write the policy to a file.

Overrides:: savePolicy in class i_ReinforcementLearner_id

Parameters:: filename - String, the name of the file to write to.

saveProfile

public void saveProfile(java.lang.String profile_filename)
                 throws java.io.IOException

Write the policy profile to a file.

Parameters:: filename - String, the name of the file to write to.

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: INNER | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

Inner Class Summary
`protected class`	`i_PriQLearner_id.state`

Field Summary
`protected PriorityQueue`	`changeQueue`
`protected int`	`criteria`
`static int`	`DISCOUNTED` Used to indicate the learner uses discounted rewards.
`protected int`	`numactions`
`protected i_PriQLearner_id.state[]`	`states`

Constructor Summary
`i_PriQLearner_id(int numstatesin, int numactionsin)` Instantiate a Q learner using default parameters.
`i_PriQLearner_id(int numstatesin, int numactionsin, int criteriain)` Instantiate a Q learner using default parameters.
`i_PriQLearner_id(int numstatesin, int numactionsin, int criteriain, long seedin)` Instantiate a Prioritized Sweeping learner using default parameters.

Method Summary
`void`	`endTrial(double Vn, double rn)` Called when the current trial ends.
`double`	`getAvgReward()` Report the average reward per step in the trial.
`int`	`getPolicyChanges()` Report the number of policy changes in the trial.
`int`	`getQueries()` Report the number of queries in the trial.
`int`	`initTrial(int s)` Called to initialize for a new trial.
`int`	`query(int yn, double rn)` Select an output based on the state and reward.
`void`	`readPolicy()` Read the policy from a file.
`void`	`savePolicy()` Write the policy to a file.
`void`	`saveProfile(java.lang.String profile_filename)` Write the policy profile to a file.
`void`	`setGamma(double g)` Set gamma for the Q-learner.
`void`	`setRandomRate(double r)` Set the random rate for the Q-learner.
`void`	`setRandomRateDecay(double r)` Set the random decay for the Q-learner.
`java.lang.String`	`toString()` Generate a String that describes the current state of the learner.
`protected void`	`updateState(i_PriQLearner_id.state st)`

EDU.cmu.cs.coral.learning Class i_PriQLearner_id

DISCOUNTED

criteria

states

changeQueue

numactions

i_PriQLearner_id

i_PriQLearner_id

i_PriQLearner_id

setGamma

setRandomRate

setRandomRateDecay

toString

updateState

query

endTrial

initTrial

getAvgReward

getQueries

getPolicyChanges

readPolicy

savePolicy

saveProfile

EDU.cmu.cs.coral.learning
Class i_PriQLearner_id