CMUnited99 (code for off-line trainer)
Patrick Riley <pfr+@cs.cmu.edu>
Computer Science Department
Carnegie Mellon University
Copyright (C) 1999 Patrick Riley

CMUnited-99 was created by Peter Stone, Patrick Riley, and Manuela Veloso

You may copy and distribute this program freely as long as you retain
this notice.  If you make any changes or have any comments we would
appreciate a message.
**********************************************************************************

(see README for an overview of the system)

**********************************************************************************
CONTENTS
**********************************************************************************
*Details Of The Object Interactions
*Summary Of The Training Module's Functions
*Program Options
*The Provided Training Modules
*Interaction With Clients
*To Add A Training Module


**********************************************************************************
DETAILS OF THE OBJECT INTERACTIONS
**********************************************************************************

In order to write your own training modules, you should understand how
the TrainingInfo class, the epoch controller, and the training module
interact. 

The TrainingInfo class serves as an interface to the main program.
The TrainingInfo class handles all of the allocation.  The first
method of the training module to be called is of course the Initialize
method. Next, the InitializeEpochController method is called. The
training module should set all of the parameter sets that should be
iterated through. If the training module is not designed to be used
with epoch, InitializeEpochController should do nothing.

After receiving a new sensation, the main program calls the
TrainingInfo method NotifyTMOfSight, and the TrainingInfo passes that
onto the epoch controller, along with a pointer to the allocated
training module.

The epoch controller then calls the training module handler function
(NewSightHandler). If epochs are turned off, the epoch controller
stops there. Otherwise it looks at how long this epoch has been
running. If the appropriate amount of time has elapsed, the epoch
controller then calls the training module's LogPerformance method.
That method should should write to a file whatever performance features
have been recorded. The training module can call back to the epoch
controller to get the current option names or values through the
WriteOptionNames and WriteCurrentOptionValues methods. Lastly, the
training modules ResetForEpoch method is called.


**********************************************************************************
SUMMARY OF THE TRAINING MODULE'S FUNCTIONS
**********************************************************************************

Bool Initialize()
	Called immediately after allocation. This function can be used
	instead of a constructor.
void InitializeEpochController(EpochController*)
	Responsible for specifying what parameter sets should be
	iterated through.
void NewSightHandler()
	Called after every new sight info. In general, this function
	will record some performance features (like speed of dribbling
	or accuracy of kick), as well as resetting the world to run
	the trial again.
void LogHeader(EpochController*)
	If epoch are being used, this function is called before epoch
	0 is recorded, allowing you to put a header of the file,
	possible reminding you what the columns of the logfile mean.
	The WriteOptionNames method of the epoch controller should
	usually be called to indicate what parameters are being varied.
void LogPerformance(EpochController*)
	Called at the end of the epoch to write the performance
	characteristics to a file. The WriteCurrentOptionValues method
	of the epoch controller should typically be called from here
	in order to record the current parameter values being used.
void ResetForEpoch()
	Called at the beginning of every epoch. Typically this
	function is used to put the world into a known state as well
	as reseting whatever performance metrics are being used


**********************************************************************************
PROGRAM OPTIONS
**********************************************************************************

All of the provided training modules have some specific options, which
are not documented here. There options are the ones which affect all
of the training modules. The trainer also recognizes all of the
options that the server takes.  All options can be given on the
command line as "-<option> <val>" or in an option file (with "-file
<optfile>") as "<option>: <val>".

cycles_to_store: (integer) as discussed in shared/README, the memory
                 model can store information about several past
                 cycles. This specifies how many cycles to store
                 total. Generally you should store at least 2 (the
                 current cycle and the last cycle).

alarm_interval
max_cycles_since_io: (integer) These options affect how the trainer
                     identifies when the server is dead. The trainer
                     has a constant alarm signal running (whose
                     interval is determined by alarm_interval), and if
                     enough alarms come between ios from the server so
                     that max_cycles_since_io cycles must have
                     occurred (as judged from the simulation_step
                     parameter).

save_log: (on/off) If this is on, then every message that the trainer
          sends and every message it receives is stored in the file
          "save.log"

training_module: (string) determines which training module to use. It
                 is matched against the "Name" element of each of the
                 training modules.

training_log_fn: (string) The filename in which the training module
                 should store data.

use_epochs: (on/off) indicates whether or not to use epochs

epoch_length: (integer) the length (in cycles) of an epoch

epoch_start
epochs_to_run: (integer) These parameters are useful in order to run a
               large number of epochs, where you want to restart the
               server so that the the current time does not get too
               large. epoch_start indicates what epoch to start at (0
               based), and epochs_to_run indicates how many epochs to
               run before shutting down. An example of a script to do
               this sort of thing is in extrain.sh


**********************************************************************************
THE PROVIDED TRAINING MODULES
**********************************************************************************

This first group of training modules are the most clearly
written. They are designed for use with and without epochs. It is
recommended you look at these to understand how training modules
generally work.
-TMdribble
	Used to evaluate the speed and reliability of dribbling with
	various parameters.
-TMhardkick
	varies the angle the player tries to get relative to the
	kick and the buffer around the player.
-TMturnball
	Used to evaluate the speed and reliability with which the ball
	can be turned around the player.
-TMturnball2
	Used to evaluate the speed, reliability, and accuracy that the
	player can get the ball to a particular angle around itself
	and stop it.

This group of training modules are not designed to be used with
epochs. They may, however, record useful data, in which case you could
probably change them to use epochs effectively.
-TMbreakaway
	Sets up a goalie and a striker with the striker at varying
	distances. Optionally, a defender (a client on the same team
	as the goal) will be placed slightly behind the striker.
-TMgame
	This is a very simple training module that resets the ball to
	the center of the field if it is stuck for a long period of time.
-TMintercept
	For use with 2 players. They are started randomly around the
	origin and then the ball is kicked near them. Records which
	player gets the ball. Useful for evaluating different
	interception strategies.
-TMintercept2
	For use with one player. Puts the player in the middle and
	kicks the ball generally in its direction.
-TMkeepaway
	In a given rectangle around the middle, each team should try
	to simply keep the ball away from the other. If the right team
	control the ball too long, it is given back to the left team.
-TMsetplay
	When a drop ball is given near certain spots of the field
	(like near the corner), this training module turns them into
	set play situations like corner kicks and throw ins.
-TMshot
	A single goalie and striker are set up. The striker should
	give the goalie a few cycles to find the ball after it is
	moved. Record the number of goals/misses/saves. 

This group of training modules are strange for one reason or another,
and may be more difficult to understand.
-TMgoalie
	Puts the ball in a specified rectangle on the field, gives the
	goalie time to adjust and then simulates a kick of the ball.
	Records success rates of various shot spots.
-TMhardkick2
	Puts the ball at various places around a player at the middle
	to see how placement of the ball around the player affects
	kick velocity and time
-TMtest
	Random stuff


**********************************************************************************
INTERACTION WITH CLIENTS
**********************************************************************************

Any message from the trainer sent with the Say command in client.h
will automatically have "training put in front of it." Therefore if
you call Send("A B C") what will be sent is "(say training A B
C)". 

If you use the epoch controller to tell the clients what parameter
sets to use (as is recommended), the epoch controller will send a
string that resembles command line options. For example, if the option
name is "foo" and you set it to value 3.14, then then trainer will
Say("-foo 3.14"), means the client will hear something like 
"training -foo 3.14" 


**********************************************************************************
TO ADD A TRAINING MODULE
**********************************************************************************

The following steps are needs to add a training module. Let's say your
TM is named foo.

1. Create (probably by copying) a TMfoo.h and TMfoo.C and program
   accordingly.
2. Add TMfoo.o to the OFILES macro in Makefile
3. Add a #include "TMfoo.h" to MemTrain.C
4. Add an if case into TrainingInfo::Initialize in MemTrain.C
