17 November 1993, 3:30pm, WeH 4601
Talk to joint meeting of Reinforcement Learning group and Manipulation group


    A Vision-Based System for the Learning of Pushing Manipulation
				   
			 Marcos Salganicoff*
				   
    The General Robotics and Active Sensory Perception (GRASP) Lab
	    Department of Computer and Information Science
		      University of Pennsylvania
			Philadelphia, PA 19104
			sal@grip.cis.upenn.edu
				   
 *Joint work  with  G.  Metta,  A. Oddera and  G.  Sandini, LIRA Lab,
		     University of Genova, Italy.
				   
				   
			       Abstract
               

We describe an  approach  for combining image-based  task  constraints
with memory-based learning for the control of robotic manipulation and
discuss related issues in using memory-based learners for time-varying
mappings.

Image-based   constraints   express  task  constraints   in  terms  of
equivalent perceptual constraints.  We demonstrate their effectiveness
and simplicity by describing two reference real-time robotic tasks and
their  corresponding implementations: the  insertion of  a pen into  a
``cap''  (the  capping  experiment)  and  the  rotational  non-sliding
point-contact pushing of an object of unknown shape, mass and friction
to a specified goal point in the image-space.

An unsupervised memory-based learning system  is described that allows
a  robot  to rapidly learn to  point-contact push  an  unknown  object
towards  an  image-space   goal  without  knowledge  of  the  object's
frictional  and mass distributions.  By having  the robot  observe the
results  of  its  actions  on the  object's  orientation  directly  in
image-space, the system learns  a forward model.   This acquired model
is inverted  on-line for  manipulation  planning  and control.  Rather
than explicitly  inverting  the forward  model  to  achieve trajectory
control, a stochastic action selection  technique [Moore,1990] is used
to select the most informative and promising  actions,  thus  allowing
the integration of model exploitation and exploration.

We conclude with  a discussion of three explicit forgetting algorithms
for memory-based learners.  A forgetting algorithm allows memory-based
learners to track time-varying concepts by deleting obsolete exemplars
from  learning  sets.  Time-weighted forgetting (TWF) is a well  known
algorithm which deletes exemplars based on their time of arrival.  Two
alternatives to TWF are introduced:  Locally-weighted forgetting (LWF)
uses  the  proximity   of   subsequent  observations  to  a   previous
observation   to  control  the   previous  observation's  decay  rate;
Performance-Error  Weighted  Forgetting  (PEWF) decays  an observation
based on its recent predictive accuracy.  We compare  these algorithms
and  argue  that  they  successfully overcome  some  of  the  previous
limitations of memory-based learners in time-varying environments.