Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!europa.eng.gtefsd.com!howland.reston.ans.net!spool.mu.edu!caen!crl.dec.com!crl.dec.com!pa.dec.com!fisher.bio.uci.edu!rwa
From: rwa@fisher.bio.uci.edu ("Russell W. Anderson")
Message-ID: <199410171938.AA03525@pinus>
Subject: Reinforcement learning, ref's wanted
Date: Mon, 17 Oct 1994 12:38:56 -0700
X-Received: by usenet.pa.dec.com; id AA16353; Mon, 17 Oct 94 13:29:42 -0700
X-Received: by pobox1.pa.dec.com; id AA00883; Mon, 17 Oct 94 13:29:40 -0700
X-Received: from fisher.bio.uci.edu by inet-gw-3.pa.dec.com (5.65/10Aug94)
	id AA18546; Mon, 17 Oct 94 12:40:28 -0700
X-Received: from pinus by fisher with SMTP id AA13527
  (5.67a/IDA-1.5 for <comp.ai.neural-nets.usenet@decwrl.dec.com>); Mon, 17 Oct 1994 12:38:57 -0700
X-Received: by pinus id AA03525
  (5.67a/IDA-1.5 for comp.ai.neural-nets.usenet@decwrl.dec.com); Mon, 17 Oct 1994 12:38:56 -0700
X-Received: by NeXT.Mailer (1.100)
X-Received: by NeXT Mailer (1.100)
X-To: comp.ai.neural-nets.usenet@decwrl.dec.com
Lines: 123


Subject: Reinforcement learning, ref's wanted

To: M. Reiss

I recently wrote a review about the biological plausibility of
trial-and-error learning rules (not to be confused with the more complicated,
adaptive critic reinforcement learning).

below is an abstract and a few other references.

best Regards,

Russell W. Anderson
Dept. of Ecology and Evolutionary Biology
University of California
Irvine, CA 92717
Phone: (714) 856-7307
Fax: 714-725-2181
email: rwa@fisher.bio.uci.edu
   or  RWANDERS@uci.edu
---------------------------------
PREPRINT AVAILABLE:

"Biased Random-Walk Learning:
A Neurobiological Correlate to Trial-and-Error"
(In press: Progress in Neural Networks)

Russell W. Anderson
Los Alamos National Laboratory

Abstract:
Neural network models offer a theoretical testbed for
the study of learning at the cellular level.
The only experimentally verified learning rule,
Hebb's rule, is extremely limited in its ability
to train networks to perform complex tasks.
An identified cellular mechanism responsible for
Hebbian-type long-term potentiation, the NMDA receptor,
is highly versatile.  Its function and efficacy are
modulated by a wide variety of compounds and conditions
and are likely to be directed by non-local phenomena.
Furthermore, it has been demonstrated that NMDA receptors
are not essential for some types of learning.
We have shown that another neural network learning
rule, the chemotaxis algorithm, is theoretically much more powerful
than Hebb's rule and is consistent with experimental data.
A biased random-walk in synaptic weight space is
a learning rule immanent in nervous activity and
may account for some types of learning -- notably the
acquisition of skilled movement.

------------------------
other references:

H. J. Bremermann and R. W. Anderson (1989).
An Alternative to
Back-propagation: A Simple Rule of Synaptic Modification For Neural
Net Training and Memory
Technical Report: U. C.
Berkeley Center for
Pure and Applied Mathematics PAM-483.

H. J. Bremermann and R. W. Anderson (1991).
How the Brain Adjusts Synapses - Maybe
In:Automated Reasoning: Essays in Honor
of Woody Bledsoe, R. S. Boyer (ed.), Chapter 6, pp. 119-147, Kluwer
Academic Pub., Boston.

R. W. Anderson and V. Vemuri (1992).
Neural Networks can be used
for Open-Loop, Dynamic Control
Int. J. Neural Networks
Vo. 2 (3)
(Abstract in: Proc. Int. AMSE Conf. Neural Networks, San
Diego, CA,
vo. 2: 227-237 (May 29-31, 1991).

R. W. Anderson (1991).
Stochastic Optimization of Neural Networks and Implications
for Biological Learning
Ph.D. Dissertation, University of California, San Francisco.

R. L. Barron (1968).
Self-Organizing and Learning Control Systems
in: Cybernetic Problems in Bionics (Bionics Symposium, May 2-5, 1966,
Dayton, Ohio), New York, Gordon and Breach, pp. 147-203.

A. N. Mucciardi (1972).
Neuromine Nets as the Basis for the
Predictive Component of Robot Brains
 in: Cybernetics, Artificial
Intelligence, and Ecology, H. W. Robinson and D. E. Knight
(eds.), (Fourth Annual Symposium Amer. Soc. of Cybernetics),
Spartan Books, pp. 159-193.

R. Smalz and M. Conrad (1991).
A Credit Apportionment
Algorithm for Evolutionary Learning with Neural Networks
In: Neurocomputers and Attention Vol.II: Connectionism
and Neurocomputers,
A. V. Holden and V. I. Kryukov, eds.,
Manchester University Press: New York,
pp. 663-673.

D. L. Styer and V. Vemuri (1992a).
Adaptive Critic and Chemotaxis in Adaptive Control
Conf. Artificial Neural Networks in Engineering (ANNIE),
St. Louis, MO. (Nov.).

J. M. Wilson (1991).
Back-Propagation Neural Networks: A Comparison
of Selected Algorithms and Methods of Improving Performance
Proc. 2nd Annual Workshop Neural Networks WNN-AIND, Auburn,
Alabama (Feb. 11-13, 1991).

WILLIS MJ; DIMASSIMO C; MONTAGUE GA; THAM MT; and others.
ARTIFICIAL NEURAL NETWORKS IN PROCESS ENGINEERING.
IEE PROCEEDINGS-D CONTROL THEORY AND APPLICATIONS, 1991 MAY, V138
N3:256-266.



