Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!gatech!howland.reston.ans.net!news.sprintlink.net!redstone.interpath.net!sas!mozart.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: Q: Super SAB training
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <D52zpy.EGu@unx.sas.com>
Date: Tue, 7 Mar 1995 17:39:34 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References:  <1995Mar2.175335.28905@nosc.mil>
Organization: SAS Institute Inc.
Lines: 55


In article <1995Mar2.175335.28905@nosc.mil>, "R. Scott Starsman" <r_starsman@nise-p.nosc.mil> writes:
|> I've been trying to implement a Super Self-Adapting Backpropagation
|> (Super SAB) network and I'm having difficulty getting it to converge
|> in an on-line (non-batch) mode.
|>
|> Has anyone used super SAB and is it worth pursuing?

Super SAB works poorly in batch mode and is undoubtedly even worse
on-line. For batch learning, Quickprop (see Fahlman) and RPROP (see
Riedmiller and Braun) are far superior to Super SAB:

   Fahlman, S.E. (1988), "An empirical study of learning speed in
   back-propagation networks", CMU-CS-88-162, School of Computer Science,
   Carnegie Mellon University.

   Fahlman, S.E. (1989), "Faster-Learning Variations on
   Back-Propagation: An Empirical Study", in Touretzky, D., Hinton, G, and
   Sejnowski, T., eds., _Proceedings of the 1988 Connectionist Models
   Summer School_, Morgan Kaufmann, 38-51.

   Riedmiller, M. and Braun, H. (1993), "A Direct Adaptive Method for
   Faster Backpropagation Learning: The RPROP Algorithm", Proceedings
   of the IEEE International Conference on Neural Networks 1993, San
   Francisco: IEEE.

In general, methods with adaptive learning rates that work well in batch
cannot be used on-line without modification, since the on-line version
would fruitlessly attempt to adapt the learning rate to random variation
in the training data. For on-line methods to converge, it is necessary
to gradually reduce the learning rate; see, e.g.:

   Robbins, H. & Monro, S. (1951), "A Stochastic Approximation Method",
   Annals of Mathematical Statistics, 22, 400-407.

   Kushner, H. & Clark, D. (1978), _Stochastic Approximation Methods for
   Constrained and Unconstrained Systems_, Springer-Verlag.

   White, H. (1989), "Some Asymptotic Results for Learning in Single
   Hidden Layer Feedforward Network Models", J. of the American Statistical
   Assoc., 84, 1008-1013.

For a method with an adaptive learning rate to converge on-line,
it is necessary to balance the learning rate adaptation against the
required gradual decline in learning rate, and I don't know any good
way to do that. If anybody does know a good way (with either a
convergence proof or extensive empirical tests [i.e., not just on XOR]),
please let us know about it!


-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
