Newsgroups: comp.ai.neural-nets,comp.soft-sys.matlab
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!news.mathworks.com!newsfeed.internetmci.com!in1.uu.net!brighton.openmarket.com!decwrl!sunsite.doc.ic.ac.uk!yama.mcc.ac.uk!thor.cf.ac.uk!scmcms
From: C.M.Sully@cs.cf.ac.uk (Chris Sully)
Subject: Re: BP and local optima
Sender: news@cf.ac.uk (USENET News System)
Message-ID: <DpwxE7.H8x@cf.ac.uk>
Date: Mon, 15 Apr 1996 16:49:19 GMT
X-Nntp-Posting-Host: tourmaline.cs.cf.ac.uk
References: <4kkkl4$r82@ustsu10.ust.hk> <4kn9p0$qi1@dfw-ixnews6.ix.netcom.com>
Organization: Dept of Computer Science, Univ of Wales, CARDIFF
Lines: 53
Xref: glinda.oz.cs.cmu.edu comp.ai.neural-nets:31077 comp.soft-sys.matlab:21003

In article <4kn9p0$qi1@dfw-ixnews6.ix.netcom.com>, jdadson@ix.netcom.com(Jive Dadson ) writes:
|> In <4kkkl4$r82@ustsu10.ust.hk> lsavio@cs.ust.hk (Lam Lai Yin Dominic
|> Savio) writes: 
|> 
|> >
|> >Can anyone point me to some materials or papers on how to minimize the
|> >chance of BP being stuck in a local optima?
|> 
|> The short answer is, no. Not if the algorithm is pure BP and not
|> some hybrid that restarts repeatedly.
|> 
|> The various flavors of backpropogation (gradient-directed search) are
|> designed specifically to find the first local minimum they can. If the
|> error-surface has lots of local minima, you are likely to wind up in
|> one. There are methods other than pure BP that try to spread the search
|> around. Some are "guaranteed" to find a global minimum, but the
|> guarantee either has some strong qualifications attached or it is the
|> same kind of guarantee that says you have a 100% chance of hitting the
|> lottery if you play long enough. Some of the magic words are "simulated
|> annealing", "Gibbs sampling", and "Monte Carlo".
|> 
|>                   Jive
|> 

Howabout the possibility of a guided choice for weight initialisation for example. Some sort of heuristic approach to start the search off nearer a 'good'
minimum?

I thought with online gradient descent (as opposed to batch) you could escape 
local minima. I think I read this in Bishop though he had his own term for 'online'. Somebody can explain to me why if they like.

I wouldn't mind a few references to the aforementioned global optimum search
techniques if anybody has some at hand. particularly if they think they're
any good.

Cheers.

Chris.

An old Welsh proverb: "Gorau arf, arf dysg" 
the best weapon is the weapon of learning

==================================================================
Christopher Sully
Ph.D. Student (Parallel and Scientific Computation Research Group)  
Department of Computer Science (Room C2.06), 
University of Wales College of Cardiff,   
Newport Road, CARDIFF, CF2 3XF. (Wales, UK) 
E-Mail: C.M.Sully@cs.cf.ac.uk
WWW: http://www.cs.cf.ac.uk/User/C.M.Sully
Phone:  +44 (0)1222 874000 x6068
Fax:	+44 (0)1222 666182
Home:	+44 (0)1222 484494
==================================================================
