Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!MathWorks.Com!news.kei.com!ub!acsu.buffalo.edu!jn
From: jn@cs.Buffalo.EDU (Jai Natarajan)
Subject: Re: Visualizing Neural Networks
Message-ID: <CxsBtK.5zn@acsu.buffalo.edu>
Originator: jn@hadar.cs.Buffalo.EDU
Sender: nntp@acsu.buffalo.edu
Nntp-Posting-Host: hadar.cs.buffalo.edu
Organization: State University of New York at Buffalo/Computer Science
References: <HkLekqLVTBCC065yn@login.dknet.dk> <37s1uh$ogp@cantaloupe.srv.cs.cmu.edu>
Date: Sun, 16 Oct 1994 21:32:07 GMT
Lines: 57


In article <37s1uh$ogp@cantaloupe.srv.cs.cmu.edu>, sef@CS.CMU.EDU (Scott Fahlman) writes:
|> 
|> In article <HkLekqLVTBCC065yn@login.dknet.dk> thn.cls@login.dknet.dk (Thomas Honore Nielsen) writes:
|> 
|>    Can anyone help me to find a way to visualize the inner workings of
|>    a backprop? I need to 1) get a hint as to which parts of the input
|>    pattern are characterizing their respective categories and 2) get
|>    a view of the network's performance when training.
|> 
|> I don't know what 2) means.  As for 1), one possibility is to put in
|> an input, let the output settle, and then compute do/di for all the
|> inputs i, by basically back-propagating the output values (rather than
|> error) and by taking this all the way back to the inputs.  It has to
|> be done with a specific set of activations in the net so that you can
|> compute the sigmoid-prime values.
|> 
|> This may not show anything interesting if units in the net are
|> saturated.  It may well be that no input, by itself, can change
|> things, so all the derivatives are zero.  In that case, you may want
|> to pretend that the sigmoid units are linear ones (or add a small
|> linear component), just to see which inputs are contributing at all.
|> But it's hard to characterize just what the results mean if you do
|> this.
|> 
|> -- Scott
|> 
|> ===========================================================================
|> Scott E. Fahlman			Internet:  sef+@cs.cmu.edu
|> Principal Research Scientist		Phone:     412 268-2575
|> School of Computer Science              Fax:       412 268-5576 (new!)
|> Carnegie Mellon University		Latitude:  40:26:46 N
|> 5000 Forbes Avenue			Longitude: 79:56:55 W
|> Pittsburgh, PA 15213			Mood:      :-)
|> ===========================================================================
 I f i understand your second question right, you want to know how to
judge whether your training is progressing properly. One strict rule you
must observe (and which is tempting to violate) is that you must NEVER
show your test set to the net while training. That means you can't just
pause occasionally, feed some test samples into the net & see that the 
recognition rate is incresing. This process introduces a bias into your training
process (both human and neural).
There are two other techniques you could use - one is to track the mean square
errors in the output nodes (maybe feed it to a graph or something) and see
that it's generally reducing apart from small local fluctuations.
The other way is to feed your training set through a recognition run and
see tha accuracy at regular intervals. The goal is to get 100% accuracy
on the training set. But a word of caution - even this may not mean optiamlity.
I have sometimes had the error rate decrease even after 100% training set
accuracy.
It's not quite easy to map the theoretical ideas of local and global
minima into practical terms. Therefore the above seems tha way to go about it.
I hope I answered what you wanted. All the best for those many, many, training
cycles !
Jai Natarajan
Dept. of Computer Science
SUNY at Buffalo
