Newsgroups: comp.ai.neural-nets,sci.stat.math
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!news.mathworks.com!newsfeed.internetmci.com!howland.reston.ans.net!psinntp!psinntp!psinntp!psinntp!megatest!news
From: Dave Jones <djones>
Subject: Where'd the 2 go in the gradient?
Content-Type: multipart/mixed;
	boundary="-------------------------------327562359116799"
Message-ID: <DJsLDF.D4p@Megatest.COM>
Sender: news@Megatest.COM (News Admin)
Nntp-Posting-Host: pluto
Organization: Megatest Corporation
References: <4a736b$ajv@goya.eunet.es> <4agrgb$eho@fstgal00.tu-graz.ac.at> <petercd.30.000A8932@io.org> <4ajcla$jto@fstgal00.tu-graz.ac.at> <4ana2s$sid@bright.ecs.soton.ac.uk>
Mime-Version: 1.0
Date: Mon, 18 Dec 1995 17:17:38 GMT
X-Mailer: Mozilla 1.1N (X11; I; SunOS 5.4 sun4m)
X-Url: file:/p405a3/djones/wave/letter2
Lines: 76
Xref: glinda.oz.cs.cmu.edu comp.ai.neural-nets:28725 sci.stat.math:8425

This is a multi-part message in MIME format.

---------------------------------327562359116799
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=us-ascii

My apologies, but the text of this message is in the attachment. I'm using
"Netscape", and I cannot figure out how to insert a file I've already typed.
Cut and paste doesn't even work. You'd think that with the billion dollars they
collected from selling stock certificates they could hire an undergraduate
for a couple of afternoons to implement that feature. Anyway, please open the
attachment and read it, if your newsreader supports attachments. If not, sorry
I bothered you.

             Dave

---------------------------------327562359116799
Content-Transfer-Encoding: 7bit
Content-Type: text/plain


I have a problem with the "gradient" definition in Masters' _Practical
Neural Network Recipes in C++_. I've often praised and recommended his
books on the net, so I hope Dr. Masters will forgive a little grousing.

At the bottom of page 95 he begins with the dreaded "simply". (I always
say a silent "Uh oh," when I see the "s" word.)

    Here we simply state that for a single presentation [read
    "datum" -- D.J.], the derivative of the output layer weight
    connecting the previous layer i to output neuron j is


            dE          
           ----  =   -o  f'(net ) (t  - o )
            dw         i       j    j    j
              ij

       E  -- an error measure
       w  -- the weight matrix for a layer
     o_i  -- output of neuron i from previous layer
       f  -- the activation function for the layer (i.e. logistic func.)
   net_j  -- the weighted sum of inputs to neuron j
     t_j  -- the "training" or expected output for neuron j
     o_j  -- the actual output of neuron j [equal to f(net_j)]


Notice first off that the sentence beginning with the word "simply" does
not even make sense. I presume it is supposed to read,

    For a single presentation, the partial derivative of the error
    function E with respect to the weight w   is...
                                           ij

But if that's what it is supposed to mean, then either he dropped a "2" or
I have forgotten how to do elementary calculus. (I will admit that the
second contingency is not out of the question.) Assuming that the error
function E is defined as on page 95,

                               2
            E(o )  =  (t  - o )
               j        j    j


then I get an answer that is exactly twice what the "simply" formula
says it is.

What  gives?


               Dave



          
---------------------------------327562359116799--
