Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!nntp.club.cc.cmu.edu!goldenapple.srv.cs.cmu.edu!das-news2.harvard.edu!cam-news-feed3.bbnplanet.com!news.inc.net!arclight.uoregon.edu!news.mathworks.com!newsgate.duke.edu!interpath!news.interpath.net!news.interpath.net!sas!newshost.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: Bogus facts about trained MLP's
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <E7x8sD.AzK@unx.sas.com>
Date: Mon, 31 Mar 1997 19:00:13 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References:  <Pine.SOL.3.91.970330104654.11493B@miles>
Organization: SAS Institute Inc.
Lines: 39


On 25 Mar 1997, Marklaw2 <marklaw2@aol.com> wrote:
> Hi Ian,
> I believe the assumption you are talking about comes from a Bayesian
> treatment of Neural Networks first proposed by Radford Neal of the
> University of Toronto. A Baysian treatment of any system requires a prior.
> The prior being your initial belief about the way the system should work.
> This prior is often taken to be a Gaussian you to (sic) the ease of
> computational application. You are right, there is no physical reason why
> this should be so, it is just another case of Gaussians being applied due
> to their computational simplicity. 

I missed Ian's post, but I take it the question had to do with the claim
that the set of weights in a well-trained NN should have a Gaussian
distribution.  I have seen this claim in a number of popular articles by
people who couldn't possibly understand Radford's Bayesian NNs. Besides
which, having a Gaussian prior distribution for the weights does not in
any way imply that the set of weights in a particular trained network
should have anything resembling a Gaussian distribution. Even if the
regularity conditions hold under which the posterior distribution is
approximately normal, that says nothing about the distribution of the
weights in a particular trained network.

This is easier to explain in a frequentist context. If you take a large
number of random samples from some population, and train a network on
each sample, and if the regularity conditions hold (which they usually
don't), and if you look at the weights corresponding to any particular
connection in all of these networks, then that "sampling distribution"
should be approximately Gaussian. If you look at all of the weights in a
particular network, there is no reason that I know of for that
distribution to be Gaussian.

-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
 *** Do not send me unsolicited commercial or political email! ***

