Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!bloom-beacon.mit.edu!uhog.mit.edu!news.mathworks.com!gatech!howland.reston.ans.net!news.sprintlink.net!redstone.interpath.net!sas!mozart.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: bias
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <D7x5u6.Fvo@unx.sas.com>
Date: Mon, 1 May 1995 21:46:54 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References: <1995Apr28.195623.7533@cm.cf.ac.uk> <D7t3E1.ECz@unx.sas.com> <799200709snz@longley.demon.co.uk>
Organization: SAS Institute Inc.
Lines: 72


In article <799200709snz@longley.demon.co.uk>, David@longley.demon.co.uk (David Longley) writes:
|> In article <D7t3E1.ECz@unx.sas.com>
|>            saswss@hotellng.unx.sas.com "Warren Sarle" writes:
|> > ... The statisticians should realize that, if they understand
|> > that feedforward neural nets are nonlinear regression models.
|>
|> This being so, why all the fuss about 'artificial neural networks'?
|> Statisticians have been building better and better regression techniques
|> over the years, and those techniques are based on clear *extensional*
|> principles where the weights can be readily attributed to the independent
|> variables. I can't help but feel that all that artificial neural networks
|> do is model the somewhat daft, *intensional* way in which biological systems
|> model relations - ie 'overfitting', getting stuck in ruts and all the rest
|> of the little pitfalls we tend to get ourselves into (Tversky & Kahneman 1974).

I am not familiar with the terminology of *extensional* and *intensional*
but I will try to say something useful nevertheless. 

There has indeed been a lot of "fuss", or "hype" as it is more commonly
called, regarding NNs. The April 15th issue of _The Economist_ has a
good example on pp. 75-77. This hype attributes the success of NNs to
their alleged emulation of biological nervous systems and hence human
intelligence.

The NNs that have been most widely and successfully applied are
multilayer perceptrons (MLPs) using some form of "backpropagation". I
would argue that MLPs have been successful because they are nonlinear
regression models, not because they emulate biological nervous systems.
The widespread publicity regarding NNs, such as the article in _The
Economist_, shows that neural networkers are better at marketing than
are statisticians, and that neural networkers have been more daring in
their applications than have statisticians. Statisticians are
professionally compelled to control error rates, while neural networkers
extrapolate with wild abandon. It seems that extrapolating with wild
abandon works more often than statisticans would have expected.  But
"backpropagation" networks are still nonlinear regression models based
on 1950s statistical technology.

It is true that statisticians have been building better and better
regression techniques, but I would include NNs among the class of
"better regression techniques", especially when combined with current
statistical technology like bootstrapping.  You might use an MLP, rather
than linear regression, if you have reason to expect nonlinear
relationships. You might use a flexible nonlinear model such as an MLP,
rather than a specific parametric nonlinear model, if you have no prior
knowledge from which to construct a specific parametric model. You might
use an MLP, rather than kernel or k-nearest-neighbor regression, if you
have many predictors, some of which are likely to be irrelevant. You
might use an MLP, rather than projection pursuit regression or LOESS, if
you want a simple formula for rapidly computing predicted values.

As for attributing weights to independent variables (i.e. inputs in NN
terminology), one can do that in some statistical models such as linear
regression or, more generally, additive models, but not in more flexibly
nonlinear models such as projection pursuit and kernel regression.

Overfitting is just as much a problem in traditional statistical methods
as it is in neural networks. You can overfit in linear regression by
using too many predictors, in polynomial regression by using too many
terms, in autoregressive models by using too many lags, or in kernel
regression by using too small a bandwidth. I see no reason to believe
that NNs would be more prone to the sorts of human foibles that Tversky
& Kahneman described than would other flexible nonlinear methods such as
projection pursuit and kernel regression.


-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
