Newsgroups: comp.ai.neural-nets
From: David@longley.demon.co.uk (David Longley)
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!gatech!howland.reston.ans.net!news.sprintlink.net!demon!news.demon.co.uk!longley.demon.co.uk!David
Subject: Re: bias
References: <1995Apr28.195623.7533@cm.cf.ac.uk> <D7t3E1.ECz@unx.sas.com> <799200709snz@longley.demon.co.uk> <D7x5u6.Fvo@unx.sas.com>
Organization: Relational Technology
Reply-To: David@longley.demon.co.uk
X-Newsreader: Demon Internet Simple News v1.29
Lines: 71
X-Posting-Host: longley.demon.co.uk
Date: Tue, 2 May 1995 10:49:17 +0000
Message-ID: <799411757snz@longley.demon.co.uk>
Sender: usenet@demon.co.uk

In article <D7x5u6.Fvo@unx.sas.com>
           saswss@hotellng.unx.sas.com "Warren Sarle" writes:

<snip>
 
> As for attributing weights to independent variables (i.e. inputs in NN
> terminology), one can do that in some statistical models such as linear
> regression or, more generally, additive models, but not in more flexibly
> nonlinear models such as projection pursuit and kernel regression.
> 
> Overfitting is just as much a problem in traditional statistical methods
> as it is in neural networks. You can overfit in linear regression by
> using too many predictors, in polynomial regression by using too many
> terms, in autoregressive models by using too many lags, or in kernel
> regression by using too small a bandwidth. I see no reason to believe
> that NNs would be more prone to the sorts of human foibles that Tversky
> & Kahneman described than would other flexible nonlinear methods such as
> projection pursuit and kernel regression.
> 
> 
> -- 
> 
> Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
> saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
> (919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
> 
Thankyou very much for this. I do have some experience with NNs and regression
but not to your  level of expertise. Most of my professional work looks to the
use of multiple or logistic regression  (SPSS I'm  afraid, as IT coordinator I
had both SPSS and SAS licences at one  time but my  colleagues (psychologists)
were not up the the latter -  so I'm no  longer  familiar  with what  SAS  now 
offer). As  I tried  to elaborate  in a  series of articles  here on  the  net
'Fragments of Behaviour' 1 - 9 ...25/4/95) my ultimate concern is the relative
merits of actuarial vs. clinical judgment. My background  in neuroscience  and
learning naturally led me to follow  developments in the  'cell assembly' area
but it was when I started to  look at  the relationship  between  conventional
stats techniques and NNs that I really thought I was on to something.

As I see it, the  *key* difference between NNs  and Regression techniques such 
as Logistic Regression (which also  uses a  squashing function to  generate  a 
probability) is that the latter is designed to give  you weightings  for  your 
input (independent) variables, whilst  NNs do not,  and can not  in  principle 
(mind you on reflection..... B  values are  partial unstandardised  regression 
coefficients aren't they..hmmmmmmm). SPSS 5.0 now has a couple of chapters  on 
Non-Linear Regression ..do you know of them (and *can* you talk about them? 

To be philosophical for a moment - Quine (1951) may been seen as the source of
a  'distributed quality space'  conception of  meaning in his  'Two  Dogams of
Empiricism'. This notion can be  found today in the connectionists dictum that
there are no referents to beliefs & other *intensions* only connection weights 
in a distributed weight space. Now this may  be a fine model of how we & other 
animals extract functions when operating on the world, ie this may be  a  good 
way of modelling *knowing how*, but the essence of scientific methodce lies in 
our being able to articulate *what* it is which functionally relates to  what.
Here, I think conventional  statisticians  are 'hiding  their  lights..'.  The 
ability to step through a multiple regression equation programme & say exactly 
what each step is doing algorithmically (effectively) is the real value of the 
technology, as doing so, along with extracting & analysing  residuals, looking 
at measures of fit and so on is analysing *extensionally* ie according to  the 
explicit  principles embodied  in the  predicate  calculus  (substitutivity of
identicals is of course essential for solution of simultaneous equations).

Having said that, I  *do* think  that  conventional  stats technology could be
marketed better. In places, the SPSS coverage of logistic regression  is  very 
good, but then the authors just let go of the neophyte's hand  & they're lost.   

This is just a quick response to say that I'd like to pursue the comparison of
conventional regression technology, cluster analysis, discriminant etc and NNs
further - thanks again. 

-- 
David Longley
