Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!europa.chnt.gtegsc.com!library.ucla.edu!agate!howland.reston.ans.net!news.sprintlink.net!redstone.interpath.net!sas!mozart.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: Mult.linear regression beats NN. Why?
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <D883zK.Bp@unx.sas.com>
Date: Sun, 7 May 1995 19:40:32 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References: <3odatr$131n@columba.udac.uu.se> <3ogprj$k7g@newsbf02.news.aol.com> <3oi58s$ok@mailnews.kub.nl>
Organization: SAS Institute Inc.
Lines: 49


In article <3oi58s$ok@mailnews.kub.nl>, rutger@kub.nl () writes:
|> TiedNBound (tiednbound@aol.com) wrote:
|> : < Well, it's because I thought that a NN ALWAYS
|> : should be able to do AT LEAST AS GOOD as any linear statistical
|> : method... Is this assumption wrong?>
|> Well, I've done some research on performance of Multiple Linear
|> Regression (MLG) models vs. ANN models for forecasting the Amsterdam
|> European Options Exchange Index. Although I found that ANNs were better
|> in forecasting than MLG the difference was rather small.
|> As you may well know MLG models will yield good forecasts if the
|> underlying data conforms to certain formal assumptions:
|>
|> 1) the errors are independent of each other,
|> 2) the errors all have expected value zero: E(error) = 0,
|> 3) the errors are all normally distributed,
|> 4) the errors all have the same variance,

The above assumptions are implicit in the use of least-squares training
and apply whenever any model is fitted by least squares, whether it be a
linear regression model or a multilayer perceptron (MLP) or anything
else.

Let's try to make the question more precise: When will a model that is
linear in the inputs, such as multiple linear regression, generalize
better than a flexible, nonlinear-in-the-inputs model, such as an MLP?

The linear-in-the-inputs model will generalize better if the true
function to be learned is linear in the inputs, although with a large
enough training set, both models will do about equally well. As Scott
pointed out, it helps the nonlinear model if it contains a linear model
as a special case, such as by including direct input-to-output
connections in a feedforward net.

The linear-in-the-inputs model will also generalize better if the
true function is mildly nonlinear but there is insufficient training
data to learn the nonlinearities accurately. With a small training set,
a nonlinear-in-the-inputs model is more liable to overfitting than
a linear model, since the nonlinear model has more weights.

Or, to turn things around, the flexible, nonlinear-in-the-inputs model
will generalize better only if the true function is nonlinear and if
there is sufficient training data to learn the nonlinearities.

-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
