Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!rochester!cornellcs!newsstand.cit.cornell.edu!portc01.blue.aol.com!newsxfer3.itd.umich.edu!su-news-hub1.bbnplanet.com!cpk-news-hub1.bbnplanet.com!news.bbnplanet.com!cam-news-hub1.bbnplanet.com!news.mathworks.com!newsgate.duke.edu!interpath!news.interpath.net!news.interpath.net!sas!newshost.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: NN vs. Linear Regression
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <E4My5n.6ns@unx.sas.com>
Date: Sun, 26 Jan 1997 21:53:47 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References: <01bc094b$5b201e40$cb9901be@IS3203>
Organization: SAS Institute Inc.
Lines: 36


In article <01bc094b$5b201e40$cb9901be@IS3203>, "Mark Walker" <mwalker@aisvt.bfg.com> writes:
|> I'm attempting to approximate non-linear, continuous functions of several
|> variables using MLP networks with sigmoidal hidden units, but, due to
|> pretty decent performance using a simple linear regression including 2nd
|> and 3rd order terms, am finding it hard to justify the computational
|> overhead.

Linear regression is basically an MLP with no hidden layer, i.e. the
inputs are connected directly to the outputs. If you construct an MLP
with a hidden layer _and_ with direct connections from inputs to
outputs, it is guaranteed to have training error at least as good as the
linear regression, and probably better, if your software is any good at
all. However ...

|> With 4 inputs, it takes on the order of 40 hidden units and 100,000 epochs
|> of 20 input patterns to get RMSE less than a linear regression using 13
      ^^^^^^^^^^^^^^^^^
|> input combinations with no cross terms (RMSElr = 0.14% full scale, and
|> RMSEnn=0.04% full scale).  The results are encouraging, but I am sure I can
|> further reduce the lr error with minimal effort (by adding a few more
|> terms). 

With about 6 more terms, you should be able to get the training error
down to zero. This is called "overfitting" and is usually not desirable.
See "What is overfitting and how can I avoid it?" in the Neural Network
FAQ, part 3 of 7: Generalization, at
ftp://ftp.sas.com/pub/neural/FAQ3.html.

-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
 *** Do not send me unsolicited commercial or political email! ***

