Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!news.sprintlink.net!redstone.interpath.net!sas!mozart.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: Error in AI Expert Paper
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <D6zJ5L.os@unx.sas.com>
Date: Thu, 13 Apr 1995 17:56:09 GMT
Distribution: usa
X-Nntp-Posting-Host: hotellng.unx.sas.com
References: <3mduma$qir$1@heifetz.msen.com> <3mhiar$ogh@kocrsv08.delcoelect.com>
Organization: SAS Institute Inc.
Lines: 42


In article <3mhiar$ogh@kocrsv08.delcoelect.com>, ddturner@kocrsv01.delcoelect.com writes:
|>
|> In article <3mduma$qir$1@heifetz.msen.com>, <csi@garnet.msen.com> writes:
|> >
|> > ... I still contend that overtraining is always possible when you
|> > are using real-world (noisy) data.
|>
|> I have to say that the statement that it was not possible to overtrain an
|> overdetermined network was pretty bold.  I wish they would have backed that up
|> with a little more fact/reference.  I tend to agree with you.  If your best
|> data is still noisy data, then you could certainly overtrain.

I do not know exactly what the article in question said, but if the
training cases are randomly sampled from the set of all cases you want
to generalize to, then the excess generalization error due to
overtraining goes to zero as the sample size goes to infinity. See,
for example, equation 1 in: 

   Moody, J.E. (1992), "The Effective Number of Parameters: An Analysis
   of Generalization and Regularization in Nonlinear Learning Systems",
   NIPS 4, 847-854,

which translated into English says:

   Generalization error = Training error + 

           2 * Noise variance * Effective number of parameters
           ---------------------------------------------------
                       Number of training cases 

and note that as the number of training cases goes to infinity, the
training error approaches the generalization error.

If the training cases are not representative of the cases you want
to generalize to, all bets are off.

-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
