Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!gatech!newsfeed.internetmci.com!in1.uu.net!news.interpath.net!sas!mozart.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: some questions about FF-nets
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <DoAqIr.2rB@unx.sas.com>
Date: Fri, 15 Mar 1996 06:40:03 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References: <Pine.A32.3.91.960312161055.12696A-100000@tharros.dipchim.uniss. <x1KocEp.predictor@delphi.com> <4i7aj6$6vl@airgun.wg.waii.com> <4i7s7v$7ki@delphi.cs.ucla.edu> <4ialh8$k6@llnews.ll.mit.edu>
Organization: SAS Institute Inc.
Lines: 39


In article <4ialh8$k6@llnews.ll.mit.edu>, heath@ll.mit.edu (Greg Heath) writes:
|> In article <4i7s7v$7ki@delphi.cs.ucla.edu>, edwin@cs.ucla.edu (E. Robert Tisdale) writes:
|> |> ...
|> |> In general, you will need as many distinct, independent training pairs
|> |> as you have connection weights and biases in your network.  Assuming
|> |> that you have an additional constant input and an additional constant
|> |> hidden node, you will need 4*(13 + 1) + (4 + 1) = 61 training pairs.
|> |> If your inputs and output are random variables, you will need more
|> |> training data.  If you have 30 times as many distinct, independent
|> |> training pairs as you have connection weights and biases, then your
|> |> confidence in your statistical estimate of them will be almost as good
|> |> as if you had an infinite number of training pairs.

Umm, 30=infinity?  Am I missing something here?  It is true that you are
not likely to see overfitting if you have 30 times as many cases as
weights, but that's not the same thing. And I have an example with 441
training cases, 35 weights, and no noise that demonstrates spectacular
overfitting, so I wouldn't even want to swear that you can't get
overfitting with a 30:1 ratio of cases to weights.

|> |> These are results that you should be able to find in any elementary
|> |> text book on Statistics.  They should be part of the FAQ for this
|> |> newsgroup.  Hope this helps, Bob Tisdale.
|> 
|> As far as rules of thumb go, 30 is a reasonable upper limit but I thought that
|> the theoretical practical lower limit was 2, not 1. I typically perform a binary
|> search starting with a power of 2 in the interval (2,32)(usually 8 or 16).

The theoretical lower limit is 1 if you have no noise and the model is
specified correctly and parsimoniously. For typical NN applications,
none of those conditions hold, so the lower limit is >1, but how much
greater isn't known.

-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
