Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!oitnews.harvard.edu!purdue!ames!enews.sgi.com!news.mathworks.com!howland.erols.net!netcom.com!komodo
From: komodo@netcom.com (Tom Johnson)
Subject: Re: Training data
Message-ID: <komodoE3zJwM.4Du@netcom.com>
Organization: NETCOM On-line Communication Services (408 261-4700 guest)
X-Newsreader: TIN [version 1.2 PL1]
References: <komodoE3yrHH.7v5@netcom.com> <32DACB6A.6976@ais.net>
Date: Tue, 14 Jan 1997 06:41:10 GMT
Lines: 48
Sender: komodo@netcom4.netcom.com

uthed@ais.net wrote:
: Tom Johnson wrote:
: > 
: > I am working on a problem in which there are trillions of possible inputs
: > and one output.
: > 
: > Here is my question: what are the pros and cons of inputting 100,000
: > random samples and training once on each or randomizing 5000 samples and
: > then cycle train then 20 times each?
: > 
: > Will giving it  a wider range of inputs make it more sensitive or is it
: > better to get stronger correlation on the smaller sample?
: > 
: > I will do both experiments, but advice from people who have been there
: > will be quite helpful.
: > 
: > Thanks in advance for any replies.
: > 
: > TJ

: Whoa! . . . . 100,000 inputs? (OK, I,m breathing in my paper bag, trying
: to calm down to finish this reply.)

: First, there is a carryover from conventional statistics to ANNs that
: encourages a "parsimonious" solution. That means use the least for the
: best generalized solution. There are technical, theoretical reasons for
: this that I won't go into that have to do with the significance of your
: analysis.

: With all these inputs, there MUST be "families" that share
: characteristics. If you can identify these, it may be a useful exercise
: to determine which element(s) of a family most influence the solution.
: After finding the best within each broad family, you can start building
: a final solution.

: It has been my experience that if you have, say, five inputs that share
: characteristics, almost randomly the ANN will settle upon one of them in
: terms of weights to the exclusion of the others. Preprocessing will
: drastically reduce the training phase, and in your case, maybe make a
: training phase possible at all. A fully connected ANN with any sensible
: size hidden layer and 100,000 inputs is for all practical purposes
: unwieldly.

Pardon my imprecision. My NN configuration has about 120 inputs. Out of 
several trillion possible inputs patterns, is it better to present say 
100,000 patterns once or train on 5000 input patterns for 20 iterations?

Tom
