Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!howland.reston.ans.net!ix.netcom.com!netcom.com!harris86
From: harris86@netcom.com (Donna Lee Harris)
Subject: net construction
Message-ID: <harris86D66K9A.KqA@netcom.com>
Organization: NETCOM On-line Communication Services (408 261-4700 guest)
X-Newsreader: TIN [version 1.2 PL1]
Date: Wed, 29 Mar 1995 02:29:33 GMT
Lines: 49
Sender: harris86@netcom20.netcom.com

I am very new to the field of neural nets but find them very 
interesting so far.  My long term goal will be to eventually (meaning a 
few years) start working on a chess program, but I know how hard it is 
etc. (no lectures from anyone :)).  My real question for right now is 
this, I understand that you want one input node for each piece of 
information you'd like the net to consider.  I also realize that the 
number of output nodes will be pretty easy to determine (1 for yes/no, 8 
for a character, etc.).  But I'm not sure how anyone reasons out how many 
"hidden layers" should be put into the net.  Here is my example: I want a 
yes/no decision to be made, and that decision is whether or not to move a 
pawn up one square (chess analogy).  I picked 8 things to use as input 
like this:
I1 : Is there a piece on the square in front of the pawn?
I2 : Does the pawn attack a piece after it moves up one?
I3 : Is the pawn attacked after moving up one?
I4 : Does the pawn defend a piece after it moves up one?
I5 : Is the pawn defended after moving up one?
I6 : Does the pawn defend a piece right now (without moving)?
I7 : Does the pawn attack a piece right now (without moving)?
I8 : Is the pawn defended right now (without moving)?

and the output will be one node (yes move the piece up, or no don't)
so I've got this much of the net built:

Input layer    Hidden layers?   Output layer
-----------    --------------   ------------
(I1)      o
(I2)      o
(I3)      o
(I4)      o                          o (O1)
(I5)      o
(I6)      o
(I7)      o
(I8)      o

It seems obvious to me that many hidden layers cause for a more complex 
(thus more effective for complicated problems) system of neurons, but I 
don't understand for example if I'd put in one layer of 4 and another of 
2 and then have the output be 1 (thus cutting the size in half every 
time) but then why not just put in one hidden layer that's 6 for example?
I have heard about this "common XOR" problem, apparently trying to 
estimate the result of xor'ing two numbers that are between 0 and 1, but 
why was the net chosen to be structured how it is?  The two input and one
output nodes are obvious to me, but the two hidden? why?  I appreciate 
all responses, I suppose since I haven't found anything on this subject, 
it must be something really basic, right? :)

-Chris

