Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!news.alpha.net!uwm.edu!spool.mu.edu!howland.reston.ans.net!ix.netcom.com!netcom.com!park
From: park@netcom.com (Bill Park)
Subject: Re: Degrees of freedom in a net
Message-ID: <parkCz4BGC.9tv@netcom.com>
Followup-To: comp.ai.neural-nets
Keywords: symmetry degrees of freedom
Organization: Netcom Online Communications Services (408-241-9760 login: guest)
References: <Cz294r.LJt@cs.dal.ca>
Date: Fri, 11 Nov 1994 19:28:59 GMT
Lines: 61

In article <Cz294r.LJt@cs.dal.ca> ab340@cfn.cs.dal.ca (John Shimeld) writes:

> I'm wondering how to determine the number of degrees of freedom
> in a neural network.
> ab340@cfn.cs.dal.ca

A related question, I think: Can we remove the symmetry in a neural
network?

It seems to me that, in general, the amount of symmetry in a system
must be related to its degrees of freedom somehow, regardless of what
sort of system it is -- neural net, economy, crystal structure,
chaotic dynamic process, etc.  Can anyone shed some light here?

Consider a three-layer (input, hidden, output) feedforward net with a
full set of connections from one layer to the next.  If we train this
net with the same set of examples, but starting with different random
weights and biases, we get very different sets of weights, because any
neuron in the hidden layer can do the job of any other.  The net is
symmetric in this respect.  What "job" a neuron winds up doing is a
matter of chance, even if the neurons should together share the same
set of "jobs" after each training session.

It might be useful if training on a given data set always produced the
same value for each weight and bias.  Then we could quantify the
difference between two trained nets by comparing corresponding weights
and biases.  That might let us say something about the similarities
and differences between the data sets on which they were trained.
Perhaps others can think of more ways to use nets that always "come
out the same."

One approach to symmetry-breaking would be to add to the training
procedure a systematic bias of some sort that treats neurons in a
layer differently.  For example, number the neurons in the hidden
layer consecutively from 1 up.  Then number their input weights
consecutively from 1 up, starting with the weights of the first hidden
neuron and proceeding through the other neurons in order.  During
training, after each weight update cycle, shift a little of the value
of each weight n to weight n-1, n > 1.  I think the result would be
that, in a successfully-trained net, the lower-numbered neurons would
have the highest average incoming weight values.  I am under the
impression that a low average value of incoming weights is one useful
criterion for "pruning" a neuron (is this correct?).  If so, perhaps
we could say that the lower-numbered neurons would then be doing the
more "important" partitionings of the input space.

The same process could be applied to each layer in a net.

A similar process could be applied to the biases on the neurons on
each layer.  However, it's not clear that a larger bias means that a
neuron is doing something more important.  Comments?

One may ask whether this sort of systematic "weight drift" or "bias
drift" would prevent training.  Perhaps this is an open question worth
investigating.

Gentlemen: Start your proposals!

Bill Park
=========
-- 
Grandpaw Bill's High Technology Consulting & Live Bait, Inc.
