Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!news.mathworks.com!news.bluesky.net!news.sprintlink.net!crash!apagani.cts.com!user
From: jdirbas@partech.com (Joseoh Dirbas)
Subject: Re: Relation between MSE and % correct class.?
Organization: pgsc
Date: Wed, 14 Jun 1995 22:55:48 GMT
Message-ID: <jdirbas-1406951455480001@apagani.cts.com>
References: <3rjtl8$lbt@uuneo.neosoft.com>  <19950613090858.BKAMP@pi0192.kub.nl> <19950614095953.BKAMP@pc0055.kub.nl>
Sender: news@crash.cts.com (news subsystem)
Nntp-Posting-Host: apagani.cts.com
Lines: 74

In article <19950614095953.BKAMP@pc0055.kub.nl>, BKAMP@kub.nl wrote:

> In Article <3rjtl8$lbt@uuneo.neosoft.com> "hav@neosoft.com" says:
> > >   BKAMP@kub.nl  (Kamp B.) writes:
> > >  Hi,
> > >   
> > >  I am working on a NN that will classify data into seven classes.
> > >  I measure performance both by Mean Square Error (MSE) and the
> > >  percentage of correct classification: I use one output which
> > >  ranges from 0 to 6, and round the classification to the nearest
> > >  integer and compare to the actual class.
> > >   
> > >  My question: While measuring performance on many trainings,
> > >  I found that there is only a very vague relation between the MSEs 
> > >  and the percentage correct classification.
> > >  How could this be?       (I am not a great statistian)
> > >   
> > >  Regards,
> > >   
> > >  Bart Kamp
> > >   
> > >   
> > >  
> > >  
> > >>>>
> > 
> > Hi Bart,
> > 
> > A question: How are you calculating MSE - that is, do you use the
actual (continuous)
> > network output or the rounded output?  
> > 
> > Also - to parrot recent postings here - how are you calculating %err
for class zero - 
> > have you tried using classes 1...7  instead? 
> > 
> > In any event, as has been pointed out recently here, there are certain
fundemental
> > differences between absolute and relative error. 
> > 
> > Also, since RMS (so MSE?) seems related to difference-vector measurement and
> > % seems more related to quantity measurements (I'm ducking Warren {;-),
> > I've often wondered about what differences in dynamics might exist between
> > single-output vs multiple output topologies for the same problem. (I know, 
> > I know...one output is simply a one-element vector ... still I wonder...)
> > Have you tried using 3 (or even 7) outputs instead of just 1?
> > 
> > ooba ooba,
> > Horace
>  
> Indeed, I am now working on other output definitions than just one output
> that ranges 0-6. Preliminary results suggest that a 7 output neuron 
> definition (so for each class 1 neuron) slightly outperforms other alterna-
> tives (including the 3 output neuron version you mentioned (I suppose we 
> talk about the same idea: 3 outputs ranging 0-1, so 2^3 permutations=
> enough combinations to define 7 classes. This definition performed slightly
> worse, probably because my classes (bond rating) have a ranking, which is
> ignored by this definition)).
>  

You shouldn't use one output neuron because youre teaching the network
that order is important (that is, calling class 2 class 3 is not as bad as
calling class2 class4).  If the outputs are not ordered, then you should
do what has been suggested already:  1 output neuron per class. (If your
output classes are
inherently ordered, like purple, blue, ..., then it is fine, and even
encouraged to use one output neuron).

-- 
Jospeh Dirbas                    |
PAR Government Systems Corp.     |
1010 Prospect St., Suite 200     |
La Jolla, CA 92037               | 
jdirbas@partech.com          |
