Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!godot.cc.duq.edu!newsgate.duke.edu!news.mathworks.com!newsfeed.internetmci.com!in1.uu.net!news.thepoint.net!not-for-mail
From: myrddin@iosys.net (Myrddin Emrys)
Subject: Re: How to classify with only 1 class?
Message-ID: <31a2fd3a.95084169@tisXnews.thepoint.net>
Date: Wed, 22 May 1996 12:17:53 GMT
References: <4m2rid$51b@leopard.wmin.ac.uk> <4nshlh$l5b@llnews.ll.mit.edu>
Organization: SimBusiness
Reply-To: myrddin@iosys.net
X-Newsreader: Forte Agent .99e/32.227
Lines: 79

We intercepted this transmission from heath@ll.mit.edu (Greg Heath):

:In article <4ni509$7kh@lorne.stir.ac.uk>, Kevin Swingler <kms@cs.stir.ac.uk>
: writes:
:|> A method I've found to work well proceeds as follows:
:|> 
:|> 1. Build a MLP with the same number of inputs as outputs
:
:Sounds like principal component analysis(PCA).
:
:|> 2. Ensure the MLP has FEWER hidden units than inputs
:
:No hidden units for 1-D problems? What about multiple 1-D Gaussian mixtures?
:
:|> 3. Train the network to take an input pattern and reproduce it on the output
:|>    layer. This is made harder by the compression on the hidden layer.
:
:Yep. PCA!
:
:|> 4. Train until the error is sufficiently low.
:
:I love that s-word. I use it a lot.
:
:|> 5. Record the mean and standard deviation of the training error
:|> 
:|> To test new data:
:|> 
:|> 1. Present the pattern to be classified
:|> 2. Calculate the error (MSE between input pattern and reproduced output
:|> pattern)
:|> 3. If the error is significantly far from the mean training error (t-test)
:|> 4. Then the data is not from the original class.
:|> 
:|> This is similar to doing a non linear projection onto a lower space and 
:|> looking for outliers, but easier.
:
:But, in general, it won't work because only the first principal coordinate
:is learned by the net. What happens if, instead of only one dominant eigenvalue 
:of the covariance matrix you have many that are nearly equal, i.e., an almost 
:spherical distribution of data? My guess is you'd have to learn the dominant 
:multi-dimensional subspace instead of one direction. This should lead to
:Kohonen's Novelty Filter!
:
:See my earlier posts in this thread re density hypothesis testing.

GAK! I'm reminded of numerous comedy sketches where the scientist spouts
meaningless technobabble that sounds real.

Now, I'm not saying I think Greg is spouting meaningless technobabble (however
much it sounds like it to my amature ear :) but please, can anyone else see the
humor there?

I actually understood everything up to that paragraph, so if someone would be
kind enough to translate, my gratitude would be endless.

Here is my interpretation (guessing at possible meanings):

But, in general, it won't work because only [a single dimension represented in
the data] is learned by the net. What happens if, instead of [the data set lying
on a plane/line] [it's dimensions in multi-space] are nearly equal, i.e., an
almost spherical distribution of data? My guess is you'd have to learn the
[primary multi-dimensional cross-section??]. This should lead to [a known method
of highlighting aberrant data called Kohonen's Novelty Filter]!

Ok, how'd I do?

:|> For more details, see Applying Neural Networks, A Practical Guide. Academic
:|> Press. 1996. There is a section in there devoted to this sort of problem.
:|> 
:|> Kevin Swingler
:|> 
:
:Gregory E. Heath     heath@ll.mit.edu      The views expressed here are
:M.I.T. Lincoln Lab   (617) 981-2815        not necessarily shared by 
:Lexington, MA        (617) 981-0908(FAX)   M.I.T./LL or its sponsors
:02173-0073, USA

--
Myrddin Emrys                                 mailto:myrddin@iosys.net

