Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!howland.reston.ans.net!swiss.ans.net!solaris.cc.vt.edu!news.mathworks.com!news.duke.edu!concert!sas!mozart.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: He who knows what he does not know is wise
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <Cz6B9q.KJL@unx.sas.com>
Date: Sat, 12 Nov 1994 21:20:14 GMT
References:  <3a2j8a$cp4@fileserv.aber.ac.uk>
Nntp-Posting-Host: hotellng.unx.sas.com
Organization: SAS Institute Inc.
Lines: 54


In article <3a2j8a$cp4@fileserv.aber.ac.uk>, dbk@aber.ac.uk (Douglas B. Kell) writes:
|> In article <Cz47Ay.G4E@unx.sas.com> saswss@hotellng.unx.sas.com (Warren Sarle) writes:
|> >
|> >In article <TAP.94Nov10180312@eagle.epi.terryfox.ubc.ca>, tap@eagle.epi.terryfox.ubc.ca (Tony Plate) writes:
|> >|>
|> >|> In article <parkCyxFB0.5Ko@netcom.com>, park@netcom.com (Bill Park) writes:
|> >|> |> What are some good ways to get a neural network to report that the inputs
|> >|> |> you gave it are too different from its training set to permit it to
|> >|> |> give you an accurate answer?
|> >|> ...
|> >|> One way of doing this is to use another neural network as an
|> >|> auto-encoder, and then treat the goodness of reconstruction
|> >|> of a new pattern as a measure of familiarity.  The idea is
|> >|> that the auto-encoder will only be able to accurately
|> >|> reconstruct patterns it is familiar with.
|> >
|> >That idea, unfortunately, is wrong. Consider principal components
|> >as the autoencoder. Patterns will be accurately reconstructed if
|> >they are near the subspace spanned by the components, regardless
|> >of how far they are from the training data.
|> >
|> I should know better than to query Warren when he talks about
|> principal components, but.....this trick of autoencoding has been
|> used to great effect in seeing if a SENSOR (input) in a set-up
|> for controlling e.g. chemical plant is working or not. The idea is to
|> train with data that come from sensors that are KNOWN to be behaving.
|> You then test the sensors when one is NOT working and the autoassociative
|> thingy tells you that 'cos it doesn't reproduce the sensor output that
|> the sensor is giving you. The idea is essentially that after training
|> the net has learnt the DOMAIN in which the process works. Sensors that
|> lie make the process appear to go out of the training domain, and this
|> is what is flagged. Which is, in my reading of it, the question that
|> was being asked (how to tell if test data are outside the domain of
|> training data).

As Tony Plate pointed out, the autoencoder method is not guaranteed
to fail, but "in some situations it will fail miserably." An
autoencoder can encode the training data well if the training data
fall near some linear (in the case of PCA) or nonlinear (in the case
of a NN) manifold. The net learns this manifold, not the training
domain. So if a test point lies in this manifold but is far from the
training data, the autoencoder will not detect the problem. If the
method seems to work in practical situations, that just means you
haven't had any bad luck yet. :-)

|> If there are follow=-ups on this can people email me a copy as I'll
|> be in the US next week.
|> Douglas.
-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
