Newsgroups: comp.ai,comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!godot.cc.duq.edu!newsgate.duke.edu!news.mathworks.com!newsfeed.internetmci.com!newsreader.sprintlink.net!news.interpath.net!sas!newshost.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Noise (Was: Reference required.)
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <DrrtwE.KyA@unx.sas.com>
Date: Tue, 21 May 1996 19:53:02 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References:  <4nq6eb$bhm@sol.sun.csd.unb.ca>
Organization: SAS Institute Inc.
Keywords: Training nets, strategies, noise
Followup-To: comp.ai.neural-nets 
Lines: 49
Xref: glinda.oz.cs.cmu.edu comp.ai:38940 comp.ai.neural-nets:31645


In article <4nq6eb$bhm@sol.sun.csd.unb.ca>, rajan@ultra1.ee.unb.ca (Sreeraman Rajan) writes:
|> I am trying to collect references on the training strategies followed by researchers to
|> train neural nets in general and for pattern classification in particular.  I am also
|> interested in the strategies that they follow when the patterns are noisy.  I have not been
|> very successful in getting references related to noisy patterns.  I have references which
|> refer to advantages in having noise in the patterns during training but do not address the
|> strategy they follow in selecting the exemplars for the training set.  

The basic remedy for noise is more training data. For more details,
see one of the statistically-oriented references on neural nets such
as:

   Bishop, C.M. (1995), Neural Networks for Pattern Recognition,
   Oxford: Oxford University Press. 

   Geman, S., Bienenstock, E. and Doursat, R. (1992), "Neural Networks
   and the Bias/Variance Dilemma", Neural Computation, 4, 1-58. 

   Ripley, B.D. (1996) Pattern Recognition and Neural
   Networks, Cambridge: Cambridge University Press.

Noise in the actual data is never a good thing, since it limits the
accuracy of generalization that can be achieved no matter how extensive
the training set is. On the other hand, injecting artificial noise
(jitter) into the inputs during training is one of several ways to
improve generalization for smooth functions when you have a small
training set. See "What is jitter?  (Training with noise)" in
ftp://ftp.sas.com/pub/neural/FAQ2.html .

If you have noise in the target values, the mean squared generalization
error can never be less than the variance of the noise, no matter how
much training data you have. But you can estimate the _mean_ of the
target values, conditional on a given set of inputs, to any desired
degree of accuracy by obtaining a sufficiently large and representative
training set, assuming that the function you are trying to learn is
one that can indeed be learned by the type of net you are using.

Noise in the inputs also limits the accuracy of generalization, but in
a more complicated way than does noise in the targets. In a region of
the input space where the function being learned is fairly flat, input
noise will have little effect. In regions where that function is steep,
input noise can degrade generalization severely. 

-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
