Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!gatech!news.mathworks.com!nntp.primenet.com!newspump.sol.net!spool.mu.edu!agate!newsgate.duke.edu!interpath!news.interpath.net!sas!newshost.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: Help: training problems: scale?
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <DwKKEn.L8H@unx.sas.com>
Date: Fri, 23 Aug 1996 02:32:47 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References:  <5pg25gr7yo.fsf@sst10c.lanl.gov>
Organization: SAS Institute Inc.
Lines: 44


In article <5pg25gr7yo.fsf@sst10c.lanl.gov>, Reid Priedhorsky <priedhor@sst10c.lanl.gov> writes:
|> ...
|> What happens (I think) is that the net expends all its effort in training
|> for the high concentration: an error of 0.1 on an output node is fairly
|> irrelevant if the desired output is 1.0; however if the desired output is
|> 0.001 it is overwhelming. If I demand accuracy to 0.0001 over all samples,
|> however, the higher concentrations are likely to overfit, are they not?

That's correct, assuming you're using the usual least-(mean-)squares
error function. The network would treat an error of .1 as having the
same importance regardless of whether the target concentration is 1.0
or 0.001.

|> How can I fix this? I have tried logarithmically normalizing the input to
|> no avail; however I can't figure out how do the same with the output since
|> it is already in the range [0, 1].

If you want the importance of an error (in terms of concentrations) to
be measured relative to the magnitude of the actual concentration, then
taking logarithms of the concentrations would be appropriate. But this
would be asking for perfect accuracy for concentrations of zero, and
your computer would probably object to taking logarithms of zero.  The
most common approach to this problem is to add some small constant to
the target concentrations before you take logarithms. This would mean
that errors for large concentrations would have importance relative to
the magnitude of the actual concentration, but as the concentration
decreased, there would be a gradual shift toward using the absolute
(rather than relative) size of the error as its importance. The size
of the small additive constant determines the rate at which that
shift occurs.

Note that there is no need for the target values to be in [0, 1]. This
is the topic of another current thread titled "Re: Apply sigmoid
activation fun".




-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
