Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!portc02.blue.aol.com!howland.erols.net!cs.utexas.edu!newshost.convex.com!newsgate.duke.edu!interpath!news.interpath.net!news.interpath.net!sas!newshost.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: Scaling Question
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <E3yxIE.21B@unx.sas.com>
Date: Mon, 13 Jan 1997 22:37:26 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References: <32AA4E4C.7ABB@netreach.net> <32AC2994.68D1@ais.net> <E2xM0n.2r1@unx.sas.com> <Pine.SOL.3.91.961231170451.6371F-100000@miles> <E3Lqp5.Mq3@unx.sas.com> <Pine.SOL.3.91.970107051441.7103A-100000-100000@miles>
Organization: SAS Institute Inc.
Lines: 72


In article <Pine.SOL.3.91.970107051441.7103A-100000-100000@miles>, Greg Heath <heath@ll.mit.edu> writes:
|> On Mon, 6 Jan 1997, Warren Sarle wrote:
|> ...
|> > If all the inputs have a small coefficient of variation, it
|> > is quite possible that all the initial hyperplanes will miss the
|> > data entirely.  With such a poor initialization, local minima are
|> > very likely to occur. It is therefore important to center the
|> > inputs to get good random initializations. 
|> 
|> I can see that it may get brutal for unscaled data. But for 0-1 
|> scaling and random presentation order it can't really be all that bad.
|> Or can it.?

It depends on a lot of details, but [0,1] scaling can be fairly bad for
nonpathological examples, one of which is shown below. For a nasty
problem like the two spirals with a single hidden layer, where it's hard
to find a global minimum in the first place, it could be well nigh
impossible with [0,1] scaling (I'm speculating--I haven't actually tried
it).

Number of global optima out of 100 random initializations, with 3 hidden
units, softmax output, data shown at end of post:

   Standard
   Deviation
   of
   Intitial   Input Scaling
   Weights    [-1,1]  [0,1]
   ------------------------
    0.01         1      1
    0.1         18      3
    1.          45     33

data rings;
   keep x y c;
   input line $char30.;
   y=_n_;
   do x=1 to 30;
      if substr(line,x,1)^=' ' then do;
         c=input(substr(line,x,1),1.);
         output;
      end;
   end;
cards;
            333
          3333333
        3333   33333
      3333        3333
     333   22222     333
    333  222  2222    3333
   333  222     222    3333
  333  222   1   222    3333
 3333  222  111   222    333
 3333  222  111   222    333
  333   222  1   222    333
   333   222    222    333
    333   222 222     333
     333   22222    3333
       333        333
         333     333
          33333333
;


-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
 *** Do not send me unsolicited commercial or political email! ***

