Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!news.duke.edu!concert!sas!mozart.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: Why hidden layer (not VERY stupid Q)
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <Cy9Iqs.D74@unx.sas.com>
Date: Wed, 26 Oct 1994 04:20:52 GMT
References:  <v9110104-251094120955@igwemc25.vub.ac.be>
Nntp-Posting-Host: hotellng.unx.sas.com
Organization: SAS Institute Inc.
Lines: 33


In article <v9110104-251094120955@igwemc25.vub.ac.be>, v9110104@is2.vub.ac.be (Johan Ovlinger) writes:
|> we can model an n layer net by matrix multiplication ( repr by :*):
|> outputs  == An * A(n-1) * ... * A1 * inputs
|> ...
|> B ==  An * A(n-1) * ... * A1
|> outputs == B * inputs
|> However, B has no hidden layers, therefore my question:
|> Why do we need hidden layers?

Because there are usually nonlinear operators (activation functions)
in between the linear operators, so you cannot reduce the model to
a linear model.

However, a hidden layer (a bottleneck) can be useful for dimensionality
reduction even without nonlinearities. The number of units in a
bottleneck layer must be less than the numbers of units in the layers
on either side. For example, one linear hidden layer in a net where
the inputs and targets are the same gives you the equivalent of
principal components (technicalities omitted). If the inputs and
targets are different, you get principal components of instrumental
variables, aka maximum redundancy analysis. By adding more hidden
layers _with_ nonlinearities, you can get nonlinear generalizations
of principal components and maximum redundancy analysis.




-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
