Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!news.mathworks.com!zombie.ncsc.mil!news.duke.edu!news-feed-1.peachnet.edu!gatech!howland.reston.ans.net!news.sprintlink.net!redstone.interpath.net!sas!mozart.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: Pseudoinverse instead of backpropagation  -  www.txt [1/1]
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <D5urC4.4KL@unx.sas.com>
Date: Wed, 22 Mar 1995 17:31:16 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References:  <3kktvu$1b8r@campus.mty.itesm.mx>
Organization: SAS Institute Inc.
Lines: 63


In article <3kktvu$1b8r@campus.mty.itesm.mx>, rgmorales@iievms1.iiecuer.unam.mx (Rafael Morales Gamboa) writes:
|> We already are interested on using pseudo-inverses instead of the
|> gradient descendant backprop.
|>
|> We have a edition of the book: "Numercial Recipes", by William H.
|> Press, Cambridge Univ. Press, 1988, with matrix operation source
|> code, but this particular topic (pseudo inverses) is not adressed.
|>
|> We would like to know if some of you have information relative to:
|> 1)   Experiences using pseudo-inverses instead of traditional
|>      gradient descendant backpropagation in forecast problems.
|> 2)   Were we can found source code for pseudo inverses calculation?
|>
|> Thanks,
|> MANUEL MEJIA LAVALLE
|> e-mail: mmlavalle@iievms1.iiecuer.unam.mx

What you need to look for are algorithms for solving linear least-squares
problems. Most elementary textbooks on numerical analysis and linear
algebra discuss this topic. The classic work is Lawson and Hanson (1974),
_Solving Least Squares Problems_.

Note that a pseudo-inverse gives you a direct solution only to linear
models, i.e. networks in which the output is a linear function of the
weights. This includes ordinary linear regression, various types of
functional-link and higher-order nets, RBF nets for which the centers
and bandwidths are fixed, etc.

Let the training data be represented by the cases-by-inputs matrix X
and the cases-by-targets matrix Y. Assume X contains a column of 1s for the
bias term. Let B be the weight matrix. Then the outputs are XB and the
                                                      -1
errors are Y-XB. The least-squares solution is B=(X'X)  X'Y  (where '
                                                         -1
represents the matrix transpose) if X is full rank. (X'X)  X' is
called the pseudo-inverse or, more often, the generalized inverse of
X. If X is well conditioned, you can compute B by any of numerous
obvious algorithms. If X is ill-conditioned, you need to use any of
various numerically stable algorithms such as those descibed by Lawson
and Hanson. If X is singular, you have a choice of numerous varieties of
generalized inverses. The simplest generalized inverse just involves
dropping some subset of columns of X so that the remaining columns are
full rank. The Moore-Penrose generalized inverse has the most elegant
properties of all the generalized inverses and yields the minimum-norm
solution for B (although it is possible to get that solution without
using the Moore-Penrose inverse). The conceptually simplest way to
compute the Moore-Penrose inverse of X is to obtain the singular value
decomposition X=USV, where U and V are orthogonal and S is diagonal.
Define a diagonal matrix T with elements t_ii such that:

   t_ii = 0 if s_ii is within numerical error of zero,
   t_ii = 1/s_ii otherwise.

Then the Moore-Penrose inverse of X is UTV. See Lawson and Hanson or
Masters, T. (1993), _Practical Neural Network Recipes in C++_,
Academic Press.

-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
