Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!howland.reston.ans.net!news.sprintlink.net!redstone.interpath.net!sas!mozart.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: variable elimination
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <D5K0KK.FGL@unx.sas.com>
Date: Thu, 16 Mar 1995 22:17:08 GMT
Distribution: usa
X-Nntp-Posting-Host: hotellng.unx.sas.com
References:  <1995Mar14.153301.19587@Princeton.EDU>
Organization: SAS Institute Inc.
Keywords: neural nets
Lines: 35


In article <1995Mar14.153301.19587@Princeton.EDU>, bagalman@tucson.princeton.edu (Michael D. Bagalman) writes:
|> I am seeking a little advice, if anyone out there can help me.  I am
|> interested in applying neural nets to a problem in which I have more than
|> 100 predictor/independent variables (inputs) but I know that probably
|> only 12 or so are really necessary.
|>
|> Once I make a neural net, how can I go about determining which inputs are
|> unnecessary?  Can I rank them in importance?  Basically I am wishing that
|> there is some way, even a primitive unreliable way, to get the equivalent
|> of the coefficients and t-tests that you get with linear regression.
|>
|> Please either post responses or email to "mbagalman@attmail.com".

Since there will usually be more than one weight associated with each
input, you will have an F test rather than a t test. There are three
asymptotically equivalent ways of doing such tests: the Wald test,
the Lagrange multiplier test, and the likelihood ratio test. The first
two require computing the inverse of the Hessian matrix, which is not
available with most neural net software. The likelihood ratio test
is simpler and usually more accurate: you just retrain the net 100
(or however many inputs you have) times, each time omitting one of
the inputs. The increase in the training or test error gives you
a direct measure of the importance of each input considered by itself.
The likelihood ratio can also be computed from the increase in error.
If the training criterion is least-squares, the likelihood ratio
test works just like the usual F test for comparing two linear models.
Of course, the distributions are only approximate with finite sample
sizes. See Gallant, A.R. (1987) Nonlinear Statistical Models, Wiley: NY.

-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
