Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!howland.reston.ans.net!swrinde!gatech!news.sprintlink.net!redstone.interpath.net!sas!mozart.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: Cascading models question
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <D8Murr.Hnp@unx.sas.com>
Date: Mon, 15 May 1995 18:45:27 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References: <3os15l$flh@uuneo.neosoft.com> <xwy876Y.predictor@delphi.com> <3p7kcc$gtb@lyra.csx.cam.ac.uk>
Organization: SAS Institute Inc.
Lines: 66


In article <3p7kcc$gtb@lyra.csx.cam.ac.uk>, "G. D. Cook" <gdc> writes:
|> The following references provide a comprehensive coverage of the theory
|> of stacking multiple models :
|>
|> D. H. Wolpert, Stacked Generalization
|> Neural networks, vol 5, no 2, 1992
|>
|> L. Breiman, Stacked Regression
|> Tech. report 367,
|> Department of statistics,
|> University of California, Berkeley, 1992

No, "stacking" is different from "cascading". Stacking means training
several networks (or other models) independently and then combining
the predictions. Here is Will's description of cascading:

> "Cascading models", here referring to the
> following: a model is built using the original input variables and is
> tested using an appropriate validation tecnique (CV, etc.).  Now, a
> second model is built which accepts as input the original input
> variables as well as the output of the first model.  It has been
> proposed that a hierarchy of such models might be used, with the
> first model handling the broadest, most general aspects of the problem,
> and successive models in the cascade handling finer and finer details
> of the problem, in effect "correcting" the cascade which is composed
> of their predecessors.  Each model (in this case) would be constructed
> seperately, so that the first model would be built, frozen (no more
> adjustment or learning), then the second one would be built and frozen,
> etc.

Also, the above references are hardly comprehensive, although Breiman's
is one of the better ones. Here are some others references culled from
previous posts on the topic

    Clemen, Robert T. (1989), "Combining forecasts: A review and
    annotated bibliography", International Journal of Forecasting,
    Vol 5 No 4, pp 559-584. 

    Genest, C. and J. Zidek (1986). Combining probability distributions:
    A critique and an annotated bibliography. Statistical Science, 1(1):114--148.

    Guerard Jr., J.B., and R.T. Clemen (1989). Collinearity and the use of
    latent root regression for combining GNP forecasts. Journal of Forecasting,
    vol. 8, pp. 231-238.

    Hashem, S. (1993), Optimal Linear Combinations of Neural Networks.
    PhD thesis, School of Industrial Engineering, Purdue University, Dec. 1993
    (Technical Report SMS 94--4).
    Available via ftp from: archive.cis.ohio-state.edu,
    file:  /pub/neuroprose/Thesis/hashem.thesis.ps.Z

    Hashem, S., & B. Schmeiser (1992). Improving Model Accuracy Using Optimal
    Linear Combinations of Trained Neural Networks, Tech. Rep. SMS92--16,
    School of Industrial Engineering, Purdue University. (IEEE Transactions on
    Neural Networks, May 1995 - forthcoming.)

    Hashem, S.,  B. Schmeiser, and Y. Yih (1994). Optimal Linear Combinations
    of  Neural Networks: An Overview.  In Proceedings of the 1994 IEEE
    International Conference on Neural Networks, Vol. 3, pp. 1507--1512.

-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
