Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!godot.cc.duq.edu!news.duke.edu!news.mathworks.com!newsfeed.internetmci.com!realtime.net!news.mindspring.com!snooze.ser.bbnplanet.com!es.dupont.com!esds01.es.dupont.com!slivova.es.dupont.com!owens
From: owens@slivova.es.dupont.com (Aaron J. Owens)
Subject: Great Energy Shootout -- Reference
Message-ID: <1996May7.192555.4272@es.dupont.com>
Sender: news@es.dupont.com (USENET News System)
Nntp-Posting-Host: slivova.es.dupont.com
Organization: DuPont Experimental Station
X-Newsreader: TIN [version 1.2 PL0]
Date: Tue, 7 May 1996 19:25:55 GMT
Lines: 65

Recently John Chandler asked for references to the Great Energy Shootout,
whic was won (over ARIMA and many other modeling methodologies) by Bayesian
neural networks. Here are the abstract and reference:


Bayesian Non-linear Modeling for the Energy Prediction Competition

David J C MacKay 

Bayesian probability theory provides a unifying framework for data
modeling. A model space may include numerous control parameters which
influence the complexity of the model (for example regularisation
constants). Bayesian methods can automatically set such parameters so
that the model becomes probabilistically well-matched to the data.
The 1993 energy prediction competition involved the prediction of a
series of building energy loads from a series of environmental input
variables. Non-linear regression using `neural networks' is a popular
technique for such modeling tasks. Since it is not obvious how large
a time-window of inputs is appropriate, or what preprocessing of
inputs is best, this can be viewed as a regression problem in which
there are many possible input variables, some of which may actually
be irrelevant to the prediction of the output variable. Because a
finite data set will show random correlations between the irrelevant
inputs and the output, any conventional neural network (even with
`weight decay') will not set the coefficients for these junk inputs
to zero. Thus the irrelevant variables will hurt the model's
performance. The Automatic Relevance Determination (ARD) model puts a
prior over the regression parameters which embodies the concept of
relevance. This is done in a simple and `soft' way by introducing
multiple `weight decay' constants, one `alpha' associated with each
input. Using Bayesian methods, the decay rates for junk inputs are
automatically inferred to be large, preventing those inputs from
causing significant overfitting. An entry using the ARD model won the
prediction competition by a significant margin.

Reference:

@INPROCEEDINGS{MacKay94:pred_ashrae,
 KEY            ="MacKay",
 AUTHOR         ="D. J. C.  MacKay",
 TITLE          ="Bayesian non-linear modelling for the energy prediction 
        competition",
 BOOKTITLE      ="ASHRAE Transactions, V.100, Pt.2",
 EDITOR         ="",
 PUBLISHER      ="ASHRAE",
 ADDRESS        ="Atlanta Georgia",
 YEAR           ="1994",
 PAGES ="1053-1062",
 ANNOTE ="Date submitted: ; Date accepted: ; Collaborating institutes: none"}

***************************************************************************

-- Aaron --

Aaron J. Owens, Research Fellow    Telephone Numbers:
Modeling and Simulation		  	Office	  (302) 695-7341 
Engineering Research Laboratory	  	FAX	    "   695-9658
DuPont Company, E320/201	        Home      (302) 733-7836
Wilmington, DE 19880-0320	   Internet: owens@prism.es.dupont.com

-----------------------------------------------------------------------------
Opinions expressed in this electronic message should *NOT* be taken to repre-
sent the official view(s) of the DuPont Company.  ANY OPINIONS EXPRESSED ARE
THE PERSONAL VIEWS OF THE AUTHOR ONLY. 
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
