 
  
  
   
The goal of Bayesian forecasting is to obtain the predictive distribution:    where
  where   is the variable we want to predict at time 
n+1 and D is the knowledge we have expressed in a database that must at least 
contain the value
  is the variable we want to predict at time 
n+1 and D is the knowledge we have expressed in a database that must at least 
contain the value   . In the general case:
 . In the general case:
where x is the "cause" of a causal (or multivariate) forecasting model. 
So finally    is the predictive distribution.
  is the predictive distribution.
Model fitting is done in the Bayesian framework by writing the probability of having a 
parameter   given the data
  given the data   and the model 
H like,
  and the model 
H like,
which is equivalente to say that the:
  
 
  in the first inference level is just a normalizing constant but at  the second level is called evidence.
  in the first inference level is just a normalizing constant but at  the second level is called evidence.
The evaluation of this distribution involves marginalizing over all levels of 
uncertainity: Model selection (H), hyperparameters (  ) and parameters
(
 ) and parameters
(  ).
 ).
  
 
where   are the parameter (weights and biases) and
  are the parameter (weights and biases) and   are the noise over parameters.
  are the noise over parameters.
The evaluation of   only requires a single pass through the network.
  only requires a single pass through the network.
Typically, marginalization over   and H affects the predictive distribution (eq. 3) significantly, but integration over the hyperparameters
  and H affects the predictive distribution (eq. 3) significantly, but integration over the hyperparameters
  has a lesser effect.  
Marginalization can rarely be done analitically.
The alternatives are Gaussian approximations [5],
[6] and Monte Carlo Methods  [7].
  has a lesser effect.  
Marginalization can rarely be done analitically.
The alternatives are Gaussian approximations [5],
[6] and Monte Carlo Methods  [7].
Complex models might produce overfitting the data, they also have an obvious extra computational cost. Bayesian inference can help tackle the problems of complexity, uncertainity and selection of probabilistic models.