## Theoretical Foundation

In this section we will study the distribution of the -th gene and the construction of a confidence interval for to the localization parameter associated with that distribution.

Let be the set of individuals with genes that make up the population and the set of the best individuals. If we assume that the genes of the individuals belonging to are independent random variables with a continuous distribution with a localization parameter , we can define the model

 for (1)

being a random variable. If we suppose that, for each gene , the best individuals form a random sample of the distribution of , then the model takes the form

 for   and (2)

Using this model, we analyze an estimator of the localization parameter for the -th gene based on the minimization of the dispersion function induced by the norm. The norm is defined as

 (3)

hence the associated dispersion induced by the norm in the model 2 is

 (4)

and the estimator of the localization parameter is:

 (5)

Using for minimization the steepest gradient descent method,

 (6)

we obtain

 (7)

and making (7) equal to 0 yields

 (8)

So, the estimator of the localization parameter for the -th gene based on the minimization of the dispersion function induced by the norm is the mean of the distribution of [KS77], that is, .

The sample mean estimator is a linear estimator1, so it has the properties of unbiasedness2 and consistency3, and it follows a normal distribution when the distribution of the genes is normal. Under this hypothesis, we construct a bilateral confidence interval for the localization of the genes of the best individuals, using the studentization method, the mean as the localization parameter,and the standard deviation as the dispersion parameter:

 (9)

where is the value of Student's distribution with degrees of freedom, and is the confidence coefficient, that is, the probability that the interval contains the true value of the population mean.

Domingo 2005-07-11