A naive description for the estimation task would be to recover (learn) a.c. the underlying credal set from an infinite sequence of outcomes. As examples 1 and 2 show, an underlying credal set does not necessarily reveal itself in any single infinite sequence of trials. In simple terms, this is just because our very loose assumptions about data generation have left totally open the manner in which ``nature'' selects individual trial distributions. The deeper ramifications of this are reflected in a series of estimation results [27, Theorems 5.1-5.4,], which state that it is not possible to detect the full extent of the underlying credal set with asymptotic certainty, although it can be done with asymptotic favorability (i.e., if you happen to be fortunate).
We keep the requirement that a good estimator must produce estimates which dominate the lower envelope for the underlying credal set. This means that the estimated credal set is smaller than the underlying credal set; our requirement is that the estimate does not contain any distribution that is outside the credal set that generated the data.
Given two estimators that always dominate a credal set, which is best?
Even if an estimator asymptotically favors the underlying credal set and guarantees a dominating credal set with asymptotic certainty, this does not mean it is the best possible estimator. It is possible to have two different estimators, both with these properties, producing distinct credal sets from the same sequence. Often these estimators will be incomparable (in which case an even better estimator can be obtained using our Theorem 1). However, it is also possible that the first estimator will always dominate the second (a.c.). If this is the case, the second estimator, which is consistently dominated, is a better estimator. This is because both are guaranteed the dominate the underlying distribution, but the second estimator's resulting credal set will be larger, and therefore closer to the true underlying credal set.
In fact, if data is generated by any non-vacuous credal set, K, it is possible to construct a mathematically equivalent generator using any credal set dominated by K (i.e., larger than K) with a simple alteration to the method for selecting the distributions. In so far as a credal set (partially) summarizes the data generation process, one would always have the option of reducing the information content of the summarization by loosening the bounds. From all these equivalent generators, we are interested in the credal set conveying the most informative description of the data generation process -- i.e., the credal set that dominates the others.
In short, our requirements are as follows. From an infinite sequence of outcomes, we desire an estimator that is guaranteed to dominate the underlying credal set with asymptotic certainty and contains as many distributions as possible.
Sun Jun 29 22:16:40 EDT 1997