Schapire 
 Singer (1998) have investigated classification problems where the aim of the
procedure is not to provide an accurate class for some observation. Rather,
the algorithm outputs a set of values (one for each class) and we expect the
class of the observation to receive the largest value of all, thus being
ranked higher than all others. This approach is particularly useful when
a given example can belong to more than one class (multilabel problems), a case where we expect each of
these classes to receive the greatest values compared to the classes the
examples does not belong to.
The ranking loss represents informally the
number of times the hypothesis fails to rank the class of an
example higher than a class to which it does not belong. Before going further, we first generalize our classification setting, and replace the common notation 
 for an example by the more general one 
. Here, 
 is a vector giving, for each class, the membership to the class (``0'' is no and ``1'' is yes) of the corresponding observation 
. It is important to note that this setting is more general than the usual Bayesian setting, in which there can exist examples 
 and 
 (using the non-vector notation) for which 
 but 
. Ranking loss generalizes Bayes to the multilabel problems, and postulates that there can be some examples for which we cannot provide a single class at a time, even if e.g. any of the classes to which the example belongs are susceptible to appear independently later with the same observation.
Ranking loss Boosting replaces each example 
 by a set of 
 examples, where 
 denotes the Hamming weight of 
 (i.e. the number of classes to which the example belongs). Each of these new examples is denoted 
, where 
 and 
 span all values in 
. The distribution of the new examples is renormalized, so that 
 whenever 
 and 
, and 
 otherwise.
Take some monomial 
 obtained from the large DC, and all examples satisfying it. We now work with this restricted subset of examples, while calculating the corresponding vector 
 of 
. Schapire 
 Singer (1998) propose a cost function which we should minimize in order to minimize the ranking loss. This function is