To avoid confusion, we call the value of computed over the transformed set of examples, and for
and
as the value of criterion using vector . It is simple to obtain a ``sufficient'' bound to check the theorem. We have

(30) | |||

(31) |

Here, is the sum of weights of the examples in the original set, whose vectors have ``1'' matching the elements of . Note that , since our algorithm is optimal, and we obtain

By taking only the positive part of the right-hand side, and remarking that

- , (the right sum is and the left one is ),
- the coefficient of in is ,

as claimed.