- ...entropy
- A more common notation for the
conditional entropy is 78#78, where
*Y*and*X*are random variables with joint distribution 79#79. To emphasize the fact that*H*is a functional, depending on the probability distribution*p*, we have adopted the alternate notation 80#80.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

- ...Lagrangian
- Ignoring
the set of weak inequalities 100#100 when forming the Lagrangian
doesn't change the problem, since for the solution that emerges, these
constraints will not be binding anyway.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

- ...by
- We will henceforth abbreviate 145#145 by
146#146 when the empirical distribution 147#147 is clear from context.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

- ...numbers,
- Technically, we require the
*strong*law of large numbers, which asserts that for all 228#228, the event inside the braces in (21) holds almost everywhere. We also require an assumption concerning the sample distribution being stationary, but discussing either of these details at length would bring us too far afield. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Fri Jul 5 11:43:50 EDT 1996