next up previous
Next: Outline Up: Maxent Modeling Previous: Exponential form

Maximum likelihood


The log-likelihood tex2html_wrap_inline1820 of the empirical distribution tex2html_wrap_inline1822 as predicted by a model p is defined bygif


It is easy to check that the dual function tex2html_wrap_inline1836 of the previous section is, in fact, just the log-likelihood for the exponential model tex2html_wrap_inline1838 ; that is


where tex2html_wrap_inline1840 has the parametric form of (11). With this interpretation, the result of the previous section can be rephrased as:

The model tex2html_wrap_inline1842 with maximum entropy is the model in the parametric family tex2html_wrap_inline1844 that maximizes the likelihood of the training sample tex2html_wrap_inline1846 .

This result provides an added justification for the maximum entropy principle: if the notion of selecting a model tex2html_wrap_inline1848 on the basis of maximum entropy isn't compelling enough, it so happens that this same tex2html_wrap_inline1850 is also the model which, from among all models of the same parametric form (11), can best account for the training sample.

Adam Berger
Fri Jul 5 11:43:50 EDT 1996