next up previous
Next: Training data Up: A Brief Maxent Tutorial Previous: Motivating example

Maxent Modeling


We consider a random process which produces an output value y, a member of a finite set tex2html_wrap_inline1524 . For the translation example just considered, the process generates a translation of the word in, and the output y can be any word in the set {dans, en, à, au cours de, pendant}. In generating y, the process may be influenced by some contextual information x, a member of a finite set tex2html_wrap_inline1532 . In the present example, this information could include the words in the English sentence surrounding in.

Our task is to construct a stochastic model that accurately represents the behavior of the random process. Such a model is a method of estimating the conditional probability that, given a context x, the process will output y.

A word here on notation: a rigorous protocol requires that we differentiate a random variable from a particular value it may assume. One approach is to write a capital letter for the first and lowercase for the second: X is the random variable (in the case of a six-sided die, tex2html_wrap_inline1540 ), and x is a particular value assumed by X. Furthermore, we should distinguish a probability distribution, say tex2html_wrap_inline1546 , ( tex2html_wrap_inline1548 is appropriate for a fair die) from a particular value assigned by the distribution to a certain event, say tex2html_wrap_inline1550 . Having conceded what we should do, we shall henceforth (when appropriate) dispense with the capitalized letters and let the context disambiguate the meaning of tex2html_wrap_inline1552 : an entire model tex2html_wrap_inline1554 or the value assigned by the model to the event X=x. Furthermore, we will denote by tex2html_wrap_inline1558 the set of all conditional probability distributions. Thus a model tex2html_wrap_inline1560 is, by definition, just an element of tex2html_wrap_inline1562 .

Adam Berger
Fri Jul 5 11:43:50 EDT 1996