Large Deviation Theory
- Large deviation theory or
Large Deviation Principle (LDP) characterizes the logarithmic rate of the
convergence, specified by Weak Law of Large Numbers (WLLN). Note that WLLN
tells
where S_n is
the sample mean of a random sequence {X_i}, E[X_i]=E[X].
- For a sample mean sequence,
its LDP contains WLLN since the lower and upper bound of the rate function
are both 0, if the event A contains the expectation; so with probability
one, the event A will happen, i.e., WLLN.
- Three levels of large
deviations:
- Level 1 is about
sample means.
- Level 2 is about
empirical distribution.
- Level 3 is about
empirical process (sample path large deviations).
- Finite time cumulant
generating function or finite time logarithmic moment generating
function of a stochastic process
is defined as

- The asymptotic log-moment
generating function of a stochastic process
is defined as
, if the limit exists.
- Cramer's theorem addresses
i.i.d. processes. The log-moment generating function is not related
to time.
- Gaetner-Ellis theorem
addresses non-i.i.d. processes, specifically, short-range-dependent
processes. The log-moment generating function is related to
time. The asymptotic log-moment generating function is intended to
remove the effect of time by taking the limit of log-moment generating
function as time goes to infinity, assuming the limit exists.
- Denote H(p) the entropy of
distribution {p,1-p}. Then we have

- Subadditive property:
f(a+b)<= f(a) + f(b).
- It characterizes
extrapolation behavior. Compared with concavity:
f(a*c+b*(1-c))>= c* f(a) + (1-c)* f(b), where c \in [0,1].
Concavity characterizes interpolation behavior.
- Why subadditive
property is very useful in network analysis? Because it
characterizes the key property of the arrival curve behavior. The
output of one or more concatenated leaky buckets is subadditive.