next up previous contents
Next: EC distribution Up: Moment matching algorithm Previous: Characterizing PH distributions   Contents


Characterizing phase type distributions

Set ${\cal S}^{(n)}$ is characterized by Theorem 1: ${\cal S}^{(n)}\subset {\cal T}^{(n)} \subset {\cal S}^{(n+1)}$ for all $n \geq 2$. In this section we prove the first part of the theorem:

Lemma 1   For all integers $n \geq 2$, ${\cal S}^{(n)}\subset {\cal T}^{(n)}$.

The second part, ${\cal T}^{(n)} \subset {\cal S}^{(n+1)}$, follows immediately from the construction of the Complete solution (see Corollary 3 in Section 2.7).

We begin by defining the ratio of the normalized moments.

Definition 10   The ratio of the normalized moments of a distribution $F$, $r^F$, is defined as $r^F = m_3^F / m_2^F$ and is also referred to as the $r$-value of $F$.

A nice property of the $r$-value is that it is insensitive to the mass probability at zero:

Proposition 5   Let $Z$ be a mixture of two distributions, $X$ and $O$, where where $X$ is a nonnegative distribution with $\mu_1^X> 0$ and $O$ is the distribution of the degenerate random variable $V\equiv 0$. Then, $r^Z=r^X$.


Proof:Let $Z(\cdot)$, $X(\cdot)$, and $O(\cdot)$ be the (cumulative) distribution functions of $Z$, $X$, and $O$, respectively. Then, there exists $0<p<1$ such that $Z(\cdot) = pX(\cdot) + (1-p)O(\cdot)$. Therefore, by definition,


\begin{displaymath}
r^Z = \frac{(p\mu_3^X)(p\mu_1^X)}{(p\mu_2^X)^2} = \frac{(\mu_3^X)(\mu_1^X)}{(\mu_2^X)^2} = r^X.
\end{displaymath}

    width 1ex height 1ex depth 0pt

Below, unless otherwise stated, we denote the (cumulative) distribution function of a distribution, $X$, by $X(\cdot)$. Below, we use the notation $O$ repeatedly.

Definition 11   Let $O$ denote the distribution of the degenerate random variable $V\equiv 0$.

Note that, using the normalized second moment and the $r$-value, ${\cal T}^{(n)}$ can be redefined as the set of distributions, $F$, that satisfy exactly one of the following two conditions:

\begin{eqnarray*}
& & \mbox{(i) } m_2^F > \frac{n+1}{n} \quad\mbox{and}\quad r^F...
..._2^F = \frac{n+1}{n} \quad\mbox{and}\quad r^F = \frac{n+2}{n+1}.
\end{eqnarray*}

To show ${\cal S}^{(n)}\subset {\cal T}^{(n)}$, consider an arbitrary distribution, $X\in {\cal S}^{(n)}$. By definition of ${\cal S}^{(n)}$, there exists an $n$-phase acyclic PH distribution, $P$, that well-represents $X$. Then $X\in {\cal T}^{(n)}$ iff $P\in {\cal T}^{(n)}$. Hence, it suffices to prove that all $n$-phase acyclic PH distributions are in ${\cal T}^{(n)}$. This can be shown by proving the two properties of the Erlang-$n$ distribution: (i) the set of Erlang-$n$ distributions has the least normalized second moment among all the $n$-phase (acyclic) PH distributions and (ii) the Erlang-$n$ distribution has the least $r$-value among all the $n$-phase acyclic PH distributions. Note that the Erlang-$n$ distribution, $E_n$, has

\begin{displaymath}
m_2^{E_n} = \frac{n+1}{n}
\quad\mbox{and}\quad
r^{E_n} = \frac{n+2}{n+1}.
\end{displaymath}

Property (i) of the Erlang-$n$ distribution immediately follows from the prior work by Aldous and Shepp [5] and O'Cinneide [142], who prove that the set of Erlang-$n$ distributions is the unique class of $n$-phase PH distributions with the least second moment among all the $n$-phase PH distributions with a fixed first moment. Thus, all that remains is to prove property (ii).

Our approach is different from Aldous and Shepp [5] and O'Cinneide [142]. Aldous and Shepp prove the least variability of the Erlang-$n$ distribution via quadratic variation (a property related to the second moment), and hence it is unlikely that their approach can be applied to prove property (ii), which relies on higher moments, in particular the $r$-value. O'Cinneide extends the work by Aldous and Shepp, considering a convex function, $f(\cdot)$, applied to a random variable having an $n$-phase PH distribution with a fixed mean. He proves, via the theory of majorization, that the expectation of $f(V)$ is minimized when the random variable $V$ has an Erlang distribution. Unfortunately, the $r$-value of a distribution, $G$, is not an expectation of $f(V)$, where $V$ is a random variable with distribution $G$, for a convex function $f(\cdot)$. Hence, the theory of majorization does not directly apply to the $r$-value.

Our proof makes use of the recursive structure of PH distributions and shows that an $n$-phase Erlang distribution has no greater $r$-value than any $n$-phase acyclic PH distribution. The key idea in our proof is that any acyclic PH distribution, $P$, can be seen as a mixture of the convolutions of exponential distributions, and one of the convolutions of exponential distributions has no greater $r$-value than $P$. This allows us to relate the minimal convolution to an Erlang distribution when all the rates of the exponential distributions are the same. The following lemma provides the key property of the $r$-value used in our proof.

Lemma 2   Let $Z(\cdot) = \sum_{i=1}^{n}p_iX_i(\cdot)$, where $n \geq 2$ and $X_i$ are nonnegative distributions with $\mu_1^{X_i}> 0$ for $1\leq i\leq n$. Then, there exists $i\in\{1,...,n\}$ such that $r^Z\geq r^{X_i}$.


Proof:We prove the lemma by induction on $n$. Without loss of generality, we let $r^{X_1} \geq \cdots \geq r^{X_n}$.

Base case ($n=2$): Let $v=\mu_1^{X_2} / \mu_1^{X_1}$ and $w=\mu_2^{X_2} / \mu_2^{X_1}$. Then,

\begin{eqnarray*}
\hspace{-2mm}r^Z-r^{X_2}
& = & \frac{p_1^2r^{X_1}+p_1p_2r^{X_1...
...^2}\\
& = & \frac{p_1p_2(w-v)^2r^{X_2}}{v(p_1+p_2w)^2}
\geq 0,
\end{eqnarray*}

where the first inequality follows from $r^{X_1}\geq r^{X_2}$.

Inductive case: Suppose that the lemma holds for $n\leq k$. When $n=k+1$, $Z$ can be seen as a mixture of two distributions, $Y(\cdot) = \frac{1}{1-p_{k+1}}\sum_{i=1}^{k}p_iX_i(\cdot)$ and $X_{k+1}(\cdot)$. When $r^{X_{k+1}}\leq r^Z$, the lemma holds for $n=k+1$. When $r^{X_{k+1}}> r^Z$, we have $r^Y\leq r^Z$ by the base case. By the inductive hypothesis, there exist $i\in\{1,...,k\}$ such that $r^Y\geq r^{X_i}$. Thus, the lemma holds for $n=k+1$.     width 1ex height 1ex depth 0pt

We are now ready to prove that an $n$-phase Erlang distribution has no greater $r$-value than any $n$-phase PH distribution, which completes the proof of Lemma 1. A proof of the following lemma is postponed to Appendix A.

Lemma 3   The Erlang distribution has the least $r$-value among all the acyclic PH distribution with a fixed number of phases, $n$, for all $n\geq 1$.


next up previous contents
Next: EC distribution Up: Moment matching algorithm Previous: Characterizing PH distributions   Contents
Takayuki Osogami 2005-07-19