Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!pipex!uunet!seas.gwu.edu!marshall
From: marshall@seas.gwu.edu (Christopher Marshall)
Subject: Null Transitions in HMMs
Message-ID: <1993Jan20.031204.1652@seas.gwu.edu>
Sender: news@seas.gwu.edu
Organization: George Washington University
Date: Wed, 20 Jan 1993 03:12:04 GMT
Lines: 78

Back in October I posted a plea for help in understanding null transitions in
HMMs. My thanks to Imoto Takashi, Les Niles, and Wieland Eckert who spent a
lot of time corresponding with me through email to try to explain them to me.

Unfortunately, although I learned some things, I never learned how
to implement null transitions in the evaluation problem of HMMs. I have
since left and come back to the problem.

I am trying to understand them again and I need help.  My apologies to
my teachers who labored in vain the first time.

Here is a restatement of the problem I am trying to solve.  Below I describe
the notation I use in painful detail.  I am hoping it is possible to express
the answer in the same notation.

I have a Hidden Markov Model (HMM) in which symbols are output as the model
changes state, as opposed to the kind where symbols are output as the
model occupies states.

Let X1, X2, ... XT+1 be a sequence of random variables representing the
sequence of states of length T+1 that the HMM traverses.

Let Y1, Y2, ... YT be a sequence of random variables representing
the sequence of output symbols of length T that the HMM produces.

Let the HMM have N states numbered 1,2,...,N.

Let the HMM have an output symbol alphabet of size M, with the letters
numbered 1,2,...,M.

Let the initial state probabilities P(X1=i) be denoted PI(i), i=1,2,...,N.

Let the transition probabilities P(Xt+1= j | Xt= i) be denoted A(i,j),
i,j=1,2,...N.

Let the output symbol probabilites P(Yt= k | Xt= i, Xt+1= j) be denoted
B(i,j,k), i,j=1,2,...,N, and k=1,2,...,M.

Now, evaluate P(Y1..YT=y1..yT) in terms of y1..yT, PI(i), A(i,j), and B(i,j,k).
This is the evaluation problem.

As long as all allowed transitions produce output symbols, the solution is:

P(Y1..YT=y1..yT)=
   sum over xT+1 of
   sum over xT of  B(xT,xT+1,yT) * A(xT,xT+1) *
   sum over xT-1 of  B(xT-1,xT,yT-1) * A(xT-1,xT) *
      ...
   sum over x2 of  B(x2,x3,y2) * A(x2,x3) *
   sum over x1 of  B(x1,x2,y1) * A(x1,x2) * PI(x1).

This procedure is usually not written this way but expressed recursively as
follows (called the forward algorithm):

   Let alpha(0,j)= PI(j).

   alpha(1,j)= sum over i of  B(i,j,y1) * A(i,j) * alpha(0,i).
   ...
   alpha(t,j)= sum over i of  B(i,j,yt) * A(i,j) * alpha(t-1,i).
   ...
   alpha(T,j)= sum over i of  B(i,j,yT) * A(i,j) * alpha(T-1,i).

   P(Y1..YT=y1..yT)= sum over i of alpha(T,i).

Now, I can not for the life of me figure out how to modify this procedure
to incorporate null transitions.

Does anyone know how to do this?

Thanks in advance,

Chris Marshall
marshall@seas.gwu.edu

Did I request thee, Maker, from my clay
To mould me Man, did I solicit thee
From darkness to promote me?
		Paradise Lost, X, 743-45
