Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!pipex!uknet!cam-eng!mjfg
From: mjfg@eng.cam.ac.uk (M. J. F. Gales)
Subject: Technical report available
Sender: mjfg@eng.cam.ac.uk (Mark Gales)
Message-ID: <1993Jun28.151958.22148@eng.cam.ac.uk>
Date: Mon, 28 Jun 1993 15:19:58 GMT
Nntp-Posting-Host: dsl.eng.cam.ac.uk
Organization: Cambridge University Engineering Department, UK
Lines: 64

The following technical report is available by anonymous ftp from the
archive of the Speech, Vision and Robotics Group at the Cambridge
University Engineering Department.

                      THE THEORY OF SEGMENTAL
                       HIDDEN MARKOV MODELS

                   Mark Gales and Steve Young

	       Technical Report CUED/F-INFENG/TR 133

	    Cambridge University Engineering Department 
		        Trumpington Street 
		        Cambridge CB2 1PZ 
			     England 


                             Abstract

The most popular and successful acoustic model for speech recognition
is the Hidden Markov Model (HMM). To use HMMs for speech recognition a
series of assumptions are made about the waveform, some of which are
known to be poor. In particular, the `Independence Assumption' implies
that all observations are only dependent on the state that generated
them, not on neighbouring observations. In this paper, a new form of
acoustic model is described called the Segmental Hidden Markov Model
(SHMM) in which the effect of the `Independence Assumption' on the
observation likelihood is greatly reduced. In the SHMM all
observations are assumed to be independent given the state that
generated them but additionally they are conditional on the mean of
the segment of speech to which they belong.  Re-estimation formulae
are presented for the training of both single and multiple Gaussian
Inter Mixture models and a recognition algorithm is described.
Additionally it is shown that the standard HMM, both in the single
Gaussian mixture and multiple Gaussian mixtures cases, is just a
subset of the SHMM.  The new model is shown to provide better
recognition performance on a wider set of synthetic data than the
standard HMM.

************************ How to obtain a copy ************************

a) Via FTP:

unix> ftp svr-ftp.eng.cam.ac.uk
Name: anonymous
Password: (type your email address)
ftp> cd reports
ftp> binary
ftp> get gales_tr133.ps.Z
ftp> quit
unix> uncompress gales_tr133.ps.Z
unix> lpr gales_tr133.ps (or however you print PostScript)

b) Via postal mail:

Request a hardcopy from

Mark Gales,
Cambridge University Engineering Department, 
Trumpington Street, 
Cambridge CB2 1PZ,
England.

or email me: mjfg@eng.cam.ac.uk
