Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!pipex!uknet!cam-eng!jn106
From: jn106@eng.cam.ac.uk (J.A. Nolazco Flores)
Subject: Technical report available
Sender: jn106@eng.cam.ac.uk (J.A. Nolazco Flores)
Message-ID: <1993May28.100041.10611@eng.cam.ac.uk>
Date: Fri, 28 May 1993 10:00:41 GMT
Nntp-Posting-Host: dsl.eng.cam.ac.uk
Organization: Cambridge University Engineering Department, UK
Lines: 104

Newsgroups: comp.speech
Subject: Technical report available
Summary: 
Followup-To: 
Distribution: world
Organization: Cambridge University Engineering Department, UK
Keywords: 
The following technical report is available by anonymous ftp from the
archive of the Speech, Vision and Robotics Group at the Cambridge
University Engineering Department.

          ADAPTING A HMM-BASED RECOGNISER FOR NOISY SPEECH 
                  ENHANCED BY SPECTRAL SUBTRACTION

                J. A. Nolazco Flores and S. J. Young

               Technical Report CUED/F-INFENG/TR 123

            Cambridge University Engineering Department 
                        Trumpington Street 
                        Cambridge CB2 1PZ 
                             England 


                             Abstract

Training HMMs on the same conditions as in recognition makes models
learn not only the features of the speech, but also those of the
environment. Training in the same conditions allows the recognition
system to obtain better recognition performance, but trying to have
models for all possible environments is impractical. Therefore, one
way to solve this problem is to compensate models trained on clean
speech to give `artificially' adapted models. The goal of these noise
adaptation techniques is to reach the same recognition performance as
would be obtained by training in the noisy conditions.  Parallel Model
Combination (PMC) is one adaptation technique which has been
successful in adapting a clean speech model to noise by automatically
generating `noisy speech models'.

However, even training in noise can only achieve limited recognition
performance because the high variance at low SNR makes the features
begin to overlap making the discrimination problem more difficult. The
problem is even worse when the vocabulary grows; for example, some
experiments have shown that recognition performance is below 80% for
0 dB, even when training and testing were in the same environment.
Therefore, in very noisy environments, or when the vocabulary grows,
even training in noise is not enough to obtain good recognition
performance. In order to improve recognition performance in very noisy
environments, some sort of enhancement technique may be useful. An
enhancement scheme could improve the SNR, or minimise the variance, or
emphasise the main features of the interesting signal. However, all
of these improvements are usually at the expense of signal distortion.
Minimising both signal distortion and noise, a signal with better
features and lower variability is obtained. However, if we want to
exploit the good features of the noise adaptation techniques and the
good features of the enhancement techniques, then we need to
compensate the speech models to the distorted signal. In other words,
we need to adapt the models to the enhanced signal.

In this work, we study how to adapt clean speech models for a signal
enhanced by Spectral Subtraction (SS). This scheme improves the SNR but
at the expense of signal distortion. Nevertheless, this scheme has
been successful for signal enhancement, and for speech recognition for
noisy environment. Here, the distorted signal is compensated to make
SS able to deal with very noisy environments. It will be shown that
the signal distortion can be represented in the linear domain by a
correction term. PMC transforms the noise and speech model parameters
from the cepstral domain to the linear domain, adds these parameters,
and then creates an adapted model by returning to the cepstral domain.
Therefore, PMC can be modified to compensate an SS distorted signal in
the linear domain by including the correction term. This modified
version of PMC will be called the SS-PMC method.

The results obtained by the SS-PMC technique are very encouraging,
showing that it is very effective to use adaptation techniques to
compensate for the signal distortion which is a side effect of an
SS-based enhancement scheme.

************************ How to obtain a copy ************************

a) Via FTP:

unix> ftp svr-ftp.eng.cam.ac.uk
Name: anonymous
Password: (type your email address)
ftp> cd reports
ftp> binary
ftp> get nolazco_tr123.ps.Z
ftp> quit
unix> uncompress nolazco_tr123.ps.Z
unix> lpr nolazco_tr123.ps (or however you print PostScript)

b) Via postal mail:

Request a hardcopy from

J. A. Nolazco Flores,
Cambridge University Engineering Department, 
Trumpington Street, 
Cambridge CB2 1PZ,
England.

or email me: jn106@eng.cam.ac.uk

