Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!pipex!uunet!munnari.oz.au!manuel!nimbus!tridge
From: tridge@nimbus.anu.edu.au (Andrew Tridgell)
Subject: Re: Help on design of speech recognition
Message-ID: <1992Oct6.233648.7486@newshost.anu.edu.au>
Keywords: speech recognition
Sender: tridge@nimbus (Andrew Tridgell)
Organization: Comp. Sci. Lab., Australian National Uni.
References:  <lels.9.718355427@unpcs1.cs.unp.ac.za>
Date: Tue, 6 Oct 92 23:36:48 GMT
Lines: 32

In article <lels.9.718355427@unpcs1.cs.unp.ac.za>, lels@unpcs1.cs.unp.ac.za (Leonard Els) writes:
|> I am working on a Speaker Independant Speech Recognition system and was
|> wondering what the best measurements are, to use in the analysis.
|> 
|> I have come accross the following: Zero-Crossing rate, Energy, LPC,
|> Cepstral Coefficients, and formant analysis

For speaker independant recognition the most popular are probably 
the cepstrals. 

There are problems with using formants - not the least of which is 
find how to calculate them! If you know of some way to robustly 
calculate their values in a speaker independant way then please share 
it with us.

My own personal favourite is the engineering approach. This is to throw
all the ingredients which might be useful into a cooking pot and stir
vigorously.

The way I do this is to use LDA (linear discriminant analysis) on a
feature vector comprising a few of the common parameters (maybe cepstrals, delta-cepstrals, band-power, delta-band-power, zcr and energy). This
gives me a new feature vector of reduced dimension, but where each
parameter has a high discrimination for the task at hand.


Andrew

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Andrew Tridgell                 CSLab, Research School of Physical Sciences
Andrew.Tridgell@anu.edu.au      Australian National University (x3064)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
