Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!pipex!howland.reston.ans.net!agate!iat.holonet.net!nickm
From: nickm@iat.holonet.net (Nicholas G. Marino)
Subject: Re: VQ base HMM Qustion
Message-ID: <CEGsyq.16I@iat.holonet.net>
Organization: HoloNet National Internet Access System: 510-704-1058/modem
References: <1993Oct5.064907.6327@ccsun2.sogang.ac.kr>
Date: Wed, 6 Oct 1993 07:32:49 GMT
Lines: 31

ibrato@ailab1.sogang.ac.kr (Yang JoonYong) writes:
: 
: 
: Hi, all..!: 
: I'm a beginner in Speech Resognition field. I have been experimenting for digit recognition. I'm trying to make a VQ based discrete HMM system similar to that described in the 1983 Bell paper(Rabiner et al). 
: I have got about 15000 frames of 12 dimesional LPC-cepstral coefficients and have generated a 256-vecor codebook using K-means method. These were extracted from700 samples.(70 samples per digit) 
: Using this codebook, I have encoded each speech sample, and I have noticed that each encoded discrete symbol sequence is quite different from the others. 
: For example, in the case 'zero' digit having 70 speech samples,
: 
: 1 th) 218 51 104 104 104 104 104 104 5 5 5 5 249 79 249 249 249 249 78 78
: 2 th) 194 194 194 31 170 170 170 170 170 34 34 34 56 56 56 56 56 56 56 74
: 3 th) 31 142 31 31 31 194 194 142 142 142 142 194 194 194 194 194 142 31 142
:  142
:    .............
: 70th) 235 89 218 126 126 126 126 126 126 126 126 126 126 126 126 126 51 239
: 
:  as shown above, symbol sequence is absolutely dissimilar to each other.
: Is this expected and correct outcome?  I don't understand this. After all, I had a very poor recognition rate using these. Thanks a lot for your help! 
: 
It looks like you're using a rate-8 (256 element) codebook.
Is the training data also taken from instances of the word 'zero'?
If so, or if the training data is similarly limited in the number of
different speech sounds present, the LBG and similar codebook construction
algorithms will produce a codebook where many elements are acoustically
similar to each other. That would account for the many different
VQ indices.

Try this - construct a 8-element codebook and examine the VQ'd data.
Using such a small codebook, each template should be very similar.
If not, there's probably a bug in your code (VQ or LPC code).

