Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!doc.ic.ac.uk!agate!ames!han.hana.nm.kr!ccsun2.sogang.ac.kr!news
From: vibrato@ailab1.sogang.ac.kr (Yang JoonYong)
Subject: VQ base HMM Qustion
Message-ID: <1993Oct5.064907.6327@ccsun2.sogang.ac.kr>
Sender: news@ccsun2.sogang.ac.kr
Organization: Sogang University
X-Newsreader: TIN [version 1.1 PL9]
Date: Tue, 5 Oct 93 06:49:07 GMT
Lines: 25



Hi, all..!

I'm a beginner in Speech Resognition field. I have been experimenting for digit recognition. I'm trying to make a VQ based discrete HMM system similar to that described in the 1983 Bell paper(Rabiner et al). 

I have got about 15000 frames of 12 dimesional LPC-cepstral coefficients and have generated a 256-vecor codebook using K-means method. These were extracted from700 samples.(70 samples per digit) 

Using this codebook, I have encoded each speech sample, and I have noticed that each encoded discrete symbol sequence is quite different from the others. 
For example, in the case 'zero' digit having 70 speech samples,

1 th) 218 51 104 104 104 104 104 104 5 5 5 5 249 79 249 249 249 249 78 78
2 th) 194 194 194 31 170 170 170 170 170 34 34 34 56 56 56 56 56 56 56 74
3 th) 31 142 31 31 31 194 194 142 142 142 142 194 194 194 194 194 142 31 142
 142
   .............
70th) 235 89 218 126 126 126 126 126 126 126 126 126 126 126 126 126 51 239

 as shown above, symbol sequence is absolutely dissimilar to each other.
Is this expected and correct outcome?  I don't understand this. After all, I had a very poor recognition rate using these. Thanks a lot for your help! 
 

     E-main: vibrato@ailab1.sogang.ac.kr


