Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!pipex!sunic!uts!diku!sporring
From: sporring@diku.dk (Jon Sporring)
Subject: Re: computing speech spectrograms?
Message-ID: <1993May25.135621.23497@odin.diku.dk>
Sender: sporring@embla.diku.dk
Date: Tue, 25 May 1993 13:56:21 GMT
References: <C7K30K.LKu.1@cs.cmu.edu>
Organization: Department of Computer Science, U of Copenhagen
Lines: 27

mkant+@cs.cmu.edu (Mark Kantrowitz) writes:

>I'm trying to write code to compute and display speech spectrograms, and am
>getting wierd results. My code for displaying waveforms works fine, so
>the problem is probably with my understanding of how to compute
>spectrograms.

>My input is ulaw-encoded speech files (captured using a mike attached
>to a Sparc) which I've converted to linear. I pass 64 (or 128)
>datapoints at a time to a FFT routine (shifting each time by 16
>points), which returns 64 complex numbers. I interpret the magnitude
>of the ith complex number as the intensity of the sr*i/64 Hz
>frequency, where sr is the sampling rate. I plot only those
>frequencies whose intensity is greater than some appropriate threshold
>value. The resulting plots do not look like spectrograms.

>Any suggestions?

>--mark

You do not write how large your samplingrate is, but you should transform
about 10ms of speech per line in the spectrogram.  Further more, if you are
looking for the formants in the spectrogram, you should afterwards convolve
each line with a square filter of som width.  This is the same as calculating
the average magnitude in a window of the same width for each point on the line.

                                  Jon Sporring (e-mail:sporring@diku.dk)
