Newsgroups: comp.speech
Path: lyra.csx.cam.ac.uk!warwick!zaphod.crihan.fr!jussieu.fr!centre.univ-orleans.fr!univ-lyon1.fr!swidir.switch.ch!scsing.switch.ch!xlink.net!howland.reston.ans.net!europa.eng.gtefsd.com!newsxfer.itd.umich.edu!nntp.cs.ubc.ca!torn!watserv2.uwaterloo.ca!watserv1!anderson
From: anderson@crypto2.uwaterloo.ca (Bill Anderson)
Subject: Entropy measure for speech
Message-ID: <CpwHtt.5Gz@watserv1.uwaterloo.ca>
Sender: news@watserv1.uwaterloo.ca
Organization: University of Waterloo
Date: Mon, 16 May 1994 15:03:28 GMT
Lines: 26

I have been attempting to calculate the entropy of
a digitized speech waveform.  The first-order entropy
is easy: I determine a maximum likelihood estimate of
the pmf of the process and assume it is i.i.d., from
that estimate I can calculate entropy as \sum_k p(k) \log p(k).

Of course, an accurate measure of entropy is not that simple
because the real speech signal is highly correlated.  I would
have to determine a joint-pmf function for a set of consecutive
samples that is large enough to contain all correlation effects
and calculate entropy from that.  Trouble is, this state-space
can be huge.  For mu-law speech, for instance, if we assume only
10 consecutive samples are correlated we still require a state-
space of 10^256 to hold the joint-pmf estimator.

Does anyone know of an alternative to this brute force approach?
I would like to be able to get a fairly "accurate" measurement
for a variety of speech signals, ie: raw TIMIT (16 bits), mu-law
PCM, DPCM, ADPCM and CELP - compressed speech.  Has any work been
done in this area before?  Any pointers?

Bill Anderson
Electrical & Computer Engineering
University of Waterloo


