Newsgroups: sci.lang
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!udel!gatech!howland.reston.ans.net!vixen.cso.uiuc.edu!uwm.edu!fnnews.fnal.gov!gw1.att.com!gw2.att.com!pacbell.com!amdahl.com!amd!amd.com!txnews.amd.com!news
From: Brett Stewart <brett.stewart@amd.com>
Subject: looking for intelligibility metrics
Content-Type: text/plain; charset=iso-8859-1
Message-ID: <D7EntA.3wC@txnews.amd.com>
Sender: news@txnews.amd.com
Nntp-Posting-Host: mondo
Content-Transfer-Encoding: 7bit
Organization: Advanced Micro Devices, Austin, TX, USA
Mime-Version: 1.0
Date: Fri, 21 Apr 1995 22:00:43 GMT
X-Mailer: Mozilla 1.1b3 (Macintosh; I; 68K)
X-Url: news:sci.lang
Lines: 30

I would appreciate pointers to researchers or reference works on the 
above subject.  More precisely, what I am seeking is an accepted process 
for quantifying the intelligibility changes in coded speech.  For 
example if you have speech, run it into a coder, communication channel, 
then into a decoder, you get say speech-prime.  I would like to know if 
there is any automated process that would give a quantitative assessment 
of how the coding-channel-decoding process garbled the speech into 
speech-prime.

I suspect that the answer to the latter question is specific to the 
phonemes of the language of the subject speech. I was wondering if there 
is any language-specific listing of the usage of or importance of 
different phonemes. As an example, I consider the tongue-palate unvoiced 
'click' which I believe occurs in some languages but not English.  In 
English, on the phone, that would sound like static, and a 
transformation that thought it was eliminating static from speech might 
decide to take that out of the transformation.  In English, I reason, no 
loss of speech information would occur.  But in some other language, 
maybe clicks would contain x percent of the information content of a 
large standard sample of speech, and therefore this process would reduce 
the intelligibility of the speech by at least x percent.  If such 
statistics exist, it might guide design of telecommunication products 
into different parameters for English vs, say B'mbutu pygmy language.

The process is desired for the purpose of obtaining quantitative quality 
assessment of telecommunication products.  Any suggestions would be 
appreciated and gratefully considered.



