Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!doc.ic.ac.uk!warwick!pipex!howland.reston.ans.net!vixen.cso.uiuc.edu!cs.uiuc.edu!asimov!blix
From: blix@asimov.cs.uiuc.edu (Gunnar Blix)
Subject: Re: Human ear
Message-ID: <CF9J2y.EFM@cs.uiuc.edu>
Keywords: Psychoacoustics, Speech recognition
Sender: news@cs.uiuc.edu
Organization: University of Illinois, Dept. of Comp. Sci., Urbana, IL
References: <2a26s9$a2h@male.EBay.Sun.COM> <CF7Hw2.J1r@walter.bellcore.com>
Date: Thu, 21 Oct 1993 19:49:45 GMT
Lines: 55

spiegel@bellcore.com (Murray Spiegel) writes:

>   andrewb@europe.EBay.Sun.COM (Andrew Bulucea) writes:

>> Any references - please !
>> I wonder if the human ear was "measured" for the following:

>> 1) how many mixed tones (at the same time) a human can recognize 
>> 2) How fast can we follow a signal that is changing its freq. (If
>> the signal is 
>> changing too fast we perceive the signal as one 'tone' vs a slower
>> freq variation)
>> 3) it's amplitude resolution (db)

>> I strongly believe that this factors are an important key in speech
>> recognition software. 

>Human psychoacoustics (which "measures" the ear's signal-processing 
>characteristics) is a very active field, and with findings not entirely 
>overloooked by those working in speech recognition.

>Many of the first-order effects associated with the auditory system,
>such as frequency warping, energy summation, masking, 
>and other Critical Band effects, are already incorporated into 
>the best automatic speech recognition systems.  Many second-order effects 
>are fairly well known, but the exact manner in which they should be 
>applied to ASR isn't entirely clear.

Although many of these effects have indeed been taken into account
when designing speech recognition systems as a whole, and auditory
models that serve as front ends to speech recognition systems, in few
cases have these systems actually been *measured* to show that they
exhibit the desired (or at least comparable) characteristics.  In a
project not yet published, Gary Bradshaw and I have done some of the
basic psychoacoustic tests on several auditory models commonly used in
speech recognition systems, and find that the values are often an
order of magnitude lower, despite theoretical claims.

>Most basic psychoacoustic research, which appears to be the thrust 
>of your note, is published in the Journal of the Acoustical Society
>of America.  I would suggest searching the areas categorized in the 
>volume indexes as 43.66 (Psychological Acoustics), 43.71 (Speech Perception), 
>and 43.64 (Psychological Acoustics).  (If you are unfamiliar with JASA, 
>check their cumulative index, published every 5 yrs.)

Brian C. Moore also has a book called "An Introduction to the
Psychology of Hearing" that outlines the basic research on human
psychoacoustics, now in its third edition, published on Academic
Press.  Easy reading, and a good place to start.

--
******************************************************************
* Gunnar Blix      * Good advice is one of those insults that    *
* blix@cs.uiuc.edu * ought to be forgiven.              -Unknown *
******************************************************************
