Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!doc.ic.ac.uk!uknet!pipex!howland.reston.ans.net!usc!elroy.jpl.nasa.gov!ames!pacbell.com!att-out!walter!spiegel@bellcore.com
From: spiegel@bellcore.com (Murray Spiegel)
Subject: Re: Human ear
Message-ID: <CF7Hw2.J1r@walter.bellcore.com>
Keywords: Psychoacoustics, Speech recognition
Sender: news@walter.bellcore.com
Nntp-Posting-Host: din.bellcore.com
Organization: Bellcore (Bell Communications Research)
References:  <2a26s9$a2h@male.EBay.Sun.COM>
Date: Wed, 20 Oct 1993 17:28:49 GMT
Lines: 54

   andrewb@europe.EBay.Sun.COM (Andrew Bulucea) writes:

> Any references - please !
> I wonder if the human ear was "measured" for the following:

> 1) how many mixed tones (at the same time) a human can recognize 
> 2) How fast can we follow a signal that is changing its freq. (If
> the signal is 
> changing too fast we perceive the signal as one 'tone' vs a slower
> freq variation)
> 3) it's amplitude resolution (db)

> I strongly believe that this factors are an important key in speech
> recognition software. 

Human psychoacoustics (which "measures" the ear's signal-processing 
characteristics) is a very active field, and with findings not entirely 
overloooked by those working in speech recognition.

Many of the first-order effects associated with the auditory system,
such as frequency warping, energy summation, masking, 
and other Critical Band effects, are already incorporated into 
the best automatic speech recognition systems.  Many second-order effects 
are fairly well known, but the exact manner in which they should be 
applied to ASR isn't entirely clear.

Most basic psychoacoustic research, which appears to be the thrust 
of your note, is published in the Journal of the Acoustical Society
of America.  I would suggest searching the areas categorized in the 
volume indexes as 43.66 (Psychological Acoustics), 43.71 (Speech Perception), 
and 43.64 (Psychological Acoustics).  (If you are unfamiliar with JASA, 
check their cumulative index, published every 5 yrs.)

For topic (1), I suggest searching out the papers Brian CJ Moore 
and Brian R Glasberg published in 1984-85.  The frequency separation 
of tones that can be isolated is roughly related to the Critical Band.
See also more recent material by Houtsma.

I don't recall the primary reference for (2), but there 
are related studies, the original of which is quite old 
(perhaps from the '40s), on two-tone dissonance.  
As delta f increases from 0, the sensations/perceptions move from
a warbling tone, to roughness, through dissonance, to 2 distinct tones.
Again, the boundaries are roughly related to critical bandwidths.

The just-noticeable-difference (jnd) threshold for intensity increments
of tones is widely quoted as between around .5-1.0 dB; the threshold 
depends on several factors.  See L Braida and N Durlach's series 
on intensity perception from 1969-1980; Jesteadt, Wier and Green
(1977) recalibrated the classic studies of intensity discrimination.

- Murray Spiegel
  Bellcore

