Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!doc.ic.ac.uk!agate!library.ucla.edu!news.mic.ucla.edu!unixg.ubc.ca!acs.ucalgary.ca!cpsc.ucalgary.ca!hill
From: hill@cpsc.ucalgary.ca (David Hill)
Subject: Re: speech, non-speech discrimination
Message-ID: <CIs36y.BDn@cpsc.ucalgary.ca>
Sender: news@cpsc.ucalgary.ca (News Manager)
Organization: University of Calgary Computer Science
References: <63442@ogicse.ogi.edu>
Date: Wed, 29 Dec 1993 04:01:45 GMT
Lines: 44

In article <63442@ogicse.ogi.edu> ldcolton@chico.cse.ogi.edu (L Don Colton) writes:
>For my ph.d. research proficiency exam, i am interested in identifying
>speech versus non-speech (background noises, etc) in telephone
>waveforms. The idea of course is that by first identifying the time
>periods that are speech, subsequent stages of recognition can be more
>productive.
>
>My cursory literature search on speech / non-speech in general (not
>limited to telephone speech) turned up only a very small handful of
>references (less than six, of varying antiquity). I am left to wonder
>whether
>
>  (a) this is too easy, so no one deems it important enough to write
>about, or
>
>  (b) this is too difficult, and no one knows how to do it, or
>
>  (c) this is written up in journals or proceedings that are outside
>my pitiful excuse for a literature search.
>
>I would very gratefully accept observations, opinions, miscellaneous
>feedback (and references!). Email me or post here, as I read this
>newsgroup religiously. Thanks much!
>
>-- 
>Don Colton               ___e     Center for Spoken Language Understanding
>ldcolton@cse.ogi.edu   _`\ <;     Oregon Graduate Institute, 20000 NW Walker Rd
>bicycle commuter______(_)/_(_)____P.O.Box 91000, Portland, OR 97291-1000

Because voiced waveforms in speech are fairly heavily damped, it turns out
that speech tends to be much more assymetrical in the time domain than
other sorts of noise you may pick up.  That is, if you compare the rms value
of the negative-going cycles with the mean energy of the positive-going
cycles, when voiced speech is present the difference between these measures
will be much greater than when voiced speech is not present.  I think
a guy called Dersch, working for IBM (his machine was called "Shoebox")
actually used this fact in his processing.

david

-- 
david hill: hill@cpsc.ucalgary.ca	|	Imagination is more
voice: 403-282-6481, fax: 403-282-6778	|	important than knowledge.
nextmail: hill@trillium.ab.ca		|		(Albert Einstein)
