Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!warwick!doc.ic.ac.uk!agate!library.ucla.edu!news.ucdavis.edu!madrone.ece.ucdavis.edu!lowerre
From: lowerre@madrone.ece.ucdavis.edu (Bruce Lowerre)
Subject: Re: speech, non-speech discrimination
Message-ID: <CIrI8y.67I@ucdavis.edu>
Sender: usenet@ucdavis.edu (News Administrator)
Organization: U.C. Davis - Department of Electrical Engineering and Computer Science
References: <63442@ogicse.ogi.edu>
Date: Tue, 28 Dec 1993 20:29:20 GMT
Lines: 27

>For my ph.d. research proficiency exam, i am interested in identifying
>speech versus non-speech (background noises, etc) in telephone
>waveforms. The idea of course is that by first identifying the time
>periods that are speech, subsequent stages of recognition can be more
>productive.
>
>My cursory literature search on speech / non-speech in general (not
>limited to telephone speech) turned up only a very small handful of
>references (less than six, of varying antiquity). I am left to wonder
>whether
>
>  (a) this is too easy, so no one deems it important enough to write
>      about, or
>
>  (b) this is too difficult, and no one knows how to do it, or
>
>  (c) this is written up in journals or proceedings that are
>      outside
>      my pitiful excuse for a literature search.

My guess is (b).  Identifying voiced parts of speech may be easy
depending on the type of "noise" one is trying to exclude.  If
you're trying to exclude background music, then you have a very
difficult task.  Also, the non-voiced parts of speech look
suspiciously like background white noise.  The real fun is dealing
with speech contaminated with noise.

