Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!doc.ic.ac.uk!agate!ames!sun-barr!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!menudo.uh.edu!lobster!nuchat!texhrc!texhrc!ak45ldp
From: someone@Texaco.com (Larry D. Pyeatt)
Subject: Re: Could S.Blaster recognize sounds??HELP
Message-ID: <1992Nov20.214205.15858@texhrc.uucp>
Sender: news@texhrc.uucp
Nntp-Posting-Host: aisun
Organization: Texaco
References:  <1992Nov10.154616.6281@gw.wmich.edu>
Date: Fri, 20 Nov 1992 21:42:05 GMT
Lines: 53

In article <1992Nov10.154616.6281@gw.wmich.edu>, x89olarte1@gw.wmich.edu writes:
|> I may be naive but that doesn't take my right to ask away  :")

That`s okay, we all start out naive.

|> The sound board Soundblaster Pro has a mic input.
|> would it be too hard to make a program that once you have in a disk say
|> 100 words saved (the wavelength or whatever) . when you talk into the board 
|> w/the mic it compares it and if it's close enough it performs some action
|> corresponding to that word , and if no one matches do nothing or say
|> "repeat" or some like that ??

Well..  It depends on your definition of too hard.  If you have spent the 
last four years learning everything you can about speech recognition systems,
then you would probably be able to do it after a few weeks of heavy programming.
If, however, you are totally new to speech recognition, it will take at least
6 months of reading and experimentation before you understand enough to begin
solving the problem. I did my MS research in speech recognition with neural
networks.  I have been thinking about trying to do what you are describing.
The main drawback that I see is the fact that the PC really does not have a
lot of computing power, and it is a pain to try and do multi-tasking under
MS-DOS.

If you are really interested, you should go to the library and start reading
the scientific journals on speech recognition.  You may want to look at using
a Hidden Markov Model ( HMM ), although it is really not that easy to  
understand at first.  You should also limit the vocabulary to about ten words
for your first attempt.  You should modularize your code as much as possible.
The major modules should perform the following:
1. signal aquisition,
2. preprocessing, and
3. recognition.

There are several different approaches to preprocessing.  The preprocessing
tecnique can make the difference between success and failure.  The result of 
preprocessing is to convert the raw signal into a series of "feature vectors"
which can be fed into the pattern recognizer.  You may want to use 
LPC cepstra, wavelet coeffecients, or any of a number of techniques.  Some
preprocessing techinques model the physical and neuronal processing which
takes place in the cochlea and aural pathways.  Other techniques ignore
biological systems and strike out in their own direction.

The recognizer takes a series of feature vectors and tries to relate them
to known patterns.  There are a lot of ways to solve this problem.  Some
approaches are good for single speaker, small vocabulary problems while 
others may be better suited to speaker independent, medium vocabulary 
problems.

-- 
Larry D. Pyeatt                 The views expressed here are not
Internet : pyeatt@texaco.com    those of my employer or of anyone
Voice    : (713) 975-4056       that I know of with the possible
                                exception of myself.
