Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!pipex!uunet!gatech!hubcap!blackhawk!bdbryan
From: bdbryan@eng.clemson.edu (Ben Bryant)
Subject: Phonemic analyzer construction
Message-ID: <1992Nov23.133836.11680@hubcap.clemson.edu>
Sender: news@hubcap.clemson.edu (news)
Reply-To: bdbryan@eng.clemson.edu
Organization: College of Engineering, Clemson Univ.
Date: Mon, 23 Nov 1992 13:38:36 GMT
Lines: 30

G'day Sirs,
I am thinking about building a connectionist phoneme analyzer, and am interested
in finding out some ideas about how to go about designing the "higher-level"
classifier which will discriminate among the outputs from several previously
trained "subclass instant" neural nets.

Basically, the way this would work is that a suitable NN architecture would
be chosen for the "lower-level" signal analysis stage, and instances of this
architecture would be trained using TIMIT or some other large database.

The way the training would take place is as follows:
1) first the training tokens for each phonemic subclass would be extracted
   from the database.
2) the phoneme tokens for each phonemic subclass extracted in step one would
   then be preprocessed with an appropriate feature representation technique.
3) network instances would be trained using the chosen neural network architecture.
   A network instance will be trained for each phonemic subclass (i.e., voiced-stops,
   unvoiced-stops, diphthongs, vowels, etc.).
4) after training all network instances, the outputs from the trained subnetworks
   would "somehow" be arbitrated to provide a decision of which phoneme was uttered
   within a given region of signal.

-The "somehow" in step 4) is what I really could use some help with. Any other
ideas for this system would be welcome as well. Thank you very much.

Sincerely,
-Benjamin Bryant 



