Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!doc.ic.ac.uk!agate!spool.mu.edu!uunet!pipex!demon!gtoal
From: gtoal@pizzabox.demon.co.uk (Graham Toal)
Subject: Re: WANTED: word to phoneme dictionary and digitized phonemes
Message-ID: <C0MvKz.Br1@demon.co.uk>
Keywords: text to speech, phonenes, dictionaries
Sender: news@demon.co.uk
Nntp-Posting-Host: pizzabox.demon.co.uk
Organization: Cuddlehogs Anonymous
References: <1993Jan10.003413.16272@hacker.UUCP>
Date: Sun, 10 Jan 1993 10:24:34 GMT
Lines: 25

In article <1993Jan10.003413.16272@hacker.UUCP> steve@hacker.uucp (Stephen M. Youndt) writes:
:The subject line pretty much says what I'm after. I'm trying to put
:together a *very* basic text to speech engine.  What this requires
:is a dictionary to convert from a latin alphabet to a phonetic
:alphabet, and digitized samples of spoken phonemes (I believe there
:are 39 in English).

I have several of these - the most easily accessible one is from the
nettalk program that comes with aspirin (the neural net package).  Another
quality set of data but not so easily accessible is the COED data on
black.ox.ac.uk  That site also has Daniel Jones' "pronouncing dictionary"
and Roger Mitton's data from his PhD project on spelling; also the MRCD
database.

I'm currently working on taking all the sets of data and converting them
to a single representation.  This also involves writing parsers to decode
text descriptions in docs like the 1911 websters.  (The IPA is represented
rather strangely...)

I'm going to use the ASCII IPA layout recently posted to alt.usage.english

Graham
PS I posted rather than replied just to let anyone who knows me know I
read this group, since I seldom post here, just in case anyone wants
to get in touch with me that I haven't spoken to for a couple of years...
