Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!doc.ic.ac.uk!agate!headwall.Stanford.EDU!kithrup.com!mrs
From: mrs@kithrup.com (Mike Stump)
Subject: English word database (6,100 words) for klatt
Organization: Kithrup Enterprises, Ltd.
Message-ID: <CBKCtu.FpG@kithrup.com>
Reply-To: mrs@cygnus.com
Date: Tue, 10 Aug 1993 21:53:39 GMT
Lines: 70

You can use the below program with the klatt-0.02 code, and the
TIMIT.mostlikely.Z database, to obtain a 6,100 english word
pronunciation database.  It is not perfect, but it is pretty
reasonable.

The single most horrible pronunciation is ss.


Info to the TIMIT.mostlikely.Z database (from news posting):

	If you would like to get these pronunciations, they are
	available via anonymous ftp from ftp.icsi.berkeley.edu in the
	directory pub/speech. The file is called "TIMIT.mostlikely.Z".

Info to klatt-0.02 code:

	The package klatt-0.02.tar.Z exists on svr-ftp.eng.cam.ac.uk
	in directory comp.speech/sources.

#!/bin/sh

# This file translates Allophones as used in the TIMIT database to
# phonemes used by klatt-0.02.

# phoneme codes that are ok as is:
# ix en 
sed '	s/\<r\>/rr/g;
	s/\<ax-r r\>/rr/g;
	s/\<ax-r\>/rr/g;
	s/\<v\>/vv/g;
	s/\<bcl b\>/bb/g;
	s/\<bcl\>/bb/g;
	s/\<b\>/bb/g;
	s/\<dcl d\>/dd/g;
	s/\<dcl\>/dd/g;
	s/\<d\>/dd/g;
	s/\<m\>/mm/g;
	s/\<n\>/nn/g;
	s/\<z\>/zz/g;
	s/\<q\>//g;
	s/\<l\>/ll/g;
	s/\<tcl t\>/tt/g;
	s/\<tcl\>/tt/g;
	s/\<t\>/tt/g;
	s/\<er\>/oxr/g;
	s/\<jh\>/jj/g;
	s/\<s\>/ss/g;
	s/\<z\>/zz/g;
	s/\<f\>/ff/g;
	s/\<pcl p\>/pp/g;
	s/\<pcl\>/pp/g;
	s/\<pct\>/pp/g;
	s/\<p\>/pp/g;
	s/\<kcl k\>/kk/g;
	s/\<kcl\>/kk/g;
	s/\<k\>/kk/g;
	s/\<ux\>/uw/g;
	s/\<epi\>//g;
	s/\<g\>/gg/g;
	s/\<w\>/ww/g;
	s/\<y\>/yu/g;
	s/\<pau\>/ /g;
	s/\<ax-h\>/ah/g;
	s/\<nx\>/nn/g;
	s/\<eng\>/ng/g;
	s/\<hv\>/hh/g;
	s/\<ax-hv\>/owhh/g;
	s/\<gcl g\>/gg/g;
	s/\<gcl\>/ /g;
' | sed 's/ \(.\)/\1/g;'
