From apple!motcsd!lance Sat Nov 23 00:22:12 1991
Return-Path: <apple!motcsd!lance>
Received: from netcomsv.netcom.com by netcom.netcom.com (4.1/SMI-4.1)
	id AA11792; Sat, 23 Nov 91 00:22:11 PST
Received: from apple.UUCP by netcomsv.netcom.com (4.1/SMI-4.1)
	id AA12781; Fri, 22 Nov 91 23:31:42 PST
Received: by apple.com (5.61/18-Oct-1991-eef)
	id AA17793; Thu, 21 Nov 91 15:25:10 -0800
	for 
Received: by motcsd.csd.mot.com (/\=-/\ Smail3.1.18.1 #18.4)
	id <m0kkN0f-0001QBC@motcsd.csd.mot.com>; Thu, 21 Nov 91 14:36 PST
Message-Id: <m0kkN0f-0001QBC@motcsd.csd.mot.com>
Date: Thu, 21 Nov 91 14:36 PST
From: apple!csd.mot.com!lance (lance.norskog)
To: thinman@netcom.com
Subject: klatt ref letter
Status: O

Path: motcsd!apple!csl!fernwood!uunet!zaphod.mps.ohio-state.edu!magnus.acs.ohio-state.edu!jabberwock.shs.ohio-state.edu!jackson
From: jackson@jabberwock.shs.ohio-state.edu (Michel Jackson)
Newsgroups: comp.dsp
Subject: Re: Help needed with Klatt synthesiser
Summary: MITALK system, KLSYN88, other info
Keywords: Klatt
Message-ID: <805@jabberwock.shs.ohio-state.edu>
Date: 18 Jul 91 15:47:21 GMT
References: <8521@acorn.co.uk>
Reply-To: Michel Jackson <jackson@cis.ohio-state.edu>
Followup-To: comp.dsp
Distribution: comp
Organization: The Ohio State University, Division of Speech and Hearing Science
Lines: 79

In article <8521@acorn.co.uk> pcolmer@acorn.co.uk (Philip Colmer) writes:
>1) where abouts does the synthesizer actually form the sounds? What I want
>   is to get the output into a format other that u-law. I notice that you
>   are actually converting from PCM into u-law, but I don't know the format
>   of the PCM data either ...

The Klatt synthesizer(s) do digital synthesis. You need a D/A to
produce sound from the digital wave it produces.  "PCM" is typically
signed twelve-bit integer, but it may have more bits than that, or be
unsigned twelve-bit integer with a constant offset.

>2) could you provide me with more details on the Klatt book, eg ISBN number
>   and title

The standard book-style reference on the system within which the
original Klatt synthesizer (circa 1970) worked is

Allen, J., Hunnicutt, M. S., & Klatt, D. 1987. _From text to speech_
(Cambridge studies in speech science and communication). New York:
Cambridge University Press.

The Klatt synthesizer was originally written in FORTRAN.  A very common
adaptation of that version was written by Diane Kewly-Port; the
version you have is most likely descended from that one, as it is by
far the commonest and most widely distributed.  A C re-write of the
code was produced by Klatt himself at MIT (runs on VAXstations); and
(I believe but do not know for sure) independently by Louis Goldstein
at Haskins Labs (runs on PCs & compatibles with Data Translation
28xx-style A/D - D/A). There is or was a bug in the Haskins version
having to do with setting sampling rates which I fixed in 88-89 &
which may have also been fixed in their version.  If your version is
in C, it may be descended from one of those; my guess
would be the VAXstation version.

Various versions of the original FORTRAN have been liscenced by
Speech+ and DEC (the basis of the commercial DECTALK system).  It is
possible but unlikely that your coded is based on one of those.  My
understanding is that the fundamental-frequency interpolation
mechanism in all post-MITALK versions of the Klatt synthesizer may be 
proprietary to DEC (i.e. patented). Any commerical user should be
aware of this and take proper precautions.

Before his decease, Dr. Klatt and Ken Stevens of MIT made arrangements
for the Klatt synthesizer to me made available to the public. Dr.
Bunnel's posting correctly identifies Sensimetrics as the souce for
what you might call the "authorized version".  Although Sensimetrics
is very close to making this version available, when I was at MIT six
months ago, there were some delays due to DEC's concern about the
proprietary nature of some of the code.

The most recent version of the Klatt synthesizer, known as KLSYN88 is
described in 

Klatt, D. & Klatt, L. 1990. "Analysis, synthesis, and perception of
voice quality variations among female and male talkers", J. Acoust.
Soc. Am. v. 87, n. 2, pp. 820-857.

I believe that Jerry Lane at UT-Austin has produced a modified version
of the Klatt synthsizer which uses one of the windowing systems
available on VAXstations to display parameter tracks etc.

>3) does the Klatt book explain the various parameters?

Chapters 11 & 12 of Allen, Hunnicutt, & Klatt describe the parameters.

>4) does the Klatt book contain the original sources? If not, do you know
>   where I can find them. I want to do a comparison.

see the above. The notion of "original source" of a program that was
widely and generously shared by Dennis Klatt and made available to a
large community of researchers is fuzzy.  However, since there is a
lot of collaborative work between the speech labs at Brown & MIT, and
because the quality of the software put out by John Mertis at Brown is
generally high, I would tend to think that your version will not have
any bugs introduced.

	---michel (jackson@shs.ohio-state.edu)

>--Philip


