Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!pipex!uunet!wupost!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!eng.ufl.edu!news
From: george@beta.ee.ufl.edu (George L)
Subject: Re: WANTED: ms-dos text-to-speech
Message-ID: <1992Oct3.083503.23450@eng.ufl.edu>
Sender: george@alpha.ee.ufl.edu
Organization: EE Dept at UF
References: <1992Oct2.060425.18579@uc.msc.edu> <MARTY.92Oct2083309@predator.cs.utexas.edu> <1992Oct3.032227.20557@beaver.cs.washington.edu>
Date: Sat, 3 Oct 92 08:35:03 GMT
Lines: 88

In article <1992Oct3.032227.20557@beaver.cs.washington.edu> mef@june.cs.washington.edu (Marc E. Fiuczynski) writes:
>Hi,
>
>In comp.newprod a company is offering a text-to-speech API callable (I
>assume) from a C program.  However, it is written for the NeXT
>workstation and probably requires to use the built in DSP 56001.
>
>Marc

Since nobody has mentioned a DOS based product yet, I dug around in my
archives and pulled up a program I tested a few years ago. It is a sampled-
phoneme based text-to-speech converter that probably isn't as intelligible 
as you would like, but was somewhat pushing the limits of the hardware it 
was written for.

Even though most of the PC world is still stuck with one-bit modulation on 
stock hardware, the quality of this technique could be substantially 
improved if the ability to harness todays faster machines was exploited.
MicroSoft has done a reasonable job with their PC Speaker driver under 
Windows 3.1. It would be relatively simple for someone to combine this 
approach with the sampled-phoneme based text-to-speech converter to produce 
a more decent package.

The program is available from SIMTEL-20 and all of its mirror sites. Some 
text from the original documentation follows:
----------------------------------------------------------------------------
TRAN is a text-to-speech program for the IBM-PC.  It can read ASCII text
files, translate normal English spelling to phonemes, and sound out each
phoneme through the speaker of the IBM-PC.

    usage: tran [+/-flags] [-options] [filename]

The filename is an ASCII text file (with no word processor formatting
codes).  If no filename is given TRAN reads input from the keyboard.

There are two timing parameters, d1 and d2, that control the rate that
TRAN speaks.  Making d1 larger increases the pauses between words and
making d2 larger lowers the pitch of the voice phonemes.  Both d1 and d2
must have a value of 1 or greater.  On an IBM-PC/XT, good values for the
timing parameters are d1=2 and d2=1.  If these parameters are not set
explicitly, the program will try to determine acceptable values
automatically.  Setting these values, will let TRAN by-pass the
automatic setting, which save a second or two starting the program.
These values can be set on the command line:

    tran -d1 2 -d2 1 ...

On a 10 MHz IBM-PC/AT the timing prarameters need to be larger, d1=4
d2=13.

Most of the speech-to-text rules used in the tran program come from an
article in an IEEE journal:

Elovitz, H.S., Johnson, R., McHugh, A., and Shore, J.E.  (1976).
"Letter-to-Sound Rules for Automatic Translation of English Text to
Phonetics," IEEE Transactions on Acoustics, Speech, and Signal
Processing, Vol.  ASSP-24(6), 446-458.

The program contains a set of 35 phonemes, each encoded as a sequence of
bits controlling the position (in or out) of the PC speaker.  The
phoneme codes come from the public domain program SPEECH by Andy
McGuire.
-------------------------------------------------------------------------
I just checked, and the program is still there along with some new ones:

-------------------------------------------------------------------------
Directory PD1:<MSDOS.VOICE>
 Filename   Type Length   Date   Description
 ==============================================
 AUTOTALK.ARC  B   23618  881216  Digitized speech for the PC
 CVOICE.ARC    B   21335  891113  Tells time via voice response on PC speaker
 HEARTYPE.ARC  B   10112  880422  Hear what you are typing, crude voice synth.
 HELPME2.ARC   B    8031  871130  Voice cries out 'Help Me!' from PC speaker
 SAY.ARC       B   20224  860330  Computer Speech - using phonemes
 SPEECH98.ZIP  B   41003  910628  Build speech (voice) on PC using 98 phonemes
 TALK.ARC      B    8576  861109  BASIC program to demo talking on a PC speaker
 TRAN.ARC      B   39766  890715  Repeats typed text in digital voice
 VDIGIT.ZIP    B  196284  901223  Toolkit: Add digitized voice to your programs
 VGREET.ARC    B   45281  900117  Voice says good morning/afternoon/evening
-------------------------------------------------------------------------

So lets have some USENET discussion on:
    How good are each of these packages?
    What don't you like about them?
    How would you improve them?

Hope this helps,
george@alpha.ee.ufl.edu
