Newsgroups: comp.speech
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!howland.reston.ans.net!ix.netcom.com!netcom.com!ebohlman
From: ebohlman@netcom.com (Eric Bohlman)
Subject: Re: Voice Synthesizer
Message-ID: <ebohlmanCyqJIG.6w6@netcom.com>
Organization: NETCOM On-line Communication Services (408 261-4700 guest)
X-Newsreader: TIN [version 1.2 PL1]
References: <Cynt3w.4JH@esd.dl.nec.com>
Date: Fri, 4 Nov 1994 08:56:40 GMT
Lines: 45

Long Van Tran (tran@esd.dl.nec.com) wrote:
: Hello,
: I am looking for a PC board that accepts ASCII text and produces English
: text message... I heard of a company named Covox(?) but lately I did not
: see any advertisement on computer magazines.

I'm not sure whether Covox is still in business.  In any case, their 
products are basically D/A converters that use software to generate speech.

Dedicated speech synthesizer boards offload most of the speech generation 
work from the host CPU.  They fall into two categories, which I call 
"smart": the board simply accepts characters from the CPU and handles 
everything from there and "dumb": a software driver translates the text 
into phonemes, and then the board takes care of outputting the phonemes 
(this is much less CPU-intensive than requiring the CPU to generate the 
sound waveforms for the phonemes.

Some popular "smart" boards are the DecTalk from DEC (very human-sounding 
speech, but loses intelligibility at very high speeds), the Accent PC 
from Aicom (more robotic sound, but stays intelligible even when speaking 
very fast), the Prose 4000 from Centigram, and the Doubletalk from RC 
systems (good speech for the price, and also available as an OEM board 
for non-PC applications).

Dumb boards include the Accent Mini from Aicom, the Soundingboard from GW 
Micro, and a series of boards made by Artic Technologies.  Most of these 
are based on the SSI263 phoneme synthesizer chip.

Several of these vendors have external units with parallel or serial 
interfaces; these are all "smart" devices.

These all vary in several quality characteristics: which ones are
important will depend on the application.  For example, a blind user who's
using speech to access screen displays will be very concerned with how
quickly the synthesizer can stop speaking one piece of text and start
speaking another; this is much less important in other applications. 
Someone building a telephone voice-response system will be concerned with
how the voice sounds when filtered through phone lines.  The importance of
human-sounding speech is inversely proportional to the amount of time a
user spends listening to the speech, while the importance of
intelligibility and speed are directly proportional to the amount of time
spent listening.  Different charateristics are needed for reading
paragraphs worth of text versus reading short prompts. 
 

