next up previous
Next: All of the people Up: Perfect Synthesis for all Previous: A finite or infinite

Some of the people all of the time

In spite of our desire to produce perfect natural sounding synthesis all the time, there are a number of users who do not actually require this. In fact given the current restrictions of general unit selection synthesis, more traditional processes may be adequate or even better.

People who must listen to synthetic output a lot, very quickly learn to accept the limitations of voice quality. In fact they often prefer that the voice is less natural but more consistent. As those who work in speech synthesis know, the more you listen to a voice the more acceptable it will become as the human ear tunes to the idiosyncrasies of the particular voice.

In some applications the content being spoken is more important than the style it is delivered in. Listening to lots of data through an audio channel is slow and many people would prefer it to delivered faster than a natural voice could. This brings in the more general issue of delivering information by audio using voice-like methods but they do not need to use only those techniques used in the human voice. [14] has a number of examples which exploit the fact that a synthesizer is being used rather than a natural voice to allow more information to be packed into the channel. For example Emacsspeak can use pitch to denote level of super/sub scripts in formula.

What is important to note here is that for some applications, natural voice is not the most ideal, when we consider the task as information presentation through the somewhat constrained bandwidth of audio other techniques may be better. This is not just voice based aspects but ear-cons, background noises etc. can be used to help information transfer rates.


next up previous
Next: All of the people Up: Perfect Synthesis for all Previous: A finite or infinite
Alan W Black 2002-09-30