\documentstyle{article}

\setlength{\textwidth}{6.2in} % was 6.5
\setlength{\oddsidemargin}{0.1in}
\setlength{\topmargin}{0.0in}
\setlength{\headheight}{0.0in}
\setlength{\headsep}{0.0in}
\setlength{\textheight}{8.7in} % was 9.0

\begin{document}

\noindent
{\large \bf Building Synthetic Voices}\\[+6pt]
{\it Alan W Black and Kevin A. Lenzo}\\
{\tt awb@cs.cmu.edu} and {\tt lenzo@cs.cmu.edu}\\

This tutorial will give an overview of the basic techniques available
for building synthetic voices for speech synthesis systems, including
an actual example of voice building. The first part will describe the
basic components of a speech synthesis system covering the state of
the art techniques used within them.  Specifically:
\begin{description}
\item[Text Analysis]:
addressing issues of expansions of symbols, 
numbers, acronyms etc and resolving homographs
\item[Linguistic Analysis]: 
"from words to how to say them", addressing issues in lexical entries,
letter to sound rules and prosodic modeling, (phrasing, intonation and
duration).
\item[Waveform Synthesis]: 
"from phones and prosody to waveforms" describing basic techniques for
making computers talk using recorded prompts, diphones, and general
unit selection synthesis
\end{description}
The second part will describe the basic stages required in building
new synthetic voices (in English or other languages): 
\begin{itemize}
\item building a text analysis system
\item building a lexicon and letter to sound rules
\item build phrasing, intonation and duration models
\item recording data for concatenative speech synthesis 
(diphones, unit selection and/or limited domain)
\end{itemize}
This tutorial is based on the techniques, documentation and tools
freely distributed through CMU's FestVox project (http://festvox.org)
leading to voices that can be run on Edinburgh University's Festival
Speech Synthesis System.
\end{document}
