Newsgroups: alt.comp.blind-users,comp.speech,comp.ai.nat-lang,comp.infosystems.www.authoring.html
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!news.mathworks.com!newsfeed.internetmci.com!news.ac.net!news.cais.net!peer.news.xara.net!xara.net!uknet!newsfeed.ed.ac.uk!edcogsci!cnews
From: pault@cogsci.ed.ac.uk (Paul Taylor)
Subject: Re: Making text-to-speech more comprehensible?
In-Reply-To: jorn@MCS.COM's message of 28 Mar 1996 08:31:39 -0600
X-Nntp-Posting-Host: scott
Message-ID: <qjwhgv1jq34.fsf@scott.cogsci.ed.ac.uk>
Sender: pault@scott.cogsci.ed.ac.uk
Organization: Centre for Cognitive Science, University of Edinburgh
References: <4jcqou$91e@mars.mcs.com> <4je7sc$r1v@Venus.mcs.com>
Date: Wed, 3 Apr 1996 15:15:59 GMT
Lines: 84
Xref: glinda.oz.cs.cmu.edu comp.speech:8869 comp.ai.nat-lang:4752 comp.infosystems.www.authoring.html:64255


In article <4je7sc$r1v@Venus.mcs.com> jorn@MCS.COM (Jorn Barger) writes:

   Certainly, the speech synthesizer's job would be easier if imperatives
   were 'tagged': "<imperative>Close the door.</imperative>"

   Another way they might be tagged would be more like a musical score:

      <f-note>Close <d-note>the <g-note>door.

   SGML prefers that markup deal with spans rather than points, which
   is a bit annoying for this application, perhaps.

   Another approach might be to pre-parse each sentence:

      <verb>Close</verb> <object-phrase>the door</object-phrase>.

   This is actually done, a lot, in natural-language research, but of
   course it's not very realistic for the WWWeb...

We are actually developing a markup language for speech synthesis. It
is called SSML and follows SGML type syntax. The idea is to enable
users of synthesis systems to have basic control over pronunciation
without needing to be speech experts. In addition it is meant to be
synthesis-system independent, so that users can write SSML documents
which will work on a variety of systems. SSML is still in development,
we hope to release a prototype soon. 

There are lots of ways in which SSML could be done, but to begin with
we have kept it pretty simple. Here are some examples:

	<language="your-language"> 
 
	SSML is not tied to any language, and
	assumes the default of the system. For multi-lingual synthesis, this
	command can be used to change language in the middle of a
	document. Examples: <language="English">, <language="Spanish">
 
	<phrase> 
 
	Major (intonation) phrases are used as the primary unit of
	prosodic structure in SSML. Phrases can take attributes which
	specify the speech act of the phrase, for example
	"yn-question" (yes/no question), "wh-question" or
	"statement". If no speech act type is given, "statement" is
	assumed as the default.
 
	<emph> word </emph>
 
	Words can be emphasised by surrounding them with the <emph>
	tags. In English one word per phrase is always emphasised, and
	therefore if no <emph> tags exist within a phrase, default
	rules are used for emphasis. Example: there is an emphasised
	word <emph> here</emph>
 
	<define word=(identifier) pro=(pronunciation)
		standard=(lexicon standard)>
 
	This command is used to define or redefine the pronunciation
	of a word. The word field serves as an identifier and the pro
	field defines the pronunciation. Different synthesis systems
	will have different ways of specifying the pronunciation of
	entries in a lexicon and so care was taken to make these
	definitions flexible. The "standard" field gives the name of a
	pre-defined lexicon standard, allowing different phonemic
	alphabets to be used.  Example <define word="Edinburgh"
	phonemes="E * d . i n . b r @@" standard="cstr">.
 

I'll post information about how to get SSML when we have a demo
version ready.

----------------------------------------------------------------------
Paul Taylor
Research Fellow,
Centre for Speech Technology Research,
University of Edinburgh,
U.K.
----------------------------------------------------------------------
email: Paul.Taylor@ed.ac.uk
tel:   +44 131 650 2793
WWW: http://www.cstr.ed.ac.uk
----------------------------------------------------------------------

