next up previous
Next: Synthesis Up: Speechalator: two-way speech-to-speech translation Previous: Recognition


Our translation uses an explicit language-independent interlingua formalism, so that support of new languages can be achieved without affecting existing supported languages. Design of the interlingua formalism is not easy but we already have experience in that area, [8].

Our interlingua representation is based on speaker intention rather than literal meaning. The speaker's intention is represented as a domain-independent speech act followed by domain dependent concepts. We use the term domain action to refer to the combination of a speech act with domain specific concepts. Examples of domain actions and speech acts are shown in Figure 1. Domain actions are constructed compositionally from an inventory of speech acts and an inventory of concepts. Specific information concerning predicate participants and objects etc. is represented by arguments and values. The allowable combinations of speech acts, concepts, arguments and values are formalized in a human- and machine-readable specification document.

Figure 1: Examples of Speech Acts and Domain Actions
\begin{figure}{\em I have a husband and two children ages two and eleven.}
body-object-spec=(whose=you, shoulder))\end{verbatim}\par\end{figure}

Our initial system used an off-device interlingua to text generation system as the generator had not yet been ported to the PDA device. This worked well, but given the network overhead, was slower than we wished, but could deal with large grammars.

We took two parallel tracks to solve this, this first was to investigate porting the existing generator system to the PDA, but as it was in C++ that was going to take time. The second route was to use a statistical based translation mechanism.

Statistical machine translation has become more popular as its performance has improved. Normally models are trained on corpora of parallel text, with each utterance in one language corresponding to a translation in the target language. This basic model however would remove the advantages of interlingua, as conventional statistical MT techniques would require parallel corpora for each language pair we wished to support. Thus instead of having the two sides be textual utterances we used a parallel corpora of interlingua representations and their realization as textual utterances in the target language. This model does introduce different problems, as the interlingua representation is effectively a tree structure. These techniques are relatively new and will be published elsewhere. But because they were successful and because we had an efficient implementation of the engine on the PDA, it was possible to use this engine with the Speechalator and a fully untethered translation device.

Thus we have two methods, one solely on the device, and a second clear method to cover much larger translation problem if a wireless connect to a server is available.

Only basic evaluation was carried out on these generation models, and this is still continuing work.

next up previous
Next: Synthesis Up: Speechalator: two-way speech-to-speech translation Previous: Recognition
Alan W Black 2003-10-27