Next: Recording all styles Up: Unit Selection and Emotional Previous: Style

Recording in multiple styles

[13] identify two basic methods for dealing multiple styles in a unit selection speech synthesis paradigm. Separate voices can be built for different styles or domains, such as a command voice and an interview voice, and these voices may be switched between by the application using the synthesizer. This is called tiering. This technique works well when there is a well defined distinction between the voice types. For example, when the domain changes in a well defined way, weather information to flight information, or even good weather to bad weather information.

The second method for combining voice types is called blending. In this model the databases are mixed into the same database. This allows a more gradual changed between voice types, and the potential of mixed styles. The style selection is automatic based the requested units. This may be influenced implicitly by the words and phrases being synthesized, command words would be more likely to be synthesized from the command phrases in the database, while general information may come from a more neutral part of the database.

This technique works well in mixed domain based synthesis with other domain based databases and/or general ones, though it helps if they are basically in the same style. Mixing domain-based and general databases in a blended voice can produce excellent quality when in domain and reasonable quality when not, which is useful for many applications.

Next: Recording all styles Up: Unit Selection and Emotional Previous: Style

Alan W Black 2003-09-07