Although we have improved the quality of speech synthesis substantially with unit selection techniques, we are far from providing the general, flexible, efficient system that users actually require. We have to be careful in demonstrating synthesis that people understand its limitations, and how good examples do not necessarily translate to continuous good and appropriate synthesis when embedded in an application.
In the short term, domain directed synthesis is clearly better than pre-recorded prompts, and we can already cater for very large domains. But we have to be thinking about the next stage. We must better represent the speech signal to allow for variation, and we must define the controls at a suitable level of abstraction that will allow applications to choose the style and quality they desire.