We feel we have adequately shown that an F0 generation algorithm based on the Tilt theory of intonation can produce acceptable contours from models trained from databases of natural speech. Our results suggest at least equal or better accuracy on held out test data than other similar experiments on the same database. Our best scores are RMSE 32.5Hz and correlation of 0.60 while a ToBI based approach [2] gives 34.5Hz and 0.62 and the dynamical system model [6] produces an RMSE of 33Hz.

Kurt Dusterhoff
Tue Jul 1 11:51:11 BST 1997