next up previous
Next: Future Improvements Up: Discussion Previous: Interpreting results

Comparison with other systems

Our results compare favourably with those of other systems. For example, our best score of breaks-correct is 79.27% compared to 70% in Ostendorf and Veilleux ostendorf:94. Wang and Hirschberg hirschberg:92 prefer a measure which is the average of the breaks correct and the non-breaks correct. Our best score using this measure is 86.6% which compares with their score of 81.7% on a text-to-speech task (as opposed to other tasks where they allow acoustic information also). When we ran our system on the test sentences used by Ostendorf and Veilleux, we achieved 72.72% breaks correct with 4.27% juncture insertions, compared with 70% and 5% reported in their paper. This shows that our system has some ability to transfer to completely unseen data in a different domain. Although our system is the best on the common test data, the improvement cannot solely be put down to differences in the techniques themselves. We trained our system on substantially more data and this may have been a large factor in the improvement. Aside from actual performance, we believe our system is somewhat simpler than the two others mentioned here and this may make it more attractive from an implementation point of view.


next up previous
Next: Future Improvements Up: Discussion Previous: Interpreting results
Alan W Black
1999-03-20