next up previous
Next: Parsing and Language Modeling Up: Architecture Previous: Telephone connection

Recognizer

We use the CMU Sphinx II speech recognizer with gender-specific telephone-quality acoustic models from the Communicator system [2]. The data used for training consists of the CMU Communicator data collected over the last 4 years. We automatically split this data into male and female speech and trained separate models. Both models are then run in parallel and the best is selected. Like others, we have found this improves recognition accuracy.

We do have access to recordings from the PAT help line, although the content is often more general than just bus schedules, and the data has acoustic artifacts from the archiving compression used and therefore does not reflect the acoustic conditions of the telephone speech we expect. Thus, at present, using existing telephone bandwidth models is appropriate, but as we collect data, we will retrain our system.



Alan W Black 2003-10-27