Carnegie Mellon Sphinx Speech Group
Sphinx-3 Decoder User Guide
Mosur Ravishankar (Ravi)
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213
10 Nov 1997
Copyright (c) 1997 Carnegie Mellon University. ALL RIGHTS RESERVED.
Introduction
Sphinx-3 (S3) is the successor to the Sphinx-II speech recognition system from
Carnegie Mellon University. The main differences between the two are:
- S3 supports a much more flexible range of acoustic modelling. Specifically, it
can handle discrete, semi-continuous, or fully continuous acoustic models. Sphinx-II
can only handle semi-continuous models.
- S3 is completely written from scratch and does not contain any historical
appendages.
The entire set of Sphinx-3 decoder programs can be found here.
The binaries include the following:
s3decode: The Viterbi decoder using beam search.
s3dag: Shortest path search of the DAG constructed
from the Viterbi decoder word lattice.
s3astar: N-best list generation from the Viterbi
decoder word lattice.
s3align: Viterbi forced alignment.
s3allphone: Allphone Viterbi decoder.
Running any of these programs without any arguments produces a short description of
the command line arguments (and their defaults) needed to run the program.