Name | Remarks |
CVSROOT |
|
s3.2 |
- Location:
$CVSROOT/s3.2
- Fast Sphinx-3 decoder using lextree organization:
- 5-10x real time speed on large vocabulary tasks
- Continuous density acoustic models only
- Batch-Mode operation only
gausubvq : Sub-vector clustered acoustic model building
- Needed for fast acoustic model evaluation
- Documentation:
|
libutil |
- Location:
$CVSROOT/libutil
- Miscellaneous utilities needed by s3.2 (some of them by Eric Thayer
and Paul Placeway):
- Platform-independent data types
- Command-line arguments parsing
- Hash tables
- Heap structures (for sorting)
- Memory allocation
- CPU usage profiling
- Error reporting
- Not Sphinx specific
|
s3 |
- Location:
/net/alf20/usr2/rkm/s3
- Original Sphinx-3 decoder
- Slow; 50-100x real time speed on large vocabulary tasks
- Any kind of acoustic model (discrete, semi-continuous, continuous, others)
- Major applications:
s3decode and s3decode-anytopo : Speech-to-text Decoding
s3align : Forced alignment
s3allphone : Allphone decoding
s3astar : A* search, nbest generation
s3dag : Shortest-path search
- Other utilities:
stseg-read : State-segmentation binary file reader
sen2s2 : Sphinx-II "sendump" file creation from Sphinx-3
acoustic model
- Documentation:
|
s2 (fbs8) |
- Location: Open source (search for
"CMU Sphinx")
- Sphinx-II decoder
- Real-time operation
- Semi-continuous, Sphinx-II acoustic models only (Sphinx-II format)
- User applications support:
- Compiled into a library with a straightforward API for building
speech-enabled applications
- Continuous-listening support
- Dynamic language model loading and switching
- Several test applications:
- Basic dictation with and without "push-to-talk"
- Basic audio recording and playback
- Audio segmentation using the continuous listener
- Additional recognition modes:
- Forced alignment
- Allphone decoding
- A* search, nbest generation
- Shortest-path search
|
lm3g2dmp |
- Location:
$CVSROOT/lm3g2dmp
- Conversion from "arpabo" format language model file to binary ("dump")
format used by all decoders
|
SphinxOCX |
|
data |
- Location:
/net/alf20/usr/rkm/SHARED
- An attempt at collecting all available data and models under one roof (not
entirely successful):
- Cepstrum files
- Control files
- HMMs (acoustic models)
- Language models
- Dictionaries
- Implemented mainly via symbolic links, rather than physical copies
|