Sphinx Resource

Disclaimer : Information provided here may not represent the standpoints of Carnigie Mellon University or CMU Sphinx Group.

During the last few years, many sites have wrote extensions of Sphinx. We start to cordinate all this effort and try to incorporate some of these extensions to the standard distribution of Sphinx. This page is far from complete. Please contact me if you are interested in writing an extension for Sphinx. I will be more than happy to include a link of your extension in this page.

20060415 Todo: This page badly need to be re-organized

20060220 Todo: This page need to be re-organized

A sphinx2 Russian model

From User Shmyrev Nick:

"Russian sphinx model By: Shmyrev Nick (nshmyrev) - 2006-12-28 10:19 Hey all, just to let you know, I built some time ago russian sphinx2 model: ftp://ftp.berlios.de/pub/festlang/sphinx-rus.tar.gz Now I have more data available and it will be possible to train better model so if someone interested help is appreciated. Also language model is missing still. "

Enabling applications which have more than 65536 words unigram

I have done some changes to make sphinx 3.X (X=6, RC1) and CMU-Cambridage Language Modeling Toolkit V2 to use more than 65536 words. That basically means some of the data structure has to be changed from 16 bits to 32 bits. I don't have a chance to test both of the tool in real life scenario. Though, I have spent quite an amount of time to make sure the 32-bit mode code generate output that is exactly the same as the 16-bit code. The code also passed through some simple test.

Both programs are not officially versioned because of the risky nature of the changes. So use it as is.

For download and detail instruction, please go to here

A French Acoustic Model created by LIUM

LIUM has open sourced their Sphinx III setup for French. You can find it here

A version of lm3g2dmp, dag that could accept four-gram.

LIUM has produced a speech transcription system using Sphinx 3.3. They also worked out a way to make lm3g2dmp and dag to accept quadgram. I personally think that this is an amazing work. You can find the tools at here

Thanks for your good work, Prof. Yannick Estève. I personally believe that the code could also be extended to accept n-gram with n>4. Anyone who is interested, please kindly talk to me or Yannick about this.

A speed-up version of Sphinx 3.0

Download at here .

As you may know, the standard CVS distribution of Sphinx 3 contains both Sphinx 3.0 , the flat lexicon version of decoder and Sphinx 3.3, the tree lexicon version of the decoder. To warm up myself in CMU, I start to add some speed-ups in Sphinx 3.0 in December 2004. Later this branch of the code is abandoned. I just put them here such that if you need to use flat-lexicon in your research, you can still enjoy speed-ups such as GMM selection/Gaussian Selection/Down Sampling.

A branch of Sphinx 3.5 with end-pointing

Download at here .

(20041024) This branch is merged into Sphinx 3.5, will be released in 3.5.RCIII.

(20040921) This package is mainly motivated by the CALO project which require high performance end-pointing routine for a meeting understanding task. It is currently not in the trunk of 3.X coz we may need to invest a lot of things before we can get a nice interface for it. If you want to enjoy some functionality of end-pointing, you can try to download it from here. Remember, you need to train the end-pointing model yourself. :-)

A branch of Sphinx 3.5 with backtracking without using silence.

Click here

(20050913) This is now incorporated in a development version of Sphinx 3.6

(20050526) This package has a minor modification that allows Sphinx 3.5 to back track without assuming silence is the final word. As is, and barely tested.

A branch of SphinxTrain with lapack

Download at here .

(20041028) An attempt to use lapack in SphinxTrain, I do it just for fun.

A branch of SphinxTrain that does Hua and Schultz's flexible tying

Download at here .

(20041221) Create by David Huggins Daines. An attempt of combining Sphinx's automatic question generation's technique and Hua's flexible tying technique.

Click here

Resource for building Spanish models in Sphinx's format.

Click Here

(20041221) Carried out by Dr. Juan Arturo Nolazo Flores. It is a great effort by Juan. It make uses of an older version of Sphinx 3 and SphinxTrain in trainer. Minded that resource in "Files for Training" need to explicitly specified in the URL: http://speech.mty.itesm.mx/~jnolazco/proyectos/(file name)

A perl/tk interface for Sphinx 2

Click here

This is nice perl/tk GUI interface written by my friend Thomas Harris, it can be pretty useful if you hate to treak the script in sh. Really want to turn it into sphinx3.4. Later. Later. I also made a local copy of it at here

A perl module for Sphinx 2

Click here

A perl interface written by David Huggins-Daines for Sphinx 2. Again. I really want to turn it for sphinx3.4. Later. Later.

An implementation of Voice authentication software based on Sphinx 3

Click here

Conducted by Roderick de Jong and and Harmen van der Spek. An interesting project which tried to use Sphinx 3 source code to build a voice authentication routing. Currently, it used VQ as a model and DTW as the recognition algorithm.