Long Qin

 

Office


GHC 6225

Carnegie Mellon University

5000 Forbes Avenue

Pittsburgh, PA 15213


Email


lqin (at) cs (dot) cmu (dot) edu

long.qin (at) mmodal (dot) com

 

I’m currently a Software Engineer at Duolingo working on speech tasks in the Duolingo learning app and test center. Before that, I worked at M*Modal as a Research Scientist on improving speech recognition for medical transcription. I received my PhD and MS degrees from the Language Technologies Institute of Carnegie Mellon University under the supervision of Prof. Alex Rudnicky. I also received a MS and a BS degree from the University of Science and Technology of China.


CV [pdf]


Research

  1. Deep learning (DNN) in speech recognition

  2. Automatic Speech Assessment

  3. Voice Activity Detection (VAD)

  4. Out-of-vocabulary (OOV) word learning

  5. Discriminative acoustic modeling

  6. Speaker adaptive training (SAT)

  7. Unsupervised / semi-supervised lexicon learning

  8. Statistical parametric speech synthesis


Selected PublicationS

  1. PhD Dissertation: Learning out-of-vocabulary words in automatic speech recognition, Carnegie Mellon University. [document] [presentation]

  2. Building a vocabulary self-learning speech recognition system, Interspeech-2014. [pdf]

  3. Learning better lexical properties for recurrent OOV words, ASRU-2013. [pdf]

  4. Using web text to improve keyword spotting in speech, ASRU-2013. [pdf]

  5. Finding recurrent OOV words, Interspeech-2013. [pdf]

  6. OOV word detection using hybrid models with mixed types of fragments, Interspeech-2012. [pdf]

  7. System combination for out-of-vocabulary word detection, ICASSP-2012. [pdf]

  8. OOV detection and recovery using hybrid models with different fragments, Interspeech-2011. [pdf]

  9. The effect of lattice pruning on MMIE training, ICASSP-2010. [pdf]

  10. Implementing and improving MMIE training in SphinxTrain, CMU Sphinx Workshop 2010. [pdf]


Courses

  1. 10-701 Machine Learning

  2. 11-711 Algorithm for NLP

  3. 11-721 Grammars and Lexicons

  4. 11-733 Multilingual Speech to Speech Translation

  5. 11-741 Information Retrieval

  6. 11-751 Speech Recognition and Understanding

  7. 11-752 Speech II

  8. 11-754 Dialog System

  9. 11-756 Design and Implementation of ASR Systems

  10. 11-761 Language and Statistics

  11. 11-791 Software Engineering


Interests

Football, Soccer, Movie, Ski, Skate

CMU Sphinx


Benchmark Testing Results [pdf]

Discriminative Training Equations [pdf]


Links


CMUSphinx

Speech at CMU

Sphinx Inner Web

Online LM tool

CMU-CAM LM toolkit

Sphinx Tutorial