Maxine Eskenazi www.cs.cmu.edu/~max
Principal Systems Scientist Tel. 412 268-3858
Language Technologies Institute Fax 412 268-6298
Carnegie Mellon University 1st three letters of my 1st name at 3 initials of the University dot edu
6413 Gates Hillman Complex
5000 Forbes Ave
Pittsburgh, PA 15213 USA
!!! Now available !!!
Eskenazi, M., Levow, G., Meng, H., Parent, G., Suendermann, D., 2013, Crowdsourcing for Speech Processing, Wiley. All proceeds go to the ISCA Student Travel Fund.
I am past chair of the SLaTE special interest group (Speech and Language Technology for Education) of ISCA (International Speech Communication Association) from 2006 to 2011. I am Director of the Dialog Research Center. And in 2013, I was co-chair of SIGDIAL’13.
MY INTERESTS, FROM SOME OF MY PUBLICATIONS:
Understanding the variability of the speech signal. Creating automatic systems (using spoken dialogue architectures, automatic speech recognition and synthesis) that benefit from this knowledge and that in turn provide a real benefit to end users. This endeavor implies studying groups of speakers, input conditions, styles of speech and detecting the acoustic and upper-level indices that are indicative of these variants. End benefits may include language learning and information giving and gathering.
Learning a foreign language is not only a question of getting the words and syntax right. You can’t be understood until you can pronounce it well! FLUENCY is designed to let you speak, then give you feedback as to how you did – what to correct, and how to correct it. Using state-of-the-art speech recognition technology, SPHINX from Carnegie Mellon, this interactive software allows you to speak, to get corrections, to listen to yourself and a native speaker and try again, over and over, as many times as you want. FLUENCY is patient and can be customized to do what you want. Practice in the privacy of your PC or phone and then speaker to others when you’re ready!
FLUENCY detects phonetic and duration errors in speech. IT can be used for English as a second language and can be adapted to other languages. The user can make it easier or harder to succeed. Other user-adaptive options to maximize learning also exist. The software can be used alone or in conjunction with language classes.
I am interested in the variability of the speech signal – relating its sources to its manifestations, characterizing groups of speakers, grouping observable phenomena. Non-native speech is a specific interest within this area, as is speaking style.
Spoken dialogue systems are one of my areas of interest. I am less interested in play systems than systems that have real users and serve a real purpose, such as Let’s Go. The Let’s Go system has been answering the phone for the Port Authority of Allegheny County every evening since the beginning of March 2005. A system with real users provides the ideal platform to run experiments on spoken dialogue such as timing and lexical entrainment. When one is lucky enough to be endowed with a system that has a constant pipeline of real users, one feels compelled to share this treasure with the research community, freely distributing the data we collect and giving them access to the system to conduct experiments of their own. This is part of the mission of the Dialog Research Center.
I am also interested in how to teach foreign languages effectively, both by a human and by a computer. This includes curriculum creation and navigation, interface issues and issues leading to robust learning. Culture underlies all language and so is also an interest of mine. And of course I am interested in pinpointing errors in non-native speech (patented!) and then providing appropriate corrective feedback.
The system I created to detect and correct foreign speakers’ pronunciation errors in English was called Fluency and the basic algorithms developed in that project were spun off into the NativeAccentTM product sold by the company I started, Carnegie SpeechTM. Another use of research results in use in real life! And the REAP system is also getting into the hands of many students and teachers.
I am the past chair of the ISCA special interest group on Speech and Language Technology in Education (SLaTE). Please visit our website for more information (www.sigslate.org).
Fluency – a project to use automatic speech recognition to detect pronunciation errors and to provide appropriate correction information – contact me directly for more information.
Let’s Go – a project using a spoken dialogue system to expand access to such systems to the elderly and to non-native speakers. http://www.speech.cs.cmu.edu/letsgo/
DialRC – a Center that is at the service of the Spoken Dialog community, providing data, running studies, educating and running a Spoken Dialog Challenge http://dialrc.org/
LexE – a project on lexical entrainment on both sides of the dialog with spoken dialog systems, sometimes with non-native speakers – see publications below
Lopes, J., Eskenazi, M., Trancoso, I., 2015, From rule-based to data-driven lexical entrainment models in spoken dialog systems, Computer Speech and Language, 31 (1), 87-112.
Saz, O., Lin, Y., Eskenazi, M., 2015, Measuring the impact of translation in on the accuracy and fluency of vocabulary acquisition of English, Computer Speech and Language, 31 (1), 49-64.
Ghigi, F., Eskenazi, M., Torres, I, Lee, S., 2014, Incremental Dialog Processing in a Task-Oriented Dialog, Proc Interspeech2014, 308-312.
Black. A., Eskenazi, M., 2014, Real Users and Real Dialog Systems: the Hard Challenge for SDS, in Natural Interaction with Robots, Knowbots and Smartphones: Putting Spoken Dialog Systems into Practice, eds J Mariani, S Rosset, M Garnier-Rizet, L Devillers, Springer, p. 29-36.
Correia, R., Mamede, N., Baptista, J., Eskenazi, M., 2014, Toward Automatic Classification of Metadiscourse, in Proc PoITAL2014, Warsaw, p. 262-269.
Correia, R., Mamede, N., Baptista, J., Eskenazi, M., Using the Crowd to Annotate Metadiscourse Acts, in Proc Joint ACL ISO Workshop on Interoperable Semantic Annotation (LREC) Raykjavik.
Pellow, D., Eskenazi, M., 2014, Tracking Human Process Using Crowd Collaboration to Enrich Data, Proc HCOMP2014.
You can learn about the REAL Challenge 2014 (and 2015!) here.
Eskenazi, M., 2013, Can I use crowdsourcing to process my data?, Special workshop on Sharing, Structuring and Processing Data, NWAV2013. Pittsburgh PA, October 17, 2013.
Dela Rosa, K., Eskenazi, M., 2013, Self-Assessment in the REAP Tutor: Knowledge, Interest, Motivation and learning, in International Journal of Artificial Intelligence in Education, June 2013.
Davis, E., Saz, O., Eskenazi, M., 2013, POLLI: a handheld-based aid for non-native student presentations, Proc. ISCA SLaTE workshop, Grenoble, p. 43-47.
Eskenazi, M., Lin, Y., Saz, O., 2013, Tools for non-native readers: the case for translation and simplification, Proceedings of the Workshop on Natural Language Processing for Improving Textual Accessibility, NAACL2013, Atlanta, p.20-28
J. Lopes, M. Eskenazi and I. Trancoso, 2013, Automated Two-Way Entrainment to Improve Spoken Dialog System Performance, Proceedings ICASSP 2013.
Eskenazi, M., Levow, G.A., Meng, H., Parent, G., Suendermann, D., 2013, Crowdsourcing for Speech Processing, Wiley.
Sungjin Lee and Maxine Eskenazi, Recipe For Building Robust Spoken Dialog State Trackers: Dialog State Tracking Challenge System Description, Proceedings of SIGDIAL 2013, Metz, France, 2013.
Pellegrini, T., Correia, R., Trancoso, I., Baptista, J., Mamede, N., Eskenazi, M., 2013, ASR-based exercises for listening comprehension practice in European Portuguese, Computer Speech and Language.
Andrew Fandrianto, Maxine Eskenazi, 2012, Prosodic Entrainment in an Information-Driven Dialog System, Proceedings of Interspeech2012, Portland, OR, USA
Sungjin Lee and Maxine Eskenazi, Exploiting Machine-Transcribed Dialog Corpus to Improve Multiple Dialog States Tracking Methods, Proceedings of SIGDIAL 2012, Seoul, South Korea, 2012.
Sungjin Lee and Maxine Eskenazi, An Unsupervised Approach to User Simulation: toward Self-Improving Dialog Systems, Proceedings of SIGDIAL 2012, Seoul, South Korea, 2012.
Sungjin Lee and Maxine Eskenazi, POMDP-Based Let's Go System for Spoken Dialog Challenge, Proceedings of the 2012 IEEE Workshop on Spoken Language Technology (SLT 2012), Miami, USA, 2012.
J. Lopes, M. Eskenazi and I. Trancoso, 2012, Incorporating ASR information in Spoken Dialog System confidence score. Proceedings of PROPOR 2012.
José Lopes, Andrew Fandrianto, Maxine Eskenazi, Isabel Trancoso, 2012, Can a spoken dialog system be used as a tool for convergence?, International Symposium on Imitation and Convergence in Speech, Aix-en-Provence, France.
Rui Correia, Jorge Baptista, Maxine Eskenazi, Nuno J. Mamede, Maxine Eskenazi, Automatic Generation of Cloze Question Stems, In International Conference on Computational Processing of Portuguese (Propor 2012), Springer-Verlag, vol. 7243, series Lecture Notes in Artificial Intelligence, pages 168–178, Coimbra, Portugal, April 2012.
Saz, O., Eskenazi, M., 2012, Addressing Confusions in Spoken Language in ESL Pronunciation Tutors, Proc. Interspeech 2012, Portland.
Correia, R., Pellegrini, T, Eskenazi, M., Trancoso, I, Baptisa, J., Mamede, N., 2011, Listening Comprehension Games for Portuguese: exploring the best features, Proceedings SLaTE 2011.
Dela Rosa, K., Eskenazi, M., 2011, Effect of Word Complexity on L2 Vocabulary Learning, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies's 6th Workshop on Innovative Use of NLP for Building Educational Applications (ACL-HLT: BEA 2011). 2011.
Dela Rosa, K., Eskenazi, M., 2011, Self-Assessment of Motivation: Explicit and Implicit Indicators in L2 Vocabulary Learning, Proceedings of the 15th International Conference on Artificial Intelligence in Education (AIED 2011).
Dela Rosa, K., Eskenazi, M., 2011, Impact of Word Sense Disambiguation on Ordering Dictionary Definitions in Vocabulary Learning Tutors, Proceedings of the 24th International FLAIRS Conference (FLAIRS 2011).
J. Lopes, M. Eskenazi and I. Trancoso , 2011 Towards choosing better primes for Spoken Dialog Systems. Proceedings ASRU 2011.
J. Lopes, I Trancoso, R Correia, T Pellegrini, H Meinedo, N Mamede, M Eskenazi, Multimedia Learning Materials, In Spoken Language Technology Workshop (SLT), 2010 IEEE, Berkeley, CA, USA, 24 January 2011
Parent, G., Eskenazi, M., 2011, Sources of variability and adaptive tasks, Proceedings CHI2011.
Parent, G., Eskenazi, M., 2011, Speaking to the Crowd: looking at past achievements in using crowdsourcing for speech and predicting future challenges, Proceedings Interspeech 2011, special session on crowdsourcing, Florence Italy.
Saz, O., Eskenazi, M., Identifying Confusable Contests for Automatic Generation of Activities in Second Language Pronunciation Training, Proceedings SLaTE 2011, Venice.
Skory, A., Eskenazi, M., 2011, Generation of Educational Content through Gameplay, Proc. SLaTE2011, Venice, August 2011.
Alan W Black, Susanne Burger, Alistair Conkie, Helen Hastie, Simon Keizer, Oliver Lemon, Nicolas Merigaud, Gabriel Parent, Gabriel Schubiner, Blaise Thomson, Jason D. Williams, Kai Yu, Steve Young and Maxine Eskenazi, Spoken Dialog Challenge 2010: Comparison of Live and Control Test Results, Proc. SIGDIAL2011, Portland, OR.
Rui Correia, Jorge Baptista, Nuno J. Mamede, Isabel Trancoso, Maxine Eskenazi, Automatic Generation of Cloze Question Distractors, In Second Language Studies: Acquisition, Learning, Education and Technology, SLaTE: the ISCA SIG on Speech and Language Technology in Edu, Waseda University, Tokyo, Japan, September 2010
Dela Rosa, K., Parent, G. ,Eskenazi, M., 2010, Multimodal learning of words: A study on the use of speech synthesis to reinforce written text in L2 language learning, Proceedings of the ISCA Workshop on Speech and Language Technology in Education (SLaTE 2010)
Eskenazi, M., Black, A., 2010, SDC: The Spoken Dialog Challenge, invited short presentation at SIGDIAL 2010. Slides
M. Heilman, K. Collins-Thompson, M. Eskenazi, A. Juffs, L. Wilson. 2010. Personalization of Reading Passages Improves Vocabulary Acquisition. International Journal of Artificial Intelligence in Education, Vol. 20 (1).
José Lopes, Isabel Trancoso, Rui Correia, Thomas Pellegrini, Hugo Meinedo, Nuno J. Mamede, Maxine Eskenazi, Multimedia Learning Materials, In IEEE Spoken Language Technology Workshop, IEEE, Berkeley, USA, December 2010.
Parent, G. and Eskenazi, M. 2010. Clustering dictionary definitions using Amazon Mechanical Turk. In Proc. of the NAACL/HLT workshop on Creating Speech and Language Data With Amazon's Mechanical Turk.
Parent, G. and Eskenazi, M. 2010. Lexical Entrainment of Real Users in the Let's Go Spoken Dialog System. In Proceedings of ISCA Interspeech 2010, Tokyo, Japan
Parent, G., Eskenazi, M. 2010. Toward better crowdsourced transcription: Transcription of a year of the let’s go bus information system data. In Proc SLT, Berkeley, CA.
Skory, A. and Eskenazi, M. (2010), Predicting Cloze Quality for Vocabulary Training, Proc. of NAACL HLT Workshop on Innovative Use of NLP for Building Educational Applications, Los Angeles, California
Skory, A. and Eskenazi, M. (2010), Automatic Selection of Collocations for Instruction, Proc. of SLaTE Workshop on Speech and Language Technology in Education, Tokyo, Japan
Black, A., and Eskenazi, M., "The Spoken Dialogue Challenge" SIGDIAL 2009, Queen Mary University, London. 2009.
Eskenazi, M., 2009, An overview of spoken language technology for education, Speech Communication, Elsevier, vol 51 issue 10 p. 832-844
Gonzalez-Brenes, J., Black, A., and Eskenazi, M. "Describing Spoken Dialogue Systems Differences" IWSDS 2009, Irsee, Germany.
Marujo, l., Lopes, J., Mamede, N., Trancoso, I, Pino, J., Eskenazi, M., Baptista, J., Viana, C., Porting REAP to European Portuguese, Proceedings ISCA Workshop on Speech and Language for Education SLaTE2009, Warwickshire England.
Pino, J., Eskenazi, M., 2009, Semi-Automatic Generation of Cloze Question Distractors: Effect of Students’ L1, Proceedings ISCA Workshop on Speech and Language for Education SLaTE2009, Warwickshire England.
Pino, J., Eskenazi, M., 2009, Measuring Hint Level in Open Cloze Questions, Proceedings FLAIRS09.
Pino, J., Eskenazi, M., 2009, An Application of Latent Semantic Analysis to Word Sense Discrimination for Words with Related and Unrelated Meanings, Proceedings of the Fourth Workshop on Innovative Use of NLP for Building Educational Applications at NAACL, Boulder Colorado.
Eskenazi, M., Black, A., Raux, A. and Langner, B., 2008, Let's Go Lab: a platform for evaluation of spoken dialog systems with real world users, Interspeech 2008, Brisbane, Australia.
M. Heilman, K. Collins-Thompson, and M. Eskenazi. 2008. An analysis of statistical models and features for reading difficulty prediction. In Proc. of The 3rd Workshop on Innovative Use of NLP for Building Educational Applications.
M. Heilman, L. Zhao, J. Pino, and M. Eskenazi. 2008. Retrieval of reading materials for vocabulary and reading practice. In Proc. of the 3rd Workshop on Innovative Use of NLP for Building Educational Applications.
M. Heilman and M. Eskenazi. 2008. Self-assessment in vocabulary tutoring. In Proc. of the Young Researcher's Track. Ninth International Conference on Intelligent Tutoring Systems.
J. Pino, M. Heilman, and M. Eskenazi. 2008. A selection strategy to improve cloze question quality. In Proc. of the Workshop on Intelligent Tutoring Systems for Ill-Defined Domains. Ninth International Conference on Intelligent Tutoring System.
A. Raux and M. Eskenazi, 2008, Optimizing Endpointing Thresholds using Dialogue Features in a Spoken Dialogue System, SIGdial 2008, Columbus, OH, USA.
Raux, A., Langner, B., Black, A. and Eskenazi, M. Building Practical Spoken Dialog Systems ACL/HLT 2008 Tutorial, Columbus, Ohio. PLEASE SEE DIALRC.ORG
Ai, H., Raux, A., Bohus, D., Eskenazi, M., and Litman, D., 2007, Comparing Spoken Dialog Corpora Collected with Recruited Subjects versus Real Users, 8th SIGDial Workshop on Discourse and Dialogue, Antwerp, Belgium.
D. Bohus, A. Raux, T. Harris, M. Eskenazi, and A. Rudnicky, 2007, Olympus: an open-source framework for conversational spoken language interface research, HLT-NAACL 2007 workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technology, Rochester, NY, USA.
M. Heilman and M. Eskenazi. 2007. Application of automatic thesaurus extraction for computer generation of vocabulary questions. In Proc. of the SLaTE Workshop on Speech and Language Technology in Education.
M. Heilman, A. Juffs, and M. Eskenazi. 2007. Choosing reading passages for vocabulary learning by topic to increase intrinsic motivation. In Proc. of AIED.
M. Heilman, K. Collins-Thompson, J. Callan, and M. Eskenazi. 2007. Combining lexical and grammatical features to improve readability measures for first and second language texts. In Proc. NAACL-HLT.
A. Raux and M. Eskenazi, A Multi-Layer Architecture for Semi-Synchronous Event-Driven Dialogue Management, ASRU 2007, Kyoto, Japan.
Brown, J., Eskenazi, M., 2006, Using Simulated Students for the Assessment of Authentic Document Retrieval, ITS2006, Taiwan, June 2006. Lecture Notes in Computer Science, Editors: Mitsuru Ikeda, Kevin D. Ashley, Tak-Wai Chan, Publisher: Springer Berlin Heidelberg, pp. 685 – 688. http://dx.doi.org/10.1007/11774303_68
Callan, J., Eskenazi, M., Perfetti, C., 2006, Progress in Providing Reader-Specific lexical Practice for Improved Reading Comprehension, presented at IES 2006 research conference, June 15-16 2006, Washington DC
Eskenazi, M., Raux, A., Harris, E., 2006, “Using speech recognition for just-in-time language learning, J. Acoust. Soc. Am, vol 120, no. 5, pt.2, p.3138.
Heilman, M., Collins-Thompson, K., Callan, J., Eskenazi, M., 2006, Classroom Success of an Intelligent Tutoring System for Lexical Practice and Reading Comprehension, Proc. Interspeech2006, Pittsburgh September 2006.
Heilman, M., Eskenazi, M., 2006, Language Learning: Challenges for Intelligent Tutoring Systems, Workshop on Ill-defined Domains in Intelligent Tutoring, Taiwan, June 2006.
Juffs, A., Eskenazi, M., Wilson, L., Pelletreau, T., Sanders, J., Callan, J., Brown, J., 2006, Promoting robust learning of vocabulary through computer assisted language learning, Proc. Joint conference of AAAL and ACLA/CAAL 2006, Montreal, June 2006.
Juffs, A., Wilson, L., Eskenazi, M., Callan, J., Brown, J., Collins-Thompson, K., Heilman, M., Pelletreau, T., Sanders, J., 2006, Robust learning of vocabulary: investigating the relationship between learner behaviour and the acquisition of vocabulary (poster). At The 40th Annual TESOL Convention and Exhibit (TESOL 2006).
Raux, A., Bohus, D., Langner, B., Black, A., Eskenazi, M., 2006, Doing Research on a Deployed Spoken Dialogue System: One Year of Let’s Go! Experience, Proc. Interspeech 2006, Pittsburgh, September 2006.
Eskenazi, M., Brown, J., 2006, Teaching the Creation of Software that Uses Speech Recognition, in Teacher Education in CALL, P. Hubbard and M. Levy Eds., Language Learning and Language Teaching series, John Benjamins Publishing.
J. Brown and M. Eskenazi. (2005). "Student, text and curriculum modeling for reader-specific document retrieval." In Proceedings of the IASTED International Conference on Human-Computer Interaction 2005. Phoenix, AZ.
J. Brown, G. Frishkoff, and M. Eskenazi. (2005). "Automatic question generation for vocabulary assessment."
Eskenazi, M., Pelton, G. 2002, Pinpointing pronunciation errors in children’s speech: examining the role of the speech recognizer, Proposed to the Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology Workshop, Sept 2002, Colorado.
Eskenazi, M., Hansma, S., Semp, M., Warner, R., 1998, By ear and by eye - adaptive tutoring for foreign language pronunciation training, in Proc. STiLL Workshop on Speech Technology in Language Learning Marhollmen.
Eskenazi, M., Rudnicky, A., Gregory, K., Constantinides, P., Brennan, R., Bennett, C., Allen, J., 1998, Data Collection and Processing in the Carnegie Mellon Communicator, in Proc. ESCA Eurospeech 98.
Placeway, P., Chen, S., Eskenazi, M., Jain, U., Parikh, V., Raj, B., Ravishankar, M., Rosenfeld, R., Seymore, K., Siegler, M., Stern, R., Thayer, E., 1997, The 1996 HUB-4 Sphinx-3 System, Proc, DARPA Speech Recognition Workshop, Chantilly, Virginia, Morgan Kaufmann Publishers.
Seymore, K., Chen, S., Eskenazi, M., Rosenfeld, R., (1997), Language and Pronunciation Modeling in the CMU 1996 HUB-4 Evaluation, Proc, DARPA Speech Recognition Workshop, Chantilly, Virginia, Morgan Kaufmann Publishers
Eskenazi, M., 1995, Hot Topics in Speaking Style Research, in European Studies in Phonetics and Speech Communication, Bloothooft, Hazan, Huber, Llisterri, eds., OTS Publications, The Netherlands. P. 58 - 62.
Eskenazi, M., Vormes, E., Monguillot, G., Frachet, B., 1993, A new training and assessment technique for cochlear implants, in Advances in Cochlear Implants, Hochmair-Desoyer and Hochmair eds., International Science Seminars, Vienna, Austria, p. 572-577.
Eskenazi, M. 1992. Changing speech styles, speakers’ strategies in read speech and careful and casual spontaneous speech. Proceedings of the International Conference on Spoken Language Processing, Banff.
AFNOR, 1990, norme experimentale S 31-115, Evaluation de systemes de traitement automatique de la parole Partie 1: Definitions et methode d'evaluation de systemes de reconnaissance automatique de la parole - systemes de reconnaissance globale.
Eskenazi, M., Brown, J., 2006, Teaching the Creation of Software that Uses Speech Recognition, in Teacher Education in CALL, P. Hubbard and M. Levy Eds., Language Learning and Language Teaching series, John Benjamins Publishing.
1. GRAPHEME_TO-PHONEME DICTIONARY: I am one of the people who has worked on CMUDICT a large grapheme-to-phoneme dictionary containing over 130,000 entries. CMUDICT can be used for a variety of applications and research topics. It is distributed as open source software and can be found at: http://www.speech.cs.cmu.edu/cgi-bin/cmudict
2. NEW FINDINGS IN LANGUAGE LEARNING: One of the most promising directions that I know of for language learning is the one that started with D. Pisoni, A. Bradlow (Northwestern) and R. Yamada (ATR). They create pairs of sounds (R and L for Japanese learners of English) and acoustically “pull them apart” until the student can hear the difference between them. Students for whom this training works often can pronounce a new sound without having pronunciation training on it.
3. FAUX AMIS: I compiled a list of words (click here) that appear to be the same, but have very different meanings in French and English. The list is available here and if you find other entries or would like to suggest modifications or corrections, please download and let me know. I will be glad to post your comments and make changes to the list for all to use. Eventually I will add the meanings!