
Maxine Eskenazi www.cs.cmu.edu/~max
Principal
Systems Scientist Tel. 412 268-3858
Language
Technologies Institute Fax
412 268-6298
Carnegie Mellon University first three letters of my first name at three initials of the University
6413 Gates Hillman Complex dot edu
I
have been chair of the SLaTE special
interest group (Speech and Language Technology for Education)
of ISCA (International Speech Communication Association) from 2006 to 2011. I
am Director of the Dialog Research Center
GOALS INTERESTS
PROJECTS STUDENTS PUBLICATIONS TEACHING
CARNEGIE SPEECH
COMPANY RESOURCES
Understanding the variability of the speech
signal. Creating automatic systems (using spoken dialogue architectures,
automatic speech recognition and synthesis) that benefit from this knowledge
and that in turn provide a real benefit to end users. This endeavor implies
studying groups of speakers, input conditions, styles of speech and detecting
the acoustic and upper-level indices that are indicative of these variants. End
benefits may include language learning and information giving and gathering.
I am interested in the variability of the speech signal – relating its
sources to its manifestations, characterizing groups of speakers, grouping observable
phenomena. Non-native speech is a specific interest within this area, as is
speaking style.
Spoken dialogue systems are one of
my areas of interest. I am less interested in play systems than systems that
have real users and serve a real purpose, such as Let’s Go. The Let’s Go system
has been answering the phone for the Port Authority of Allegheny County every
evening since the beginning of March 2005. A system with real users provides
the ideal platform to run experiments on spoken dialogue such as timing and lexical entrainment. When one is lucky enough to be endowed with a
system that has a constant pipeline of real users, one feels compelled to share
this treasure with the research community, freely distributing the data we
collect and giving them access to the system to conduct experiments of their
own.
I am also interested in how to teach foreign languages effectively,
both by a human and by a computer. This includes curriculum creation and
navigation, interface issues and issues leading to robust learning. Culture
underlies all language and so is also an interest of mine. And of course I am
interested in pinpointing errors in non-native speech (patented!) and then
providing appropriate corrective feedback.
The system I created to detect and correct foreign speakers’
pronunciation errors in English was called Fluency and the basic algorithms
developed in that project were spun off into the NativeAccentTM
product sold by the company I started, Carnegie SpeechTM. Another
use of research results in use in real life! And the REAP system is also
getting into the hands of many students and teachers.
I am
the chair of the ISCA special interest group on Speech and Language Technology in Education (SLaTE). Please visit our website for more information (www.sigslate.org).
Fluency – a project to use automatic speech
recognition to detect pronunciation errors and to provide appropriate
correction information – contact me directly for more information.
Let’s Go – a project using a spoken
dialogue system to expand access to such systems to the elderly and to
non-native speakers. http://www.speech.cs.cmu.edu/letsgo/
DialRC – a Center that is at the service of
the Spoken Dialog community, providing data, running studies, educating and
running a Spoken Dialog Challenge http://dialrc.org/
LexE – a project on lexical entrainment on
both sides of the dialog with spoken dialog systems, sometimes with non-native
speakers
REAP – a project to retrieve appropriate,
individuated texts for students learning to read http://reap.cs.cmu.edu/ which exists in French and Portuguese as
well as in English.
Present:
Jose
David Aguas Lopes, Rui
Correira, Sukhada Palkar, Sungjin Lee, Yibin Lin, Andrew Fandrianto
Postdoc: Oscar Saz
Past: Adam Skory, Kevin Dela Rosa, Gabriel Parent, Antoine Raux, Jonathan Brown, Jie Hu, Juan Pino, Michael Heilman, Aleata Hubbard, Elizabeth Harris, Kathrin Probst, Yan Ke, Helene Maynard-Bonneau, Paul-Eric Stern, Anne Lacheret-Dujour
Spoken
Dialogue Systems and Automatic Speech Recognition
Parent, G., Eskenazi, M. Speaking
to the Crowd: looking at past achievements in using crowdsourcing for speech
and predicting future challenges, Proceedings Interspeech 2011, special session
on crowdsourcing,
Parent, G., Eskenazi, M. 2010. Toward better crowdsourced
transcription: Transcription of a year of the let’s go bus information system
data. In Proc SLT, Berkeley, CA. pdf file
Parent, G., Eskenazi, M., 2011,
Sources of variability and adaptive tasks, Proceedings CHI2011. pdf file
Alan W Black, Susanne Burger, Alistair Conkie,
Helen Hastie, Simon Keizer, Oliver Lemon, Nicolas Merigaud,
Gabriel Parent, Gabriel Schubiner, Blaise Thomson,
Jason D. Williams, Kai Yu, Steve Young and Maxine Eskenazi, Spoken Dialog
Challenge 2010: Comparison of Live and Control Test Results, Proc. SIGDIAL2011,
Portland, OR. .pdf
file
Eskenazi, M., Black, A., 2010, SDC: The Spoken Dialog
Challenge, invited short presentation at SIGDIAL 2010. Slides
Parent, G. and Eskenazi, M. 2010. Lexical Entrainment of Real Users
in the Let's Go Spoken Dialog System. In Proceedings of ISCA Interspeech 2010,
Black, A., and Eskenazi, M., "The Spoken Dialogue Challenge" SIGDIAL 2009,
Gonzalez-Brenes,
J., Black, A., and Eskenazi, M. "Describing
Spoken Dialogue Systems Differences" IWSDS 2009,
Raux, A., Langner, B., Black, A. and Eskenazi, M. Building Practical Spoken Dialog Systems
ACL/HLT 2008 Tutorial,
Eskenazi, M., Black, A., Raux, A.
and Langner, B., 2008, Let's Go Lab: a platform for evaluation of spoken dialog systems with real
world users, Interspeech 2008,
A. Raux and M. Eskenazi, 2008,
Optimizing Endpointing Thresholds using Dialogue
Features in a Spoken Dialogue System, SIGdial 2008,
Columbus, OH, USA. pdf file
A. Raux and M. Eskenazi, A Multi-Layer
Architecture for Semi-Synchronous Event-Driven Dialogue Management, ASRU 2007,
Ai, H., Raux, A., Bohus, D., Exkenazi, M., and Litman, D., 2007,
Comparing Spoken Dialog Corpora Collected with Recruited Subjects versus
Real Users, 8th SIGDial Workshop on Discourse and
Dialogue, Antwerp, Belgium. pdf file
D. Bohus, A. Raux,
T. Harris, M. Eskenazi, and A. Rudnicky, 2007, Olympus: an open-source
framework for conversational spoken language interface research, HLT-NAACL 2007
workshop on Bridging the Gap: Academic and Industrial Research in Dialog
Technology,
Raux, A., Bohus,
D., Langner, B., Black, A., Eskenazi, M., 2006, Doing
Research on a Deployed Spoken Dialogue System: One Year of Let’s Go!
Experience, Proc. Interspeech 2006,
A. Raux, B. Langner,
D. Bohus, A. W Black and M. Eskenazi, Let's Go
Public! Taking a Spoken Dialog System to the Real World,
Interspeech 2005
Eskenazi, M., 1998, User Come Back, DARPA Communicator
Compare and Contrast Meeting, June 16-17, 1998. .pdf file
Ravishankar, M. and
Eskenazi, M., 1997, Automatic Generation of Context-dependent Pronunciations,
Proc. Eurospeech ’97,
Placeway, P.,
Chen, S., Eskenazi, M., Jain, U., Parikh, V., Raj, B., Ravishankar, M.,
Rosenfeld, R., Seymore, K., Siegler,
M., Stern, R., Thayer, E., 1997, The 1996 HUB-4 Sphinx-3 System, Proc, DARPA
Speech Recognition Workshop, Chantilly, Virginia, Morgan Kaufmann Publishers.
Seymore, K.,
Chen, S., Eskenazi, M., Rosenfeld, R., (1997), Language and Pronunciation Modelling in the CMU 1996 HUB-4 Evaluation, Proc, DARPA
Speech Recognition Workshop, Chantilly, Virginia, Morgan Kaufmann Publishers
Computer-Assisted
Language Learning
Eskenazi,
M., 2009, An overview of spoken language technology for education, Speech
Communication, Elsevier, vol 51 issue 10 p. 832-844.
Vocabulary Learning
Dela Rosa, K., Eskenazi, M., 2011, Effect of Word
Complexity on L2 Vocabulary Learning, Proceedings of the 49th Annual Meeting of
the Association for Computational Linguistics: Human Language Technologies's 6th Workshop on Innovative Use of NLP for
Building Educational Applications (ACL-HLT: BEA 2011). 2011. pdf file
Dela Rosa, K., Eskenazi,
M., 2011, Self-Assessment of Motivation: Explicit and Implicit Indicators in L2
Vocabulary Learning, Proceedings of the 15th International Conference on
Artificial Intelligence in Education (AIED 2011). pdf file
Dela Rosa, K., Eskenazi,
M., 2011, Impact of Word Sense Disambiguation on Ordering Dictionary
Definitions in Vocabulary Learning Tutors, Proceedings of the 24th International
FLAIRS Conference (FLAIRS 2011). pdf file
Skory, A., Eskenazi, M., 2011, Generation of Educational
Content through Gameplay, Proc. SLaTE2011,
M. Heilman, K. Collins-Thompson,
M. Eskenazi, A. Juffs, L. Wilson. 2010.
Personalization of Reading Passages Improves Vocabulary Acquisition. International Journal of Artificial
Intelligence in Education, Vol. 20 (1).
Dela Rosa, K., Parent, G. ,Eskenazi, M., 2010, Multimodal learning of
words: A study on the use of speech synthesis to reinforce written text in L2
language learning, Proceedings of the ISCA Workshop on Speech and Language
Technology in Education (SLaTE 2010) pdf
file
Skory, A. and Eskenazi, M. (2010),
Predicting Cloze Quality for Vocabulary Training, Proc. of NAACL HLT Workshop on Innovative Use of NLP for Building
Educational Applications,
Skory, A. and Eskenazi, M. (2010),
Automatic Selection of Collocations for Instruction, Proc. of SLaTE Workshop on Speech and Language
Technology in Education,
file
Data
collection and assessment
A
book chapter on the material in 11-717:
Here
is an informal course I teach on how to write a scientific paper.
And
see: http://www.postgazette.com/pg/09013/941345-298.stm?cmpid=newspanel1
Here
are some things that may be of interest to you.
1. THE PITTSBURGH SCIENCE OF LEARNING CENTER: Through an NSF SLC award,
cognitive scientists, language technologists, psychologists and others work
together in this center to explore robust learning. You can find out more about
our Center at learnlab.org
3. AUTOMATIC SPEECH
RECOGNIZER:
The first time I used SPHINX II, I was astounded at how robust it could
be. It is not perfect – none are, as we
all know. But with understanding of the strong and weak points of the
recognizer and some smart engineering, it is possible to modify it to perform
nicely in well-defined applications (like Carnegie SpeechTM’s
NativeAccentTM). It is also open source software and can be found
at: http://www.cmusphinx.org .
4. AN AUTOMATIC SPEECH
RECOGNITION (ASR) DIALOGUE SYSTEM: One of the precursors of our Let’s Go dialogue
system and one of the best known is the Galaxy system from MIT. http://www.sls.csail.mit.edu/GALAXY.html
5. NEW FINDINGS IN LANGUAGE
LEARNING:
One of the most promising directions that I know of for language learning is
the one that started with D. Pisoni, A. Bradlow (Northwestern) and R. Yamada
(ATR). They create pairs of sounds (R and L for Japanese learners of English)
and acoustically “pull them apart” until the student can hear the difference
between them. Students for whom this training works often can pronounce a new
sound without having pronunciation training on it.
6. AUTHORING NEW TUTORING
SYSTEMS SUMMER SCHOOL: The great people who have made some of the most advanced and
successful intelligent tutoring systems that exist hold a summer school each
year where you can come and use their authoring tools to create your own tutor.
You can find it here: http://learnlab.org/opportunities/summer/
7. FAUX AMIS: I compiled a list of words
that appear to be the same, but have very different meanings in French and
English. The list is available here and if you find other entries or would like
to suggest modifications or corrections, please download and let me know. I
will be glad to post your comments and make changes to the list for all to
use. .doc file Eventually I will add the
meanings!