NineOneOne

Exploratory Research on Recognizing Non-English Speech for Emergency Triage in Disaster Response

Dr. Robert Frederking, PI; Dr. Tanja Schultz, co-PI, and Dr. Alan W Black, co-PI.
Udhay Nallasamy, graduate student. Jerry Weltman ( LSU), volunteer graduate student.
Officer Julio Schrodel, point of contact at Cape Coral Police Dept. (Florida)
Contact: email Dr. Robert Frederking, phone +1-412-268-6656.

Funded by the National Science Foundation, as NSF Award IIS-0627957.
Began 1 May 2006.
Project Summary.


Status updates:
June 2006: Kick-off meeting with Cape Coral Police Department, Cape Coral, FL
PI co-organized the related First International Workshop on Medical Speech Translation ("MedSLT1") at HLT-NAACL-2006 .
August 2006: Collecting and transcribing 9-1-1 calls, as training data for building a targeted Spanish speech recognition system.
September 2006: Set up internal website for co-ordinating speech transcription tasks.
November 2006: Built initial context-independent ASR (automatic speech recognizer) test system, using other speech data (GlobalPhone).
January 2007: Re-organized and documented internal web interface and speech database preparation process.
February 2007: Continued receiving and transcribing data, and ASR experimentation and development.
August 2007: ASR achieved accuracy of 74% on test phonecall data.
Began development of proof-of-concept demo system.
Began negotiating access to Spanish 9-1-1 recordings from cities in addition to Cape Coral (FL).
September 2007: Began receiving additional 9-1-1 recordings from Charlotte-Mecklenburg (NC) Police Department.
October 2007: Began receiving additional 9-1-1 recordings from Mesa (AZ) Police Department.
We were granted a No-Cost Extension from the NSF, to continue working through April 2008.
Utterance classification achieved accuracy of 69% on test data; this is the first step in our MT (Machine Translation) approach.
November 2007: PI visited Mesa (AZ) 9-1-1 communications center.
December 2007: PI visited Charlotte-Mecklenburg (NC) 9-1-1 communications center.
January 2008: Jerry Weltman, PhD student in Computer Science/Linguistics at LSU, joined project for Independent Study credit.
His primary activity is analyzing utterance classes, in preparation for MT work.
February 2008: Conference paper on this project's research accepted for presentation at LREC-2008 this coming May.
PI co-organizing the related Workshop on Speech Translation for Medical and Other Safety-Critical Applications ("MedSLT2") at COLING-2008.
March 2008: PI presented a local seminar on the project.
LREC-2008 paper finalized. [pdf]
Shared our bilingual phonecall transcriptions with a UTDallas researcher working on Spanish/English code-switching.
April 2008: Continued analyzing utterance classes, in preparation for MT work.
Began analyzing parameters of utterance classes as well.
May 2008: PI presented paper at LREC-2008.
Continued analyzing parameters of utterance classes for MT work.
Submitted paper to Workshop on Speech Translation for Medical and Other Safety-Critical Applications ("MedSLT2") at COLING-2008.
(Now renamed to "Workshop on Speech Processing for Safety Critical Translation and Pervasive Applications".)
June 2008: MedSLT2 paper finalized. [pdf]
Continued analyzing parameters of utterance classes for MT work.
July 2008: Negotiated access to Spanish 9-1-1 recordings from Washington County 9-1-1 (Oregon)
and New Brunswick (NJ) Police Department.
Began experimenting with software parsing of parameters.
August 2008: PI presented MedSLT2 paper at Workshop on Speech Translation for Medical and Other Safety-Critical Applications ("MedSLT2") at COLING-2008.
Continued experimenting with software parsing of parameters.
September 2008: LTI PhD student Shinjae Yoo working on improving utterance classification as his 11-732 MT Lab class project.
Continued experimenting with software parsing of parameters.
December 2008: 11-732 MT Lab class project completed; resulted in significant improvement in utterance classification, using nonterminals from parameter parsing as features.
Continued experimenting with software parsing of parameters.
January 2009: LTI PhD student Rohit Kumar (advisor: Carolyn Penstein Rose) conducting Independent Study on a human-computer interaction (HCI) design and evaluation for the NineOneOne system.
March 2009: Work continues on human-computer interaction (HCI) design and evaluation for the NineOneOne system.


For COLING 2008 workshop submission: Full list of current (as of May 11, 2008) utterance classification tags: tags-list.txt.

NSF Project Summary:
In any major disaster, such as a hurricane, there will be a huge number of 9-1-1 emergency calls in a short period of time; many of these will be non-English. Even in the best of times, there is a chronic shortage of translation for triage of non-English calls at U.S. emergency call centers. This exploratory research seeks to demonstrate that new approaches to Automatic Speech Recognition (ASR) and Machine Translation (MT) will permit the automatic, real-time translation of these calls well enough to allow emergency triage. Full recognition and translation is not necessary; only the type of emergency and relevant details, such as location, need to be recognized.

Broader Impacts: This project is directly relevant to Homeland Defense. In any major disaster, nearby 9-1-1 dispatching centers will be overwhelmed with large volumes of calls in a short period of time. In large cities, this will naturally include a significant number of non-English calls. If successful, this exploratory project will demonstrate the feasibility of speech translation technology for serving 9-1-1 call centers. This should enable follow-on projects to eventually produce deployable speech translation systems for 9-1-1 dispatching centers. Dispatching centers will thereby be much better equipped for dealing with any such large scale emergencies.

In addition, this project has clear, compelling broader impacts, due to the 9-1-1 dispatching domain. It specifically addresses the needs of local government agencies and the participation in society of disadvantaged ethnic groups, by ameliorating (if successful) the chronic lack of Spanish translation at emergency dispatch centers. This applies in principal to other ethnic groups as well, since the fundamental techniques and software designs to be investigated are language-independent and transferable to other task domains.

Intellectual Merit: The 9-1-1 domain is very challenging but we believe feasible due to inherent domain constraints, and will stimulate new advances in the basic science underlying speech translation. The challenging aspects of this domain require significant new basic research in the areas of Automatic Speech Recognition (ASR) and Machine Translation (MT), building on much previous work in both areas.

For ASR, techniques will be investigated for using multilingual acoustic models enhanced by articulatory features and combining multilingual grammars with n-gram language models, in order to recognize speech with the acoustic and lexical characteristics of a 9-1-1 call in the U.S. in Spanish. These characteristics include the use of English phrases and the distressed emotional state of the callers. For MT, the research focus will be the extension and significant adaptation in novel ways of techniques (developed in earlier NSF-funded projects) for Domain Action classification, in order to recognize the basic intent of agitated, disfluent Spanish 9-1-1 calls. The system does not attempt a detailed translation of everything said, but rather tries to understand the type of emergency, using categories relevant to triage such as "Request-Ambulance" or "Report-Flooding", and then (in a full system developed in a follow-on project) focus on translating just the critical details that the dispatcher needs to know.

The exploratory system works with Spanish, but the approaches used are applicable to any language. Both the ASR and MT results should also transfer to other domains. Increasing ASR quality in noisy multi-speaker environments is clearly transferable to other domains. The MT is based on prior work in several other domains, and our novel extensions here should also be transferable across task domains.

We expect to make the resulting transcribed and annotated speech corpus publically available for other researchers.