Bernhard Suhm's Home Page

Bernhard Suhm
PhD graduate from Interactive Systems Laboratories (Carnegie Mellon University and Karlsruhe University ) -
soon to work for BBN Technologies, Speech and Language Research Group in Cambridge (MA)
Oneline resume
My Research:
The research question I am pursuing is: given unreliable spoken language
technology - how to minimize the user's effort to recover from
interpretation errors?
My approach was multimodal interactive error correction. Repeated
recognition errors are avoided by switching modality for correction,
for example from continuous speech to spelling, writing, and
gesturing. Thus, recognition errors can be corrected efficiently.
I demonstrated the concept by building a multimodal dictation system
which integrates multimodal error correction with a state-of-the-art
large vocabulary dictation recognizer. Formal user studies showed
that switching modality increases correction speed,
compared to unimodal correction. Not only recognition accuracy, but
also correction speed determine productivity of speech interfaces
such as dictation systems.
Other research interests include human computer interaction in
general, machine learning and cognitive science.
Publications:
(limited to publications where I was first author)
- (accepted for publication in) CHI'99 Conference: Empirical and Model-based Evaluation of Multimodal Error Correction
- IEEE Workshop on Speech Recognition and Understanding, Santa Barbara (USA): Empirical Evaluation of Interactive Multimodal Error Correction
- EUROSPEECH 97, Rhodes (Greece): Exploiting Repair Context in Interactive Multimodal Error Recovery
- ICASSP 97, Munich (Germany): Multimodal Interfaces for Multimedia Information Agents
- ICSLP 96, Philadelphia (PA): Interactive Error Recovery for Speech User Interfaces
- SIG-CHI 96, Workshop on
Designing the User Interface for Speech Recognition Applications: Designing Interactive Error Recovery Methods for Speech Interfaces
- MaxEnt 95, XV Workshop on Maximum Entropy and Bayesian Methods, 1995, Los Alamos: Efficient iterative scaling of a class of maximum entropy language models
- ARPA SLT 95, Speech Language Technology Workshop, Austin (TX): JANUS - Towards Multilingual Spoken Language Translation
- ICSLP 94, Yokohama (Japan): Towards better Language Models for Spontaneous Speech (Word Cluster and Word Phrase LMs)
- AAAI 94 Workshop on Integration of Speech and Natural Language Processing, Seattle (WA): Speech-Language Integration in a Multi-Lingual Speech Translation System
- EUROSPEECH 93, Berlin (Germany): Detection and Transcription of New Words
Other Publications:
bsuhm@hotmail.com
20 Bristol Street #3
Cambridge MA 02141
currently reachable at 07803-2467
Finger
Now back to
Interactive Systems project page.