Arthur R. Toth

Graduated May 17,2009 with Ph.D.
Language Technologies Institute
School of Computer Science
Carnegie Mellon University
email: atoth@cs.cmu.edu
Temporarily using my old office:
Newell-Simon Hall 4508
Office Phone: 412-268-2067

Education

  • Ph.D. Language and Information Technologies, School of Computer Science, Carnegie Mellon University, May 2009
    "Using Articulatory Position Data to Improve Voice Transformation"
    Advisor: Alan W Black
  • M.S. Language Technologies, School of Computer Science, Carnegie Mellon University, May 2001
  • A.B. Mathematics, Harvard University, June 1993

  • Teaching Assistant Positions

    15-453: Formal Languages, Automata, and Computation, Spring 2003
    11-682/15-492: Intro to IR, NLP, MT, and Speech, Fall 2002

    Research

    I received my Ph.D. on May 17th and am looking for a job for June 2009 and beyond.

    I am continuing work with Dr. Tanja Schultz for one month (May 2009), but now from Pittsburgh. My task is to construct an on-line system that converts electromyographical data to speech. The surface electromyographical data we use is collected by attaching probes to a person's face in order to measure the activation potentials of certain muscles which are used during speech. As this data can also be collected while a person pantomimes speech, we are investigating its use for silent speech interfaces which take this data and produce speech from it. The on-line system we are working on will serve as a demonstration and proof-of-concept of a silent speech interface based on certain machine learning and signal processing concepts.

    From February through April 2009, I worked with Dr. Tanja Schultz in the Cognitive Systems Lab at University of Karlsruhe. I worked with her group to apply voice transformation techniques to synthesize speech from electromyographical data that they had collected and previously used for speech recognition experiments. This work led to two paper submissions to Interspeech 2009. During this time, Tanja and I also continued our collaboration with Dr. Alan W Black and Dr. Qin Jin. I constructed some human listening evaluations on various types of de-identified speech to determine how difficult it was for people to identify speakers when we tried to obscure who was speaking. This work was combined with some other work we had performed and was part of another paper we submitted to Interspeech 2009 and part of an article we submitted to IEEE Transactions on Audio, Speech, and Language Processing.

    From 2005 until January 2009, I worked with Dr. Alan W Black on the TRANSFORM project. My primary work was on trying to use articulatory position data, more specifically the MOCHA database, to improve voice transformation. We also investigated and implemented Harmonic plus noise and Harmonic Stochastic models for speech signals. In our last year-and-a-half, we collaborated with Dr. Qin Jin and Dr. Tanja Schultz, pitting our voice transformation systems against their speaker identification systems. We investigated security issues, such as whether voice transformation was a threat for fooling speaker identification systems, and we investigated privacy issues, such as whether voice transformation could be used to obscure the identity of speech presentated to speaker identification systems.

    From September 2002 until 2005, I worked with Dr. Alan W Black on the Storyteller project. I worked primarily on the automatic detection of prosodic boundaries in speech, especially in the context of multi-sentence recordings that are longer than what is typically used for constructing concatenative speech synthesizers.

    Previously, from August 1999 through August 2002, I worked with Dr. Roni Rosenfeld on Statistical Language Modeling and the Universal Speech Interface project.


    Publications