Robert G. Malkin

Now featuring a stunningly plain, still-under-construction web presence!

Me with Emma at Point State Park

interACT
Language Technologies Institute
Carnegie Mellon University
407 South Craig Street
Pittsburgh, PA USA 15213
(412) 268-2518
robert dot malkin at gmail dot com

Synopsis

I'm a newly-minted Ph.D. at the Language Technologies Institute, a division of Carnegie Mellon University's School of Computer Science. I defended my thesis, titled Machine Listening for Context-Aware Computing on December 14. In addition to the Ph.D., I hold a Master of Language Technologies and a B.S. in Computational Linguistics, both from CMU. I'm currently a member of Prof. Alex Waibel's interACT group, a research and education center jointly located at CMU and Karlsruhe University in Germany. I recently accepted an employment offer from Google and I'll be starting at their Pittsburgh office in January.

Research Interests

My primary research interest is in computational audition; specifically, I am interested in how to develop systems that exploit the information present in the audio signal in order to accomplish some real-world task. Examples of these kinds of tasks include sensory enhancement or replacement, multimedia content analysis and retrieval, surveillance and intelligence gathering, and context awareness for control of smart spaces and devices.

Audio has many attractive qualities for these kinds of applications. The sensors are robust, cheap, and omnidirectional. Storing and processing the signal is relatively cheap. The audio signal does not suffer from adverse conditions like sensor motion, occlusion, or changes in lighting conditions. Noise can be a problem for applications like automatic speech recognition, but for the most part, auditory noise is often characteristic of certain event and environment classes and can be seen as a source of information. Perhaps most importantly, real-world events often leave behind clear evidence in the acoustic signal which is amenable to detection. This is of course not always the case, which is why (in my view) computational audition should be viewed as an complementary to computational vision in most context awareness applications.

More details coming soon!

Demos and Screenshots

Coming soon!

The Koios Smartroom Package

The Koios smartroom package is designed to make it easy to build large smartroom systems comprised of arbitrary collections of sensors and services. Koios is a socket-based, context-aware communication layer which receives information from smartroom components and forwards this information to other components based on a user-defined state graph in which each node contains a set of conditional actions. Taken together, these state-dependent conditional actions completely define the behavior of the smartroom. Koios is currently used here at interACT, as well as at two partner sites in the CHIL project: the Polytechnic University of Catalunia and Karlsruhe University.

Contact me for more information about Koios, or if you would like to use it for your smartroom project.

Publications

PDF Links Coming Soon!

R. Malkin, D. Chen, J. Yang, A. Waibel. Multimodal Estimation of user interruptibility for smart mobile applications. To Appear, 2006 ACM International Conference on Multimodal Interfaces.
D. Chen, J. Yang, R. Malkin, H. Watclar. Detecting social interactions of elderly in a nursing home environment. To appear, October 2006 ACM Transactions on Multimedia Computing, Communications, and Applications.
R. Malkin, D. Chen, J. Yang, A. Waibel. Directing attention in online aggregate sensor streams via auditory blind value assignment. In Proceedings, 2006 IEEE International Conference on Multimedia and Expo.
A. Temko, R. Malkin, C. Zieger, D. Macho, C. Nadeu, M. Omologo. CLEAR evaluation of acoustic event detection and classification systems. In Proceedings, 2006 CLEAR Evaluation Workshop
R. Malkin, A. Waibel. The CLEAR 2006 CMU acoustic environment classification system. In Proceedings, 2006 CLEAR Evaluation Workshop.
R. Malkin, A. Waibel. Classifying user environment for mobile applications using linear autoencoding of ambient audio. In Proceedings, 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing.
R. Malkin, D. Macho, A. Temko. First evaluation of acoustic event classification systems in the CHIL project. In Proceedings, 2005 Workshop on Hands-Free Speech Communication and Microphone Arrays.
M. Danninger, G. Flaherty, R. Malkin, K. Nickel, K. Bernadin, R. Stiefelhagen, A. Waibel. The Connector --- facilitating context-aware communication. In Proceedings, 2005 ACM International Conference on Multimodal Interfaces.
F. Kraft, R. Malkin, T. Schaaf, A. Waibel. Temporal ICA for classification of acoustic events in a kitchen environment. In Proceedings, 2005 ICSA International Conference on Speech and Language Processing / Interspeech.
D. Chen, R. Malkin, J. Yang. Multimodal detection of human interaction events in a nursing home environment. In Proceedings, 2004 ACM International Conference on Multimodal Interfaces.
A. Waibel, T. Schultz, R. Malkin, R. Stiefelhagen, J. Yang, M. Denecke, I. Rogina. SMART: The smart meeting room task at ISL. In Proceedings, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing.
C. Hori, S. Furui, R. Malkin, H. Yu, A. Waibel. Automatic speech summarization applied to English broadcast news speech. In Proceedings, 2002 Workshop on Human Language Technologies.
B. Myers, R. Malkin, A. Waibel, B. Bostwick, R. Miller, J. Yang, M. Denecke, E. Seeman, J. Zhu, C. Peck, D. Kong, J. Nichols, B. Scherlis. Flexi-modal and multi-machine user interfaces. In Proceedings, 2002 ACM International Conference on Multimodal Interfaces.
H. Yu, C. Clark, R. Malkin, A. Waibel. Experiments in automatic meeting transcription using JRTk. In Proceedings, 1998 IEEE International Conference on Acoustics, Speech, and Signal Processing.

Resume

Coming Soon!

Bio, short form

I am originally from Norwalk, CT , a town of 80,000 located about 45 miles northeast of Manhattan. I left Norwalk in 1992 to attend CMU as an undergrad, where I majored in computational linguistics (which was, at the time, a major offered by the Philosophy department of CMU's College of Humanities and Social Sciences). Legend has it that when I recieved my B.S. in 1996, I was one of two American citizens to have ever recieved such a degree. This story is likely apocryphal. Upon graduation, the computational linguistics program folded, and the Language Technologies Institute was born as a unit of the School of Computer Science. I received a Master of Language Technologies degree from the LTI in 1998. I've been working on the Ph.D. ever since, shifting topics not once, but twice: first from statistical language modeling to speech recognition, and subsequently to computational audition. I also spent a year or so working at Interactive Systems Incorporated (now Multimodal Technologies Incorporated) on English Broadcast News speech recognition and automatic JSGF grammar expansion to account for disfluencies, politeness, and other spontaneous effects in speech-enabled applications.

I met Katya (pictured below) in 1995, and we got married in 1999. We have two daughters, Emma Jane (b. 2003) and Hannah Elizabeth (b. 2005). We live in Squirrel Hill, about 2.5 miles from CMU.

Wife Katya

With the surname Malkin, I hardly ever used to get asked if I am related to someone famous. This is no longer the case. My clan can be traced back to one Thomas Malkin, a silkweaver from Cheshire, England, who emigrated to Connecticut in the 1850s. There is apparently a group of Malkins from Russia as well, currently represented by Evgeni Malkin , a center with the Pittsburgh Penguins. To my knowledge, the Cheshire Malkins are not related to this fellow, but who knows? Maybe there was some cultural exchange eons ago between Cheshire and Magnitogorsk related to the imperial stout trade. I'm also not related to the infamous Michelle Malkin, who is such a cartoon that I'm not even going to link to her.

free hit counters