Ming Sun Homepage

something about me ...

Last update: Sep, 2016

I graduated from Carnegie Mellon University in May 2016. My PhD advisor is Alexander I. Rudnicky. My thesis focuses on adaptive spoken dialog systems which accommodate users' language (vocabulary, grammar, syntax etc) and high-level intentions (e.g., orgnizing a dinner or planning a trip). I was a post-doc at Disney Research Pittsburgh working on conversational agents. I am currently an applied scientist at Amazon.

PhD thesis. CV.

Projects

Adaptive Spoken Dialog Systems

Out-of-vocabulary Learning:
1. Detect OOVs during conversation and recover these words (detect-and-learn [Interspeech'11] [ICASSP'12]).
2. Expect what new words may occur and add them to the recognition models ahead of time (expect-and-learn [Interspeech'15]).
Language Adaptation for Cloud-based Speech Recognition:
Combine a local, adaptive, light ASR with a cloud ASR to provide both domain-/user-adaptation and coverage of the language (Chapter 3 in thesis).
Cross-domain Intent Adaptation:
Use existing domains to assist users to achieve complex intents [IUI'16].

Human Machine Interactions (based on Olympus Framework)

Robust Awareness Detection using Microsoft Kinect [pdf]
Multiparty Dialog Management [pdf]
Multi-app Dialog Framework based on Olympus
Batch Dialog Framework

Publications

0. Child-robot Interaction

(NEW!) M. Sun, I. Leite, J. Lehman and B. Li, "Collaborative Storytelling with Children: A Feasibility Study". 16th ACM SIGCHI Interaction Design and Children Conference (IDC) 2017. [PRESS]

1. Multi-domain Dialog System

M. Sun, A. Pappu, YN. Chen, A. I. Rudnicky, "Weakly Supervised User Intent Detection for Multi-Domain Dialogues". (to appear) IEEE Workshop on Spoken Language Technology (SLT) 2016. [PDF] [BIB] [DATA & CODE]

YN. Chen, M. Sun, A. I. Rudnicky and Anatole Gershman, "Unsupervised User Intent Modeling by Feature-Enriched Matrix Factorization". IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2016. [PDF] [BIB]

M. Sun, YN. Chen and A. I. Rudnicky, "AppDialogue: Multi-App Dialogues for Intelligent Assistants". International Conference on Language Resources and Evaluation (LREC) 2016. [PDF] [BIB] [DATA]

M. Sun, YN. Chen and A. I. Rudnicky, "An Intelligent Assistant for High-Level Task Understanding". ACM Conference on Intelligent User Interfaces (IUI) 2016. [PDF] [BIB] [DEMO]

M. Sun, YN. Chen and A. I. Rudnicky, "Learning User Intentions Spanning Multiple Domains", Workshop on Smart Connected and Wearable Things (SCWT) 2016 [PDF]

YN. Chen, M. Sun and A. I. Rudnicky, "Towards Spoken Language Interfaces for Mobile Applications". CHI Workshop on Designing Speech and Language Interaction for Mobile and Wearable Computing (DSLI) 2016.

M. Sun, YN. Chen and A. I. Rudnicky, "HELPR: A Framework to Break the Barrier across Domains in Spoken Dialog Systems". International Workshop on Spoken Dialogue Systems (IWSDS) 2016.

M. Sun, YN. Chen and A. I. Rudnicky, "Understanding User's Cross-Domain Intentions in Spoken Dialog Systems". NIPS Workshop on Machine Learning for SLU and Interaction (NIPS-SLU) 2015. [PDF] [BIB]

YN. Chen, M. Sun, A. I. Rudnicky and Anatole Gershman, "Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken Language Understanding". International Conference on Multimodal Interaction (ICMI) 2015. [PDF] [BIB] [DATA]

2. Multi-modal & Multi-party Dialog Systems

A. Pappu, M. Sun, S. Sridharan and A. I. Rudnicky, "Situated Multiparty Interactions between Humans and Agents". HCII, 2013. [PDF] [BIB]

A. Pappu, M. Sun, S. Sridharan and A. I. Rudnicky, "Conversational Strategies for Robustly Managing Dialog in Public Spaces". European Chapter of the Association for Computational Linguistics (EACL) Dialog in Motion Workshop, 2014. [PDF] [BIB]

3. Out-of-vocabulary Learning

M. Sun, YN. Chen and A. I. Rudnicky, "Learning OOV through Semantic Relatedness in Dialog Systems". Interspeech, 2015. [PDF] [BIB]

L. Qin, M. Sun, A. I. Rudnicky, "System Combination for Out-of-vocabulary Word Detection", ICASSP, 2012. [PDF] [BIB]

L. Qin, M. Sun, A. I. Rudnicky, "OOV Detection and Recovery Using Hybrid Models with Different Fragments", Interspeech, 2011. [PDF] [BIB]

4. Adaptive Dialog Systems

M. Sun, A. I. Rudnicky, U. Winter "User Adaptation in Automotive Environments", AHFE, 2012

M.Sun and A. I. Rudnicky, "Spoken Dialog Systems Adaptation for Domains and for Users" in U. Winter. Design of Multimodal Mobile Interfaces.

Demo

The agent understands the complex user intentions (cross-domain). The agent would suggest a set of apps with different functionality to assist the user. It also reveals its understanding of user intention via understandable language reference. Credit to Avnish Saraf for building the UI, Chenran Li for implementing the models, Zhenhao Hua, Yulian Tamres-Rudnicky, Arnab Dash for helping out the data collection.

Multiparty Interaction. The agent is capable of carrying out dialogs in the following multiparty situations: 1) two users (friends) in a scene; 2) one user leaves the scene while the other stays; 3) a new users interrupts the current conversation. Joint work with Aasish Pappu and Seshadri Sridharan. Thanks to Matt for letting us use his name for easy speech recognition.

Multi-app Dialog Framework. The demos below were done when Smart TV just came out. In the first demo below, before installing a specific app, the system cannot understand app-related user input, for example, the specific app name. After installing the app, vocabularies related to this app are augmented into the recognition and understanding components. Therefore, in this example, the system is then able to recognize both app name and contact names. In the second demo below, a few speech-enabled apps have already been installed. Upon receiving user input, the system decides which app to launch.

Multimodal Interaction. The robot can detect who is speaking so it would turn toward the active speaker. Joint work with Aasish Pappu and Seshadri Sridharan.

Olds

I could not attend LREC-2016@Slovenia due to visa. But we have a paper on multi-domain dialog corpus [pdf]

I could not attend ICASSP-2016@China due to visa. But we have a paper on user intent understanding [pdf]

I could not attend IWSDS-2016@Finland due to visa. But we have a paper on multi-domain dialog

I could not attend NIPS-2015@Canada due to visa. But we have a workshop paper on multi-domain dialog [pdf]

I could not attend Interspeech-2015@Germany due to visa. But we have a paper on OOV [pdf]

Contact

Emails: mings AT cs dot cmu dot edu; first_name.last_name AT disneyresearch dot com

MING SUN