Shrimai Prabhumoye

I am a PhD student at Language Technologies, School of Computer Science, Carnegie Mellon University. I am advised by Prof. Alan W. Black and Prof. Ruslan Salakhutdinov. I work on controllable text generation with focus on style, content and structure. I am also exploring the ethical considerations of controllable text generation. I co-designed the Computational Ethics for NLP course which was offered for the first time in Spring 2018 at CMU.

I graduated with a Masters in Language Technologies in Aug 2017. During that time, I was leading the CMU Magnus team in the Amazon Alexa Prize competition. I completed my undergraduate at National Institute of Technology, Karnataka, India.


Email: sprabhum at cs.cmu[dot]edu; sprabhum at andrew[dot]cmu.edu
Office: 5511 Gates and Hillman Center, Carnegie Mellon University

News

Jul 2020 Invited talk at Salesforce, Mila, and Apple on Controllable Text Generation: Should machines reflect the way humans interact in society.
Jul 2020 Our work on politeness transfer is featured in SCS CMU News, TechCrunch, CNET, Pittsburgh Post-Gazette, msn, Hindustan Times, and Axios.
May 2020 New paper titled Exploring Controllable Text Generation Techniques
May 2020 Excited to join Salesforce Research as an intern.
Apr 2020 I successfully proposed my thesis titled Controllable Text Generation: Should machines reflect the way humans interact in society?
Apr 2020 New paper titled Politeness Transfer: A Tag and Generate Approach is accepted at ACL 2020
Apr 2020 New paper titled Topological Sort for Sentence Ordering is accepted at ACL 2020
Oct 2019 Invited talk at U Mass Amherst
Jun 2019 Invited talk at Google AI, NYC

Publications

13. Exploring Controllable Text Generation Techniques

Shrimai Prabhumoye, Alan W Black, Ruslan Salakhutdinov.
arXiv:2005.01822 [cs.CL]


12. Topological Sort for Sentence Ordering

Shrimai Prabhumoye, Ruslan Salakhutdinov, Alan W Black.
In the proceedings of Association for Computational Linguistics Conference (ACL) 2020.


11. Politeness Transfer: A Tag and Generate Approach

Aman Madaan*, Amrith Setlur*, Tanmay Parekh*, Barnabas Poczos, Graham Neubig,Yiming Yang,
Ruslan Salakhutdinov, Alan W Black, Shrimai Prabhumoye.
In the proceedings of Association for Computational Linguistics Conference (ACL) 2020.


10. I love your chain mail! Making knights smile in a fantasy game world:
Open-domain goal-oriented dialogue agents

Shrimai Prabhumoye*, Margaret Li*, Jack Urbanek, Emily Dinan, Douwe Kiela, Jason Weston, Arthur Szlam.
arXiv:2002.02878 [cs.AI]


9. Generating Interactive Worlds with Text

Angela Fan*, Jack Urbanek*, Pratik Ringshia, Emily Dinan, Emma Qian, Siddharth Karamcheti, Shrimai Prabhumoye,
Douwe Kiela, Tim Rocktaschel, Arthur Szlam, Jason Weston.
In the Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence.


8. Principled Frameworks for Evaluating Ethics in NLP Systems

Shrimai Prabhumoye, Elijah Mayfield, Alan W Black.
Widening NLP Workshop at ACL 2019.


7. "My Way of Telling a Story": Persona based Grounded Story Generation

Shrimai Prabhumoye*, Khyathi Chandu*, Ruslan Salakhutdinov, Alan W Black.
In the proceedings of Storytelling Workshop at ACL 2019.


6. Equity Beyond Bias in Language Technologies for Education

Elijah Mayfield, Michael Madaio, Shrimai Prabhumoye, David Gerritsen, Brittany McLaughlin,
Ezekiel Dixon-Román, Alan W Black.
In the Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications at ACL 2019.


5. Towards Content Transfer Through Grounded Text Generation

Shrimai Prabhumoye, Chris Quirk, Michel Galley
In the proceedings of North America Chapter of Association of Computational Linguistics (NAACL) 2019.


4. A Dataset for Document Grounded Conversations

Kangyan Zhou, Shrimai Prabhumoye, Alan W Black.
In the proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP) 2018.


3. Style Transfer Through Back-Translation

Shrimai Prabhumoye, Yulia Tsvetkov, Ruslan Salakhutdinov, Alan W Black.
In the proceedings of Association for Computational Linguistics Conference (ACL) 2018.


2. Linguistic Markers of Influence in Informal Interactions

Shrimai Prabhumoye*, Samridhi Choudhary*, Evangelia Spiliopoulou, Christopher Bogart, Carolyn Penstein Rose, Alan W Black.
In the proceedings of Workshop on NLP+CSS at ACL 2017.


1. Building CMU Magus from User Feedback

Shrimai Prabhumoye*, Fadi Botros*, Khyathi Chandu*, Samridhi Choudhary*, Esha Keni*, Chaitanya Malaviya*, Thomas Manzini*, Rama Pasumarthi*, Shivani Poddar*, Abhilasha Ravichander*, Zhou Yu, Alan Black
In the proceedings of Alexa Prize 2017.

Talks

Controllable Text Generation: Should machines reflect the way humans interact in society?

Montreal Institute for Learning Algorithms (Mila), July 2020.
Apple, Seattle, July 2020.
The LTI Summer Seminar, July 2020,
Salesforce, July 2020.

Controlling style, content and structure in Natural Language Generation

University of Massachusets Amherst, October 2019.
Google AI Research, NYC, June 2019.

Towards Content Transfer Through Grounded Text Generation

NAACL, June 2019, oral.

Style Transfer Through Back-Translation

ACL, July 2018, oral.

Conversational Agents

Apple, April 2018.

Mentored Students

Politeness Transfer: A Tag and Generate Approach

This work introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning. We also provide a dataset of more than 1.39 million instances automatically labeled for politeness to encourage benchmark evaluations on this new task. We design a tag and generate pipeline that identifies stylistic attributes and subsequently generates a sentence in the target style while preserving most of the source content.

Associated Publication: Politeness Transfer: A Tag and Generate Approach at ACL 2020

Aman Madaan
Amrith Setlur
Tanmay Parekh

Downstream tasks to evaluate style transfer

Mukul Bhutani

We know that downstream tasks are influenced by the demographic skew of training sets like the sentiment analysis task is affected by the gender confound and the part of speech (POS) tagging task is affected by the age confound. By building a generation engine that can preserve content while controlling for style, we can now produce demographically balanced datasets for these NLP tasks. We are also looking at using these downstream tasks to automatically evaluate style transfer models.

A Dataset for Document Grounded Conversations

Kangyan Zhou

This work introduces a document grounded dataset for conversations using Wikipedia articles on movies. The dataset contains 4112 conversations with an average of 21.43 turns per conversation. We describe two neural architectures that provide benchmark performance on the task of generating the next response.

Associated Publication: A Dataset for Document Grounded Conversations at EMNLP 2018

Teaching

Guest Lectures

Style Transfer

Machine Translation and Sequence-to-sequence Models
CS 11-731, Carnegie Mellon University, Fall 2018

Ethics in Conversational Agents

Computational Ethics in NLP
CS 11-830, Carnegie Mellon University, Spring 2018, Spring 2019 and Spring 2020

Chatbots

Speech Processing
CS 11-492 11-692 11-892, Carnegie Mellon University, Fall 2017, 2018, and Fall 2019

Neural Dialogue

Speech Processing
CS 11-492 11-692 11-892, Carnegie Mellon University, Fall 2017, 2018, and Fall 2019

Building an Alexa Skill

Speech Processing
CS 11-492 11-692 11-892, Carnegie Mellon University, Fall 2017, 2018, and Fall 2019

Chatting with Computers Workshop
OurCS, Carnegie Mellon University, Fall 2017.

Teaching Assistant

Computational Ethics in NLP

CS 11-830, Carnegie Mellon University, Spring 2018

Speech Processing

CS 11-492 11-692 11-892, Carnegie Mellon University, Fall 2017