Design practical AIs that can help users in complex tasks, where users are not oracle, and not static.
Recent Publications
2026
Modeling Multi-Party Interaction in Couples Therapy: A Multi-Agent Simulation Approach
Canwen Wang,
Angela Chen,
Catherine Bao,
Siwei Jin,
Yee Kit Chan,
Jessica R Mindel,
Sijia Xie,
Holly Swartz,
Tongshuang Wu,
Robert E Kraut,
Haiyi Zhu
ArXiv 2026
General Scales Unlock AI Evaluation with Explanatory and Predictive Power
Lexin Zhou,
Lorenzo Pacchiardi,
Fernando Martínez-Plumed,
Katherine M. Collins,
Yael Moros-Daval,
Seraphina Zhang,
Qinlin Zhao,
Yitian Huang,
Luning Sun,
Jonathan E. Prunty,
Zongqian Li,
Pablo Sánchez-García,
Kexin Jiang Chen,
Pablo A. M. Casares,
Jiyun Zu,
John Burden,
Behzad Mehrbakhsh,
David Stillwell,
Manuel Cebrian,
Jindong Wang,
Peter Henderson,
Sherry Tongshuang Wu,
Patrick C. Kyllonen,
Lucy Cheke,
Xing Xie,
José Hernández-Orallo
Nature 2026
Improving Automated Feedback Systems for Tutor Training in Low-Resource Scenarios through Data Augmentation
Chentianye Xu,
Jionghao Lin,
Tongshuang Wu,
Vincent Aleven,
Kenneth R. Koedinger
TLT 2026
Improving Attributed Long-form Question Answering with Intent Awareness
Xinran Zhao,
Aakanksha Naik,
Jay DeYoung,
Joseph Chee Chang,
Jena D. Hwang,
Tongshuang Wu,
Varsha Kishore
ICLR 2026
Evidotes: Integrating Scientific Evidence and Anecdotes to Support Uncertainties Triggered by Peer Health Posts
Better Synthetic Data by Retrieving and Transforming Existing Datasets
Saumya Gandhi,
Ritu Gala,
Vijay Viswanathan,
Tongshuang Wu,
Graham Neubig
ACL Findings 2024
Generating Situated Reflection Triggers About Alternative Solution Paths: A Case Study in Generative AI for Computer-Supported Collaborative Learning
Best Paper Nominee
Atharva Naik,
Jessica Ruhan Yin,
Anusha Kamath, Qianou Ma,
Sherry Tongshuang Wu,
Charles Murray,
Majd Sakr,
Carolyn P. Rose
AIED 2024
Large Language Models Help Humans Verify Truthfulness – Except When They are Convincingly Wrong
Chenglei Si,
Navita Goyal,
Tongshuang Wu,
Chen Zhao,
Shi Feng,
Hal Daumé III,
Jordan Boyd-Graber
NAACL 2024
Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia
Tzu-Sheng Kuo,
Aaron Halfaker,
Zirui Cheng,
Jiwoo Kim,
Meng-Hsin Wu,
Tongshuang Wu,
Ken Holstein,
Haiyi Zhu
CHI 2024
Self-Guide: Better Task-Specific Instruction Following via Self-Synthetic Finetuning
Chenyang Zhao,
Xueying Jia,
Vijay Viswanathan,
Graham Neubig,
Tongshuang Wu
CoLM 2024
"Merge Conflicts!" Exploring the Impacts of External Distractors to Parametric Knowledge Graphs
Cheng Qian,
Xinran Zhao,
Tongshuang Wu
CoLM 2024
Beyond Relevance: Evaluate and Improve Retrievers on Perspective Awareness
Selenite: Scaffolding Online Sensemaking with Comprehensive Overviews Elicited from Large Language Models
Michael Xieyang Liu,
Tongshuang Wu,
Tianying Chen,
Franklin Mingzhe Li,
Aniket Kittur,
Brad A. Myers
CHI 2024
What Is Wrong with My Model? Identifying Systematic Problems with Semantic Data Slicing
Chenyang Yang,
Yining Hong,
Grace A. Lewis,
Tongshuang Wu,
Christian Kästner
ASE 2024
HiMemFormer: Hierarchical Memory-Aware Transformer for Multi-Agent Action Anticipation
Zirui Wang,
Xinran Zhao,
Simon Stepputtis,
Woojun Kim,
Tongshuang Wu,
Katia Sycara,
Yaqi Xie
Video-Language Models Workshop @ NeurIPS 2024
2023
Large Language Models Enable Few-Shot Clustering
Vijay Viswanathan,
Kiril Gashteovski,
Carolin Lawrence,
Tongshuang Wu,
Graham Neubig
TACL 2023
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Patrick Fernandes,
Aman Madaan,
Emmy Liu,
António Farinhas,
Pedro Henrique Martins,
Amanda Bertsch,
José G. C. de Souza,
Shuyan Zhou,
Tongshuang Wu,
Graham Neubig,
André F. T. Martins
TACL 2023
Seeing Seeds Beyond Weeds: Green Teaming Generative AI for Beneficial Uses
Logan Stapleton,
Jordan Taylor,
Sarah Fox,
Tongshuang Wu,
Haiyi Zhu
ArXiv 2023
DataFinder: Scientific Dataset Recommendation from Natural Language Descriptions
Vijay Viswanathan,
Luyu Gao,
Tongshuang Wu,
Pengfei Liu,
Graham Neubig
ACL 2023
Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs