Jielin Qiu

Final-year Ph.D. candidate
Computer Science Department
Carnegie Mellon University
Advisors: Prof. Lei Li and Prof. Christos Faloutsos

Email / Google Scholar / Github / LinkedIn

About

I am a final-year Ph.D. candidate in the Computer Science Department at School of Computer Science, Carnegie Mellon University. I am fortunate to be advised by Prof. Lei Li and Prof. Christos Faloutsos. Before that, I received my B.Eng. from Shanghai Jiao Tong University, advised by Prof. Bao-Liang Lu. I've worked as a research intern at Google, Meta, Microsoft, Amazon Web Services, and Adobe.

My research interests lie in Multimodal Machine Learning. The central goal of my research is to design scalable inference and learning algorithms to connect language, perception, and control for robust multimodal learning. I strive to achieve this by learning the unique modality equivalence through abstract multimodal representations. My current research lies in the foundations of multimodal learning with applications in multimedia, computer vision, natural language processing, healthcare, and embodied AI. My research is generously supported by CMU CSD fellowships and fundings from DARPA, NSF, Adobe, Allegheny Health Network, and Cleveland Clinic.

News

[2024-04] MMSum dataset gets accepted by CVPR 2024 as Poster Highlight (Top 11.9%). Check our MMSum dataset!
[2024-03] Embodied Policy Learning with Language-based Scene Summarization gets accepted by NAACL 2024.
[2024-01] MMRobustness gets accepted as the very first paper at Journal of Data-centric Machine Learning Research (DMLR) 2024. Check our MMRobustness benchmark!
[2023-11] One paper about Cardiovascular record retrieval gets accepted by PMLR ML4H 2023.
[2023-10] Start a research internship at Google.
[2023-10] One paper about human languages and brain signals gets accepted by EMNLP Findings 2023.
[2023-06] One paper accepted as spotlight by ICML 2023 Workshop on Interactive Learning with Implicit Human Feedback.
[2023-06] Two papers accepted by ICML 2023 Workshop on Machine Learning for Multimodal Healthcare Data.
[2023-05] Start a research internship at Meta.
[2023-05] One paper about multimodal summarization by Optimal Transport gets accepted by ACL Findings 2023.
[2023-04] One paper about data augmentation on Geodesics gets accepted by ICML 2023.
[2023-04] Invited talk at Microsoft Research Cambridge.
[2023-02] One paper accepted by CVPR 2023.
[2023-02] One paper accepted by ICASSP 2023.
[2023-01] Start a research internship at Microsoft.
[2023-01] One paper accepted by EACL Findings 2023.
[2023-01] One paper accepted by AISTATS 2023.
[2022-10] One paper accepted by WACV 2023.
[2022-10] One paper accepted by NeurIPS 2022 Workshop on Distribution Shifts.
[2022-10] Top Reviewers in NeurIPS 2022.
[2022-06] One paper accepted by MLHC 2022.
[2022-05] Start a research internship at AWS AI.
[2022-05] One paper accepted by ICML 2022 workshop on Principles of Distribution Shift.
[2022-04] One paper accepted by ICLR 2022 Workshop on Socially Responsible Machine Learning.
[2021-09] Receive a gift funding from Adobe. Thanks, Adobe!
[2021-05] Start a research internship at Adobe research.

Work Experience

[2023/10 - Now] - Google. Student Researcher.
Work on multimodal watermark.

[2023/05 - 2023/08] - Meta. Research Scientist Intern.
Work on entity-centric VQA and retrieval-augmented multimodal LLM.

[2023/01 - 2023/04] - Microsoft. Research Intern.
Work on multimodal video summarization dataset and entity recognition image dataset.

[2022/05 - 2022/12] - Amazon Web Service. Applied Scientist Intern.
Work on robustness study of multimodal image-text models under distribution shifts.

[2021/05 - 2021/12] - Adobe. Research Intern.
Work on multimodal Livesteam video segmentation and summarization.

Selected Publications

* marked as equal contribution

SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Jielin Qiu, Andrea Madotto, Zhaojiang Lin, Paul Crook, Ethan Xu,
Luna Dong, Christos Faloutsos, Lei Li, Babak Damavandi, Seungwhan Moon
Under Review
[paper]

Embodied Executable Policy Learning with Language-based Scene Summarization
Jielin Qiu*, Mengdi Xu*, William Han*, Seungwhan Moon, Ding Zhao
NAACL 2024
ICML 2023 Workshop on Interactive Learning with Implicit Human Feedback (spotlight)
[paper] [code]

MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Jielin Qiu, Jiacheng Zhu, William Han, Aditesh Kumar, Karthik Mittal, Claire Jin, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Ding Zhao,
Bo Li, Lijuan Wang
CVPR 2024 (Poster Highlight 11.9%)
[paper] [website] [dataset] [code]

Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift
Jielin Qiu, Yi Zhu, Xingjian Shi, Florian Wenzel, Zhiqiang Tang, Ding Zhao,
Bo Li, Mu Li
Journal of Data-centric Machine Learning Research (DMLR) 2024
[paper] [website] [code]

Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition
Jielin Qiu, William Han, Winfred Wang, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Christos Faloutsos, Lei Li, Lijuan Wang
Under Review
[paper]

Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment
Jielin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin
ACL 2023 Findings
[paper] [press]

Can Brain Signals Reveal Inner Alignment with Human Languages?
William Han*, Jielin Qiu*, Jiacheng Zhu, Mengdi Xu, Douglas Weber,
Bo Li, Ding Zhao
EMNLP 2023 Findings
[paper] [code]

Automated Cardiovascular Record Retrieval by Multimodal Learning between Electrocardiogram and Clinical Report
Jielin Qiu*, Jiacheng Zhu*, Shiqi Liu, William Han, Jingqi Zhang, Chaojing Duan, Michael Rosenberg, Emerson Liu, Douglas Weber, Ding Zhao
PMLR Proceedings of Machine Learning for Health 2023
[paper] [code]

Multimodal Representation Learning of Cardiovascular Magnetic Resonance Imaging
Jielin Qiu*, Peide Huang*, Makiya Nakashima, Jaehyun Lee, Jiacheng Zhu, Wilson Tang, Pohao Chen, Christopher Nguyen, Byung-Hak Kim,
Debbie Kwon, Douglas Weber, Ding Zhao, David Chen
ICML 2023 Workshop on Machine Learning for Multimodal Healthcare Data
[paper]

Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models?
Jielin Qiu*, William Han*, Jiacheng Zhu, Mengdi Xu, Michael Rosenberg, Emerson Liu, Douglas Weber, Ding Zhao
EACL 2023 Findings
[paper] [code]

Cardiac Disease Diagnosis on Imbalanced Electrocardiography Data Through Optimal Transport Augmentation
Jielin Qiu*, Jiacheng Zhu*, Mengdi Xu, Peide Huang, Michael Rosenberg, Douglas Weber, Emerson Liu, Ding Zhao
ICASSP 2023
[paper]

LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream Videos
Jielin Qiu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Ding Zhao, Hailin Jin
WACV 2023
[paper] [press]

Interpolation for Robust Learning: Data Augmentation on Geodesics
Jiacheng Zhu, Jielin Qiu, Aritra Guha, Zhuolin Yang, XuanLong Nguyen,
Bo Li, Ding Zhao
ICML 2023
[paper]

Align and Attend: Multimodal Summarization with Dual Contrastive Losses
Bo He, Jun Wang, Jielin Qiu, Abhinav Shrivastava, Trung Bui, Zhaowen Wang
CVPR 2023
[paper] [code]

Benchmarking Robustness under Distribution Shift of Multimodal Image-Text Models
Jielin Qiu, Yi Zhu, Xingjian Shi, Zhiqiang Tang, Ding Zhao, Bo Li, Mu Li
NeurIPS 2022 Workshop on Distribution Shifts
[paper] [press] [code]

GeoECG: Data Augmentation via Wasserstein Geodesic Perturbation for Robust Electrocardiogram Prediction
Jiacheng Zhu*, Jielin Qiu*, Zhuolin Yang, Douglas Weber,
Michael Rosenberg, Emerson Liu, Bo Li, Ding Zhao
MLHC 2022
[paper]

Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables
Mengdi Xu, Peide Huang, Yaru Niu, Visak Kumar, Jielin Qiu, Chao Fang, Kuan-Hui Lee, Xuewei Qi, Henry Lam, Bo Li, Ding Zhao
AISTATS 2023
[paper] [code]

Data Augmentation via Wasserstein Geodesic Perturbation for Robust Electrocardiogram Prediction
Jiacheng Zhu*, Jielin Qiu*, Zhuolin Yang, Michael Rosenberg, Emerson Liu, Bo Li, Ding Zhao
ICLR 2022 Workshop on Socially Responsible Machine Learning (SRML)
[paper]

Comparing Recognition Performance and Robustness of Multimodal Deep Learning Models for Multimodal Emotion Recognition
Wei Liu, Jielin Qiu, Wei-Long Zheng, Bao-Liang Lu
IEEE Transactions on Cognitive and Developmental Systems 2021
[paper] [code]

Visual Sequence Learning in Hierarchical Prediction Networks and Primate Visual Cortex
Jielin Qiu, Ge Huang, Tai Sing Lee
NeurIPS 2019
[paper]

Investigating Sex Differences in Classification of Five Emotions from EEG and Eye Movement Signals
Lan-Qing Bao, Jielin Qiu, Hao Tang, Wei-Long Zheng, Bao-Liang Lu
EMBC 2019
[paper] [code]

Approximation Gradient Error Variance Reduced Optimization
Weiye Zhao, Yang Liu, Xiaoming Zhao, Jielin Qiu, Jian Peng
AAAI-RLG 2019
[paper]

Multi-view Emotion Recognition Using Deep Canonical Correlation Analysis
Jielin Qiu, Wei Liu, Bao-Liang Lu
ICONIP 2018
[paper] [code]

Services

Reviewer: ICML 2021-2024, CVPR 2022-2024, WACV 2023-2024, ICLR 2023-2024, ICASSP 2023-2024, ECCV 2022-2024, ACL Rolling Review (ARR) 2024, ACCV 2024, CHIL 2022-2024, NeurIPS 2022-2023, ICCV 2023, KDD 2023, EACL 2023, MICCAI 2023, AISTATS 2023, MLHC 2022-2023, ACM MM 2022.

PC Member: AAAI 2021-2024, ACL 2023, EMNLP 2022-2023.

Journal Reviewer: Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Transactions on Machine Learning Research (TMLR), Journal of Data-centric Machine Learning Research (DMLR), IEEE Transactions on Neural Networks and Learning Systems.

Committee: NeurIPS 2022 virtual deep-dive session chair, CMU RISS Committee.

Teaching

Teaching Assistant of CMU 16-824 Visual Learning and Recognition, Instructor: Prof. Jun-Yan Zhu, Fall 2021

Teaching Assistant of CMU 11-777 MultiModal Machine Learning, Instructor: Prof. Yonatan Bisk, Spring 2021

Website template from Jon Barron and Mengdi Xu.