Jielin Qiu

Jielin Qiu

I am a final-year Ph.D. candidate in the Computer Science Department at School of Computer Science, Carnegie Mellon University. I am fortunate to be advised by Prof. Lei Li and Prof. Christos Faloutsos. Before that, I received my B.Eng. from Shanghai Jiao Tong University, advised by Prof. Bao-Liang Lu. I've worked as a research intern at Google, Meta, Microsoft, Amazon Web Services, and Adobe.

My research interests lie in Multimodal Machine Learning. The central goal of my research is to design scalable inference and learning algorithms to connect language, perception, and control for robust multimodal learning. I strive to achieve this by learning the unique modality equivalence through abstract multimodal representations. My current research lies in the foundations of multimodal learning with applications in multimedia, computer vision, natural language processing, healthcare, and embodied AI. My research is generously supported by CMU CSD fellowships and fundings from DARPA, NSF, Adobe, Allegheny Health Network, and Cleveland Clinic.

Email  /  Google Scholar  /  Github  /  LinkedIn

profile photo

Update


I'm on the job market and I'm looking for a Research Scientist/Applied Scientist position starting from 2024.
Please feel free to contact me if you have any opening or if there's a suitable match!

News


  • [2024-02] One paper accepted by CVPR 2024.
  • [2024-01] One paper accepted by Journal of Data-centric Machine Learning Research (DMLR) 2024.
  • [2023-11] One paper accepted by PMLR ML4H 2023.
  • [2023-10] Start a research internship at Google.
  • [2023-10] One paper accepted by EMNLP Findings 2023.
  • [2023-06] One paper accepted by ICML 2023 Workshop on Interactive Learning with Implicit Human Feedback (spotlight).
  • [2023-06] Two papers accepted by ICML 2023 Workshop on Machine Learning for Multimodal Healthcare Data.
  • [2023-05] Start a research internship at Meta.
  • [2023-05] One paper accepted by ACL Findings 2023.
  • [2023-04] One paper accepted by ICML 2023.
  • [2023-04] Invited talk at Microsoft Research Cambridge.
  • [2023-02] One paper accepted by CVPR 2023.
  • [2023-02] One paper accepted by ICASSP 2023.
  • [2023-01] Start a research internship at Microsoft.
  • [2023-01] One paper accepted by EACL Findings 2023.
  • [2023-01] One paper accepted by AISTATS 2023.
  • [2022-10] One paper accepted by WACV 2023.
  • [2022-10] One paper accepted by NeurIPS 2022 Workshop on Distribution Shifts.
  • [2022-10] Top Reviewers in NeurIPS 2022.
  • [2022-06] One paper accepted by MLHC 2022.
  • [2022-05] Start a research internship at AWS AI.
  • [2022-05] One paper accepted by ICML 2022 workshop on Principles of Distribution Shift.
  • [2022-04] One paper accepted by ICLR 2022 Workshop on Socially Responsible Machine Learning.
  • [2021-09] Receive a gift funding from Adobe. Thanks, Adobe!
  • [2021-05] Start a research internship at Adobe research.

Work Experience


  • [2023/10 - Now] - Google. Student Researcher.
    Work on multimodal watermark.
  • [2023/05 - 2023/08] - Meta. Research Scientist Intern.
    Work on entity-centric VQA and retrieval-augmented multimodal LLM.
  • [2023/01 - 2023/04] - Microsoft. Research Intern.
    Work on multimodal video summarization dataset and entity recognition image dataset.
  • [2022/05 - 2022/12] - Amazon Web Service. Applied Scientist Intern.
    Work on robustness study of multimodal image-text models under distribution shifts.
  • [2021/05 - 2021/12] - Adobe. Research Intern.
    Work on multimodal Livesteam video segmentation and summarization.
  • Selected Publications


    * marked as equal contribution

    SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
    Jielin Qiu, Andrea Madotto, Zhaojiang Lin, Paul Crook, Ethan Xu,
    Luna Dong, Christos Faloutsos, Lei Li, Babak Damavandi, Seungwhan Moon
    Under Review

    MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
    Jielin Qiu, Jiacheng Zhu, William Han, Aditesh Kumar, Karthik Mittal, Claire Jin, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Ding Zhao,
    Bo Li, Lijuan Wang
    CVPR 2024
    [paper] [website] [dataset] [code]

    Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift
    Jielin Qiu, Yi Zhu, Xingjian Shi, Florian Wenzel, Zhiqiang Tang, Ding Zhao,
    Bo Li, Mu Li
    Journal of Data-centric Machine Learning Research (DMLR) 2024
    [paper] [website] [code]

    Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition
    Jielin Qiu, William Han, Winfred Wang, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Christos Faloutsos, Lei Li, Lijuan Wang
    Under Review

    Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment
    Jielin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin
    ACL 2023 Findings
    [paper] [press]

    Embodied Executable Policy Learning with Language-based Scene Summarization
    Jielin Qiu*, Mengdi Xu*, William Han*, Seungwhan Moon, Ding Zhao
    ICML 2023 Workshop on Interactive Learning with Implicit Human Feedback (spotlight)
    [paper]

    Can Brain Signals Reveal Inner Alignment with Human Languages?
    William Han*, Jielin Qiu*, Jiacheng Zhu, Mengdi Xu, Douglas Weber,
    Bo Li, Ding Zhao
    EMNLP 2023 Findings
    [paper] [code]

    Automated Cardiovascular Record Retrieval by Multimodal Learning between Electrocardiogram and Clinical Report
    Jielin Qiu*, Jiacheng Zhu*, Shiqi Liu, William Han, Jingqi Zhang, Chaojing Duan, Michael Rosenberg, Emerson Liu, Douglas Weber, Ding Zhao
    PMLR Proceedings of Machine Learning for Health 2023
    [paper] [code]

    Multimodal Representation Learning of Cardiovascular Magnetic Resonance Imaging
    Jielin Qiu*, Peide Huang*, Makiya Nakashima, Jaehyun Lee, Jiacheng Zhu, Wilson Tang, Pohao Chen, Christopher Nguyen, Byung-Hak Kim,
    Debbie Kwon, Douglas Weber, Ding Zhao, David Chen
    ICML 2023 Workshop on Machine Learning for Multimodal Healthcare Data
    [paper]

    Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models?
    Jielin Qiu*, William Han*, Jiacheng Zhu, Mengdi Xu, Michael Rosenberg, Emerson Liu, Douglas Weber, Ding Zhao
    EACL 2023 Findings
    [paper] [code]

    Cardiac Disease Diagnosis on Imbalanced Electrocardiography Data Through Optimal Transport Augmentation
    Jielin Qiu*, Jiacheng Zhu*, Mengdi Xu, Peide Huang, Michael Rosenberg, Douglas Weber, Emerson Liu, Ding Zhao
    ICASSP 2023
    [paper]

    LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream Videos
    Jielin Qiu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Ding Zhao, Hailin Jin
    WACV 2023
    [paper] [press]

    Interpolation for Robust Learning: Data Augmentation on Geodesics
    Jiacheng Zhu, Jielin Qiu, Aritra Guha, Zhuolin Yang, XuanLong Nguyen,
    Bo Li, Ding Zhao
    ICML 2023
    [paper]

    Align and Attend: Multimodal Summarization with Dual Contrastive Losses
    Bo He, Jun Wang, Jielin Qiu, Abhinav Shrivastava, Trung Bui, Zhaowen Wang
    CVPR 2023
    [paper] [code]

    Benchmarking Robustness under Distribution Shift of Multimodal Image-Text Models
    Jielin Qiu, Yi Zhu, Xingjian Shi, Zhiqiang Tang, Ding Zhao, Bo Li, Mu Li
    NeurIPS 2022 Workshop on Distribution Shifts
    [paper] [press] [code]

    GeoECG: Data Augmentation via Wasserstein Geodesic Perturbation for Robust Electrocardiogram Prediction
    Jiacheng Zhu*, Jielin Qiu*, Zhuolin Yang, Douglas Weber,
    Michael Rosenberg, Emerson Liu, Bo Li, Ding Zhao
    MLHC 2022
    [paper]

    Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables
    Mengdi Xu, Peide Huang, Yaru Niu, Visak Kumar, Jielin Qiu, Chao Fang, Kuan-Hui Lee, Xuewei Qi, Henry Lam, Bo Li, Ding Zhao
    AISTATS 2023
    [paper] [code]

    Data Augmentation via Wasserstein Geodesic Perturbation for Robust Electrocardiogram Prediction
    Jiacheng Zhu*, Jielin Qiu*, Zhuolin Yang, Michael Rosenberg, Emerson Liu, Bo Li, Ding Zhao
    ICLR 2022 Workshop on Socially Responsible Machine Learning (SRML)
    [paper]

    Comparing Recognition Performance and Robustness of Multimodal Deep Learning Models for Multimodal Emotion Recognition
    Wei Liu, Jielin Qiu, Wei-Long Zheng, Bao-Liang Lu
    IEEE Transactions on Cognitive and Developmental Systems 2021
    [paper] [code]

    Visual Sequence Learning in Hierarchical Prediction Networks and Primate Visual Cortex
    Jielin Qiu, Ge Huang, Tai Sing Lee
    NeurIPS 2019
    [paper]

    Investigating Sex Differences in Classification of Five Emotions from EEG and Eye Movement Signals
    Lan-Qing Bao, Jielin Qiu, Hao Tang, Wei-Long Zheng, Bao-Liang Lu
    EMBC 2019
    [paper] [code]

    Approximation Gradient Error Variance Reduced Optimization
    Weiye Zhao, Yang Liu, Xiaoming Zhao, Jielin Qiu, Jian Peng
    AAAI-RLG 2019
    [paper]

    Multi-view Emotion Recognition Using Deep Canonical Correlation Analysis
    Jielin Qiu, Wei Liu, Bao-Liang Lu
    ICONIP 2018
    [paper] [code]

    Services


  • Reviewer: ICML 2021-2024, CVPR 2022-2024, WACV 2023-2024, ICLR 2023-2024, ICASSP 2023-2024, ECCV 2022-2024, ACL Rolling Review (ARR) 2024, ACCV 2024, CHIL 2022-2024, NeurIPS 2022-2023, ICCV 2023, KDD 2023, EACL 2023, MICCAI 2023, AISTATS 2023, MLHC 2022-2023, ACM MM 2022.
  • PC Member: AAAI 2021-2024, ACL 2023, EMNLP 2022-2023.
  • Journal Reviewer: Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Journal of Data-centric Machine Learning Research (DMLR), IEEE Transactions on Neural Networks and Learning Systems.
  • Committee: NeurIPS 2022 virtual deep-dive session chair, CMU RISS Committee.

  • Teaching


  • Teaching Assistant of CMU 16-824 Visual Learning and Recognition, Instructor: Prof. Jun-Yan Zhu, Fall 2021
  • Teaching Assistant of CMU 11-777 MultiModal Machine Learning, Instructor: Prof. Yonatan Bisk, Spring 2021

  • Website template from Jon Barron and Mengdi Xu.