CX Research Group

Research

All

2025

Aligning Web Query Generation with Ranking Objectives via Direct Preference Optimization
João Coelho, Bruno Martins, João Magalhães, Chenyan Xiong
Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval  ·  13 Jul 2025  ·  doi:10.1145/3726302.3730162
Aligning Web Query Generation with Ranking Objectives via Direct Preference Optimization
Joao Coelho, Bruno Martins, Joao Magalhães, Chenyan Xiong
Association for Computing Machinery, Inc.  ·  13 Jul 2025  ·  doi:10.1145/3726302.3730162
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models
Hao Kang, Zichun Yu, Chenyan Xiong
arXiv  ·  26 May 2025  ·  doi:10.48550/arXiv.2505.20225
Group-Level Data Selection for Efficient Pretraining
Zichun Yu, Fei Peng, Jie Lei, Arnold Overwijk, Wen-tau Yih, Chenyan Xiong
arXiv  ·  20 Feb 2025  ·  doi:10.48550/arXiv.2502.14709
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li, Zichun Yu, Chenyan Xiong
ICLR 2025  ·  22 Jan 2025  ·  doi:10.48550/arxiv.2410.14208

2024-08

Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval
Joao Coelho, Bruno Martins, Joao Magalhaes, Jamie Callan, Chenyan Xiong
Association for Computational Linguistics  ·  2024-08  ·  doi:10.18653/v1/2024.acl-short.35

2024

MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models
Zichun Yu, Spandan Das, Chenyan Xiong
Neural Information Processing Systems (NeurIPS)  ·  10 Jun 2024  ·  doi:10.48550/arXiv.2406.06046
Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval
João Coelho, Bruno Martins, Joao Magalhaes, Jamie Callan, Chenyan Xiong
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)  ·  01 Jan 2024  ·  doi:10.18653/v1/2024.acl-short.35