Research
All
2025
Aligning Web Query Generation with Ranking Objectives via Direct Preference Optimization
Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval
·
13 Jul 2025
·
doi:10.1145/3726302.3730162
Aligning Web Query Generation with Ranking Objectives via Direct Preference Optimization
Association for Computing Machinery, Inc.
·
13 Jul 2025
·
doi:10.1145/3726302.3730162
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models
arXiv
·
26 May 2025
·
doi:10.48550/arXiv.2505.20225
Group-Level Data Selection for Efficient Pretraining
arXiv
·
20 Feb 2025
·
doi:10.48550/arXiv.2502.14709
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
ICLR 2025
·
22 Jan 2025
·
doi:10.48550/arxiv.2410.14208
2024-08
Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval
Association for Computational Linguistics
·
2024-08
·
doi:10.18653/v1/2024.acl-short.35
2024
MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models
Neural Information Processing Systems (NeurIPS)
·
10 Jun 2024
·
doi:10.48550/arXiv.2406.06046
Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
·
01 Jan 2024
·
doi:10.18653/v1/2024.acl-short.35