Colin White
About
I design algorithms that create advanced AI algorithms.
I am Head of Research at Abacus.AI. I graduated from Carnegie Mellon University with a Ph.D. in computer science, advised by Nina Balcan and supported by the NDSEG Fellowship. I received my undergraduate degree from Amherst College.
For more, see my CV (last updated Jan. 2024).
I specialize in both fundamental and applied research in machine learning, including research in large language models, automated machine learning, debiasing neural networks, and machine learning for science. My goal is to create more efficient and fairer AI systems. Much of my work involves designing innovative methods by drawing insights from large-scale studies and designing tools for better benchmarking of machine learning techniques.
News
-
- • Feb 21, 2024: Smaug-72B is the first open-source LLM to surpass 80% on the HuggingFace Open LLM Leaderboard! Read the paper here.
- • Sep 21, 2023: I was a Senior Area Chair for the NeurIPS Track on Datasets and Benchmarks 2023.
- • Sep 12-15, 2023: I was a Program Chair for AutoML 2023. Mark your calendars for Sep 9-12, AutoML'24 in Paris!
- • Jan 22, 2023: New survey on neural architecture search. Email me if you have comments!
- • Jul 25, 2022: I gave a tutorial on neural architecture search with Debadeepta Dey, at AutoML 2022. View the talk here, and the slides here.
Preprints
- TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks Benjamin Feuer, Robin Tibor Schirrmeister, Valeriia Cherepanova, Chinmay Hegde, Frank Hutter, Micah Goldblum, Niv Cohen*, Colin White* Preprint [paper] [code]
- Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive Arka Pal, Deep Karkhanis, Samuel Dooley, Manley Roberts, Siddartha Naidu, Colin White Preprint [paper] [code]
- Physics-Informed Neural Operators with Exact Differentiation on Arbitrary Geometries Colin White*, Julius Berner*, Jean Kossaifi, Mogab Elleithy, David Pitt, Daniel Leibovici, Zongyi Li, Kamyar Azizzadenesheli, Anima Anandkumar Deep Learing & Differential Equations Workshop at NeurIPS 2023 [paper]
- Neural Architecture Search: Insights from 1000 Papers Colin White, Mahmoud Safari, Rhea Sukthanker, Binxin Ru, Thomas Elsken, Arber Zela, Debadeepta Dey, Frank Hutter Preprint [paper]
Publications
- To the Cutoff... and Beyond? A Longitudinal Perspective on LLM Data Contamination Manley Roberts, Himanshu Thakur, Christine Herlihy, Colin White, Samuel Dooley Selected for a contributed talk at the ICBINB@NeurIPS Workshop 2023 International Conference on Learning Representations (ICLR) 2024 [paper] [code]
- Guaranteed Approximation Bounds for Mixed-Precision Neural Operators Renbo Tu*, Colin White*, Jean Kossaifi, Boris Bonev, Gennady Pekhimenko, Kamyar Azizzadenesheli, Anima Anandkumar International Conference on Learning Representations (ICLR) 2024 [paper] [code]
- Rethinking Bias Mitigation: Fairer Architectures Make for Fairer Face Recognition Samuel Dooley*, Rhea Sukthanker*, John P. Dickerson, Colin White, Frank Hutter, Micah Goldblum Selected for oral presentation Neural Information Processing Systems (NeurIPS) 2023 [paper] [code]
- ForecastPFN: Synthetically-Trained Zero-Shot Forecasting Samuel Dooley, Gurnoor Singh Khurana, Chirag Mohapatra, Siddartha Naidu, Colin White Neural Information Processing Systems (NeurIPS) 2023 [paper] [code]
- When Do Neural Nets Outperform Boosted Trees on Tabular Data? Duncan McElfresh, Sujay Khandagale, Jonathan Valverde, Vishak Prasad C, Ganesh Ramakrishnan, Micah Goldblum, Colin White Neural Information Processing Systems Datasets Track (NeurIPS Datasets Track) 2023 [paper] [code]
- NAS-Bench-Suite-Zero: Accelerating Research on Zero Cost Proxies Arjun Krishnakumar*, Colin White*, Arber Zela*, Renbo Tu*, Mahmoud Safari, Frank Hutter Neural Information Processing Systems Datasets Track (NeurIPS Datasets Track) 2022 [paper] [code]
- On the Generalizability and Predictability of Recommender Systems Duncan McElfresh*, Sujay Khandagale*, Jonathan Valverde*, John P. Dickerson, Colin White Neural Information Processing Systems (NeurIPS) 2022 [paper] [code] [1 min video] [5 min video]
- AutoML for Climate Change: A Call to Action Renbo Tu, Nicholas Roberts, Vishak Prasad C, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White Tackling Climate Change with Machine Learning Workshop at NeurIPS 2022 [paper] [code]
- Speeding up NAS with Adaptive Subset Selection Vishak Prasad C, Colin White, Paarth Jain, Sibasis Nayak, Rishabh Iyer, Ganesh Ramakrishnan Workshop at AutoML 2022 [paper] [1 min video] [5 min video]
- A Deeper Look at Zero-Cost Proxies for Lightweight NAS Colin White, Mikhail Khodak, Renbo Tu, Shital Shah, Sébastien Bubeck, Debadeepta Dey International Conference on Learning Representations Blog Post Track (ICLR Blog Post Track) 2022 [blog post]
- NAS-Bench-Suite: NAS Evaluation is (Now) Surprisingly Easy Yash Mehta*, Colin White*, Arber Zela, Arjun Krishnakumar, Guri Zabergja, Shakiba Moradian, Mahmoud Safari, Kaicheng Yu, Frank Hutter International Conference on Learning Representations (ICLR) 2022 [paper] [code] [slides]
- Synthetic Benchmarks for Scientific Research in Explainable Machine Learning Yang Liu*, Sujay Khandagale*, Colin White, Willie Neiswanger Neural Information Processing Systems Datasets Track (NeurIPS Datasets Track) 2021 [paper] [code] [6 min video]
- How Powerful are Performance Predictors in Neural Architecture Search? Colin White, Arber Zela, Binxin Ru, Yang Liu, Frank Hutter Selected for a contributed talk at the NAS@ICLR Workshop 2021 Neural Information Processing Systems (NeurIPS) 2021 [paper] [code] [slides] [2 min video] [15 min video]
- NAS-Bench-x11 and the Power of Learning Curves Shen Yan*, Colin White*, Yash Savani, Frank Hutter Neural Information Processing Systems (NeurIPS) 2021 [paper] [code] [slides] [15 min video]
- Exploring the Loss Landscape in Neural Architecture Search Colin White, Sam Nolen, Yash Savani Uncertainty in Artificial Intelligence (UAI) 2021 [paper] [code] [blog post] [slides] [8 min video]
- BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search Colin White, Willie Neiswanger, Yash Savani AAAI Conference on Artificial Intelligence (AAAI) 2021 [paper] [code] [blog post] [slides] [18 min video]
- A Study on Encodings for Neural Architecture Search Colin White, Willie Neiswanger, Sam Nolen, Yash Savani Selected for spotlight presentation Neural Information Processing Systems (NeurIPS) 2020 [paper] [code] [3 min video] [10 min video]
- Intra-Processing Methods for Debiasing Neural Networks Yash Savani, Colin White, Naveen Govindarajulu Neural Information Processing Systems (NeurIPS) 2020 [paper] [code] [blog post] [3 min video]
- k-center Clustering under Perturbation Resilience With Maria-Florina Balcan and Nika Haghtalab Transactions on Algorithms Journal (TALG) 2020 Extends results from ICALP 2016 and this arXiv preprint [paper]
- Robust Communication-Optimal Distributed Clustering Algorithms With Pranjal Awasthi, Ainesh Bakshi, Maria-Florina Balcan, and David Woodruff International Colloquium on Automata, Languages, and Programming (ICALP) 2019 [paper]
- New Aspects of Beyond Worst-Case Analysis Colin White Ph.D. Thesis, Carnegie Mellon University, 2018 [paper]
- Data-Driven Clustering via Parameterized Lloyd's Families With Maria-Florina Balcan and Travis Dick Selected for spotlight presentation Neural Information Processing Systems (NeurIPS) 2018 [paper]
- Learning-Theoretic Foundations of Algorithm Configuration for Combinatorial Partitioning Problems With Maria-Florina Balcan, Vaishnavh Nagarajan, and Ellen Vitercik Conference on Learning Theory (COLT) 2017 [paper] [10 min video]
- Data Driven Resource Allocation for Distributed Learning With Travis Dick, Mu Li, Krishna Pillutla, Maria-Florina Balcan, and Alex Smola International Conference on Artificial Intelligence and Statistics (AISTATS) 2017 [paper]
- Learning Combinatorial Functions from Pairwise Comparisons With Maria-Florina Balcan and Ellen Vitercik Conference on Learning Theory (COLT) 2016 [paper] [10 min video]
- Lower Bounds in the Preprocessing and Query Phases of Routing Algorithms Colin White European Symposium on Algorithms (ESA) 2015 [paper]
- Small dynamical heights for quadratic polynomials and rational functions With Rob Benedetto, Ruqian Chen, Trevor Hyde, and Yordanka Kovacheva Journal of Experimental Mathematics, 2014 [paper]
Professional
Service
-
Senior Area Chair for NeurIPS Track on Datasets and Benchmarks 2023
Program Chair for AutoML 2023
Local Chair and Area Chair for AutoML 2022
Co-organizer for the 8th AutoML Workshop at ICML 2021
Top 10% of reviewers at NeurIPS 2022
Top 10% of reviewers at ICML 2022
Top 10% of reviewers at ICLR 2022
Top 10% of reviewers at NeurIPS 2021
Top 10% of reviewers at ICML 2021
Top 10% of reviewers at NeurIPS 2020