Simon Shaolei Du 杜少雷

Simon Shaolei Du
Email: ssdu [at] cs (dot) washington (dot) edu
Office: Gates 312
Google Scholar / DBLP / Talk Bio
Twitter

About Me

I am an assistant professor in the Paul G. Allen School of Computer Science & Engineering at University of Washington. My research interests are broadly in machine learning such as deep learning, representation learning and reinforcement learning.

Prior to starting as faculty, I was a postdoc at Institute for Advanced Study of Princeton, hosted by Sanjeev Arora. I completed my Ph.D. in Machine Learning at Carnegie Mellon University, where I was co-advised by Aarti Singh and Barnabás Póczos. Previously, I studied EECS and EMS at UC Berkeley. I have also spent time at Simons Institute and research labs of Facebook, Google and Microsoft.

Students and Visitors

My current focus is on theoretical foundations of deep learning, representation learning and (multi-agent) reinforcement learning.
I am looking for PhD students starting from Fall 2024.
I am happy to host (remote) undergradaute / graduate visitors.
If you want to work with me, please feel free to send me an email with your CV.

Research Focus and Selected Publications

Representation Learning Theory

We studied when pretraining provably improves the performance of the downstream task. Based on our theory, we developed an active learning algorithm to select the most relevant pretraining data.

Active Multi-Task Representation Learning
Yifang Chen, Simon S. Du, Kevin Jamieson
International Conference on Machine Learning (ICML) 2022

Few-Shot Learning via Learning the Representation, Provably
Simon S. Du*, Wei Hu*, Sham M. Kakade*, Jason D. Lee*, Qi Lei*
International Conference on Learning Representations (ICLR) 2021

Optimization and Generalization in Over-Parameterized Neural Networks

We proved the first set of global optimization and generalization guarantees for over-parameterized neural networks in the neural tanget kernel regime [Wikipedia]. Also see our [Blog] for a quick sumary. Recently, we found over-parameterization can exponentially slow down convergence.

Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu, Simon S. Du
Conference of Learning Theory (COLT) 2023

On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora*, Simon S. Du*, Wei Hu*, Zhiyuan Li*, Ruslan Salakhutdinov*, Ruosong Wang*
Conference on Neural Information Processing Systems (NeurIPS) 2019

Gradient Descent Finds Global Minima of Deep Neural Networks
Simon S. Du*, Jason D. Lee*, Haochuan Li*, Liwei Wang*, Xiyu Zhai*
International Conference on Machine Learning (ICML) 2019

Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora*, Simon S. Du*, Wei Hu*, Zhiyuan Li*, Ruosong Wang*
International Conference on Machine Learning (ICML) 2019

Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Simon S. Du*, Xiyu Zhai*, Barnabás Póczos, Aarti Singh
International Conference on Learning Representations (ICLR) 2019

Reinforcement Learning with Function Approximation

We studied the necessary and sufficient conditions that permit efficient learning for reinforcement learning problems with a large state space.

Bilinear Classes: A Structural Framework for Provable Generalization in RL
Simon S. Du*, Sham M. Kakade*, Jason D. Lee*, Shachar Lovett*, Gaurav Mahajan*, Wen Sun*, Ruosong Wang*
International Conference on Machine Learning (ICML) 2021

Provably efficient RL with Rich Observations via Latent State Decoding
Simon S. Du, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal, Miroslav Dudík, John Langford
International Conference on Machine Learning (ICML) 2019

Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning?
Simon S. Du*, Sham M. Kakade*, Ruosong Wang*, Lin F. Yang*
International Conference on Learning Representations (ICLR) 2020

Multi-Agent Reinforcement Learning (MARL)

We initiated the study on what dataset permits solving offline reinforcement learning problems. We also study MARL with function approximation that avoids the exponential dependency on the # of agents.

Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation
Qiwen Cui, Kaiqing Zhang, Simon S. Du
Conference of Learning Theory (COLT) 2023

When are Offline Two-Player Zero-Sum Markov Games Solvable?
Qiwen Cui, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2022

Fundamental Limits of Reinforcement Learning

We develop algorithms to obtain optimal complexity guarantees for reinforcement learning. In particular, we showed the sample complexity can be independent of the planning horizon.

Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang, Yuxin Chen, Jason D. Lee, Simon S. Du

Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang, Xiangyang Ji, Simon S. Du
Conference of Learning Theory (COLT) 2021

Acknowledgement: National Science Foundation (Awards 2212261, 2143493, 2134106, 2019844, 2110170, 2229881), NEC, Tencent, UW eScience.