Simon Shaolei Du

Office: GHC 8005

Email: ssdu [at] cs (dot) cmu (dot) edu

Social Media: LinkedIn Facebook 知乎 WeChat

I am a third-year PhD student in Machine Learning Department at Carnegie Mellon University, co-advised by Aarti Singh and Barnabás Póczos. Previously, I studied EECS and EMS at UC Berkeley and I worked with Ming Gu, Lei Li, Michael Mahoney and Stuart Russell. I also spent a semester at Department of Electronic Engineering at Tsinghua University. .

My research interests broadly include topics in theoretical machine learning and statistics, such as matrix factorization, convex/non-convex optimization, transfer learning, reinforcement learning, non-parametric statistics and robust statistics. On the application side, I am interested in applying machine learning techniques for precision agriculture.

I was born in Sydney and grew up in Beijing. I had my wonderful 6 years at SDSZ.


Professional Experiences



  1. Stochastic Zeroth-order Optimization in High Dimensions
    Yining Wang, Simon S. Du, Sivaraman Balakrishnan, Aarti Singh,
    [PDF] [arXiv]
  2. When is a Convolutional Filter Easy to Learn?
    Simon S. Du, Jason D. Lee, Yuandong Tian,
    [PDF] [arXiv]
  3. Conference Papers

  4. Gradient Descent Can Take Exponential Time to Escape Saddle Points,
    Simon S. Du, Chi Jin, Jason D. Lee, Michael I. Jordan, Barnabás Póczos, Aarti Singh,
    To appear in Conference on Neural Information Processing Systems (NIPS) 2017 (Spotlight).
    [PDF] [arXiv]
  5. On the Power of Truncated SVD for General High-rank Matrix Estimation Problems,
    Simon S. Du, Yining Wang, Aarti Singh,
    To appear in Conference on Neural Information Processing Systems (NIPS) 2017.
    [PDF] [arXiv]
  6. Hypothesis Transfer Learning via Transformation Functions,
    Simon S. Du, Jayanth Koushik, Aarti Singh, Barnabás Póczos,
    To appear in Conference on Neural Information Processing Systems (NIPS) 2017.
    [PDF] [arXiv] [Poster]
  7. High-throughput Robotic Phenotyping of Energy Sorghum Crops,
    Srinivasan Vijayarangan, Paloma Sodhi, Prathamesh Kini, James Bourne, Simon S. Du, Hanqi Sun, Barnabás Póczos, Dimitrios Apostolopoulos, and David Wettergreen,
    Conference on Field and Service Robotics (FSR) 2017.
  8. Stochastic Variance Reduction Methods for Policy Evaluation,
    Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou,
    International Conference of Machine Learning (ICML) 2017.
    [PDF] [arXiv] [Lihong's Talk at Simons Institute] [Poster]
  9. Computationally Efficient Robust Estimation of Sparse Functionals,
    Simon S. Du, Sivaraman Balakrishnan, Aarti Singh,
    Conference of Learning Theory (COLT) 2017.
    [PDF] [arXiv] [Slides] [Poster]
    Merged with this paper
  10. Efficient Nonparametric Smoothness Estimation,
    Shashank Singh, Simon S. Du, Barnabás Póczos,
    Conference on Neural Information Processing Systems (NIPS) 2016.
    [PDF] [arXiv]
  11. An Improved Gap-Dependency Analysis of the Noisy Power Method,
    Maria-Florina Balcan*, Simon S. Du*, Yining Wang*, Adams Wei Yu*,
    Conference of Learning Theory (COLT) 2016.
    [PDF] [arXiv] [Slides] [Talk]
  12. Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nystrom Method,
    David G. Anderson*, Simon S. Du*, Michael W. Mahoney*, Christopher Melgaard*, Kunming Wu*, Ming Gu,
    International Conference on Artificial Intelligence and Statistics (AISTATS) 2015.
    [PDF] [Supplement] [Code]
  13. Workshop Papers

  14. Novel Quantization Strategies for Linear Prediction with Guarantees,
    Simon S. Du**, Yichong Xu**, Yuan Li, Hongyang Zhang, Aarti Singh, Pulkit Grover,
    International Conference of Machine Learning (ICML) 2016, On Device Intelligence (ONDI) workshop.
    [PDF] [Slides]
  15. Maxios: Large Scale Nonnegative Matrix Factorization for Collaborative Filtering,
    Simon S. Du, Yilin Liu, Boyi Chen, Lei Li,
    Conference on Neural Information Processing Systems (NIPS) 2014, workshop on Distributed Machine Learning and Matrix Computations.
    [PDF] [Poster]

  16. **: Equal contribution. *: Alphabetic order according to mathematics or theoretical computer science convention.