• Understanding Optimization and Generalization in Deep Learning: A Trajectory-based Analysis
    CMU AI Lunch, Feb 2019, Pittsburgh, USA
    Harbin Institute of Technology, Jan 2019, Harbin, China
    Peking University, Dec 2018, Beijing, China
    Institute for Interdisciplinary Information Sciences, Tsinghua University, Dec 2018, Beijing, China

  • Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima
    ICML 2018, Jul 2018, Stockholm, Sweden

  • On the Power of Over-parametrization in Neural Networks with Quadratic Activation
    ICML 2018, Jul 2018, Stockholm, Sweden

  • On the Power of Randomly Initialized Gradient Descent for Learning Convolutional Neural Networks
    CMU AI Lunch, Mar 2018, Pittsburgh, USA
    Peking University, Jan 2018, Beijing, China
    Microsoft Research Asia, Jan 2018, Beijing, China
    Institute of Computing Technology, Chinese Academy of Sciences, Jan 2018, Beijing, China
    Tencent AI Lab, Jan 2018, Shenzhen, China
    Kuaishou, Jan 2018, Beijing, China
    FLAIR (Future Leaders of AI Retreat), Dec 2017, Shanghai, China

  • Gradient Descent Can Take Exponential Time to Escape Saddle Points
    NIPS 2017, Dec 2017, Long Beach, USA

  • When is a Convolutional Filter Easy to Learn?
    AT&T Labs Graduate Student Symposium, Dec 2017, New York City, USA
    CMU Deep Learning Reading Group, Sep 2017, Pittsburgh, USA
    Facebook AI Research, Aug 2017, Menlo Park, USA

  • Stochastic Variance Reduction Methods for Policy Evaluation
    ICML 2017, Aug 2017, Sydney, Australia
    Microsoft Research Machine Learning Lunch, Aug 2016, Redmond, USA

  • Computationally Efficient Robust Sparse Estimation in High Dimensions
    CMU Statistical Machine Learning Seminar, Oct 2017, Pittsburgh, USA
    COLT 2017, Jul 2017, Amsterdam, Netherlands. Joint with Jerry Li

  • An Improved Gap-Dependency Analysis of the Noisy Power Methods
    COLT 2016, Jun 2016, New York City, USA