Lin Ma

I am a Ph.D. candidate in the Computer Science Department at Carnegie Mellon University, where I am fortunately advised by Andy Pavlo. My research interest is in database systems, data management, and machine learning. My current research focus is the architecture design for Self-Driving Database Systems, implemented in a new in-memory relational DBMS from CMU.

Before starting my graduate study at CMU, I finished my Bachelor’s degree at Peking University, majoring in Computer Science and Technology. In PKU, I used to work on data and information management with Prof. Bin Cui.

Download CV

Publications


  • MB2: Decomposed Behavior Modeling for Self-Driving Database Management Systems
    Lin Ma, William Zhang, Jie Jiao, Wuwen Wang, Matthew Butrovich, Wan Shen Lim, Prashanth Menon, Andrew Pavlo
    SIGMOD 2021 [pdf] [short video] [long video] [code]

  • Everything is a Transaction: Unifying Logical Concurrency Control and Physical Data Structure Maintenance in Database Management Systems
    Ling Zhang, Matthew Butrovich, Tianyu Li, Andrew Pavlo, Yash Nannapaneni, John Rollinson, Huanchen Zhang, Ambarish Balakumar, Daniel Biales, Ziqi Dong, Emmanuel J Eppinger, Jordi E Gonzalez, Wan Shen Lim, Jianqiao Liu, Lin Ma, Prashanth Menon, Soumil Mukherjee, Tanuj Nayak, Amadou Ngom, Dong Niu, Deepayan Patra, Poojita Raj, Stephanie Wang, Wuwen Wang, Yao Yu, William Zhang
    CIDR 2021 [pdf]

  • Permutable Compiled Queries: Dynamically Adapting Compiled Queries without Recompiling
    Prashanth Menon, Amadou Ngom, Lin Ma, Todd C. Mowry, Andrew Pavlo
    VLDB 2020 [pdf]

  • Active Learning for ML Enhanced Database Systems
    Lin Ma, Bailu Ding, Sudipto Das, Adith Swaminathan
    SIGMOD 2020 [pdf][slides] [poster][video]

  • External vs. Internal: An Essay on Machine Learning Agents for Autonomous Database Management Systems
    Andrew Pavlo, Matthew Butrovich, Ananya Joshi, Lin Ma, Prashanth Menon, Dana Van Aken, Lisa Lee, Ruslan Salakhutdinov
    TCDE Bulletin 2019 [pdf]

  • Query-based Workload Forecasting for Self-Driving Database Management Systems
    Lin Ma, Dana Van Aken, Ahmed Hefny, Gustavo Mezerhane, Andrew Pavlo, Geoffrey J. Gordon
    SIGMOD 2018 [pdf][slides] [code] [poster][video]

  • Self-Driving Database Management Systems
    Andrew Pavlo, Gustavo Angulo, Joy Arulraj, Haibin Lin, Jiexi Lin, Lin Ma, Prashanth Menon, Todd C Mowry, Matthew Perron, Ian Quah, et al.
    CIDR 2017 [pdf]

  • Larger-than-Memory Data Management on Modern Storage Hardware for In-Memory OLTP Database Systems
    Lin Ma, Joy Arulraj, Sam Zhao, Andrew Pavlo, Subramanya R. Dulloor, Michael J. Giardino, Jeff Parkhurst, Jason L. Gardner, Kshitij Doshi, Stanley Zdonik
    DAMON 2016 [pdf][slides][code]

  • Reducing the storage overhead of main-memory OLTP databases with hybrid indexes
    Huanchen Zhang, David G Andersen, Andrew Pavlo, Michael Kaminsky, Lin Ma, Rui Shen
    SIGMOD 2016 [pdf]

  • PAGE: A Partition Aware Engine for Parallel Graph Computation
    Yingxia Shao, Bin Cui, Lin Ma
    TKDE 2015 [pdf]

  • Parallel Subgraph Listing in a Large-Scale Graph
    Yingxia Shao, Bin Cui, Lei Chen, Lin Ma, Junjie Yao, Ning Xu
    SIGMOD 2014 [pdf]

  • PAGE: A Partition Aware Graph Computation Engine
    Yingxia Shao, Junjie Yao, Bin Cui, Lin Ma
    CIKM 2013 [pdf]

Experience

Working


  • Microsoft Research
    Research Intern in Data Management , Exploration and Mining ([DMX]) Group, Summer 2018

Teaching


  • CMU 15-721 Advanced Database Systems
    Head Teaching Assistant [Spring 2019]

  • CMU 15-445 Database Systems
    Teaching Assistant [Fall 2018]

Professional Service


Talks


  • NoisePage:The Self-Driving Database Management System
    Facebook, June 4, 2021
    Harvard University, May 28, 2021
    Columbia University, April 13, 2021
    Stanford University (MLSys Seminar), April 8, 2021 [video]
    Oracle, April 6, 2021
    Carnegie Mellon University, March 22, 2021 [video]
    Centrum Wiskunde & Informatica, March 19, 2021
    The University of Chicago, March 17, 2021
    University of Washington, March 3, 2021
    University of California, Berkeley, February 23, 2021
    University of California, Santa Cruz (CSE 215), February 19, 2021
    Technical University of Munich, February 18, 2021
    Brown University, January 27, 2021

  • Active Learning for ML Enhanced Database Systems
    SIGMOD, June 2020

  • Self-Driving Databases: It All Starts with Workload Forecasting
    Percona Live, May 2019

  • Efficiently Leveraging B-Instances for Query Plan Predictions
    Microsoft Research, August 2018

  • Query-based Workload Forecasting for Self-Driving DBMSs
    SIGMOD, June 2018
    Microsoft Research, May 2018
    PDL Retreat, October 2017

  • Larger-than-Memory Data Management on Modern Storage Hardware for In-Memory OLTP Database Systems
    SIGMOD, June 2016

  • The Self-Driving DBMS
    PDL Retreat, October 2016

  • Multi-Level Anti-Caching for NVM+SSD in H-Store
    PDL Retreat, October 2015

  • Finalist Presentation of Programming Contest
    SIGMOD, June 2014

  • Using Less to Do More With Anti-Caching in OLTP Database Systems
    Carnegie Mellon University, August 2014