The two primary areas of my current research agenda are non-volatile memory and self-driving DBMSs.

Non-Volatile Memory Database Management Systems

The central theme of my research is on the development of new database management systems that leverage the characteristics of emergent hardware technologies to meet the requirements of modern data-intensive applications. In particular, my current research focuses on a new class of memory category non-volatile memory (NVM) technologies that blur the gap between volatile memory and durable storage. NVM supports low latency byte-addressable accesses similar to DRAM, but all writes are persistent like SSDs. There are several aspects of NVM that make existing DBMS architectures inappropriate for them. My research investigates how to rearchitect the DBMS from the ground-up to take advantage of NVM. I redesign the fundamental algorithms and data structures employed in traditional DBMSs to leverage the persistence and performance characteristics of NVM. This enables the DBMS to support low latency transactions, instantaneous recovery from system failures, and cost-effective data management. My work shows that NVM’s impact straddles across all the layers of the DBMS, including logging and recovery, storage management, indexing, and query execution. I designed and built the Peloton non-volatile memory database management system.

Self-Driving Database Management Systems

Peloton started out as a research platform for exploring the implications of NVM for DBMSs. But it is now also being used in pursuing a new research direction on designing a self-driving DBMS. Tuning modern DBMSs for a particular workload is a laborious and error-prone task due to the long and growing list of knobs that these systems expose. If the DBMS could do automatically tune itself, then it would remove many of the complications and costs involved with its deployment. My research focuses on designing new algorithms that allow Peloton to tune itself. I apply techniques from machine learning to tune the physical design of the database to accelerate query processing. I developed new algorithms for tuning two key components of the database's physical design: storage layout and index configuration. Our evaluation in Peloton demonstrated that the self-driving module inside the DBMS achieves a near-optimal physical design for an arbitrary workload without requiring any manual tuning.