Next-Generation Storage Technologies
For the first time in 25 years, a new
is being created that is expected to be more than 100 times faster than current durable storage devices.
This technology blurs the gap between memory and storage.
My research focuses on understanding the changes required in database systems to leverage the unique characteristics of these non-volatile memory technologies.
I designed and built a database management system for these next-generation storage technologies.
We have developed a new logging and recovery protocol, called write-behind logging,
that enables a database system to recover nearly instantaneously from system failures.
Our evaluation in the Peloton DBMS showed that compared to the ubiquitous write-ahead logging protocol,
write-behind logging reduces the recovery time of the DBMS by more than two orders of magnitude,
shrinks its storage footprint, and surprisingly also improves its runtime performance.
- Write-Behind Logging, VLDB 2017
- How to Build a Non-Volatile Memory Database Management System, SIGMOD 2017 (Tutorial)
- Larger-than-Memory Data Management on Modern Storage Hardware for In-Memory OLTP Database Systems, DAMON 2016
- Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems, SIGMOD 2015
- A Prolegomenon on OLTP Database Systems for Non-Volatile Memory, ADMS 2014
Self-Driving Database Management Systems
I am also excited about self-driving data management systems with domain-specific AI that enables them to automatically adapt to evolving real-world workloads. I design self-driving online algorithms for incrementally morphing the storage layout, access methods, and data placement policy employed inside the data management system in tandem with workload shifts. We have designed techniques to continuously and automatically evolve the database’s physical storage layout by predicting future query workload trends. Our evaluation in Peloton demonstrated that this self-driving mechanism allows the DBMS to achieve a near-optimal storage layout for an arbitrary workload without requiring any manual tuning.