Papers
NULLS!: Revisiting Null Representation in Modern Columnar Formats
Zeng, Xinyu and Meng, Ruijun and Pavlo, Andrew and McKinney, Wes and Zhang, Huanchen
Proceedings of the 20th International Workshop on Data Management on New Hardware, 2024. DOI
Dear User-Defined Functions, Inlining isn't working out so great for us. Let's try batching to make our relationship work. Sincerely, SQL
Franz, Kai and Arch, Samuel I and Hirn, Denis and Grust, Torsten and Mowry, Todd and Pavlo, Andrew
CIDR 2024, Conference on Innovative Data Systems Research, 2024.
An Empirical Evaluation of Columnar Storage Formats
Zeng, Xinyu and Hui, Yulong and Shen, Jiahong and Pavlo, Andrew and McKinney, Wes and Zhang, Huanchen
Proc. VLDB Endow., pages. 148–161, 2023.
Tigger: A Database Proxy That Bounces With User-Bypass
Butrovich, Matthew and Ramanathan, Karthik and Rollinson, John and Lim, Wan Shen and Zhang, William and Sherry, Justine and Pavlo, Andrew
Proc. VLDB Endow., pages. 3335–3348, 2023. DOI
Database Gyms
Lim, Wan Shen and Butrovich, Matthew and Zhang, William and Crotty, Andrew and Ma, Lin and Xu, Peijing and Gehrke, Johannes and Pavlo, Andrew
CIDR 2023, Conference on Innovative Data Systems Research, 2023.
Litmus: Towards a Practical Database Management System with Verifiable ACID Properties and Transaction Correctness
Xia, Yu and Yu, Xiangyao and Butrovich, Matthew and Pavlo, Andrew and Devadas, Srinivas
Proceedings of the 2022 International Conference on Management of Data, pages. 1478–1492, 2022. DOI
Are You Sure You Want to Use MMAP in Your Database Management System?
Andrew Crotty and Viktor Leis and Andrew Pavlo
CIDR 2022, Conference on Innovative Data Systems Research, 2022. INFO
MB2: Decomposed Behavior Modeling for Self-Driving Database Management Systems
Ma, Lin and Zhang, William and Jiao, Jie and Wang, Wuwen and Butrovich, Matthew and Lim, Wan Shen and Menon, Prashanth and Pavlo, Andrew
Proceedings of the 2021 International Conference on Management of Data, pages. 1248–1261, 2021. DOI
Filter Representation in Vectorized Query Execution
Ngom, Amadou and Menon, Prashanth and Butrovich, Matthew and Ma, Lin and Lim, Wan Shen and Mowry, Todd C. and Pavlo, Andrew
Proceedings of the 17th International Workshop on Data Management on New Hardware (DaMoN 2021), pages. 6:1–6:7, 2021. DOI
An Inquiry into Machine Learning-based Automatic Configuration Tuning Services on Real-World Database Management Systems
Dana Van Aken and Dongsheng Yang and Sebastien Brillard and Ari Fiorino and Bohan Zhang and Christian Billian and Andrew Pavlo
Proc. VLDB Endow., pages. 1241–1253, 2021.
Spitfire: A Three-Tier Buffer Manager for Volatile and Non-Volatile Memory
Zhou, Xinjing and Arulraj, Joy and Pavlo, Andrew and Cohen, David
Proceedings of the 2021 International Conference on Management of Data, pages. 2195–-2207, 2021. DOI
Everything is a Transaction: Unifying Logical Concurrency Control and Physical Data Structure Maintenance in Database Management Systems
Ling Zhang and Matthew Butrovich and Tianyu Li and Andrew Pavlo and Yash Nannapaneni and John Rollinson and Huanchen Zhang and Ambarish Balakumar and Daniel Biales and Ziqi Dong and Emmanuel J Eppinger and Jordi E Gonzalez and Wan Shen Lim and Jianqiao Liu and Lin Ma and Prashanth Menon and Soumil Mukherjee and Tanuj Nayak and Amadou Ngom and Dong Niu and Deepayan Patra and Poojita Raj and Stephanie Wang and Wuwen Wang and Yao Yu and William Zhang
CIDR 2021, Conference on Innovative Data Systems Research, 2021.
Mainlining Databases: Supporting Fast Transactional Workloads on Universal Columnar Data File Formats
Tianyu Li and Matthew Butrovich and Amadou Ngom and Wan Shen Lim and Wes McKinney and Andrew Pavlo
Proc. VLDB Endow., pages. 534–546, 2020.
Taurus: Lightweight Parallel Logging for In-Memory Database Management Systems
Yu Xia and Xiangyao Yu and Andrew Pavlo and Srinivas Devadas
Proc. VLDB Endow., pages. 189–201, 2020. CODE
Permutable Compiled Queries: Dynamically Adapting Compiled Queries without Recompiling
Prashanth Menon and Amadou Ngom and Lin Ma, Todd C. Mowry, Andrew Pavlo
Proc. VLDB Endow., pages. 101–113, 2020.
Order-Preserving Key Compression for In-Memory Search Trees
Huanchen Zhang and Xiaoxuan Liu and David G. Andersen and Michael Kaminsky and Kimberly Keeton and Andrew Pavlo
Proceedings of the 2020 International Conference on Management of Data, pages. 1601–1615, 2020. CODE
On Supporting Efficient Snapshot Isolation for Hybrid Workloads with Multi-Versioned Indexes
Yihan Sun and Guy E. Blelloch and Wan Shen Lim and Andrew Pavlo
Proc. VLDB Endow., pages. 221–225, 2019.
Scheduling OLTP transactions via learned abort prediction
Yangjun Sheng and Anthony Tomasic and Tieying Zhang and Andrew Pavlo
Proceedings of the Second International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM@SIGMOD 2019, pages. 1:1–1:8, 2019.
Everything You Always Wanted to Know About Compiled and Vectorized Queries But Were Afraid to Ask
Timo Kersten and Viktor Leis and Alfons Kemper and Thomas Neumann and Andrew Pavlo and Peter Boncz
Proc. VLDB Endow., pages. 2209–2222, 2018.
Sundial: Harmonizing Concurrency Control and Caching in a Distributed OLTP Database Management System
Xiangyao Yu and Yu Xia and Andrew Pavlo and Daniel Sanchez and Larry Rudolph and Srinivas Devadas
Proc. VLDB Endow., pages. 1289–1302, 2018. CODE
Building a Bw-Tree Takes More Than Just Buzz Words
Ziqi Wang and Andrew Pavlo and Hyeontaek Lim and Viktor Leis and Huanchen Zhang and Michael Kaminsky and David G. Andersen
Proceedings of the 2018 ACM International Conference on Management of Data, pages. 473–488, 2018. CODE
Query-based Workload Forecasting for Self-Driving Database Management Systems
Lin Ma and Dana Van Aken and Ahmed Hefny and Gustavo Mezerhane and Andrew Pavlo and Gordon, Geoffrey J
Proceedings of the 2018 ACM International Conference on Management of Data, pages. 631–645, 2018. DOI CODE
SuRF: Practical Range Query Filtering with Fast Succinct Tries
Huanchen Zhang and David G. Andersen and Michael Kaminsky and Andrew Pavlo and Hyeontaek Lim and Viktor Leis and Kimberly Keeton
Proceedings of the 2018 ACM International Conference on Management of Data, pages. 323–336, 2018. CODE
Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last
Prashanth Menon and Todd C. Mowry and Andrew Pavlo
Proc. VLDB Endow., pages. 1–13, 2017.
Automatic Database Management System Tuning Through Large-scale Machine Learning
Van Aken, Dana and Pavlo, Andrew and Gordon, Geoffrey J. and Zhang, Bohan
Proceedings of the 2017 ACM International Conference on Management of Data, pages. 1009–1024, 2017. DOI CODE
An Empirical Evaluation of In-Memory Multi-Version Concurrency Control
Yingjun Wu and Joy Arulraj and Jiexi Lin and Ran Xian and Andrew Pavlo
Proc. VLDB Endow., pages. 781–792, 2017. CODE
Online Deduplication for Databases
Xu, Lianghong and Pavlo, Andrew and Sengupta, Sudipta and Ganger, Gregory R.
Proceedings of the 2017 ACM International Conference on Management of Data, pages. 1355–1368, 2017.
An Evaluation of Distributed Concurrency Control
Rachael Harding and Dana Van Aken and Andrew Pavlo and Michael Stonebraker
Proc. VLDB Endow., pages. 553–564, 2017. CODE
Self-Driving Database Management Systems
Andrew Pavlo and Gustavo Angulo and Joy Arulraj and Haibin Lin and Jiexi Lin and Lin Ma and Prashanth Menon and Todd Mowry and Matthew Perron and Ian Quah and Siddharth Santurkar and Anthony Tomasic and Skye Toor and Dana Van Aken and Ziqi Wang and Yingjun Wu and Ran Xian and Tieying Zhang
CIDR 2017, Conference on Innovative Data Systems Research, 2017.
Write-Behind Logging
Arulraj, Joy and Perron, Matthew and Pavlo, Andrew
Proc. VLDB Endow., pages. 337–348, 2016.
Clay: Fine-Grained Adaptive Partitioning for General Database Schemas
Serafini, Marco and Taft, Rebecca and Elmore, Aaron J and Pavlo, Andrew and Aboulnaga, Ashraf and Stonebraker, Michael
Proc. VLDB Endow., pages. 445–456, 2016.
Larger-than-memory Data Management on Modern Storage Hardware for In-memory OLTP Database Systems
Ma, Lin and Arulraj, Joy and Zhao, Sam and Pavlo, Andrew and Dulloor, Subramanya R. and Giardino, Michael J. and Parkhurst, Jeff and Gardner, Jason L. and Doshi, Kshitij and Zdonik, Stanley
Proceedings of the 12th International Workshop on Data Management on New Hardware, pages. 9:1–9:7, 2016.
Bridging the Archipelago Between Row-Stores and Column-Stores for Hybrid Workloads
Arulraj, Joy and Pavlo, Andrew and Menon, Prashanth
Proceedings of the 2016 International Conference on Management of Data, pages. 583–598, 2016. DOI
Reducing the Storage Overhead of Main-Memory OLTP Databases with Hybrid Indexes
Zhang, Huanchen and Andersen, David G. and Pavlo, Andrew and Kaminsky, Michael and Ma, Lin and Shen, Rui
Proceedings of the 2016 International Conference on Management of Data, pages. 1567–1581, 2016.
TicToc: Time Traveling Optimistic Concurrency Control
Yu, Xiangyao and Pavlo, Andrew and Sanchez, Daniel and Devadas, Srinivas
Proceedings of the 2016 International Conference on Management of Data, pages. 1629–1642, 2016. DOI
S-Store: Streaming Meets Transaction Processing
John Meehan and Nesime Tatbul and Stan Zdonik and Cansu Aslantas and Ugur Çetintemel and Jiang Du and Samuel Madden and David Maier and Andrew Pavlo and Michael Stonebraker and Kristin Tufte and Hao Wang
PVLDB, pages. 2134–2145, 2015.
Reducing Replication Bandwidth for Distributed Document Databases
Xu, Lianghong and Pavlo, Andrew and Sengupta, Sudipta and Li, Jin and Ganger, Gregory R.
Proceedings of the Sixth ACM Symposium on Cloud Computing, pages. 222–235, 2015.
Squall: Fine-Grained Live Reconfiguration for Partitioned Main Memory Databases
Aaron J. Elmore and Vaibhav Arora and Rebecca Taft and Andrew Pavlo and Divyakant Agrawal and Amr El Abbadi
Proceedings of SIGMOD, pages. 299–313, 2015. CODE
Let's Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems
Joy Arulraj and Andrew Pavlo and Subramanya Dulloor
Proceedings of the 2015 International Conference on Management of Data, pages. 707–722, 2015. CODE
Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores
Xiangyao Yu and George Bezerra and Andrew Pavlo and Srinivas Devadas and Michael Stonebraker
PVLDB, pages. 209–220, 2014. CODE
E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing
Rebecca Taft and Essam Mansour and Marco Serafini and Jennie Duggan and Aaron J. Elmore and Ashraf Aboulnaga and Andrew Pavlo and Michael Stonebraker
PVLDB, pages. 245–256, 2014.
A Prolegomenon on OLTP Database Systems for Non-Volatile Memory
Justin DeBrabant and Joy Arulraj and Andrew Pavlo and Michael Stonebraker and Stanley B. Zdonik and Subramanya Dulloor
ADMS @ VLDB, pages. 57–63, 2014.
Anti-Caching: A New Approach to Database Management System Architecture
Justin DeBrabant and Andrew Pavlo and Stephen Tu and Michael Stonebraker and Stanley B. Zdonik
PVLDB, pages. 1942–1953, 2013.
OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases
Djellel Eddine Difallah and Andrew Pavlo and Carlo Curino and Philippe Cudré-Mauroux
PVLDB, pages. 277–288, 2013. CODE
Benchmarking OLTP/Web Databases in the Cloud: The OLTP-bench Framework
Curino, Carlo A. and Difallah, Djellel E. and Pavlo, Andrew and Cudre-Mauroux, Philippe
CloudDB, pages. 17–20, 2012. CODE
On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systems
Pavlo, Andrew and Jones, Evan P.C. and Zdonik, Stan
Proc. VLDB Endow., pages. 85–96, 2011.
Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems
Pavlo, Andrew and Curino, Carlo and Zdonik, Stanley
SIGMOD, pages. 61–72, 2012.
A Comparison of Approaches to Large-Scale Data Analysis
Pavlo, Andrew and Paulson, Erik and Rasin, Alexander and Abadi, Daniel J. and DeWitt, David J. and Madden, Samuel and Stonebraker, Michael
Proceedings of SIGMOD, pages. 165–178, 2009.
The NMI Build & Test Laboratory: Continuous Integration Framework for Distributed Computing Software
Andrew Pavlo and Peter Couvares and Rebekah Gietzel and Anatoly Karp and Ian D. Alderman and Miron Livny and Charles Bacon
LISA, pages. 263–273, 2006.
Pegasus and DAGMan From Concept to Execution: Mapping Scientific Workflows onto Today's Cyberinfrastructure
Ewa Deelman and Miron Livny and Gaurang Mehta and Andy Pavlo and Gurmeet Singh and Mei-Hui Su and Karan Vahi and R. Kent Wenger
HPC and Grids in Action, pages. 56–74, 2008.