Chasing the Dragon of OLTP Databases // Blog // Andy Pavlo

TL;DR

I am going through a major change in my life where OLTP throughput is no longer the end all for database research.

When I was younger there was this "rave" kid (yes, this was a while ago) that in lived in my apartment complex and he had a big drug problem. He would take a lot of pills and then prance around with glow sticks in the parking lot. His signature "move" to impress the ladies was to stick a vibrator inside of a teddy bear, turn it on, and then throw it at girls he liked. He was a mess.

One day this kid disappeared. His friends later told me that he went into rehab to get clean. When he came back he was much more mellow and pleasant to be around. No more pills. No more cargo pants with zippers. And especially no more of that stupid vibrating bear.

I thought that I was better than him and that I would never have the same kind of problems in my life. But it's now time for me to come clean about my own addiction from the last eight years. I am a bit ashamed to have to admit this. I've talked it over with my sponsor and he said that this was for the best.

The Beginning

You never forget the first time that you are able to beat MySQL's performance with your new database system. For me it was the first incarnation of H-Store in 2008. John Hugg and I presented a demo at VLDB'08 in New Zealand^[1].

Our early system could do about 5000 TPC-C txn/sec on a single laptop, whereas MySQL's throughput was around 400 txn/sec. Granted MySQL was doing logging and other things that our system was not, but previous research showed that this only accounted for about 30% of its slowdown. It gave me this great rush and feeling of accomplishment. I wanted more.

A few years later I published the next H-Store paper where I could get about 11,000 txn/sec. This 2011 version of H-Store was a bit of a freakshow because it was using Evan Jones' external transaction coordination framework in a way that it wasn't intended. Furthermore, these results were running on David DeWitt's old cluster at Wisconsin. But it got the job done. We were able to get even better results in 2012 when H-Store could do 50,000 TPC-C txn/sec. I thought I was invincible. I thought that making transactions go faster was the reason why I was put on this planet. I loved the way that these throughput numbers ran through your veins. But it never gave me the same feeling that I got with the first paper. No matter what I tried, it just wasn't the same.

My entire graduate school career was focused on trying to get better OLTP throughput numbers. Sure we had to handle recovery in the system and we dabbled with OLAP workloads for a bit. But those things never made me feel the same way that serializable transactions did. I didn't care about anything else during this time. I ignored my family for years. I started skipping meals. I sold my bone marrow to pay my Amazon AWS bills^[2]. I got banned from the Providence Mall Apple Store because I tried to use their laptops to get H-Store to build on OSX. I started carrying a gun because I thought that the Shore-MT people were going to come after me and steal my ideas.

When Things Got Bad

H-Store was designed at a time when off-heap memory and other optimizations with the JVM were not common. Thus, it had some architectural issues that prevent it from scaling up to higher core counts on a single node. Scaling out didn't always work if there were multi-partition transactions^[3].

Then two new systems came out: Silo and Hekaton. These systems were able to execute over a million txn/sec on a single machine. H-Store would never be able to do that because of the JVM. But I still wanted to get my high.

By now I was at CMU, so I could have students get me my throughput fix. The first system was the DBx1000 testbed written by a MIT PhD student that I co-advise (Xiangyao Yu^[4]). The second was a prototype DBMS, called N-Store, written by my PhD student Joy Arulraj^[5]. N-Store was designed for non-volatile memory and in our SIGMOD'15 paper we were able to get 50,000 txn/sec with full durability on eight cores. DBx1000 is super optimized for many-core CPUs could do over 4 million txn/sec on a subset of TPC-C (no durability) with 40 cores.

In hindsight these are amazing performance numbers. Any database professor would be pleased to have their students write systems from scratch that can do this, especially as a new assistant professor that was just starting out. But I was a monster. I yelled at them that it wasn't good enough. I wanted higher throughput numbers. I wanted to get that same rush I did with my first H-Store experiment. I made them spend long hours to try to squeeze out the last bit of performance. I would sit down with them and make them explain to me the purpose of every spin lock. Things got so bad that I ended up beating one my students (I won't say which one) with the IV pole that I had from when I was selling my urine to people that needed to pass drug screening tests at their job.

This is when I realized I had a problem.

Moving On

After going through court-mandated rehabilitation and treatment, I now see the error in my ways. Foremost is that I recognize that physically beating my students was a bad idea. I also acknowledge that doing research for the sole purpose of achieving higher OLTP throughput is ultimately an unfulfilling endeavor. You can work on this problem forever but there is no end to it. There is always going to be some trick, optimization, or technique that will allow you to get better throughput. But it never will feel like the first time. I feel like I have reached the point where I am comfortable with what we can achieve in terms of throughput. Please note that I am not claiming that there will never be applications that need millions of transactions per second. And there is still plenty of work to do on achieving this kind of OLTP performance while also executing OLAP operations on the same database. What I am saying is that I think that we have reached a good stable point in this research area for today's hardware and that it is time to work on other problems.

And so what is the next thing? I strongly believe one research area will be database administration automation. The idea that we pay human DBAs to babysit software should go away. Microsoft did amazing work in this area a decade ago, but they never went all the way with a completely autonomous system. I will have more to say about this later in the months to come.

Footnotes

Although John and I presented the demo, there was a large team that helped us this first year.
Selling your bone marrow is legal in Nevada.
This problem with H-Store's concurrency control protocol is well known and documented.
Xiangyao will be on the job market at the end of 2016. Yes, that is four months from now. He is awesome.
Joy will be on the job market at the end of 2017. If you don't get Xiangyao, then you are going to want to hire Joy.