James Bornholt is a fourth-year PhD student in the PLSE group at the University of Washington, advised by Emina Torlak, Dan Grossman, and Luis Ceze. He works on programming languages and formal methods, focusing on automated program synthesis and verification. His recent work has shown how to synthesize memory consistency models, formalize and prove file system crash safety, and (on a different note) store digital data in DNA.
JY: Tell us about yourself. How did you get here?
JB: I grew up in Australia, and got my undergraduate degree from the Australian National University. While there, I got the chance to work with Steve Blackburn, who kindly took me on and let me loose hacking on Jikes RVM (a metacircular Java virtual machine), which was a blast. He also introduced me to Kathryn McKinley, who invited me to do an internship at Microsoft Research that turned into my first paper, at ASPLOS, about Uncertain<T>, a type system for propagating uncertainty (from sensors, machine learning, etc.) through computations. Through this work I realized I really like digging into programming language internals, and Steve and Kathryn convinced me I might enjoy grad school (they were right!).
JY: Tell us about the paper you had in POPL 2016, about optimizing synthesis with metasketches. How did that work come about?
JB: A lot of my work deals with program synthesis, the problem of automatically generating a program given a specification of what it should do. Solving a synthesis problem is hard: most approaches explicitly search the (extremely large!) space of possible programs for one that satisfies the specification. One standard way of specifying a synthesis problem is as a “sketch” describing the space of programs to search, but this relies on the synthesis engine to infer the search space’s structure. Metasketches, which we introduced in our paper, layer structure onto the search space explicitly by defining it as a family of sketches. The individual sketches can be solved separately (or even in parallel) and more efficiently. This added structure is especially important when looking for an optimal solution, since the engine can prune the search space by removing expensive sketches from the family entirely.
When I first started at UW, I worked on approximate computing, the idea that programs can be much more efficient if allowed to be slightly incorrect. We wanted to use program synthesis to generate approximate programs automatically. I naively assumed this would be easier than synthesizing exact programs—approximate programs are allowed to be wrong!—but it’s actually quite difficult. We needed new techniques to make this synthesis scale, and it turned out that what we came up with—metasketches—wasn’t specific to approximation at all.
One special feature of this paper is that it’s the first POPL paper (to my knowledge) to feature a photo of a cat. :-)
JY: What is your thesis topic? How did you come to work on it?
JB: I’m incredibly excited about the effectiveness of recent solver-aided tools—tools that reduce programs to satisfiability queries that can be automatically checked—for verifying and synthesizing applications. The fact that someone can pick up a language like Racket or Haskell and get automatic verification and synthesis almost for free is a real triumph of formal methods and of programming language design. But victories in building solver-aided tools atop these languages are still hard-won; anyone who’s built one (and I’ve built a few) has tales of the magic incantations needed to make them scalable. The core problem is that scaling these tools requires understanding and tuning a complex software stack, including symbolic evaluation, a host language, an SMT solver, etc. Often the best solution is to modify these layers to expose the level of control necessary for your particular problem.
My thesis work is on better abstractions for use in building solver-aided tools, so that programmers don’t have to keep diving down into this complex software stack to make things work. Metasketches were the first part of this effort. Since then, we’ve been working on abstractions for symbolic evaluation itself. At the core of a solver-aided tool is a symbolic evaluation engine that encodes a program’s semantics as logical constraints. But these engines aren’t silver bullets: they require careful use to build a tool that works in practice. So how should a symbolic evaluation engine explain its performance to the programmer? How should it expose control over symbolic evaluation to allow tuning? These questions grew out of my own problems building solver-aided tools and teaching others to do the same.
JY: You've worked on a lot of different topics with a lot of people. How do the different projects come about? What is the unifying theme for you in all this work?
JB: When I first started at UW I wasn’t sure whether to work on programming languages or on systems and architecture. In my first quarter I took a new class on automated reasoning from my now-advisor Emina Torlak (who had also just joined the school), and really enjoyed the material. We kept working together on what became metasketches, and I learned a lot about building efficient automation for domain-specific languages. Most of my current work revolves around that same idea.
That expertise has also been really valuable in helping me find new projects: people ask me for suggestions about how to automate a PL problem they’re having, and some of those conversations have turned into fruitful collaborations in PL, systems, and architecture. I like being able to bring this automation toolkit to bear on systems and architecture problems. For example, our most recent paper shows how to automatically synthesize memory consistency models, which you need for verification of multiprocessor code. That’s solving an architecture problem (“what does my chip do?”), but along the way we needed new PL machinery to make it scale.
JY: What problems are you most excited to see the community solve in the next five to ten years?
JB: It’s a pretty common knock against formal verification to point out that we shouldn’t trust the specification any more than the implementation. One of the most interesting papers I read recently that responds to this objection is “Who guards the guards? Formal Validation of the ARM v8-M Architecture Specification”, by Alastair Reid at ARM. This work shows that we can improve confidence in specifications by proving cross-cutting properties that capture developer intent. But these properties tend to be the most difficult to prove, especially automatically. So if we want formal verification technology to be mainstream, we have more work to do on building specifications people can understand and use—and I’m excited to see the community continue to tackle this problem.
JY: Some of your work is quite applied. What do you see as the relationship between your more theoretical programming languages work and that other work?
JB: I’m lucky that most of my work falls into the intersection of POPL and more applied venues, so would find a comfortable home in either category. Even when that’s not true, there are often PL problems lurking that could make the jump. For example, I was lucky to be involved in a project here at UW to build a data storage system using DNA. I jumped in because it sounded cool, not because I necessarily expected any PL results. It led to a nice ASPLOS paper, but since then, other students here have been working on problems that I think would make great POPL papers: how do you program a molecular system like this one, where all operators are stochastic and whose efficiency varies depending on their inputs?
JY: What do you like most about the POPL community?
JB: I’m really happy with how seriously our community takes mentoring of new researchers. Attending your first conference, or submitting your first paper, or receiving your first rejection can be really intimidating. I was lucky to have great mentors help me through these steps, and though I still consider myself pretty new, I’ve had the chance to help bring newer folks into the fold too. These days we even have mentoring workshops—I learned a lot from them both as an attendee and a presenter, and I really wish the other communities I work in would follow our lead here.