Nikhil Swamy is a Senior Researcher in the RiSE Group at Microsoft Research Redmond. In this interview we talk about his work on using dependent types for writing verifiably secure programs, including the origins and evolution of the F* programming language. Fun fact: I was Nik's intern when he first started working on F*, in 2010. Even without this connection, I would still think that F* is one of the hottest research languages right now. Its type system is really pushing the boundary of what it is possible to prove in an automated way.
JY: Tell us about yourself.
NS: I grew up in India. I lived there until I was 16 and then I spent a couple of years in the UK and then I came to the US. I was an undergraduate at Hampshire College and then I worked for a couple years as a software engineer in the Boston area.
And then I was a PhD student at the University of Maryland working with Mike Hicks. Working with Mike it quickly became clear to me one place in which automated reasoning could work well is in proving many simple theorems about complicated programs. That was a big moment for me. And so I did a PhD with Mike on designing systems that let you prove simple security theorems about programs. I was interested then in simple forms of dependent typing to prove information flow properties and access control properties.
I came to Microsoft Research in 2008. I'm still working in very much the same space.
JY: I was hoping you could talk about how F* came about and how it's evolved.
NS: It started as an outcropping of my thesis. My thesis was about how could you add just the slightest amount of dependent typing that you could get away with into a full-fledged programming language that had effects, while still being able to keep the complexity low and still being able to use it to prove interesting properties. It had what people call a value-dependent type system. There was no computation in the types or anything. It was simple enough that it was easy to formalize and easy to implement, but also quite useful. When I came to MSR there was a group at MSR Cambridge working a language called F7 which had quite some similarities to this language I was developing in my thesis, which I called Fable. F7's main distinctive feature was that it used an SMT solver, whereas in Fable you had write out all these proofs very painfully by hand.
The first steps towards F* was putting these languages together and making SMT work really well in a value-dependent type system. Initially my goals were certainly very much along the lines of maximizing your bang for your buck. Can you add as little as possible in terms of fancy typing features while still getting the ability to prove a bunch of things? We started by trying to use this thing to prove simple properties using a relatively simple type system. And of course we got interested in more and more interesting properties. And so we wanted a more and more powerful type system. And now we're at a point where we, it's very close in expressiveness to something like Coq or Agda or Lean or any other full fledged dependent type system. But it's still got SMT very much at its core. And it's still got effects very much at its core.
JY: What's your vision for who would be the ultimate users of F*?
NS: It's for people who are motivated to build and have the resources to build very high assurance systems. It’s still a very small domain and a very small group of experts who can actually do this effectively. The POPL community is very much a part of this crowd. But the systems community is coming along and the systems conferences these days tend to have one or two verification tracks. The crypto community is coming along quite nicely.
In a way our ambition is to bridge the gap between programming and proving in as seamless a way as possible. But we're still quite far from saying, "Take your arbitrary whatever program you've been writing and start to incrementally add verification to it."
JY: How has the reception for this kind of work changed over the years?
NS: As the tooling has improved, people from other communities have become more receptive to this kind of work. Notably the systems and the crypto communities. And that's given this line of work an external motivation. And that means for the people providing the logical foundations and the tooling to get the stuff done, there are some goal posts now. That's nice.
On the F* side that driving force has been "Project Everest". We're trying to verify stuff related to security protocols and crypto algorithms in the HTTPS ecosystem. And that's meant getting F* to work in a scenario where you're writing performance- and security-critical networking protocols. Where you’re also doing efficient low level arithmetic on custom representations of bignums. At the same time you're also getting into modeling game-based cryptography where you're trying to prove equivalences between different effectful programs. All while trying to extract efficient programs that execute in C or assembly. So, being committed to these kinds of applications, we’re forced to make the language work well in several related, but still quite diverse, settings. It's great to have that kind of application driving the design of the language.
JY: How would you say your views on what makes a good problem has changed over the years?
NS: I was at some point of the mindset that you need to be able to get your tools in the hands of people who are working everyday programmers. There's a place for that kind of research but for me, that kind of research was not necessarily compatible with doing the most far out stuff. I've become more comfortable with doing work that's catered towards experts. If you empower a small number of people to do something that could then influence a very large number of people, that's still okay.
For instance, so right now, we're maybe 20, 25 people in Project Everest using F* very heavily, and then some others who are tinkering with it here and there. It's still a very small community of users. But the thing that we're hoping to build as a result of this stuff are protocols that could be used by by anybody browsing the internet. They don't have to know what's inside these components or how the components were produced. The way in which you amplify your impact is at one level of indirection.
JY: Do you see POPL as having a core identity?
NS: POPL has evolved more in recent years to be about proving things. It used to be more about clean language designs and principles about programs in general.
Ultimately programming is a very applied discipline and programming is changing all the time. But people are programming all kinds of new things all the time. We should expect an applied conference like POPL to reflect the kind of evolving interpretations of what programming actually means. It could do that much better.
I'm a good example of this. I mean, pick this area that you like, that you find fascinating and work in it. Do roughly the same thing for 10 years or whatever, right?
Meanwhile, the thing that I learned about this last weekend was about twitch.tv. Twitch is apparently a top 20 website in the US where you can watch other people playing video games. Apparently half of millennial males in the US spend on average two hours a day on this website. First of all, to spend two hours a day on any website is insane. But I'm also thinking of it from a programming perspective. This website is streaming high def content on their machines to millions of other people doing this in real time with live feeds and chats and streams going in every which direction. And the whole thing is driven around yet another programming experience, which is the actual building of the games. That's some whole kind of other programming. That's apparently super pervasive.
I hope that POPL continues to evolve and that it's this kind of broad tent where there'll be people thinking about how to do streaming websites. I guess people are thinking about this in the POPL community but I've not really seen a paper on programming for streaming rich multimedia content. I'm not sure that's a big topic at POPL. Maybe it should be.
JY: What do you think are the most important or interesting problems that the POPL community can work on in the next few years?
NS: Of all the languages that make it somehow, the ones that are popular are very rarely designed by POPL like people. They're usually designed by people who are working in some domain and have a real itch to scratch and they build something. And it turns out that's a widely felt need and a bunch of people jump on it and start to use it. And then POPL finds out about it after it's caught on. And then we're faced with trying to make sense of it. Now how can we, as a community, inject ourselves earlier into that process?