Research


Uniquely Represented Data Structures, with Applications

My Ph.D. Thesis: read the abstract or download the whole dissertation (PDF, 2.3 MB).

My thesis work provides the theoretical foundation for efficient uniquely represented data structures. These data structures ensure that the memory layout of the data structure is a function of the logical state it is supposed to encode, rather than a function of the sequence of operations used to create it. Thus each logical state is encoded in a unique machine state. This opens up the possibility, for the first time, of efficient systems that store exactly the information that we ask them to store, and no more. Such systems would eliminate the potential for privacy and security violations enabled by the undesirable storage of extraneous information.

For example, my work provides the theoretical foundation for:

  1. A filesystem where deleting a file provably leaves no trace whatsoever that it ever existed.
  2. A voting machine that stores the set of voters who participated in an election, while provably storing no information about the order in which they voted.
  3. File formats that allow efficient updates while guaranteeing secure redaction of content and removal of history (to avoid embarrasing leaks, as discussed e.g., here, here, and here. )

These applications and more make use of the so-called strong history-independence property of uniquely represented data structures. There are also other applications I am currently looking into, that do not pertain directly to history-independence per se, for example in certifying that certain computations were run.

Online Algorithms, Approximation Algorithms, and Provably Good Ways to Deal with Uncertainty

Most of the research questions I like to work on are algorithmic. In fact, my interest in uniquely represented data structures began with a fundamentally algorithmic question. Thus much of my research concerns the development of approximation algorithms for NP-hard problems, online algorithms, and the combination of the two. Typically this work focuses on resource allocation problems, often in the presence of uncertainty. Please see my publications page for more information.

Recently I have been doing some work with Matthew Streeter fusing approximation algorithms for NP-hard problems with no-regret algorithms for online decision problems. I am very excited about this work. To see why, have a look at some of the resulting papers here, here, and here.

Older Work

Below is a description of my research written in my first year of grad school. It may be a bit outdated, but it will give you a fair idea of my background and interests as a researcher.

My research so far mainly centers around approximation algorithms, often for problems with a graph theoretic feel. Here are examples, drawn from two broad classes of problems:

  1. Allocation Problems: In this class of problems, one must allocate a known set of resources to maximize a known function. I am considering the problem of fairly allocating K discrete (i.e indivisible) good to N agents, where K > N. If the agent's utility functions are known up-front, it is easy to maximize the total utility of all the agents -- it is less clear how to efficiently compute an allocation that is fair, for example, one that maximizes the minimum utility of any agent under that allocation. Solutions to these problems would find clear application in increasing efficiency of markets (e.g. as they relate to finding Pareto improvements) and intra-organization allocation of goods and services.

  2. Stochastic Decision Problems: In this class of problems, the algorithm makes decisions in stages. At the end of each stage, the environment changes according to some stochastic process. The algorithm must intelligently handle the changes in its environment, and perform well in expectation. Essentially, these problems are about uncertainty in the world, and involve searching for provably good ways to mitigate the cost of such uncertainty, as well as determining the its true cost (i.e. how much would knowing the future allow us to save with better planning?). An example problem in this class is that of building a path in a network subject to failures. In this problem, you must have a path from A to B by the end of tommorow. You can buy edges cheaply today (but they may fail) or you can buy edges at a much greater cost tommorow (in which case they won't fail). Your objective is to minimize cost. Why is this problem important? One could imagine a business needing a certain capability tomorrow, provided by a complex system. This system can be modelled as in realibility theory as a network of components, each with a failure probability. The system fails if and only if there is no A to B path in it. Thus, if components are less expensive today then tomorrow, one variant of the stochastic path problem essentially answers the problem "Which components and how many spare components of each type should you buy today to minimize your expected costs?"