CSD

Concurrency and randomization are two programming features that are notoriously difficult to use correctly. This is because programs that use them no longer behave deterministically, so programmers must take into account the set of all possible interactions and random choices that may occur. A common approach to reasoning about complex programs is to use relational or refinement reasoning: to understand a complex program, we first prove a correspondence between its behavior and that of some simpler program. Then, we can analyze or verify properties of this simpler program and draw conclusions about the original program.

Although logics have been developed for relational reasoning for concurrent programs and randomized programs, no logics exist for reasoning about programs that are both concurrent and randomized. I propose developing a program logic that supports reasoning about both of these features. Moreover, I argue that such a logic is more than just an ad-hoc combination of features: instead, the ideas of separation and resources, which are already central to modern concurrency logics, also provide a foundation for probabilistic reasoning.

Thesis Committee:
Robert Harper (Chair)
Jan Hoffmann
Jeremy Avigad
Derek Dreyer (MPI-SWS)

Copy of Thesis Summary

Riemannian geometry provides a vast toolbox for the numerical treatment of nonlinear optimization problems. Alas, some infinite dimensional problems, e.g., those related to curvature energies of immersed surfaces, can hardly be put into a Riemannian context. Still, many techniques from Riemannian optimization perform very well even in a suitable Banach manifold setting–at least experimentally.

In this talk, we demonstrate the efficiency of the method of gradient descent for the minimization of the Willmore energy. Instead of L2–bilinear forms (which lead to the parabolic Willmore flow), we utilize a H2–bilinear form to define gradients. Although the Willmore energy is not defined for general surfaces of class H2, we show that in the space of immersions of class W2,p with p > 2, the gradient vector field is well-defined and induces an ordinary differential equation. In case of equality constraints, we also provide conditions for the existence of the projected gradient flow.

Host: Keenan Crane

The automatic generation of programs from non-executable specifications would have a tremendous impact on software engineering, and in particular would help to tackle security problems. However, with program synthesis methods being limited to a few lines of code at a time, we are still far away from usefulness. In this talk I will discuss fundamental problems with existing program synthesis algorithms, such as enumerative program synthesis and CEGIS, and suggest a new algorithmic principle to overcome these problems. For that we will resort to a simplified setting, quantified Boolean formulas (QBFs), where the suggested algorithm already significantly improves over the state of the art in terms of the number of solved instances, solving time, and the size of certificates. In the last part of the talk I will present some initial, promising experiments for program synthesis.

Markus N. Rabe is a postdoc at UC Berkeley, where he works with Prof. Sanjit Seshia. In his research he focuses on the fundamental aspects of program synthesis and security. His recent works include a completely new approach to quantifier elimination that tackles the scalability problem of program synthesis and an approximate probabilistic optimization algorithm to find inputs to programs that cause a worst-case leakage. In his PhD at Saarland University under the supervision of Prof. Bernd Finkbeiner he developed HyperLTL, a temporal logic capable of expressing a wide range of properties from information-flow control, and new model checking algorithms to automatically check information-flow properties for complex hardware systems.

Faculty Host: Jean Yang

In an era of big data, the rapid growth of data that many companies and organizations produce and manage continues to drive efforts to improve the scalability of storage systems. The number of objects presented in storage systems continue to grow, making metadata management critical to the overall performance of file systems. Many modern parallel applications are shifting toward shorter durations and larger degree of parallelism. Such trends continue to make storage systems to experience more diverse metadata intensive workloads.

The goal of this dissertation is to improve metadata management in both local and distributed file systems. The dissertation focuses on two aspects. One is to improve the out-of-core representation of file system metadata, by exploring the use of log-structured multi-level approaches to provide a unified and efficient representation for different types of secondary storage devices (e.g., traditional hard disk and solid state disk). We have designed and implemented TableFS and its improved version SlimFS, which shows 50% to 10x faster than traditional Linux file systems. The other aspect is to demonstrate that such representation also can be flexibly integrated with many namespace distribution mechanisms to scale metadata performance of distribution file systems, and provide better support for a variety of big data applications in data center environment. Our distributed metadata middleware IndexFS can help improve metadata performance for PVFS, Lustre and HDFS by scaling to as many as 128 metadata servers.

Thesis Committee:
Garth Gibson (Chair)
Gregory R. Ganger
David G. Andersen
Brent B. Welch (Google, Inc.)

An important feature of functional programs is that they are parallel by default. Implementing an efficient parallel functional language, however, is a major challenge, in part because the high rate of allocation and freeing associated with functional programs requires an efficient and scalable memory manager.

In this talk, we present a technique for parallel memory management for strict functional languages with nested parallelism. At the highest level of abstraction, the approach consists of a technique to organize memory as a hierarchy of heaps, and an algorithm for performing automatic memory reclamation by taking advantage of a disentanglement property of parallel functional programs. More specifically, the idea is to assign to each parallel task its own heap in memory and organize the heaps in a hierarchy/tree that mirrors the hierarchy of tasks.

We present a nested-parallel calculus that specifies hierarchical heaps and prove in this calculus a "disentanglement" property, which prohibits a task from accessing objects allocated by another task that might execute in parallel. Leveraging the disentanglement property, we present a garbage collection technique that can operate on any subtree in the memory hierarchy concurrently as other tasks (and/or other collections) proceed in parallel. We prove the safety of this collector by  formalizing it in the context of our parallel calculus. In addition, we describe how the proposed techniques can be implemented on modern shared-memory machines and present a prototype implementation as an extension to MLton, a high-performance compiler for the Standard ML language. Finally, we evaluate the performance of this implementation on a number of parallel benchmarks.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Recent progresses on a number of combinatorial and numerical problems benefited from combining ideas and techniques from both fields to design faster and more powerful algorithms. A prime example is the field of spectral graph theory, which involves the interplay between combinatorial graph algorithms with numerical linear algebra, and this led to the first nearly linear time solvers for graph Laplacians and symmetric and diagonally dominant (SDD) linear systems.

In this thesis we present several combinatorial algorithms that allow us to tap into spectral properties of graphs. In particular we present:

- Improved parallel algorithm for low diameter decomposition via exponential shifts.
- Parallel algorithm for graph spanners with near optimal stretch trade-offs and its application to spectral graph sparsification.
- Improved low stretch tree embeddings that are suitable for nearly linear time graph Laplacian solvers.
- Work efficient parallel algorithms for hopset and approximate shortest path.

A common goal we strive for in the design of these algorithms is to have complexity nearly linear in the input size and to be work efficient, in order to be scalable to the evergrowing amount of data and problem sizes in this day and age.

Thesis Committee:
Gary Miller (Chair)
Bernhard Haeupler
Daniel Sleator
Noel J. Walkington (Math)
Ioannis Koutis (University of Puerto Rico)

The Cloud Computing model has incentivized companies to outsource services to third-party providers. Service owners can use third-party computational, storage and network resources while avoiding the cost of acquiring an IT infrastructure. However, they have to rely on the trustworthiness of the third-party providers, who ultimately need to guarantee that the services run as intended.

This thesis shows how to secure the execution of large-scale services. From the perspective of a client that sends a request and receives a response, trust can be established by verifying a small proof of correct execution that is attached to the result. On the remote provider’s platform, a small trusted computing base enables the secure execution of generic services composed of a large source code and/or working on large data sets, using an abstraction layer that is implementable on diverse trusted hardware architectures.

Our small TCB implements three orthogonal techniques that are the core contributions of this thesis. The first one targets the identification (and the execution) of only the part of code that is necessary to fulfill a client’s request. This allows an increase both in security and efficiency by leaving any code that is not required to run the service outside the execution environment. The second contribution enables terabyte-scale data processing by means of a secure in-memory data handling mechanism. This allows a service to retrieve data that is validated on access and before use. Notably, data I/O is performed using virtual memory mechanisms that do not require any system call from the trusted execution environment, thereby reducing the attack surface. The third contribution is a novel fully-passive secure replication scheme that is tolerant to software attacks. Fault-tolerance delivers availability guarantees to clients, while passive replication allows for computationally efficient processing. Interestingly, all of our techniques are based on the same abstraction layer of the trusted hardware. In addition, our implementation and experimental evaluation demonstrate the practicality of these approaches.

Thesis Committee:
Peter Steenkiste (Co-Chair)
Nuno Neves (Co-Chair, University of Lisbon)
Anupam Datta
Vyas Sekar
Antonia Lopes (University of Lisbon)

Pages

Subscribe to CSD