Matthew Wachs, Ph.D.

E-mail: Look at the URL of this page. Take everything between the tilde (~) and the following slash (/), and append ".com" to it. Then prepend "misc@" to it.



Research

I worked on performance insulation for shared storage servers. Shared storage servers are an appealing alternative to per-application, dedicated storage systems. However, it is essential that applications sharing a server receive good performance, fairness, and efficiency. Unfortunately, interference between workloads may reduce all three of these. With a combination of three techniques (timeslicing, amortization, and cache partitioning), we've been able to approach the goal of providing each of n clients 1/n of their standalone throughput, while keeping average response times reasonable [ read more | web site ]. These techniques have been implemented in the Argon storage server. We've also demonstrated how to extend this technique to a workload using multiple servers to store its data [ read more ].

Our latest work in this area, Cesium, shows how to provide specific bandwidth guarantees to workloads while building on the high efficiency of Argon. A new timeslicing-based scheduler grows or shrinks timeslices depending on the access patterns of workloads to provide them with their specified bandwidth requirements. When a guarantee cannot be met, we are able to differentiate between fundamental violations (those where the workload's access pattern is temporarily too demanding for its guarantee to be met) and avoidable violations. Our scheduler is able to prevent nearly all of the avoidable violations, whereas other approaches that do not explicitly manage efficiency suffer from many avoidable violations when the workloads are complex [ read more ].

My thesis, on these topics, may be viewed here.

I've also worked on a number of other topics. We explored making it possible to use a file system implementation in one operating system from within another. Not all file systems are available on all operating systems. Porting file systems can be a significant burden for implementers. One type of "porting" is merely maintaining compatibility with newer versions of a kernel; even minor kernel revisions often change file system interfaces enough to require significant effort from developers. While file systems can be exported from one operating system to another using file sharing / network file systems like NFS, the semantics of these protocols often differ dramatically from the file system of interest. If NFS is used, the semantics become the "lowest common denominator." Our solution, which preserves semantics, is File System Virtual Appliances (FSVAs). These are virtual machines which host a file system, using its operating system of choice. Other virtual machines on the same machine can then access the file system as if it were local to them. This is accomplished by installing a relatively simple kernel module in the operating systems of both virtual machines. The module performs VFS forwarding (redirecting kernel file system API calls) between the machines [ read more | web site ].

I've also worked on parallel application I/O tracing for benchmarking. The best benchmark for a real application is the real application, or trace replay based on traces from that application. Unfortunately, running the real application against a new or different storage system can be difficult, or even impossible if the application or data set are classified, confidential, or sensitive. Trace replay can be significantly more straightforward and can be done with 'dummy' data. For parallel applications, however, accurate trace replay requires respecting the dependencies between multiple nodes. Thus, it is necessary to discover these dependencies during the trace extraction process. We've proposed and implemented a black-box technique to do this by running a parallel application, slowing down nodes, and observing how other nodes react [ read more | web site ].



Publications



Support

I appreciate the support, while I was a graduate student, of an NDSEG (National Defense Science and Engineering) Graduate Fellowship, thanks to the Air Force Office of Scientific Research (AFOSR).



Education

I received my Ph.D. from Carnegie Mellon University . I was a member of the Computer Science Department in the School of Computer Science.

I double-majored in Computer Science and Math in the College of Arts and Sciences at Cornell University.


While I was a student, I enjoyed being a part of a number of interesting courses:

I was a teaching assistant for 15-212 (Fall 2009), Carnegie Mellon's course on functional programming (ML). It was taught by Professor Steven Brookes.

I was a teaching assistant for 15-213 (Fall 2007), Carnegie Mellon's course on computer architecture from a programmer's perspective (such as representation of ints and floats, understanding assembly language, and buffer overflows). It was taught by Professor Todd Mowry and Professor Greg Ganger.

I was a teaching assistant for CS 482 in Spring 2004 with Professor Jon Kleinberg. CS 482 is Cornell's required CS theory course covering algorithms topics such as greedy algorithms, dynamic programming, network flow, and NP-completeness.

I was a teaching assistant for CS 381 in Fall 2003 with Professor John Hopcroft. CS 381 is Cornell's required CS theory course covering finite automata, context-free languages, and Turing machines.



Last Modified: September 2014