Matthew Wachs
Fourth Year (as of Fall, 2007) Ph.D.
Student
Computer Science Department
School of Computer Science
Carnegie Mellon University
Parallel Data Laboratory
E-mail: Look at the URL of this
page. Take everything between the tilde (~) and the following slash
(/), append "+web" to it, then append "@cs.cmu.edu" to it.
Alternatively, look for my email address at or near the top of this list.
Current Research
I'm working on performance
insulation
for shared storage servers. Shared storage servers are an
appealing alternative to per-application, dedicated storage systems.
However, it is essential that applications sharing a server receive
good performance, fairness, and efficiency. Unfortunately, interference
between workloads may reduce all three of these. With a combination of
three techniques, we've been able to approach the goal of providing
each of n clients 1/n of their standalone throughput,
while keeping response times reasonable [read
more | web site].
The next step is
to provide similar guarantees to a workload using multiple servers to
store its data (using RAID or erasure coding).
I'm also working on parallel
application I/O tracing for benchmarking. The best benchmark for
a real application is the real application, or trace replay based on
traces from that application. Unfortunately, running the real
application against a new or different storage system can be difficult,
or even impossible if the application or data set are classified,
confidential, or sensitive. Trace replay can be significantly more
straightforward and can be done with 'dummy' data.
For parallel applications, however,
accurate trace replay requires respecting the dependencies between
multiple nodes. Thus, it is necessary to discover these dependencies
during the trace extraction process. We've proposed and implemented a
black-box technique to do this by running a parallel application,
slowing down nodes, and observing how other nodes react [read more | web site].
The next step is to design a technique that involves less tracing time
and can handle applications whose request patterns change when the
timing of nodes is altered.
My advisor is Professor
Greg Ganger in the Parallel Data
Laboratory at CMU.
Publications
- Modeling the relative fitness of storage. Michael Mesnier,
Matthew Wachs, Raja R. Sambasivan, Alice X. Zheng, Gregory R. Ganger.
Proceedings of the Joint
International Conference on Measurement and Modeling of Computer
Systems (SIGMETRICS'07). June 12th–16th 2007, San Diego, CA.
Awarded Best Paper
Abstract / PDF [235K]
- Argon: Performance Insulation
for Shared Storage Servers.
Matthew Wachs, Michael Abd-El-Malek,
Eno Thereska, Gregory R. Ganger. Proceedings of the 5th USENIX
Conference on File and Storage Technologies (FAST '07),
February 13–16, 2007, San Jose, CA. Supercedes Carnegie Mellon
University Parallel Data Lab Technical Report
CMU-PDL-06-106, May 2006.
Abstract
/ PDF [167K]
- //TRACE: Parallel Trace
Replay with Approximate Causal
Events.
Michael Mesnier, Matthew Wachs, Raja R. Sambasivan, Julio Lopez, James
Hendricks, Gregory R. Ganger, David O'Hallaron.
Proceedings of the 5th USENIX Conference on File and Storage
Technologies (FAST '07),
February 13–16, 2007, San Jose, CA. Supercedes Carnegie Mellon
University Parallel Data Lab Technical Report
CMU-PDL-06-108, September 2006.
Abstract
/ PDF [187K]
- Early Experiences on the Journey Towards Self-* Storage.
Michael
Abd-El-Malek, William V. Courtright II, Chuck
Cranor, Gregory R. Ganger, James Hendricks, Andrew J. Klosterman,
Michael Mesnier, Manish Prasad,
Brandon Salmon, Raja R. Sambasivan, Shafeeq Sinnamohideen, John D.
Strunk, Eno Thereska, Matthew Wachs, Jay J. Wylie.
Bulletin of the IEEE Computer Society Technical Committee on Data
Engineering, September 2006.
Abstract
/ PDF [113K] / Postscript
[745K]
- Stardust: Tracking Activity in a Distributed Storage System.
Eno Thereska, Brandon Salmon, John Strunk, Matthew Wachs, Michael
Abd-El-Malek, Julio Lopez, Gregory R. Ganger. Proceedings of the Joint
International Conference on Measurement and Modeling of Computer
Systems, (SIGMETRICS'06). June 26th-30th 2006, Saint-Malo, France.
Abstract / PDF [578K]
- Relative fitness models for storage. Michael Mesnier,
Matthew Wachs, Brandon Salmon, Gregory R. Ganger. SIGMETRICS
Performance Evaluation Review (Vol 33, No 4, pg 23-38). March, 2006.
- Ursa Minor: Versatile Cluster-based Storage.
Michael Abd-El-Malek, William V. Courtright II, Chuck Cranor, Gregory
R. Ganger, James Hendricks, Andrew J. Klosterman, Michael Mesnier,
Manish Prasad, Brandon Salmon, Raja R. Sambasivan, Shafeeq
Sinnamohideen, John D. Strunk, Eno Thereska, Matthew Wachs, Jay J.
Wylie. Proceedings of the 4th USENIX Conference on File and Storage
Technology (FAST '05). San Francisco, CA. December 13-16, 2005.
Supercedes Carnegie Mellon University Parallel Data Lab Technical
Report CMU-PDL-05-104, April, 2005.
Awarded Best Paper
Abstract
/ PDF
[490K]
Please visit the PDL web site if the
above links do not work.
Support
I appreciate the support of an
NDSEG
(National Defense Science and Engineering) Graduate Fellowship,
thanks to
the Air Force Office of Scientific
Research
(AFOSR).
Education
I double-majored in Computer
Science and Math in the
College of Arts and Sciences
at Cornell University. I
graduated in May, 2004.
Teaching
I was a teaching assistant for
15-213 (Fall 2007), Carnegie Mellon's course on
computer architecture from a programmer's perspective (such as representation
of ints and floats, understanding assembly language, and buffer overflows). It
was taught by Professor Todd Mowry
and Professor Greg Ganger.
I was a teaching assistant for CS 381 in Fall 2003 with Professor John Hopcroft. CS
381 is Cornell's required CS theory course covering finite automata,
context-free languages, and Turing machines.
I was a teaching assistant for CS 482 in
Spring 2004 with Professor
Jon Kleinberg. CS 482 is Cornell's required CS theory course
covering algorithms topics such as greedy algorithms, dynamic
programming, network flow, and NP-completeness.
Last Modified: February 15, 2008