CS Thesis Proposal

Thesis Proposals
HYEONTAEK LIM
Computer Science Department
Carnegie Mellon University
Data-Intensive System Designs for High Performance and Capacity
Monday, March 24, 2014 - 2:30pm to 4:30pm
7101 
Gates&Hillman Centers
Abstract:

The key contributions of my thesis are algorithms and data structures for storing and processing a large amount of fine-grained data on modern hardware, and software architectures that combine and tune these technologies together to build data-intensive systems that achieve high performance and use memory efficiently. This thesis will describe SILT and MICA, which are key-value stores providing a hash table-like interface, as examples that demonstrate how these algorithms, data structures, and software architectures apply to data-intensive system designs. SILT, which is based upon Entropy-Coded Tries that index items sorted by the hash of their keys, requires only 0.7 bytes per item in memory, serving requests at flash drive speeds of tens to hundreds of thousands of items per second. MICA, which uses lossy and lossless hash indexes, circular logs, an client-assisted hardware-based request direction, handles 65.6 million remote operations per second per server node for items stored in memory. My proposed work will apply an additional set of algorithms and data structures to strengthen SILT and MICA's benefits and improve their robustness on diverse hardware.

Thesis Committee:
David G. Andersen (Chair)
Andrew Pavlo
Michael Kaminsky
Eddie Kohler (Harvard University)

Thesis Document

 

 

 

Keywords:
For More Information, Please Contact:

deb [atsymbol] cs ~replace-with-a-dot~ cmu ~replace-with-a-dot~ edu