In an era of big data, the rapid growth of data that many companies and organizations produce and manage continues to drive efforts to improve the scalability of storage systems.The number of objects presented in storage systems continue to grow, making metadata management critical to the overall performance of file systems. On the other hand, many modern parallel applications are shifting toward shorter durations and larger degree of parallelism. Such trends continue to make storage systems to experience more diverse metadata intensive workloads.
The goal of this dissertation is to improve metadata management in both local and distributed file systems. The dissertation focuses on two aspects. One is to improve the out-of-core representation of file system metadata, by exploring the use of log-structured multi-level approaches to provide a unified and efficient representation in versatile secondary storage devices (e.g., traditional hard disk, shingled disk, and solid state disk). The other aspect is to demonstrate that such representation also can be flexibly integrated with many namespace distribution mechanisms to scale metadata performance of distribution file systems, and provide better support for big data analytic applications in data center environment.
Garth Gibson (Chair)
Gregory R. Ganger
David G. Andersen
Brent B. Welch (Google, Inc.)