Cluster storage systems gotta have HeART: improving storage efficiency by exploiting disk-reliability heterogeneity
DARE: Disk-Adaptive Redundancy
Established the case for performing disk-adaptive redundancy where data redundancy is tailored to observed disk failure rate heterogeneity. Based on analysis of over 5.3 million disks spanning 65 makes/models from production environments of Google, NetApp and Backblaze, we designed the first two DARE systems: HeART and Pacemaker which provided over approximately 15–20% space savings in large-scale storage clusters while never compromising on reliability.
Geriatrix: File system aging suite
Designed and developed an efficient and reproducible file system aging tool to encourage realistic and fair and responsible benchmarking. Geriatrix takes as input the file age, file size and directory depth distributions from already aged file system images and performs a controlled sequence of file creations and deletions to age the intended file system to mimic the characteristics of the reference file system image. Geriatrix is open source with 8 built-in aging profiles.
Packing in cloud file systems
Augmented a cloud file system’s write-back cache with a packing and indexing layer that coalesces small files to transform arbitrary user workload(s) to a write pattern more ideal for cloud storage in terms of — transfer sizes, number of objects and price. The result is a >60x improvement in performance and >25000x reduction in cloud storage price.
Conducted under the guidance of Prof. Garth Gibson, this research aimed at ways to minimize the size of unshingled partitions (typically used for frequently updated data viz. metadata, small files, etc.) on shingled disks. It also involved the analysis and partial implementation of two cleaning algorithms originating from log structured file systems on SMRfs.
Designed and implemented a flash translation layer (FTL) with block-mapping, garbage collection (with four policies) and wear-leveling in FlashSim (an FTL simulation software). This project was enhanced and released as a course project for a 70+ student graduate level storage systems course (15-746) at CMU.
Space Maps in Ext4
Designed and developed an extent-based free-space management technique for the Ext4 filesystem, called Space Maps, along with an allocator that uses Space Maps for disk-space allocation. Consisting of a red black tree and a log, Space Maps enhanced the allocation speed by 30% and deallocation speed by 80% and aided in reducing file and free space fragmentation.
SSD over Infiniband
This study compared the performance between a locally connected SSD and remotely connected SSD (over infiniband). Using the lightweght SCSI RDMA protocol (SRP) for communication, we analyzed the costs in accessing remote SSDs and gained insights into enhancing software architectures of next-gen data centers from the storage perspective.
Price of Ext4
Under the guidance of Prof. Remzi Arpaci-Dusseau this study measured the software overhead of the Ext4 file system with the advent of storage devices with microsecond latencies. We threw light on the shifting of bottlenecks in the various submodules of Ext4 and suggested optimizations to make it future-proof.
Database Garbage Collection
Designed, developed and evaluated a co-operative (i.e. not stop-the-world) multi-threaded, lock-free, epoch based garbage collection mechanism for Peloton, a hybrid in-memory database system. Explored tradeoffs between optimizing for average latency versus tail latency due to absence of dedicated garbage collection thread.
Studied the hazy nature of compression algorithms used in checkpoint / restore systems, and went on to suggest possible enhancements and future directions in library-level checkpoint compression for faster and more efficient checkpointing with reduced disk footprint.
Implemented a proof-of-concept of decentralized active databases on top of Kademlia - a distributed hash table on a decentralized peer-to-peer network. Active Databases essentially mean event-driven databases following event-condition-action (ECA) rules.
Designed and developed a UDP based VM migration module in Palacios - an OS independent embeddable VM monitor. It supported multiple-source multiple-destination migrations specifically aimed at distributed applications in HPC environments (viz. supercomputers) to exploit page-sharing among participating nodes giving increased parallelism for migration.
Explored a run-length based preprocessing scheme exploiting the power-law behavior of genomic data to reveal possibilities of Markovian compression and variable length encoding algorithms for higher compression ratio than provided by existing dictionary based compression algorithms.
NIC of Time
Designed and developed a tool for exploring the state space of all possible combinations of offloaded functionalities on the NIC vs their presence in the kernel. The tool performs extensive analysis of throughput and CPU utilization to suggest one or a group of features that should be offloaded to the NIC depending on the particular workload under consideration.
PhD in Computer ScienceCarnegie Mellon University2014 - present
MS in Computer ScienceNorthwestern University2012 - 2013
BE in Computer SciencePune Institute of Computer Technology2005 - 2009