ABSTRACT
Parity Declustering for Continuous Operation in Redundant Disk ArraysMark Holland and Garth A. GibsonSchool of Computer ScienceCarnegie Mellon University Pittsburgh, PA 15213 garth.gibson@cmu.edu AbstractWe describe and evaluate a strategy for declustering the parity encoding in a redundant disk array. This declustered parity organization balances cost against data reliability and performance during failure recovery. It is targeted at highly-available parity-based arrays for use in continuous- operation systems. It improves on standard parity organizations by reducing the additional load on surviving disks during the reconstruction of a failed disk's contents. This yields higher user throughput during recovery, and/or shorter recovery time.We first address the generalized parity layout problem, basing our solution on balanced incomplete and complete block designs. A software implementation of declustering is then evaluated using a disk array simulator under a highly concurrent workload comprised of small user accesses. We show that declustered parity penalizes user response time while a disk is being repaired (before and during its recovery) less than comparable non-declustered (RAID5) organizations without any penalty to user response time in the fault-free state. We then show that previously proposed modifications to a simple, single-sweep reconstruction algorithm further decrease user response times during recovery, but, contrary to previous suggestions, the inclusion of these modifications may, for many configurations, also slow the reconstruction process. This result arises from the simple model of disk access performance used in previous work, which did not consider throughput variations due to positioning delays. Click here for the full paper in pdf or postscript.
|