Professor
Computer Science Dept and
Department of Electrical and Computer Engineering
Carnegie Mellon University
Gates-Hillman Center 9111
5000 Forbes Avenue, Pittsburgh, PA 15213-3891
Phone: 412-268-5890, FAX: 412-268-5576
E-mail: garth dot gibson @ cs dot cmu dot edu

Co-Founder and Chief Scientist
Panasas, Inc. www.panasas.com
1501 Reedsdale Street, Pittsburgh PA 15233
Phone: 412-323-3500, FAX: 412-323-3511
E-mail:
garth dot gibson @ panasas dot com

Administrative Contact
Jennifer Landefeld
Gates-Hillman Center 9006
412-268-4740
E-mail:
jennsbl @ cs dot cmu dot edu

Typical Courses Taught:
15-410 Operating Systems Design and Implementation
15-712 Advanced and Distributed Operating Systems
15-719 Advanced Cloud Computing
15-746 Advanced Storage Systems

Broadly, I am interested in large-scale parallelism in computer systems and its implications on application performance, operating system design, fault tolerance and big data in cloud computing. My particular interests focus on secondary memory system technologies and optimization, scalable file and key-value storage systems, scalable machine learning and systematic testing for large scale systems. I have a strong interest in shepherding technological advances from blackboard through standards and to commercial reality .

 

    I joined the faculty of CMU's Computer Science Department in 1991. Previously I received a Ph.D. and a M.Sc. in Computer Science in 1991 and 1987, respectively, from the University of California at Berkeley. Prior to Berkeley, I received a Bachelor of Mathematics in Computer Science and Applied Mathematics in 1983 from the University of Waterloo in Ontario, Canada.

    In 1993 I founded CMU's Parallel Data Laboratory (PDL) and led it until April 1999. Today the PDL is led by Greg Ganger. The PDL is a community that typically comprises between 3 to 6 faculty, 1 to 2 dozen students and 4 to 10 staff. It receives support and guidance from a consortium of 10 to 20 companies with interests in storage systems, the Parallel Data Consortium. This community holds biannual retreats and workshops to exchange technology ideas, analysis and future directions. The publications of the PDL are available for your inspection.

    The principal contributions of my first twenty years of research: Redundant Arrays of Inexpensive Disks (RAID), Informed Prefetching and Caching (TIP) and Network-Attached Secure Disks (NASD), now an ANSI SCSI command set standard (OSD), have all stimulated derivative research and development in academia and industry. RAID, in particular, is now the organizing concept of a 10+ billion-dollar marketplace (more on RAID in my 1995 RAID tutorial).

    In 1999 I started Panasas Inc., a scalable storage cluster company using an object storage architecture and providing 100s of TB of high-performance storage in a single management domain for national laboratory, energy sector, auto/aero-design, life sciences, financial modeling, digital animation, and engineering design markets.

    In 2006 I founded a Petascale Data Storage Institute (PDSI) for the Department of Energy's Scientific Discovery through Advanced Computing (SciDAC). Led by CMU, with partners at Los Alamos, Sandia, Oak Ridge, Pacific Northwest and Lawrence Berkeley National Labs, and University of California, Santa Cruz and University of Michigan, Ann Arbor, this Institute gathers together leading experts in leadership class supercomputing storage systems to address the challenges involved in moving from today's terascale computers to the petascale computers of the next decade.

    In 2008 I turned to Data Intensive Scalable Computing, Clouds, and Scalable Analytics, participating in the design and installation of 2 TF, 2TB, 1/2PB of computing in an OpenCirrus and an OpenCloud cluster.

     

 

 

  • PLFS (see SC09 paper below and institutes.lanl.gov/plfs) has been released on Sourceforge (sourceforge.net/projects/plfs) under a BSD license. It is available through MPI-IO libraries or FUSE user level file system reflector. It is being put into production HPC use at Los Alamos.
  • pNFS, or Parallel NFS, is a subset of the new features in NFS version 4.1 (www.pnfs.com, RFC 5661-5664, tools.ietf.org/html/rfc566x for x=1,2,3,4). pNFS hails from a workshop in Dec 2003 when Garth Gibson and Peter Honeyman asked "what is next for NFS" and Gibson/CMU/Panasas answered: delegation of file layouts (direct and indirect pointers in an inode, sort of). Based on the NASD file system work (ASPLOS98 below) that inspired Panasas (FAST08 below), pNFS allows client machines to request a (revocable) map of the locations of data in a storage area network (seen as SCSI blocks, SCSI objects or NFS files) which the client can use to directly access file data (without access being proxied by the NFSv4.1 server).
    • an implementation of pNFS built on top of NFS files was taken into Linux in 2.6.39
    • an implementation of pNFS built on top of SCSI objects was taken into Linux in 2.6.40, renamed as 3.0
    • an implementation of pNFS built on top of SCSI blocks was taken into Linux 3.1
    • extension of pNFS built on SCSI objects with RAID 1 and RAID 5 over objects on different data servers, taken into Linux 3.2
  • Past member of the technical council of the Storage Networking Industry Association (SNIA), an international organization of about 100 networking and storage companies formed in July 1999.
  • Founder and chair, National Storage Industry Consortium (NSIC) working group on Network-Attached Storage Devices (NASD), 1996-1999. Program chair for eleven NSIC/NASD sponsored public workshops (75 presentations and 500 attendees). The result of these efforts was presented in "Object Based Storage Devices: A Command Set Proposal," (http://www.nsic.org/nasd/final.pdf, November 1999) which was written to launch an ANSI standards effort in the X3/T10 (SCSI) committee.
  • Release of NASD Scalable Storage Systems Prototype code, version 1.1 in July 1999, and version 1.3 in May 2000. This code implements CMU's view of the next generation storage interface (SCSI-4?) and a simple set of changes for a distributed file system to exploit it.
  • Release of RAIDframe Rapid Prototyping Tool for RAID Systems code, August 1996. This code, suitably debugged and adapted, appears as the RAID device driver in the current release of the NetBSD operating system.

 

ACM DIgital Library Publication List  •  Google Scholar Publication List

  • More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server. Ho, Qirong, James Cipar, Henggang Cui, Seunghak Lee, Jin Kyu Kim, Phil B. Gibbons, Garth A. Gibson, Gregory R. Ganger, Eric P. Xing. 2013 Neural Information Processing Systems (NIPS 2013), Dec 5-10, Lake Tahoe, NV. PDF
  • PARROT: A Practical Runtime for Deterministic, Stable and Reliable Threads. Heming Cui, Jiri Simsa, Yi-Hong Lin, Hao Li, Ben Blum, Xinan Xu, Junfeng Yang, Garth A. Gibson. 24th ACM Symposium on Operating Systems Principles (SOSP'13), Nov 4-6, 2013, Farmington, PA. PDF - coming soon.
  • TABLEFS: Enhancing Metadata Efficiency in the Local File System. Kai Ren, Garth Gibson. 2013 USENIX Annual Technical Conference, June 26-28, 2013, San Jose, CA. PDF
  • Solving the Straggler Problem with Bounded Staleness. James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Gregory R. Ganger, Garth Gibson, Kimberly Keeton, Eric Xing. 14th USENIX Workshop on Hot Topics in Operating Systems, May 13-15, 2013, Santa Ana Pueblo, NM. PDF
  • Shingled Magnetic Recording: Areal Density Increase Requires New Data Management. Tim Feldman, Garth Gibson. USENIX ;login:, v 38, n 3, June 2013. PDF
  • I/O Acceleration with Pattern Detection. Jun He, John Bent, Aaron Torres, Gary Grider, Garth Gibson, Carlos Maltzahn, Xian-He Sun.The 22nd Int. ACM Symposium on High Performance Parallel and Distributed Computing (HPDC'13), New York City, June 17-21, 2013. PDF
  • Discovering Structure in Unstructured I/O. Jun He, John Bent, Aaron Torres, Gary Grider, Garth Gibson, Carlos Maltzahn, Xian-He Sun. Proc. of the Seventh Parallel Data Storage Workshop (PDSW12), co-located with the Int. Conference for High Performance Computing, Networking, Storage and Analysis (SC12), Salt Lake City, UT, November 2012. PDF
  • A Case for Scaling HPC Metadata Performance through De-specialization. Swapnil Patil, Kai Ren, Garth Gibson. Proc. of the Seventh Parallel Data Storage Workshop (PDSW12), co-located with the Int. Conference for High Performance Computing, Networking, Storage and Analysis (SC12), Salt Lake City, UT, November 2012. PDF
  • TABLEFS: Embedding a NoSQL Database inside the Local File System. Kai Ren, Garth Gibson. 1st Storage System, Hard Disk and Solid State Technologies Summit, IEEE Asia-Pacific Magnetic Recording Conference (APMRC), November 2012, Singapore. PDF
  • Scalable Dynamic Partial Order Reduction. Jiri Simsa, Randal Bryant, Garth Gibson, Jason Hickey. Third Int. Conf. on Runtime Verification (RV2012), 25-28 September 2012, Istanbul, Turkey. PDF
  • The Power and Challenges of Transformative I/O. Adam Manzanares, Meghan McClelland, John Bent, Garth Gibson. 2012 IEEE Int. Conf. on Cluster Computing (CLUSTER12), 24-28 September 2012, Beijing, China. PDF - coming soon
  • Indexing a large-scale database of astronomical objects. Bin Fu, Eugene Fink, Garth Gibson, and Jaime Carbonell. Proceedings of the Fourth Workshop on Interfaces and Architecture for Scientific Data Storage (IASDS), September 2012, Beijing, China. PDF
  • Exact and Approximate Computation of a Histogram of Pairwise Distances between Astronomical Objects. Bin Fu, Eugene Fink, Garth Gibson, Jaime Carbonell. 1st Workshop on High Performance Computing in Astronomy (AstroHPC'12), June 2012, Delft, Netherlands. PDF
  • File System Virtual Appliances: Portable File System Implementations. Michael Abd-El-Malek , Matthew Wachs, James Cipar, Karan Sanghi, Gregory R. Ganger, Garth A. Gibson, Michael K. Reiter. ACM Transactions on Storage, Vol. 8, No. 3, Article 39, May 2012. PDF
  • YCSB++: Benchmarking and Performance Debugging Advanced Features in Scalable Table Stores. Swapnil Patil, Milo Polte, Kai Ren, Wittawat Tantisiriroj, Lin Xiao, Julio Lopez, Garth Gibson, Adam Fuchs, Billie Rinaldi. Proc. of the 2nd ACM Symposium on Cloud Computing (SOCC '11), October 27–28, 2011, Cascais, Portugal. Supersedes Carnegie Mellon University Parallel Data Laboratory Technical Report CMU-PDL-11-111, August 2011. PDF
  • On the Duality of Data-intensive File System Design: Reconciling HDFS and PVFS. Wittawat Tantisiriroj, Swapnil Patil, Garth Gibson, Seung Woo Son, Samuel J. Lang, Robert B. Ross. SC11, November 12-18, 2011, Seattle, Washington USA. Supersedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-11-108. April 2011.
    PDF
  • Recipes for Baking Black Forest Databases: Building and Querying Black Hole Merger Trees from Cosmological Simulations. Lopez, J., C. Degraf, T. DiMatteo, B. Fu, E. Fink, G. Gibson. 23rd Scientific and Statistical Database Management Conference (SSDBM'11), July 2011. PDF
  • Six Degrees of Scientific Data: Reading Patterns for Extreme Scale Science IO. Lofstead, Jay, Milo Polte, Garth Gibson, Scott A. Klasky, Karsten Schwan, Ron Oldfield, Matthew Wolf, Qing Liu. 20th ACM Int. Symp. On High-Performance Parallel and Distributed Computing (HPDC'11), June 2011. PDF
  • Otus: Resource Attribution and Metrics Correlation in Data-Intensive Clusters. Kai Ren, Julio Lopez, Garth Gibson. The 2nd International Workshop on MapReduce and its Applications (MapReduce'11), June, 2011. PDF
  • Scale and Concurrency of GIGA+: File System Directories with Millions of Files. Patil, S., G. Gibson. Proc 9th USENIX Conf. on File and Storage Technologies (FAST11), February, 2011. PDF
  • dBug: Systematic Evaluation of Distributed Systems. Jiri Simsa, Randy Bryant, Garth Gibson. 5th Int. Workshop on Systems Software Verification (SSV’10), co-located with 9th USENIX Symp. On Operating Systems Design and Implementation (OSDI’10), Vancouver BC, October 2010. PDF
  • pWalrus: Towards Better Integration of Parallel File Systems into Cloud Storage. Yoshihisa Abe, Garth Gibson.Workshop on Interfaces and Abstractions for Scientific Data Storage (IASDS10), co-located with IEEE Int. Conference on Cluster Computing 2010 (Cluster10), Heraklion, Greece, September 2010. PDF
  • DiscFinder: A Data-Intensive Scalable Cluster Finder for Astrophysics. Fu, B., K. Ren, J. Lopez, E. Fink, G. Gibson. ACM Int. Symp. On High Performance Distributed Computing (HPDC), June 2010. PDF
  • PLFS: A Checkpoint Filesystem for Parallel Applications. Bent, J., G. Gibson, G. Grider, B. McClelland, P. Nowoczynski, J. Nunez, M. Polte, M. Wingate, “,” Int. Conf. for High Performance Computing, Networking, Storage and Analysis (SC2009), Nov. 2009. PDF
  • Understanding and Maturing the Data-Intensive Scalable Computing Storage Substrate. Gibson, G., B. Fan, S. Patil, M. Polte, W. Tantisiriroj, L. Xiao. 2009 Microsoft eScience Workshop, Pittsburgh, PA, October, 2009. PDF
  • Directions for TDMR System Architecture: Synergies with SSDs. Gibson, G.A., Milo Polte. Proc. of the I.E.E.E. International Symposium on Magnetics (INTERMAG09), Sacramento CA, March 2009. PDF, TALK
  • Safe and Effective fine-grain TCP Retransmissions for Datacenter Incast Communication. Vasudevan, V., A. Phanishayee, H. Shah, E. Krevat, D.G. Andersen, G.R. Ganger, G.A. Gibson, B. Mueller, S. Seshan. SIGCOMM'09, August 16-21, 2009, Barcelona, Spain. PDF
  • In Search of an API for Scalable File Systems: Under the table or above it? Patil, S., G.A. Gibson, G.R. Ganger, J. Lopez, M. Polte, W. Tantisiriroj, L. Xiao. HotCloud'09, June 15, 2009, San Diego, CA. PDF
  • Enabling Enterprise Solid State Disks Performance. Polte, M., J. Simsa, G. Gibson. 1st Workshop on Integrating Solid-state Memory into the Storage Hierarchy, March 7, 2009, Washington DC. PDF
  • Scalable Performance of the Panasas Parallel file System. Welch, B., M. Unangst, Z. Abbasi, G. Gibson, B. Mueller, J. Small, J. Zelenka, B. Zhou. 6th USENIX Conf on File and Storage Technologies (FAST'08), Feb. 2008, San Francisco, CA. PDF
  • Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems. Phanishayee., A., E. Krevat, V. Vasudevan, D.G. Andersen, G.R. Ganger, G.A. Gibson, S. Seshan. 6th USENIX Conf on File and Storage Technologies (FAST'08), Feb. 2008, San Francisco, CA. PDF
  • Understanding failure in petascale computers. Schroeder, B., Gibson, G.A. SciDAC 2007. Journal of Physics: Conf. Ser. 78. PDF
  • Disk Failures in the Real World: What Does an MTTF of 1,000,000 Hours Mean to You? Schroeder, B., Gibson, G.A. Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST '07), February 13–16, 2007, San Jose, CA. PDF
  • A Large-scale Study of Failures in High-performance-computing Systems. Schroeder, B., Gibson, G.A. Proceedings of the International Conference on Dependable Systems and Networks (DSN2006), Philadelphia, PA, USA, June 25-28, 2006. PDF
  • Scheduling Speculative Tasks in a Compute Farm. Petrou, D., G. Ganger, G.A. Gibson. High Performance Computing, Networking, and Storage Conference (SC2005), Seattle, WA., November 2005. PDF
  • Active Disks for Large-Scale Data Mining. Riedel, E., C. Faloutsos, G.A. Gibson, D. Nagle. ACM Computer Magazine, June 2001.
  • Network Attached Storage Architecture. Gibson, G.A., R. Van Meter. Comm. of the ACM, Vol. 43, No 11, November, 2000. PDF
  • Dynamic Function Placement for Data-Intensive Cluster Computing. Amiri, K., D. Petrou, G. Ganger, G.A. Gibson. USENIX Technical Conference, San Diego, June 2000. PDF
  • Highly Concurrent Shared Storage. Amiri, K., G.A. Gibson, R. Golding. Int. Conf. on Distributed Computing Systems (ICDCS2000), April 2000. PDF
  • Automatic I/O Hint Generation through Speculative Execution. Chang, F., G.A. Gibson. Proceedings of the Third USENIX Symposium of Operating Systems Design and Implementation (OSDI), February 1999. PDF
  • Active Storage for Large-scale Data Mining and Multimedia Applications. E. Riedel, G. A. Gibson, C. Faloutsos. Proceedings of the 1998 Very Large Data Bases conference (VLDB), August 1998. PDF
  • A Cost-Effective High-Bandwidth Storage Architecture. Gibson, G.A, et. al. Int. Conf. on Architectural Support for Programming Languages and Operating Systems, ASPLOS, October, 1998. PDF
  • Report of the Working Group on Storage I/O Issues in Large-Scale Computing. G. A. Gibson, J. S. Vitter, J. Wilkes, eds. ACM Workshop on Strategic Directions in Computing Research. ACM Computing Surveys, 28, 1, Dec. 1996. PDF
  • Informed Prefetching and Caching. R. H. Patterson, G. A. Gibson, E. Ginting, D. Stodolsky, and J. Zelenka. Proc. of the 15th Symposium of Operating Systems Principles, December 3-6, 1995. PDF
  • Architectures and Algorithms for On-line Failure Recovery in Redundant Disk Arrays. M. Holland, G. A. Gibson, D. P. Siewiorek. J. of Distributed and Parallel Databases, Vol. 2, No. 3, July, 1994. PDF
  • A Case for Redundant Arrays of Inexpensive Disks (RAID). D. A. Patterson, G. A. Gibson, R. H. Katz. Proceedings of the International Conference on Management of Data (SIGMOD), June 1988. PDF



garth dot gibson @ cs dot cmu dot edu
(last updated 16-Dec-2013)
© 2012