[ About me | Prospective students | News | Research | Publications | Service | Talks ]


    Bianca Schroeder

    Assistant professor

    Computer Science Department
    University of Toronto
    Bahen Centre for Information Technology
    40 St. George Street
    Toronto, ON M5S 2E4

    E-mail: first-name@cs.toronto.edu
    Office: Bahen 5236

     

    About me

    I'm an assistant professor in the computer science department and a member of the SysLab and the systems group . Before joining UofT, I spent 2 years as a post-doc at CMU working with Garth Gibson. I finished my PhD in August 2005 at CMU under the guidance of Mor Harchol-Balter.

    My research focuses on the design, implementation and performance evaluation of computer systems. The methods I am using in my work are inspired by a broad array of disciplines, including performance modeling and analysis, workload and fault characterization, machine learning, and scheduling and queueing theory. My work spans a number of different areas in computer systems, including high-performance computing systems, web servers, computer networks, database systems and storage systems.

    For more information, you can take a look at my CV and my research pages.

     

    Prospective students

    I am currently actively looking for graduate students to work with. If you are interested in working with me please check out my research pages . The line of research I'm currently most interested in is "empirical system reliability".

     


     

    News

    • Feb 2008: Our FAST'08 paper wins the best student paper award.

    • December 2007: I've been invited to join the program committee of the 16th Conference on Measurement, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS 2008).

    • July 2007: I have accepted a position as a tenure track faculty in the computer science department at the University of Toronto starting in January 2008!

    • April 2007: I've been invited to join the program committee of the 38th Annual International Conference on Dependable Systems and Networks (DSN 2008) and the program committe of the 17th International World Wide Web conference (WWW'08).

    • Feb 2007: Our FAST'07 paper is being featured in an article on slashdot, which so far has received more than 75,000 unique hits! It has also been featured in articles at other news sites such as Computerworld, the StorageMojo and eWEEK and PCWorld.

    • Feb 2007: Our FAST'07 paper wins a best paper award.

    • Oct 2006: I'm a co-PI of the Petascale Data Storage Institute that has been awarded to a collaboration of researchers at Carnegie Mellon University, National Energy Research Scientific Computing Center, Pacific Northwest National Laboratory, Oak Ridge National Laboratory, Sandia National Laboratory, Los Alamos National Laboratory, University of Michigan, and the University of California at Santa Cruz.

    • Sept 2006: I've been invited to join the program committee of the 16th International World Wide Web conference (WWW'07).

    • Aug 2006: We have started to collaborate with Usenix on setting up a public failure data repository to make some of the data we have been using in our recent work publicly available and to encourage others to share their data as well. More updates to follow, once the repository is set up.

     

     


    Research

    My research focuses on the design and implementation of computer systems. The methods I am using in my work are inspired by a broad array of disciplines, including performance modeling and analysis, workload and fault characterization, machine learning, and scheduling and queueing theory. My work spans a number of different areas in computer systems, including high-performance computing systems, web servers, computer networks, database systems and storage systems.

    My PhD thesis work focused on scheduling to improve the performance of web servers and databases and to provide differentiated Quality of Service.

    Currently, I am very interested in "empirical system reliability". This new line of research is motivated by the fact that, with the ever growing component count in large-scale IT systems, component failures are quickly becoming the norm rather than the exception. Yet, virtually no data on failures in real systems is publicly available, forcing researchers to base their work on anecdotes and back of the envelope calculations rather than empirical data. The goal of my work is to collect and analyze failure data from real, large-scale production systems and to exploit the results for better system design and management.

    For a brief overview over some of the projects I have worked on check out the following project web pages:

    • The SYNC project: Schedule Your Network Connections.

    • Scheduling supercomputers: The case for load Unbalancing. [Under construction].

    • QoS for databases.

    • Workload modeling and impact on system design. [Under construction].

    • Failures in the real world: Empirical system reliability.

     

     

     

    Publications

    Conferences and journals

    • L. Bairavasundaram, G. Goodson, B. Schroeder, A. Arpaci-Dusseau, R. Arpaci-Dusseau, FAST'08. "An analysis of data corruption in the storage stack." 6th Usenix Conference on File and Storage Technologies (FAST 2008). pdf.

    • Bianca Schroeder, Garth Gibson. "Understanding failure in petascale computers." Presented at the SciDAC 2007 conference. Journal of Physics: Conf. Ser. 78. pdf.

    • Bianca Schroeder, Garth Gibson. "The computer failure data repository." Invited contribution to the Workshop on Reliability Analysis of System Failure Data (RAF'07) to be held at MSR Cambridge, UK. pdf.

    • Bianca Schroeder, Garth Gibson. "Disk failures in the real world: What does an MTTF of 1,000,000 hours mean too you?" 5th Usenix Conference on File and Storage Technologies (FAST 2007). Winner of best paper award. pdf

      The above paper has also been featured in an article on slashdot, which so far has received more than 75,000 hits!

    • Ernst Biersack, Bianca Schroeder. "Scheduling in Practice." Invited to special issue of the ACM Sigmetrics PER (Performance Evaluation Review) on "New Perspectives in Scheduling". To appear in 2007. pdf

    • Bianca Schroeder, Garth Gibson. "A large scale study of failures in high-performance-computing systems." International Symposium on Dependable Systems and Networks (DSN 2006). pdf

      As one of the best DSN'06 papers invited to IEEE Transactions on Dependable and Secure Computing (TDSC).

    • Bianca Schroeder, Adam Wierman and Mor Harchol-Balter. "Open vs closed: a cautionary tale." 3rd Symposium on Networked System Design and Implementation (NSDI 2006). pdf

    • Bianca Schroeder, Arun Iyengar and Erich Nahum. "Web traffic analsyis for capacity planning." . In preparation.

    • Bianca Schroeder, Mor Harchol-Balter, Arun Iyengar, Erich Nahum. "Achieving class-based QoS for transactional workloads." Poster paper in 22nd International Conference on Data Engineering (ICDE 2006). pdf

    • Bianca Schroeder, Mor Harchol-Balter, Arun Iyengar, Erich Nahum and Adam Wierman. "How to determine a good multi-programming level for external scheduling." 22nd International Conference on Data Engineering (ICDE 2006). pdf

    • David T. McWherter, Bianca Schroeder, Anastassia Ailamaki and Mor Harchol-Balter. "Improving Preemptive Prioritization via Statistical Characterization of OLTP Locking." 21th International Conference on Data Engineering (ICDE 2005). pdf

    • David T. McWherter, Bianca Schroeder, Anastassia Ailamaki and Mor Harchol-Balter. "Priority Mechanisms for OLTP and Transactional Web Applications." 20th International Conference on Data Engineering (ICDE 2004). pdf

    • Bianca Schroeder and Mor Harchol-Balter. "Web servers under overload: How scheduling can help." . 18th International Teletraffic Congress (ITC 2003). Winner of student paper award. (Original Tech report Number CMU-CS-02-143, pdf).

      Extended version in ACM Transactions on Internet Technologies (TOIT 2006), vol. 6, no.1, February, 2006. pdf

    • A. Nucci, B. Schroeder, S. Bhattacharyya, N. Taft, C. Diot. "IS-IS Link Weight Assignment for Transient Link Failures." 18th International Teletraffic Congress (ITC 2003).

    • Mor Harchol-Balter, Bianca Schroeder, Nikhil Bansal, Mukesh Agrawal. "Size-based Scheduling to Improve Web Performance." Transactions on Computer Systems (TOCS 2003). postscript / pdf

    • Mor Harchol-Balter, Nikhil Bansal, and Bianca Schroeder. "Implementation of SRPT Scheduling in Web Servers," Technical report Number CMU-CS-00-170. Postscript.

      Short version appeared as "SRPT Scheduling for Web Servers" in JSSPP 2001, 7th International Workshop, Cambridge, MA.

    • Bianca Schroeder and Mor Harchol-Balter. "Evaluation of Task Assignment Policies for Supercomputing Servers: The Case for Load Unbalancing and Fairness," 9th IEEE Symposium on High Performance Distributed Computing (HPDC 2000) , 2000.

      As one of the best HPDC'00 papers invited to Cluster Computing 7(2): 151-161 (2004). Postscript / pdf

    • S. Albers and B. Schroeder. "An experimental study of online scheduling algorithms." 4th Workshop on Algorithm Engineering (WAE 2000) .

      As one of the best WAE'00 papers invited to ACM Journal of Experimental Algorithms 7: 3 (2002).

    • Bianca Schroeder. "Upper and Lower bounds for online scheduling," Masters Thesis at the Max-Planck-Institute, Saarbruecken, Germany, December 1998.


    Book chapters

    • Arun Iyengar, Lakshmish Ramaswamy, and Bianca Schroeder. "Techniques for efficiently serving and caching dynamic web content." In "Recent Advances on Web Data Delivery" by S. Chanson, X. Tang, J. Xu. Kluwer Academic Publisher, 2005.

    • Anastassia Ailamaki, Sailesh Krishnamurthy, Spiros Papadimitriou, and Bianca Schroeder. "The PostgreSQL Open Source DBMS." In "Database System Concepts" by Abraham Silberschatz, Henry F. Korth, S. Sudarshan, 5th Edition. McGraw-Hill Book Company, 2005.


    Patents

    • A. Iyengar, E. Nahum, and B. Schroeder. "Method for Dynamically Scheduling Requests". Filed in March 2004.

    • S. Bhattacharyya, A. Nucci, N. Taft, B. Schroeder and C. Diot. "Method for Assigning Link Weights in a Communications Network". Sprint Docket Number 1917/SPRI.98254. Filed in February 2003.

     


     

    Professional Service

    Program committee member

    • 16th Conference on Measurement, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS 2008)

    • 17th International World Wide Web Conference (WWW'08)

    • 38th Annual International Conference on Dependable Systems and Networks (DSN'08)

    • 16th International World Wide Web conference (WWW'07)


     

    Talks

    Conference talks

    March 2007 Workshop on Reliability Analysis of System Failure Data (RAF'07) to be held at MSR Cambridge, UK.
    "The computer failure data repository."

    February 2007 5th Usenix Conference on File and Storage Technologies (FAST 2007)
    "Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?".

    November 2006 Workshop on Petascale Data Storage, International Conference
    for High Performance Computing, Networking, Storage and Analysis (SC 2006)
    "Learning to live with our failures".

    November 2006 The 7th Usenix Symposium on Operating Systems Design and Implementation (OSDI 2006)
    Work-in-Progress session. "Failures in the real world."

    June 2006 The International Conference on Dependable Systems and Networks (DSN 2006)
    " A Large-Scale Study of Failures in High-Performance-Computing Systems."

    May 2006 3rd Symposium on Networked System Design and Implementation (NSDI 2006)
    "Closed versus open system models: Understanding their impact on performance evaluation
    and system design".


    April 2006 22th International Conference on Data Engineering (ICDE 2006)
    "How to determine a good multi-programming level for external scheduling."

    November 2005 Workshop on Dependability Benchmarking at the 16th IEEE International Symposium
    on Software Reliability Engineering (ISSRE 2005)
    "Analyzing failure data from large HPC clusters".

    October 2004 Grace Hopper conference for women in computing (GHC 2004)
    "Improving the performance of static and dynamic requests at a busy web server."

    May 2004 CORS/INFORMS joint conference 2004
    "Scheduling web servers."

    April 2004 20th International Conference on Data Engineering (ICDE 2004)
    "Priority Mechanisms for OLTP and Transactional Web Applications."

    September 2003 18th International Teletraffic Congress (ITC 2003)
    "Web servers under overload: How scheduling can help."

    September 2000 4th International Workshop on Algorithm Engineering (WAE 2000)
    "An Experimental Study of Online Scheduling Algorithms."

    August 2000 9th IEEE Symposium on High Performance Distributed Computing (HPDC 2000)
    "Evaluation of Task Assignment Policies for Supercomputing Servers:
    The Case for Load Unbalancing and Fairness."

    Invited talks

    November 2006 University of Washington. Host: Hank Levy.
    "Failures in the real world".

    November 2006 Invited talk in the ASC booth at SC'06.
    (ASC is a program of the DOE's National Nuclear Security Administration (NNSA)).
    "Failures in the real world: Collecting, sharing, and analyzing failure data".

    October 2006 Google Inc., Mountain View, CA. Hosts: Luiz Barroso, Eduardo Pinheiro.
    "Failures in the real world: Collecting, sharing, and analyzing failure data".

    October 2006 IBM Almaden, San Jose, CA. Host: Frank Schmuck.
    "Failures in the real world: Collecting, sharing, and analyzing failure data".

    August 2006 HEC-IWG File Systems and I/O R&D Workshop, Washington D.C. 2006
    "The failure data usage project". Host: Gary Grider

    June 2006 University of California, Berkeley. Host: Armando Fox.
    "Understanding failure at scale".

    June 2006 Hewlett Packard Laboratories, Palo Alto, CA. Host: Kim Keeton.
    "Understanding failure at scale".

    June 2006 Microsoft Research, Mountain View, CA. Host: Chandu Thekkath.
    "Understanding failure at scale".

    May 2004 University of Calgary. Host: Carey Williamson.
    "Scheduling web servers: Theory and practice".

    April 2004 Selected as one of two PhD students to give research presentation at
    CMU open house for prospective students. "QoS for databases".

    March 2004 Boston University, Networks seminar.
    "QoS for online shopping".

    August 2003 IBM TJ Watson Research Center. Host: Arun Iyengar.
    "Priority Mechanisms for OLTP and Transactional Web Applications".

    July 2001 Sprint Advanced Technology Laboratories, Burlingame, CA. Host: Christophe Diot.
    "Improving Performance of Web Servers under Overload."

    June 2001 Stanford University. Hosts: Nick McKeown and Balaji Prabhakar.
    "Size-based Scheduling to Improve Web Performance".