Mor Harchol-Balter

Bruce J. Nelson Professor
of Computer Science

Dept. of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213

Office: Gates Hall 7207
Phone: (412) 268-7893
Fax: (412) 268-5576
harchol@cs.cmu.edu

Exec. Assistant:
Patricia Loring (GHC 7025)

 

 

   

   


SHORT BIO

CV (in pdf)

RESEARCH PAPERS ONLINE


Introduction for students: Link to my upcoming IC talk slides 2021

I am interested in the performance analysis and design of computer systems, particularly distributed systems. I use analytical models to capture the important characteristics of a computer system, and then I prove theorems about these models that allow me to redesign the system to improve its performance. Here "performance" might denote response time, energy use, throughput, capacity, etc. Most of my research involves inventing new analytical techniques in the area of performance analysis, as well as new algorithms for resource allocation.

Unlike many theoretical computer scientists, my analysis is based on stochastic (probabilistic) models of computer systems. There is no "adversary" sending us worst-case inputs. By contrast, there is a stream of requests, whose arrival times and service demands come from empirically fitted distributions. These distributions might be correlated, and they often exhibit heavy tailed service demands and high variability.

I believe that many conventional wisdoms on which we base computer system designs are not well understood and sometimes false, leading to inferior designs. My research revisits very classic questions in system design. Here are a few examples of commonly-held beliefs that my research challenges:


Research Specifics:

I work on the design and performance analysis of computer systems including both theory and implementation:

Algorithmic Work: Designing and analyzing algorithms for: resource allocation, task/job scheduling, load sharing, routing, cycle stealing, replication, power-management, multi-class/multi-server scheduling, multi-core parallel scheduling, fairness. Correlated arrival processes, and analysis under high-variability workloads. ``All-Can-Win Theorem'', demonstrating that scheduling policies which are biased towards favoring small jobs can also be preferable to large jobs.

Stochastic Analysis & Queueing Techniques: Developing new methods for stochastic analysis. Examples include: (i) Recursive Dimensionality Reduction (RDR), a technique to reduce a multi-dimensional Markov chain to a 1-dimensional chain, by using busy period transitions; (ii) Recursive Renewal Reward (RRR) and Clearing Analysis on Phases (CAP), techniques that allow one to obtain closed-form solutions for many one-dimensional infinite repeating Markov chains, including the M/M/k with setup chain; (iii) Exact analysis of Replication Systems, the first exact solution (in product-form) for replication systems, involving any number of servers, and number of classes, and any degree of job replication; (iv) SOAP Analysis of Scheduling Policies, the first exact response time analysis of a huge class of scheduling policies with no prior analysis, including the Gittins Index; (v) Optimal Scheduling for Multi-server systems, the first algorithms and analysis for optimal scheduling in the M/G/k, both with known and unknown job sizes.

Computer systems implementation: Resource management/scheduling in data centers and supercomputing centers. Implementation of QoS tail latency guarantees for flows in networked storage systems (PriorityMeister and SNC-Meister). Online scheduling of jobs with heterogeneous preferences (TetriSched). Dynamic power management in multi-tier data centers (AutoScale), adopted by Facebook. Scheduling of Memory Controllers (ATLAS). Pricing and queueing optimizations for cloud services. Kernel-level SRPT connection scheduling in Web servers (SyNC). Scheduling to cope with transient overload in Web servers.

Modeling and Workload characterization: Known for the discovery of Pareto heavy-tailed distribution of UNIX process CPU lifetimes. Statistical characterization of workloads including UNIX processes, web, OLTP, supercomputing, memory workloads, datacenter workloads, and parallel multi-core jobs.


My TENURE RESEARCH STATEMENT from May 2007:


SOME TALKS


Graduated PhD STUDENTS


Current PhD STUDENTS


KEYNOTE PRESENTATIONS

QTECQS 2023, ITC 2022, MAM 2022, UEMCON 2021, DDQC 2021, QTNA 2019, YEQT 2018, 40th Dutch Queueing Colloquium 2018, MIT LIDS Student Conference 2017, CanQueue 2016, ACM SIGMETRICS 2016, ICDCS 2015, GreenMetrics 2014, MASCOTS 2008, SIPEW 2008, Lunteren Conference 2007, ORSIS 2007.

ANNUAL TALK on Applying to Ph.D. Programs in Computer Science (pdf) .


TEACHING