15-829: Performance Modeling Tools for Computer Systems Researchers
Meets: FRIDAYS 10:10 a.m. - 1:10 p.m., Room: GHC 4301
12 Units
CLASS STARTS SEPTEMBER 6, 2022
www.cs.cmu.edu/~harchol/Tools/class.html
Office Hours:
- Tuesday 4:00 p.m. - 5:00 p.m. in GHC 7207
- Wednesday 4:00 p.m. - 5:00 p.m. in GHC 7207
DESCRIPTION:
This class is aimed at computer systems PhD students who are already
involved in doing systems research, where the goal is to improve the
performance of the system. Improving performance could involve
reducing response times, providing class-based response time
differentiation, improving tail behavior, scheduling to favor certain jobs, reducing loss/drop rate, increasing throughput, increasing
revenue, reducing power or other costs, load balancing, etc.
Improving systems performance involves queue management and resource
allocation, both major topics in queueing theory. While queueing
theory classes traditionally involve heavy mathematics, the goal of
this class is to teach systems students the performance
modeling/queueing theory in a super intuitive manner, without covering
proofs, and without requiring a probability background.
The focus of the class will be on learning how to translate computer systems performance problems into
the appropriate queueing network framework.
Each class is divided into two parts. The first half presents a lesson in
queueing theory, modeling, simulation, or workload characterization.
The second half is devoted to having a student in the class present their own computer systems performance research
problem. Together, we will figure out in real time how to
model this research problem as a queueing network and solve the problem.
In between the two halves, we will share pizza!
Prerequisites:
No prerequisites, other than the fact that you should be a Phd student, already be working on computer systems research, where you're looking at improving *performance* of your system.
Tentative Syllabus of Queueing Topics: (approximately 2 weeks each)
- Vocabulary: Speaking like a queueing theorist
- Single-sever system
- Queueing network
- Response time
- Load (Utilization)
- Throughput
- Closed systems versus Open systems
- Kendall notation
- Distributions
- Job size distributions
- Conditional distributions
- Heavy tails and high variability distributions
- Squared coefficient of variation
- Arrival processes
- Generating workloads for simulation
- PASTA
- Inspection Paradox
- Easy (operational) back-of-envelope analysis
- Operational Laws
- Load (utilization) for more complex systems
- "What If" analysis for closed systems
- The single-server queue
- Modeling via Discrete-time Markov chains
- Modeling via Continuous-time Markov chains
- M/G/1
- Setup times
- Multi-server queue
- Capacity provisioning
- Load balancing
- Networks of queues
- Jackson probabilistic networks
- Classed jackson networks
- Scheduling Theory
- Non-preemptive, non-size-based scheduling policies: FCFS, LCFS, Random
- Preemptive, non-size-based scheduling policies: PS, FB, P-LCFS
- Non-preemptive, size-based: priority queues, SJF
- Preemptive, size-based: priority queues, PSJF, SRPT
- Scheduling when job sizes are unknown: SERPT, Gittins
- Scheduling to optimize the tail of response time
- Scheduling to maximize value
Some Potential Application Areas:
- Meeting QoS Service Level Objectives
- Capacity provisioning, work-stealing
- Load balancing algorithms
- Dynamic power management
- Network routing
- Scheduling of parallelizable jobs with different speedup functions
- Admission control for database systems
- Caching to minimize response time
- Managing Supercomputing Centers
GRADING:
- Short weekly homeworks -- worth 30%. Homeworks will involve simulations, rather than proofs (learning by observing).
- In-class presentation -- worth 40%.
- Participation in presentations of others -- worth 30%.
- Standard grading scale: 90%- 100% is A; 80% - 89% is B; 70%- 79% is C; and so on, typically with curve at end.
- No quizzes or tests. Your goal in this class is to improve your own research and that of others in the class.