Carnegie Mellon
SCS logo
Computer Science Department

15-410 Project 4


This semester we will try something a little different for Project 4: profiling. You will instrument your kernel, measure its run-time behavior, and evaluate an improvement.

Project 4 is due Wednesday, April 27th, at 23:59. When planning your work, keep in mind that the book report and the final homework assignment will also be due on the last day of classes.

Note that P4 grades will probably not be returned before the final exam; in the other direction, the exam will not test you on P4 material as such (for that reason and because not all groups will work on P4).


Profiling is the craft of figuring out where the time goes when a system is running. Frequently, profiling a system in even a straightforward way reveals at least one performance surprise. This means that attempts to improve performance which aren't based on data are likely to waste effort speeding up parts of the program which don't use much time. Experience over the years has given rise to the following words of wisdom:

Rules of Optimization

  • Rule 1. Don't do it.
  • Rule 2. (for experts only) Don't do it yet.

--"Principles of Program Design", M.A. Jackson


  1. BEFORE YOU BEGIN: Fill out the p4guess form. Save your guesses somewhere for your final report.

  2. Modify your timer interrupt handler to lprintf() information about what was running before the interrupt arrived. The most straightforward piece of information to record is the program counter (%eip) value. But you may end up recording more information than that, or recording some abstraction of the program counter value in some cases.

  3. Use the nm program to obtain a symbol table listing of your kernel.

  4. Write a small program (in C, C++, Perl, Python, Java, ML, Ruby, forth, SNOBOL, Scheme, or some other reasonable language) to post-process your kernel.out file, extracting the program counter values, comparing them to the symbol table, and emitting a table of how many times each kernel function was interrupted by a clock interrupt. Also compute what percentage of execution time is spent in kernel mode versus user mode.

  5. Profile your kernel running various tests. We recommend at least cho, cho2, make_crash, and fork_wait_bomb, though we encourage you to experiment to see if other tests produce "interesting" results.

  6. Based on your data, suggest a modification to your kernel which would improve its performance noticeably without being too difficult to implement.

  7. Implement the change, re-run at least the most relevant tests, present your data, and comment.


  1. BEFORE YOU BEGIN: Fill out the p4guess form. Save your guesses somewhere for your final report.

  2. Your modified kernel tree, in p4

  3. diff -rc output showing the changes you made to your kernel to instrument it, called p4/report/instrument.diff

  4. source to your post-processing program, stored appropriately in p4/report (e.g., p4/report/analyze.4th)

  5. diff -cr of your performance boost, in p4/report/performance.diff

  6. Brief writeup, p4/report/discussion.pdf, containing

    • brief explanation of your profiling setup (e.g., sampling rate, any special data collected),
    • table of performance results,
    • brief discussion of results (Did you guess right? Is time being taken up by the right things?)
    • table of new performance results
    • brief discussion of new results
    Aside from tables, we expect your discussion to occupy one to two pages of text. If feel you need more space, that's fine, though you should probably touch base with us if you feel you need more than eight pages.


This project will not be graded in terms of who achieves the highest percentage performance improvement. Instead, we will be looking for ways in which you demonstrate prediction based on reasoning, measurement design, and analysis of your findings. Explain why you think the code change you make should be done, why and how much you think it will affect performance, and then discuss what happens when you try it.

Possible Extensions

Examine data more closely
Look over your raw data a bit. Perhaps the program-counter-to-function abstraction is hiding something interesting...
Collect more information
What more could you learn if you collected not only the program counter of the function interrupted by the timer, but also the return address in the enclosing function...?

Additional Guidance

Tell us about interesting failures.
If you try an optimization and it doesn't work out, it may well make sense for you to briefly report on what you tried and why you suspect it didn't work. Negative results are often instructive.
This isn't a stealth COW project.
If the most bang for the buck would clearly come from implementing COW, and you don't feel you have the time to do that and get it stable, don't. Pick something which isn't the most bang for the buck and try it instead, and/or strengthen your analysis a bit. There is at least one COW-subset thing you should consider.
Consider other measurements.
PC-sampling won't bring to light all inefficiencies. Might your kernel contain a structural shortcut which might be invisibly slowing down lots of instructions?

[Last modified Wednesday April 20, 2005]