Carnegie Mellon
Computer Science Department |
|
|
|
|
|
|
|
|
|
|
15-410 Project 4
Overview
This semester we will try something a little different
for Project 4: profiling. You will instrument your kernel,
measure its run-time behavior, and evaluate an improvement.
Project 4 is due Wednesday, April 27th, at 23:59.
When planning your work, keep in mind that the book
report and the final homework assignment will also
be due on the last day of classes.
Note that P4 grades will probably not be returned before
the final exam; in the other direction, the exam will
not test you on P4 material as such (for that reason
and because not all groups will work on P4).
Profiling
Profiling is the craft of figuring out where the time
goes when a system is running. Frequently, profiling a
system in even a straightforward way reveals at least
one performance surprise. This means that attempts
to improve performance which aren't based on data
are likely to waste effort speeding up parts of the
program which don't use much time.
Experience over the years
has given rise to the
following words of wisdom:
Rules of Optimization
- Rule 1. Don't do it.
- Rule 2. (for experts only) Don't do it yet.
--"Principles of Program Design", M.A. Jackson
Approach
BEFORE YOU BEGIN: Fill out the
p4guess form. Save your
guesses somewhere for your final report.
Modify your timer interrupt handler to lprintf()
information about what was running before the interrupt arrived.
The most straightforward piece of information to record is the
program counter (%eip ) value. But you may end up
recording more information than that, or recording some
abstraction of the
program counter value in some cases.
Use the nm program to obtain a symbol table
listing of your kernel.
Write a small program (in C, C++, Perl, Python, Java, ML,
Ruby, forth, SNOBOL, Scheme,
or some other reasonable language)
to post-process your kernel.out
file, extracting the program counter values, comparing them
to the symbol table, and emitting a table of how many times
each kernel function was interrupted by a clock interrupt.
Also compute what percentage of execution time is spent in
kernel mode versus user mode.
Profile your kernel running various tests. We recommend
at least
cho,
cho2,
make_crash,
and
fork_wait_bomb,
though we encourage you to experiment to see if other tests
produce "interesting" results.
Based on your data, suggest a modification to
your kernel which would improve its performance noticeably
without being too difficult to implement.
Implement the change, re-run at least the most relevant
tests, present your data, and comment.
Deliverables
BEFORE YOU BEGIN: Fill out the
p4guess form. Save your
guesses somewhere for your final report.
Your modified kernel tree, in p4
diff -rc output showing the changes you made
to your kernel to instrument it, called p4/report/instrument.diff
source to your post-processing program, stored appropriately in
p4/report (e.g., p4/report/analyze.4th)
diff -cr of your performance boost, in
p4/report/performance.diff
Brief writeup, p4/report/discussion.pdf, containing
- brief explanation of your profiling setup (e.g.,
sampling rate, any special data collected),
- table of performance results,
- brief discussion of results (Did you guess right?
Is time being taken up by the right things?)
- table of new performance results
- brief discussion of new results
Aside from tables, we expect your discussion to occupy
one to two pages of text. If feel you need more space,
that's fine, though you should probably touch base with
us if you feel you need more than eight pages.
Guidance
This project will not be graded in terms of who
achieves the highest percentage performance improvement.
Instead, we will be looking for ways in which you
demonstrate prediction based on reasoning,
measurement design, and analysis of your findings.
Explain why you think the code change you make
should be done, why and how much you think it will
affect performance,
and then discuss what happens when you try it.
Possible Extensions
- Examine data more closely
- Look over your raw data a bit. Perhaps the
program-counter-to-function abstraction is hiding
something interesting...
- Collect more information
- What more could you learn if you collected
not only the program counter of the function
interrupted by the timer,
but also the return address in the enclosing
function...?
Additional Guidance
- Tell us about interesting failures.
- If you try an optimization and it doesn't work out,
it may well make sense for you to briefly report on
what you tried and why you suspect it didn't work.
Negative results are often instructive.
- This isn't a stealth COW project.
- If the most bang for the buck would clearly
come from implementing COW, and you don't feel you
have the time to do that and get it stable, don't.
Pick something which isn't the most bang for the
buck and try it instead, and/or strengthen your
analysis a bit.
There is at least one COW-subset thing
you should consider.
- Consider other measurements.
- PC-sampling won't bring to light all inefficiencies.
Might your kernel contain
a structural shortcut which might
be invisibly slowing down lots of instructions?
|