CMU 15-418 (Spring 2012)
Final Project Information
Your 15-418 final project gives you the opportunity to dive deeply
into a fun parallel systems problem of your choosing. One way to do
this is to design a parallel solution to a problem in an application
area that is interesting to you. Projects you have done in other
classes are a good source of ideas. For example, projects in machine
learning, AI, graphics, computational photography, and computer vision
often stand to benefit greatly from parallelization.
Other project ideas focus on system design or workload evaluation. For
example, a project might compare the performance of CPU and GPU
implementations of a similar problem, and describe which platform is
better suited for the job. You could simulate the behavior of code on
machines with different SIMD widths, add a feature to the ISPC
compiler (its implementation is open source), or develop a parallel
debugging tool that helps visualize bottlenecks and performance in
Requirements and Deadlines
Proposal (due Monday, April 2nd)
Create a web page for your project. The purpose of the proposal is
two-fold: (1) it forces you to organize your thoughts about your
project (2) it gives 15-418 course staff the ability to verify your
plans are of the right scope given our expectations (it also gives us
the ability to offer suggestions and help).
A template just your project web page is
provided here. You do not need
to use the template your your project page should contain the same
sections and same content.
Your proposal should include the following parts:
SUMMARY. Summarize your project in
a few sentences. Describe what you plan to do and what parallel
systems you will be working with. Example one-liners include (you
should add a bit more detail):
- We are going to implement an optimized Smoothed Particle
Hydrodynamics fluid solver on the NVIDIA GPUs in the lab.
- We are going port the Go runtime to Blacklight.
- We are going to create optimized implementations of
sparse-matrix multiplication on both GPU and multi-core CPU
platforms, and perform a detailed analysis of both systems'
- We are going to back-engineer the unpublished machine
specifications of the GPU in the tablet my partner just
- BACKGROUND. If your project
involves accelerating a compute-intensive application, describe
the application or piece of the application you are going to
implement in more detail. This description should be 1-2
paragraphs. It might be helpful to include a block diagram or
pseudocode of the basic idea. What aspects of the problem might
benefit from parallelism?
- THE CHALLENGE. Describe why the
problem is challenging. What aspects of the problem might make
it difficult to parallelize?
- Describe the workload: what are the
dependencies, what are its memory access characteristics? (is there
locality? is there a high communication to computation ratio?), is
there divergent execution.
- Describe constraints: What are the properties of the
system that make mapping the workload to it challenging.
- RESOURCES. Describe the resources
you will use. What code base will you start from? Are you
starting from scratch or using an existing piece of code? Are
there any other resources you need, but haven't figured out how to
GOALS/DELIVERABLES. Describe the
deliverables or goals of your project. Separate your goals
into what you PLAN TO ACHIEVE (what you believe you must get done to
have a successful project) and an extra goal or two that you HOPE TO
ACHIEVE if the project goes really well and you get ahead of
schedule. It may not be possible to state precise performance goals
at this time, but we encourage you be as precise as possible. If
you do state a goal, give some justification of why you think you
can achieve it.
- If applicable, describe the demo you plan to show at the
parallelism computation (will it be an interactive demo on a
certain problem size, will you show an output of the program
that is really neat, will you show speedup graphs reaching a
certain performance level?)
- If your project is an analysis project, what are you hoping to
learn about the workload or system being studied? What
question(s) do you plan to answer?
Systems project proposals should describe what the system will
be capable of and what performance is hoped to be achieved.
- PLATFORM CHOICE. Describe why the
platform (computer and/or language) you have chosen is a good one for
your needs. Why does it make sense to use this parallel system
for the workload you have chosen?
- SCHEDULE. Produce a schedule for
your project by filling out the following table. List what you
plan to get done each week from now until May 10th in order to meet
your project goals. Keep in mind that due to other classes, you'll
have more time to work some weeks than others (work that into the
schedule). You will need to re-evaluate your progress at the end of
each week and update this schedule accordingly. Note the intermediate
checkpoint deadline is April 20th.
In your schedule we encourage you to be precise as precise as
possible. It's often helpful to work backward from your deliverables
and goals, writing down all the little things you'll need to do
(establish the dependencies!).
||What We Plan To Do
||What We Actually Did
|Apr 1-7|| || |
|Apr 8-14|| || |
|Apr 15-21|| || |
|Apr 22-28|| || |
|Apr 29-May 5|| || |
|May 6-11|| || |
Checkpoint (due Friday, April 20)
The following are suggestions for information to include in your
checkpoint write-up. Your goal in the writeup is to assure the course
staff that your project is proceeding as you said it would in your
proposal. If it is not, your checkpoint writeup should emphasize what
has been causing you problems, and provide an adjust schedule and
goals. As projects differ, not all items in the list below are
relevant to all projects.
- Make sure your project schedule on your main project page is up
to date with work completed so far, and well as with a revised plan
of work for the coming weeks. As by this time you should have a good
understanding of what is required to complete your project, I want
to see a very
detailed schedule for the coming weeks. I suggest breaking
time down into half-week increments. Each increment should have
multiple tasks, and for each task put a person's name on it. An
example skeleton schedule is below.
- In a couple of paragraphs, summarize the work that you have
completed so far. (This should be easy if you have been maintaining
this information on your project page.)
- Describe how you are doing with respect to the goals and
deliverables stated in your proposal. Do you still believe you will
be able to produce all your deliverables? If not, why? What about
the "nice to haves"? In your checkpoint writeup we want a new list
of goals that you plan to hit for May 10th.
- What do you plan to show at the parallelism competition? Will it
be a demo? Will it be a graph?
- Do you have preliminary results at this time? If so, it would
be great to included them in your checkpoint write-up.
- List the issues that concern you the most. Are there any
remaining unknowns (things you simply don't know how to solve, or
resource you don't know how to get) or is it just a matter of coding
and doing the work? If you do not wish to put this information on
a public web site you are welcome to email the staff directly.
- [Optionally] schedule a meeting with the course staff to discuss
Example Checkpoint 2 Schedule
||What We Plan To Do
||What We Actually Did
|Mon 4/23-Thu 4/26|| || |
|Fri 4/27-Sun 4/29|| || |
|Mon 4/30-Thu 5/3|| || |
|Fri 5/4-Sun 5/6|| || |
|Mon 5/7-Thu 5/10|| || |
Presentation (Thursday May 10, 8:30-11:30AM,
Scaife Hall Room 125)
Suggestions and expectations for project presentations were
discussed in class. Please see the posted slides
from Lecture 26.
Final Write Up (due Friday May 11, 11:59PM)
Your proposal should include the following basic sections. Not all
the sub-bullets apply to all projects, but they are given as
examples/suggestions of issues to address. You are also encouraged to
provide more detail if you wish. Note that some of the information in
your final writeup can be pulled directly from your proposal if it is
- SUMMARY. A two to three
sentence project summary. The summary should list your project
deliverables and what machines they ran on.
- Example: We implemented smooth particle hydrodynamics in
CUDA on the GPU and in ISPC on the CPU and compared the performance
of the two implementations.
- Example: We parallelized a chess bot. Our 64 core
implementation on Blacklight achieves a 40x speedup and won
several games on an internet chess server.
- Example: We accelerated image processing operations using the
GPU. Given the speed of our implementation, we demonstrate that a
brute-force approach to breaking CAPTCHAS is effective.
- BACKGROUND. Describe the
algorithm, application, or system you parallelized in computer science
terms. (Recall our discussion from class.) A figure would be really
- What are the key data structures?
- What are the key operations on these data structures?
- What are the algorithm's inputs and outputs?
- What is the part that computationally expensive and could benefit from parallelization?
- Break down the workload. Where are the dependencies in the
program? How much parallelism is there? Is it data-parallel?
Where is the locality? Is it amenable to SIMD execution?
- APPROACH. Tell us how your
implementation works. Your description should be sufficiently
detailed to provide the course staff a basic understanding of your
approach. Again, it might be very useful to include a figure here
illustrating components of the system and/or their mapping to parallel
- Describe the technologies used. What language/APIs? What
machines did you target?
- Describe how you mapped the problem to your target parallel
machine(s). IMPORTANT: How do the data structures and
operations you described in part 2 map to machine concepts like
cores and threads. (or warps, thread blocks, gangs, etc.)
- Did you change the original serial algorithm to enable better
mapping to a parallel machine?
- If your project involved many iterations of optimization,
please describe this process as well. What did you try that did
not work? How did you arrive at your solution? The notes
you've been writing throughout your project should be helpful
here. Convince us you worked hard to arrive at a good
- If you started with an existing piece of code, please mention
it (and where it came from) here.
- RESULTS. How successful
were you at achieving your goals? We expect results sections to
differ from project to project, but we expect your evaluation to be very
thorough (your project evaluation is a great way to demonstrate
you understood topics from this course). Here are a few ideas:
REFERENCES. Please provide
a list of references used in the project.
LIST OF WORK BY EACH
STUDENT. If your project is a team project, please list the
work performed by each partner. If you do not feel comfortable
placing this information on a public web page, you may email the
course staff this information directly.
- If your project was optimizing an algorithm, please define
how you measured performance. Is it wall-clock time? Speedup?
An application specific rate? (e.g., moves per second,
- Please also describe your experimental setup. What were the
size of the inputs? How were requests generated?
- Provide graphs of speed up. Please precisely define the
configurations being compared. Is your baseline single-threaded
CPU code? It is an optimized parallel implementation for a
- Recall the importance of problem size. Is it important to
report results for different problem sizes for your project? Do
different workloads exhibit different execution behavior?
- IMPORTANT: What limited your speedup? Is it a lack of
parallelism? (dependencies) Communication or synchronization
overhead? Data transfer (memory-bound or bus transfer
bound). Poor SIMD utilization due to divergence? As you try and
answer these questions, we strongly prefer that you provide data
and measurements to support your conclusions. If you are merely
speculating, please state this explicitly. Performing a solid
analysis of your implementation is a good way to pick up credit
even if your optimization efforts did not yield the performance
you were hoping for.
- Deeper analysis: Can you break execution time of your
algorithm into a number of distinct components. What percentage
of time is spent in each region? Where is there room to
- Was your choice of machine target sound? (If you chose a GPU,
would a CPU have been a better choice? Or vice versa.)
Project Ideas / Brainstorming
- Game playing: Chess, Go, etc.
Graphics: extend assignment 2 to achieve higher performance under real
workloads (render real triangle meshes and more complex shading
functions); implement a parallel ray tracer
- Physical simulation: fluid simulation, rigid body solver, cloth
simulation (for those that have taken Prof. Treuille's class)
Annotate the compiled ISPC code with calls to CPU performance monitor
instructions and then gather interesting statistics about program
execution: cache hits/misses, IPC (and scalar IPC and vector IPC
separately). Create a visualization tool for the results. (from Matt Pharr)
For the really brave: Add polymorphic functions to ISPC: implement a
function template mechanism in the compiler since it gets to be
painful to write multiple versions of functions with both uniform and
varying parameter types. (from Matt Pharr).
- Image processing (for those that have taken Efros' computational
- Vision: detect features then train an SVM classifier used for
high-quality image search. (note this is a general description but
there's a specific algorithm vision researchers here at CMU have in
mind, they'd love to have someone try to accelerate it... starter code
Vision: image/video tagging (cast as an integer programming problem).
Another research project looking for a faster solution. MATLAB starter
- Study a workload's amenability to SIMD execution. Simulate
behavior given multiple SIMD widths.
- Implement a sparse matrix equation solver.
Projects from previous years:
Note many of the project pages have disappeared since students have
graduated, but the titles should give you a good indication of the
flavor of the projects.