CMU 15-418 (Spring 2012) Final Project:
Parallel Smoothed Particle Hydrodynamics
Christian Bruggeman, Robbie McElrath
Reports

Project Proposal

Checkpoint Report

Final Report

Working Schedule

Week What We Plan To Do What We Actually Did
Apr 1-7Familiarize with the existing codebase; Get the base system building on the Gates machines; set up initial CUDA requirementsFixed the makefile to build on most Linux machines; Relearned code structure and organization
Apr 8-14Implement Goswami et al alogrithm using CUDABuilt CUDA parts of make file; Launch a CUDA kernel and have two renderers; Implement benchmarking and comparison code for deep analysis; Restructure simulation code to allow for CUDA and sequential renderer
Apr 15-21Bugfixing of CUDA implementation; implement faster neighbor finding for sequential codeImplemented bucket creation and implemented the C part of the CUDA launch
Mon 4/23-Thu 4/26Implement the kernel generation and density computation functions - RobbieKernel generation implemented; Spent time troubleshooting bucket generation
Fri 4/27-Sun 4/29Implement the force computation function and have a fully working CUDA implementtion - RobbieImplemented force and density functions; CUDA code complete but still with
Mon 4/30-Thu 5/3Implement a log n order neighbor finding algorithm for sequential implementation - Christian At the suggestion of Kayvon, implemented an ISPC version of the solver; Began writeup
Fri 5/4-Sun 5/6Bugfix all prior implementations; begin writing analysis, describing work - RobbieWorked on writeup and timings; generated better demonstration starting conditions
Mon 5/7-Thu 5/10Gather timings and finish writing results and analysis - ChristianFinished all code bugs and refactored the implementation to improve performance for all three implementations; Finished writing the paper and analysis

Working Log
April 14, 2012
Did quite a bit of work today:

Getting a CUDA file to build. For now, it's just the CUDA file from the circle renderer.

At this point, we are still investigating how to dump frames for benchmarking without spawning a new window. This is something that we feel we should have in order to accurately measure and compare runtimes efficiently.

Most of the work was spent re-architecting the base code to allow for multiple renderers and neighbor finding algorithms. Now, it will be possible to support arbitrary renderers and simulators and select them at runtime via command line.



April 15, 2012

Made significant progress in understanding the algorithm for implementation. Plan to start implementation of CUDA render path in the next few days.