Assistant Professor of Computer Science
Carnegie Mellon University
412-268-1234 (Smith Hall 225)
I architect new visual computing systems that enable increasingly immersive and more intelligent visual computing applications. In pursuit of these goals, my recent research efforts can be categorized into two main themes:
1. Designing new programming systems that facilitate rapid, or even automatic, generation of efficient renderer implementations for specific hardware configurations or application needs. Given knowledge of a virtual scene, a specific rendering task, and a parallel hardware platform, is it possible to generate an optimized graphics system specialized for this context?
2. Exploring visual computing platforms in the context of extreme mobility, cloud computing, and ubiquitous image sensors and displays. How do we build platforms that take graphics applications from one user on a single GPU to 10,000 machines and one million users in the cloud? What future architectures will enable sophisticated image and video analysis on devices of the future?
I am currently working on (1) the renderer compiler, a system for exploring the wide space of algorithmic choices that go into a parallel renderer implementation, (2) exploring the use of large-scale cloud computing to realize high-fidelity interactive environment (a collaboration with Adrien Treuille and James O'Brien), and (3) the creation of an "infinite-capacity camera" to serve as a platform for real-time visual data analysis (a collaboration with Alexei Efros). More details to come!
I will be teaching 15-462/662: Computer Graphics with Keenan Crane in Fall 2015.
15-418/15-618: Parallel Computer Architecture and Programming (Spring 2012, 2013, 2014, 2015)
15-869: Visual Computing Systems (Fall 2013, Fall 2014)
15-869: Graphics and Imaging Architectures (Fall 2011)
Here are a few tips on how to give clear research talks (or class project talks).
Former students:
A System for Rapid, Automatic Shader Level-of-Detail
Yong He, Tim Foley, Natalya Tatarchuk, Kayvon Fatahalian
SIGGRAPH Asia 2015
Aggregate G-Buffer Anti-Aliasing
Cyril Crassin, Morgan McGuire, Kayvon Fatahalian, Aaron Lefohn
I3D 2015
Extending the Graphics Pipeline with Adaptive, Multi-Rate Shading
Yong He, Yan Gu, Kayvon Fatahalian
Self-Refining Games using Player Analytics
Matt Stanton, Ben Humberston, Brandon Kase, James O'Brien, Kayvon Fatahalian, Adrien Treuille
Near-exhaustive Precomputation of Secondary Cloth Effects
Doyub Kim, Woojong Koh, Rahul Narain, Kayvon Fatahalian, Adrien Treuille, James O'Brien
Efficient BVH Construction via Approximate Agglomerative Clustering
Yan Gu, Yong He, Kayvon Fatahalian, Guy Blelloch
High Performance Graphics 2013
SRDH: Specializing BVH Construction and Traversal Order Using Representative Shadow Ray Sets
Nicolas Feltman, Minjae Lee, Kayvon Fatahalian
High Performance Graphics 2012
Evolving the Real-Time Graphics Pipeline for Micropolygon Rendering
Kayvon Fatahalian, Stanford University Ph.D. Dissertation, 2011
Reducing Shading on GPUs using Quad-Fragment Merging
Kayvon Fatahalian, Solomon Boulos, James Hegarty, Kurt Akeley, William R. Mark, Henry Moreton, Pat Hanrahan
Space-Time Hierarchical Occlusion Culling for Micropolygon Rendering with Motion Blur
Solomon Boulos, Edward Luong, Kayvon Fatahalian, Henry Moreton, Pat Hanrahan
High Performance Graphics 2010
Hardware Implementation of Micropolygon Rasterization with Motion and Defocus Blur
John S. Brunhaver, Kayvon Fatahalian, Pat Hanrahan
High Performance Graphics 2010
A Lazy Object-Space Shading Architecture With Decoupled Sampling
Christopher A. Burns, Kayvon Fatahalian, William R. Mark
High Performance Graphics 2010
DiagSplit: Parallel, Crack-Free, Adaptive Tessellation for Micropolygon Rendering
Matthew Fisher, Kayvon Fatahalian, Solomon Boulos, Kurt Akeley, William R. Mark, Pat Hanrahan
SIGGRAPH Asia 2009
Data-Parallel Rasterization of Micropolygons with Defocus and Motion Blur
Kayvon Fatahalian, Edward Luong, Solomon Boulos, Kurt Akeley, William R. Mark, Pat Hanrahan
High Performance Graphics 2009
GRAMPS: A Programming Model for Graphics Pipelines
Jeremy Sugerman, Kayvon Fatahalian, Solomon Boulos, Kurt Akeley, Pat Hanrahan
Transactions on Graphics (TOG) January 2009
A Closer Look at GPUs
Kayvon Fatahalian and Mike Houston
Communications of the ACM. Vol. 51, No. 10 (October 2008)
(also published as "GPUs: A Closer Look": ACM Queue. March/April. 2008)
A Portable Runtime Interface for Multi-level Memory Hierarchies
Mike Houston, Ji Young Park, Manman Ren, Timothy J. Knight, Kayvon Fatahalian, Alex Aiken, William J. Dally, Pat Hanrahan
PPOPP 2008
Compilation for Explicitly Managed Memory Hierarchies
Timothy J. Knight, Ji Young Park, Manman Ren, Mike Houston, Mattan Erez, Kayvon Fatahalian, Alex Aiken, William J. Dally, Pat Hanrahan
PPOPP 2007
Sequoia: Programming the Memory Hierarchy
Kayvon Fatahalian, Timothy J. Knight, Mike Houston, Mattan Erez, Daniel R Horn, Larkhoon Leem, Ji Young Park, Manman Ren, Alex Aiken, William J. Dally, Pat Hanrahan
Supercomputing 2006
Understanding the Efficiency of GPU Algorithms for Matrix-Matrix Multiplication
Kayvon Fatahalian, Jeremy Sugerman, Pat Hanrahan
Graphics Hardware 2004
Brook for GPUs: Stream Computing on Graphics Hardware
Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, Pat Hanrahan
Precomputing Interactive Dynamic Deformable Scenes
Doug L. James and Kayvon Fatahalian
Real-Time Global Illumination of Deformable Objects
Undergraduate Senior Research Thesis (Carnegie Mellon University). 2003.
Advised by Doug James
A Real-Time Micropolygon Rendering Pipeline
In a few years, GPUs will have the compute horsepower to render scenes containing cinematic-quality surfaces in real-time. Unfortunately, if they render these subpixel polygons (micropolygons) using the same techniques as they do for large triangles today, GPUs will perform extremely inefficiently. Instead of trying to parallelize Pixar's Reyes micropolygon rendering system, we're taking a hard look at how the existing Direct3D 11 rendering pipeline, and GPU hardware implementations, must evolve to render micropolygon workloads efficiently in a high-throughput system. Changes to software interfaces, algorithms, and HW design are fair game! Slides describing what we've learned can be found in this SIGGRAPH course talk or in my dissertation: Evolving the Real-Time Graphics Pipeline for Micropolygon Rendering.
The Sequoia Programming Language ("Programming the Memory Hierarchy")
Sequoia is a hierarchical stream programming language that arose from the observation that expressing locality, not parallelism is the most important responsibility of parallel application programmers in scientific/numerical domains. Sequoia presents a parallel machine as an abstract hierarchy of memories and gives the programmer explicit control over data locality and communication through this hierarchy using first-class language constructs (basically, Sequoia supports nested kernels and streams of streams). Sequoia programs have run on a variety of exposed-communication architectures such as clusters, the CELL processor, GPUs, and even supercomputing clusters at Los Alamos. The best way to learn about Sequoia is to read our SC96 paper. You can also learn more at the Sequoia project page.
GRAMPS (a framework for heterogeneous parallel programming)
There are two ways to think about GRAMPS. Graphics folks should think of GRAMPS as a system for building custom graphics pipelines. We simply gave up on adding more and more configurable knobs to existing pipelines like OpenGL/Direct3D and instead allow the programmer to programmatically define a custom pipeline with an arbitrary number of stages connected by queues. To non-graphics folks, GRAMPS is a stream programming system that embraces heterogeneity in underlying architecture and anticipates streaming workloads that exhibit both regular and irregular (dynamic) behavior. The GRAMPS runtime dynamically schedules GRAMPS programs onto architectures containing a mixture of compute-optimized cores, generic CPU cores, and fixed-function processing units.
While at Stanford, I helped out with the BrookGPU (abstracting the GPU as a stream processor for numerical computing) and Merrimac Streaming Supercomputer projects.
My work is supported by the National Science Foundation and by INTEL, NVIDIA, QUALCOMM, GOOGLE, and APPLE.