Special Seminar

  • Gates Hillman Centers
  • ASA Conference Room 6115
  • Assistant Professor of Computer Science
  • Department of Electrical Engineering & Computer Science
  • University of California at Berkeley

Organizing Computation for High-Performance Visual Computing

Future visual computing applications—from photorealistic real-time rendering, to 4D light field cameras, to pervasive sensing and computer vision—demand orders of magnitude more computation than we currently have. From data centers to mobile devices, performance and energy scaling is limited by locality (the distance over which data has to move, e.g., from nearby caches, far away main memory, or across networks) and parallelism. Because of this, I argue that we should think of the performance and efficiency of an application as determined not just by the algorithm and the hardware on which it runs, but critically also by the organization of its computations and data. For algorithms with the same complexity—even the exact same set of arithmetic operations and data—the order and granularity of execution and placement of data can easily change performance by an order of magnitude, on the same hardware, because of locality and parallelism. To extract the full potential of our machines, we must treat the organization of computation as a first class concern, while working across all levels, from algorithms and data structures, to compilers, to hardware.

This talk will present facets of this philosophy in systems I have built for visual computing applications from image processing and vision, to 3D rendering, simulation, optimization, and machine learning. I will show that, for data-parallel pipelines common in graphics, imaging, and other data-intensive applications, the organization of computations and data for a given algorithm is constrained by a fundamental tension between parallelism, locality, and redundant computation of shared values. I will focus particularly on the Halide language and compiler, which explicitly separates what computations define an algorithm from the choices of organization which determine parallelism, locality, and synchronization. I will show how this approach can enable much simpler programs to deliver performance often many times faster than the best prior implementations, while scaling across radically different architectures, from ARM phones to massively parallel GPUs, FPGAs, and custom ASICs.

Jonathan Ragan-Kelley is an assistant professor of Computer Science at UC Berkeley. He works on high-efficiency visual computing, including systems, compilers, and architectures for image processing, vision, 3D rendering, simulation, and machine learning. He was previously a visiting researcher at Google, a postdoc in Computer Science at Stanford, and earned his PhD in Computer Science from MIT in 2014, where he built the Halide language for high-performance image processing. Halide is used throughout industry to process billions of images every day, from data centers to billions of smartphones. Jonathan previously built the Lightspeed preview system, which was used on over a dozen films at Industrial Light & Magic and was a finalist for an Academy technical achievement award, and he worked in GPU architecture, compilers, and research at NVIDIA, Intel, and ATI.

For More Information, Please Contact: