MIME-Version: 1.0 Server: CERN/3.0 Date: Wednesday, 15-Jan-97 01:49:26 GMT Content-Type: text/html Content-Length: 3925 Last-Modified: Wednesday, 11-Sep-96 17:42:46 GMT
Alan Heirich and James Arvo
California Institute of Technology
Problem Statement. Photorealistic image synthesis requires a detailed simulation of photon transport through geometric environments. Monte Carlo methods have been applied to this problem in order to reduce the amount of computation (1). The resulting algorithms are concurrent and well suited to implementation on parallel computers. An effective implementation requires proper latency hiding and dynamic load balancing.
Diffusion has been discussed as a metaphor for concurrent computation since the early 1980's (2). In recent years numerous diffusion algorithms have been proposed to solve the problem of dynamic load balancing on parallel computers (3,4). Very recently a diffusion algorithm has been proposed to solve the closely related mapping problem (5). This proposal has antecedents in relaxation methods applied to problems in circuit layout.
In the current research we have investigated the utility of diffusion algorithms to the problems of dynamic load balancing and partitioning in Monte Carlo path tracing. We have implemented a Monte Carlo algorithm as a message driven concurrent pipeline, and have employed a diffusion algorithm to perform dynamic load balancing. We have designed a diffusion algorithm to partition complex geometric models among the processors of a parallel computer and have performed initial simulations to validate this approach.
Research Results. Initial results have shown better than 90% scaling efficiencies and we anticipate that these figures will improve (6). The implementation has been tested on up to 128 processors, and on platforms that include IBM SP1 and SP2 systems, networks of workstations, and uniprocessors. We have used the facilities of the Argonne HPCRF to measure scalability on up to 64 processors, and have used other SP installations for larger benchmarks. In these benchmarks we have discovered that the implementation does not require large amounts of network bandwidth, and that it quickly becomes compute bound for complex models. For example, the message traffic generated by 64 nodes of the ANL HPCRF was within the bandwidth provided by Ethernet. Although an experiment on 128 nodes at the Cornell Theory Center became bandwidth limited on Ethernet, when higher sampling rates were employed this phenomenon subsided.
Future Plans. We plan to continue to develop this implementation as well as to explore complementary rendering techniques (e.g., radiosity, finite element methods) that are amenable to parallel implementation. We intend to benchmark the code on a Beowulf cluster of 200 Mhz Pentium Pro computers under construction at the Caltech Center for Advanced Computing Research.
Funding. This work has been funded by the Cornell Program of Computer Graphics, the Caltech Center for Advanced Computing Research, and the NSF Center for Research in Parallel Computation.
References.
(1) Kajiya, J.T. The Rendering Equation. Proc. SIGGRAPH (1986).
(2) Dijkstra, E.W. & Scholten, C.S. Termination Detection for Diffusing Computations. Inf. Proc. Lett. (1980).
(3) Cybenko, G. Dynamic Load Balancing for Distributed Memory Multiprocessors. J. Par. Dist. Comp. (1989).
(4) Heirich, A. & Taylor, S. A Parabolic Load Balancing Method. Proc. Intl. Conf. Par. Proc. (1995).
(5) Heirich, A. A Scalable Diffusion Algorithm for Dynamic Mapping and Load Balancing on Networks of Arbitrary Topology. Intl. J. Found. Comp. Sci., to appear (1997).
(6) Heirich, A. & Arvo, J. Scalable Photorealistic Rendering of Complex Scenes. Proc. 1st Eurographics Workshop on Parallel Graphics and Visualization (1996).