Space-Efficient Scheduling of Parallelism

SCANDAL Project

Many of today's high level parallel languages support dynamic, fine-grained parallelism. The language implementation typically requires an efficient scheduling algorithm to assign computations to processors at runtime. However, in an attempt to expose a sufficient degree of parallelism to keep all processors busy, schedulers often create many more parallel threads than necessary, leading to excessive memory usage. Further, the order in which the threads are scheduled can greatly affect the total size of the live data at any instance during the parallel execution, and unless the threads are scheduled carefully, the parallel execution of a program may require much more memory than its serial execution.

This research involves finding theoretically space- and time-efficient scheduling algorithms for multithreaded programs, including programs with nested parallelism and synchronization variables. In addition to proving the space and time bounds of the resulting parallel schedules, we demonstrate that they are efficient in practice. We have implemented a runtime system that uses our algorithm to schedule threads in computations with nested parallelism.

Recently, we added this space-efficient scheduling technique to the native Pthreads library on Solaris, and studied the performance of 7 dynamic or irregular parallel benchmarks written using Pthreads We show how simple, yet provably good modifications to the library can result in significantly improved space and time performance. (click here for the paper). With the modified Pthreads library, each of the parallel benchmarks performs as well as its coarse-grained, hand-partitioned counterpart. These results demonstrate that, provided we use a good scheduler, the rich functionality and standard API of Pthreads can be combined with the advantages of dynamic, lightweight threads to result in high performance.

Publications

PThreads for Dynamic and Irregular Parallelism
SC98: High Performance Networking and Computing, November 1998. CMU-CS-98-114 is a longer version.
Space-Efficient Scheduling of Parallelism with Synchronization Variables
ACM Symposium on Parallel Algorithms and Architectures, June 1997.
Space-Efficient Implementation of Nested Parallelism
Proceedings of the 6th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, June 1997.
A Framework for Space and Time Efficient Scheduling of Parallelism
CMU-CS-96-197, December 1996.
A Provable Time and Space Efficient Implementation of NESL
ACM International Conference on Functional Programming, May 1996.
Provably Efficient Scheduling for Languages with Fine-Grained Parallelism
ACM Symposium on Parallel Algorithms and Architectures, July 1995.

Girija Narlikar

Last modified: Mon Aug 31 11:36:19 EDT 1998