Parallelism in Sequential Functional Languages
Guy Blelloch and John Greiner
Proceedings of the Symposium on Functional Programming and Computer
Architecture. June, 1995.
This paper formally studies the question of how much parallelism is
available in call-by-value functional languages with no parallel
extensions (i.e., the functional subsets of ML or Scheme). In
particular we are interested in placing bounds on how much parallelism
is available for various problems. To do this we introduce a
complexity model, the PAL, based on the call-by-value lambda-calculus.
The model is defined in terms of a profiling semantics and measures
complexity in terms of the total work and the parallel depth of a
computation. We describe a simulation of the A-PAL (the PAL extended
with arithmetic operations) on various parallel machine models,
including the butterfly, hypercube, and PRAM models and prove
simulation bounds. In particular the simulations are work-efficient
(the processor-time product on the machines is within a constant
factor of the work on the A-PAL), and for P processors the slowdown
(time on the machines divided by depth on the A-PAL) is proportional
to at most O(log P). We also prove bounds for simulating the PRAM on
the A-PAL.
Based on the model, we describe and analyze tree-based versions of
quicksort and mergesort. We show that for an input of size n these
algorithms run on the A-PAL model with O(n log n) work and O(log^2 n)
depth (expected case for quicksort).