From Programming Parallel Algorithms.
Communications of the ACM, 39(3), March, 1996.

Next: NESL Up: Programming Parallel Algorithms Previous: Relationship of Work and

Nested Data-Parallelism and NESL

Many constructs have been suggested for expressing parallelism in programming languages, including fork-and-join constructs, data-parallel constructs, and futures, among others. The question is which of these are most useful for specifying parallel algorithms? If we look at the parallel algorithms that are described in the literature and their pseudocode, we find that nearly all are described as parallel operations over collections of values. For example ``in parallel for each vertex in a graph, find its minimum neighbor'', or ``in parallel for each row in a matrix, sum the row''. Of course the algorithms are not this simple--they usually consist of many such parallel calls interleaved with operations that rearrange the order of a collection, and can be called recursively in parallel, as in Quicksort. This ability to operate in parallel over sets of data is often referred to as data-parallelism [15], and languages based on it are often referred to as data-parallel languages or collection-oriented languages [24]. We note that many parallel languages have data-parallel features in conjunction with other forms of parallelism [10, 3, 12, 18].

Before we come to the rash conclusion that data-parallel languages are the panacea for programming parallel algorithms, we make a distinction between flat and nested data-parallel languages. In flat data-parallel languages a function can be applied in parallel over a set of values, but the function itself must be sequential. In nested data-parallel languages [4] any function can be applied over a set of values, including parallel functions. For example, the summation of each row of the matrix mentioned above could itself execute in parallel using a tree sum. We claim that the ability to nest parallel calls is critical for expressing algorithms in a way that matches our high-level intuition of how they work. In particular, nested parallelism can be used to implement nested loops and divide-and-conquer algorithms in parallel (five out of the seven algorithms described in this article use nesting in a crucial way). The importance of allowing nesting in data-parallel languages has also been observed by others [13]. However, most existing data-parallel languages, such as High Performance Fortran (HPF) [14] or C* [21], do not have direct support for such nesting.

Next: NESL Up: Programming Parallel Algorithms Previous: Relationship of Work and

Guy Blelloch, blelloch@cs.cmu.edu