Communications of the ACM, 39(3), March, 1996.

This article uses NESL [5] as an example of a nested data-parallel language. This section gives an overview of the language and Section 3 gives several examples of parallel algorithms described and analyzed with NESL. NESL was designed to express nested parallelism in a simple way with a minimum set of structures and was therefore designed as a language on its own rather than as an extension of an existing sequential language. The ideas, however, can clearly be used in other languages. NESL is loosely based on ML [19], a language with a powerful type system, and on SETL [22], a language designed for concisely expressing sequential algorithms. As with ML, NESL is mostly functional (has only limited forms of side effects), but this feature is tangential to the points made in this article.

NESL supports data-parallelism by means of operations on sequences--one
dimensional arrays. All elements of a sequence must be of the same
type, and sequence indices are zero based (`a[0]` extracts the
first element of the sequence `a`). The main data-parallel
construct is *apply-to-each*, which uses a set-like notation.
For example, the expression

{a * a : a in [3, -4, -9, 5]};

squares each elements of the sequence `[3, -4, -9, 5]`
returning the sequence `[9, 16, 81, 25]`. This can be read:
``in parallel, for each `a` in the sequence `a`''. The apply-to-each can be used over multiple
sequences. The expression

{a + b : a in [3, -4, -9, 5]; b in [1, 2, 3, 4]};

adds the two sequences elementwise returning `[4, -2, -6, 9]`.
The apply-to-each construct also provides the ability to subselect
elements of a sequence based on a filter. For example

{a * a : a in [3, -4, -9, 5] | a > 0};

can be read: ``in parallel, for each `a` in the
sequence `a` is greater than `0`,
square `a`''. It returns the sequence `[9, 25]`. The
elements that remain maintain their relative order. This filtering
was used in the Quicksort example.

Any function, whether primitive or user defined, may be applied to each element of a sequence. So, for example, we could define

function factorial(n) = if (n == 1) then 1 else n*factorial(n-1);

and then apply it over the elements of a sequence as in

{factorial(i) : i in [3, 1, 7]};which returns the sequence

In addition to the parallelism supplied by apply-to-each, NESL
provides a set of functions on sequences, each of which can be
implemented in parallel. For example the function `sum` adds the
elements of a sequence, and the function `reverse` reverses the
elements of a sequence. Perhaps the most important function on
sequences is `write`, which supplies the only mechanism to modify
multiple values of a sequence in parallel. `write` takes two
arguments: the first is the sequence to modify and the second
is a sequence of integer-value pairs that specify what to modify.
For each pair `(i,v)` the value `v` is inserted into position
`i` of the destination sequence. For example

write([0, 0, 0, 0, 0, 0, 0, 0], [(4,-2),(2,5),(5,9)]);

inserts the `-2`, `5` and `9` into the
sequence at locations 4, 2 and 5, respectively, returning

[0, 0, 5, 0, -2, 9, 0, 0] .If an index is repeated, then one value is written non-deterministically. For readers familiar with the variants of the PRAM model, we note that the

Nested parallelism is supplied in NESL by allowing sequences to be
nested and allowing parallel functions to be used in an apply-to-each.
For example, we could apply the `sum` function in parallel over a
nested sequence, as in

{sum(a) : a in [[2,3], [8,3,9], [7]]};which would return

Guy Blelloch, blelloch@cs.cmu.edu