The primary research interest of the Scandal
group is the development of a portable, interactive environment for
programming a wide range of supercomputers, such as the Connection
Machine CM-5, Cray Y-MP C90, Cray T3D, and Intel Paragon. In
particular, the goals of such an environment are:
The intent is to make a dramatic reduction in the programmer time and
effort required to develop efficient, portable parallel programs.
Toward these goals, our research has taken two primary directions:
- To supply a very high-level programming language.
- To be well suited for computationally intensive applications,
especially those with irregular data structures.
- To allow the same source code to be used on supercomputers with
significantly different architectures.
- To generate target code for these machines that is close to
the efficiency of hand optimized code designed for a specific machine.
We have designed and fully implemented an applicative parallel language
called NESL. NESL is intended to be
used as a portable interface for programming a variety of parallel and
vector supercomputers, and as a basis for designing parallel algorithms.
Parallelism is supplied through a set of data-parallel
constructs that manipulate collections of values. These constructs
supply parallelism through:
We have found that this sort of parallelism is very well suited for
describing parallel algorithms.
- The ability to apply any function concurrently over each element of
- A set of parallel functions that operate on collections, such as
summing the elements, reordering the elements, or subselecting the
NESL compiles to an intermediate machine-independent language called
VCODE, and we have implemented
highly optimized versions of VCODE for the Cray Y-MP C90, Connection
Machine CM-2, and Encore Multimax, and prototype implementations for the
Connection Machine CM-5 and clusters of workstations. We are currently
working on a follow-up of NESL, which is based on a more rigorous type
system, and includes some support for higher-order functions. As well
as language design, we are very interested in compiler techniques,
especially techniques that optimize the representation and layout of
data on parallel computers.
Recently, we've been working on space-efficient scheduling for parallel languages, including
Nesl and multithreaded systems like Posix threads (Pthreads).
Optimized Implementation of Parallel Algorithms
We have carefully compared and implemented various algorithms on
different parallel machines. The idea is to look at what aspects of
algorithms are important for practical implementations. For example we
have studied the sorting problem extensively, and have designed and
implemented the fastest existing sorting algorithm for the Connection
Machines CM-2 and CM-5, Cray Y-MP C90, and
We have found that although these machines have significantly different
architectures, similar algorithms can be used, and many of the same
issues arise when trying to optimize the algorithms. One such algorithm
is a parallel version of radix sort,
which is the fastest sort for the CM-2, CM-5, and Cray Y-MP C90 in many
situations (it can depend on the number and size of the keys). The
Connection Machine versions of this sort have been installed as the
system library sort delivered by Thinking Machines.
Other algorithms we have studied include algorithms for finding the
convex-hull of a set of points, for finding the connected-components
of a graph, for finding the union, intersection,
and difference of ordered sets, and for solving irregular linear systems,
such as required in modeling many physical phenomena. As well as
algorithm design, we are very interested in studying existing
theoretical algorithms and determining which ones can be mapped well
onto existing parallel machines and communication topologies, and what
aspects are important in getting efficient implementations.