Scandal Overview

The primary research interest of the Scandal group is the development of a portable, interactive environment for programming a wide range of supercomputers, such as the Connection Machine CM-5, Cray Y-MP C90, Cray T3D, and Intel Paragon. In particular, the goals of such an environment are: The intent is to make a dramatic reduction in the programmer time and effort required to develop efficient, portable parallel programs. Toward these goals, our research has taken two primary directions:

Parallel Languages

We have designed and fully implemented an applicative parallel language called NESL. NESL is intended to be used as a portable interface for programming a variety of parallel and vector supercomputers, and as a basis for designing parallel algorithms. Parallelism is supplied through a set of data-parallel constructs that manipulate collections of values. These constructs supply parallelism through:
  1. The ability to apply any function concurrently over each element of a collection.
  2. A set of parallel functions that operate on collections, such as summing the elements, reordering the elements, or subselecting the elements.
We have found that this sort of parallelism is very well suited for describing parallel algorithms.

NESL compiles to an intermediate machine-independent language called VCODE, and we have implemented highly optimized versions of VCODE for the Cray Y-MP C90, Connection Machine CM-2, and Encore Multimax, and prototype implementations for the Connection Machine CM-5 and clusters of workstations. We are currently working on a follow-up of NESL, which is based on a more rigorous type system, and includes some support for higher-order functions. As well as language design, we are very interested in compiler techniques, especially techniques that optimize the representation and layout of data on parallel computers.

Recently, we've been working on space-efficient scheduling for parallel languages, including Nesl and multithreaded systems like Posix threads (Pthreads).

Optimized Implementation of Parallel Algorithms

We have carefully compared and implemented various algorithms on different parallel machines. The idea is to look at what aspects of algorithms are important for practical implementations. For example we have studied the sorting problem extensively, and have designed and implemented the fastest existing sorting algorithm for the Connection Machines CM-2 and CM-5, Cray Y-MP C90, and iWarp. We have found that although these machines have significantly different architectures, similar algorithms can be used, and many of the same issues arise when trying to optimize the algorithms. One such algorithm is a parallel version of radix sort, which is the fastest sort for the CM-2, CM-5, and Cray Y-MP C90 in many situations (it can depend on the number and size of the keys). The Connection Machine versions of this sort have been installed as the system library sort delivered by Thinking Machines.

Other algorithms we have studied include algorithms for finding the convex-hull of a set of points, for finding the connected-components of a graph, for finding the union, intersection, and difference of ordered sets, and for solving irregular linear systems, such as required in modeling many physical phenomena. As well as algorithm design, we are very interested in studying existing theoretical algorithms and determining which ones can be mapped well onto existing parallel machines and communication topologies, and what aspects are important in getting efficient implementations.


Guy Blelloch.