Date: Tue, 10 Dec 1996 03:29:28 GMT Server: NCSA/1.4.2 Content-type: text/html ZPL Overview

ZPL Overview


ZPL is a new array programming language designed for engineering and scientific programs that would previously have been written in Fortran. Because its design goals were machine independence and high performance, ZPL programs run fast on both sequential and parallel computers. Because it is "implicitly parallel," i.e. the programmer does NOT express the parallelism, ZPL programs are simple and easy to write.

On this page, ZPL is described at a high level by answering the "obvious" questions about it and its implementation. The page concludes with a ZPL fact sheet.


What is an array language? In scalar languages like Fortran, C, Pascal and Ada, operations apply only to single values, so a + b expresses the addition of two numbers. In such languages adding two arrays requires indexing and looping:

   DO 10 I = 1, N                     for ( i=0; i<n; i++) {
      DO 10 J = 1, N		          for ( j=0; j<n; j++){
10       A(I,J) = A(I,J) + B(I,J)             a[i][j] = a[i][j]+b[i][j];
                                          }
                                      }
            FORTRAN 77                               C

This need to loop and index to perform operations on arrays is both tedious and error prone.

In ZPL operations are generalized to apply to both scalars and arrays. Thus, a + b expresses the sum of two scalars if a and b were declared as scalars, or arrays if they were declared as arrays. When applied to arrays, the operations act on corresponding elements as illustrated in the loops above. Indeed, when the ZPL compiler encounters the statement

	A := A + B;
and A and B are two dimensional arrays, it generates code that is effectively the same as the C loops shown above. An array language, therefore, simplifies programming.

Why create a new array language? Programming languages are created all the time in computer science, but the widespread adoption of a new programming language is an extremely rare event. The recent use of object-oriented languages shows that it does occur from time-to-time. Nevertheless, the cardinal rule is, "If you want your research to be applied in practice, do not invent a new language." It is much better to extend or enhance an existing language that has an established user base.

Parallelism, in the opinion of ZPL's designers, is a phenomenon that cannot be fully exploited through the medium of existing programming languages, even existing array languages, such as Fortran 90 and APL [Greenlaw90]. Case after case demonstrates that effective parallel computations are typically accomplished through a "paradigm shift" away from sequential solutions. This shift, which is more frequently discontinuous than the term "shift" implies, is inhibited by languages designed for sequential computers.

A new language can avoid these problems and facilitate the paradigm shift. Further, by choosing its primitives carefully, new research in parallel compilation can apply. So, ZPL has been designed from first principles (see below) to realize these goals.

Isn't programming with arrays hard? Programmers unfamiliar with them may find array languages a little different initially, but technical programmers, meaning scientists, engineers, mathematicians, statisticians, etc., will generally find them natural. Indeed, many science and engineering problems are formulated in a way that is perhaps closer to array languages than scalar languages.

As a trivial example, consider the computation of the mean and standard deviation of a Sample of n items. The textbook definitions of these quantities are:

In ZPL, mu and sigma are computed by single statements which mirror the definitions:

    mu    := +<<Sample/n;                   -- Mean

    sigma := sqrt((+<<sqr(Sample-mu))/n);   -- Std deviation

The array Sample contains the items, and the operator +<< sums them. So m is a direct translation. The computation is analogous and illustrates several features of ZPL, including subtracting mu from each item of Sample, (this is called promoting a scalar to an array), promoting a scalar function sqr to be an array function by applying it to each item of the array, etc. These properties of ZPL, e.g. promotion, are simply programming language terminology for natural concepts technically educated people already know.

After using ZPL for a few months, a graduate student research assistant in civil engineering rebelled when told to program in C again.

High performance is widely claimed; why believe it for ZPL? For some languages "high performance" is part of the name. For ZPL, "high performance" is part of the description, as backed by experimental evidence [Dikaiakos,Lewis,Lin95.] This evidence takes different forms, but is always relative to other means of achieving good performance. For example, Fortran 77 programs run sequentially, or C programs customized to a parallel platforms (with user specified communication), are regarded as reasonable ways to establish good baselines, since these usually represent the best alternatives to achieving good performance.

The evidence of ZPL's high performance derives from several types of experiments [Lin94, Lin95], one of which will be mentioned here: SIMPLE is a fluid dynamics program developed at Lawrence Livermore National Laboratories to benchmark new computers and compilers. The computation has been widely used in the study of parallel computing. A parallel version of the original 2400 line Fortran 77 program was developed by Gannon and Panetta [Gannon86]. A high quality, variable grain version written in C requires approximately 5000 lines [Lee92]. This C program, customized to the Intel Paragon and the Kendal Square Research KSR-2, was compared to the 520 line ZPL program for SIMPLE [Lin94]. The speedups of these programs are shown for experiments involving 10 iterations on a 256 x 256 size problem.

The experiments indicate that the high level ZPL performs as well as the low-level C program for these two machines. Other information suggests that similar behavior can be expected on any MIMD parallel computer [Lin90].

Why does ZPL have such good performance? ZPL does not have parallel "directives" or other forms of explicit parallelism. Instead, it exploits the fact that when programmers describe computations in terms of arrays, many scalar operations must be performed to implement the array operations. This "implied" computation can be parceled out to different processors to get concurrency. Thus, the parallelism comes simply from the semantics of the array operations.

What does 'ZPL was designed from first principles' mean? ZPL is actually the dataparallel sublanguage of a more powerful parallel programming language, called Advanced ZPL [Lin94] or A-ZPL. A-ZPL is a fully general parallel language for "power" users, i.e. programmers with extreme performance requirements and the sophistication to use more demanding technology. A-ZPL is not yet implemented, and its completion is not expected for at least two years.

Advanced ZPL (known previously as Orca C) implements a programming model, called Phase Abstractions [Griswold90]. The Phase Abstractions model is capable of expressing task parallelism, pipelined parallelism and other parallel programming paradigms, not just data parallelism. The Phase Abstractions programming model is built on and generalizes a parallel machine model, called the CTA [Snyder86]. The CTA abstracts the family of multiple-instruction-multiple-data (MIMD) computers. The two models are the mechanism by which the benefits and costs of parallel computation are succinctly conveyed to A-ZPL programmers [Snyder93]. The models balance the conflicting requirements that to write efficient code, the programmer needs to make decisions based on how the program will be executed, but to be machine independent (portable) the programmer must avoid reliance on any particular machine facilities. The relationships of this approach to others have been described [Snyder95], and feasibility studies indicate that the approach works [Lin92].

The Phase Abstractions programming model recognizes three different "programming levels":

The letter name of this highest, problem solving programming level motivates the language's name.

Learning the Language. A simple introduction of some of the basic ZPL concepts is available online as a Walk-through of a ZPL program. A tutorial introduction to programming in ZPL is available as the ZPL Programmer's Guide. The ZPL Language Manual defines the language specifics. Sample programs and scientific research papers are also available.

Writing Your First Program. Perhaps the simplest way to write and run a ZPL program on YOUR Unix machine is to use the Web Compiler. You paste a ZPL program (your own or one of ours) into a window, and click "compile." The program is packaged up and sent to a machine at the UW CSE Department. It is compiled into ANSI C and returned to you. A "make" of this file will result in an executable that can be run on your sequential computer.

Parallel Use of ZPL. To run on a parallel computer ZPL must first be targeted to that parallel computer. This is an operation that is typically NOT performed by ZPL applications programmers, but is straightforward for parallel computer systems administrators. The present platforms on which ZPL runs are

For information on targeting to a new platform, click here.

Once ZPL has been targeted to your type of parallel computer and the libraries installed at your facility, you are ready to use ZPL in parallel. It is NOT necessary to install the ZPL compiler on your workstation, because for the near term all ZPL compilation will be performed at the University of Washington. This is to assist in rapidly disseminating compiler improvements to the user community. (You can't have a stale version of the compiler!) However, we provide some software that you WILL want to install on your workstation which will simplify this remote compilation, and give you the convenience similar to compiling locally.

To learn more about running ZPL in parallel, click here.


ZPL Fact Sheet

Name. ZPL is short for the Z (level) Programming Language; see discussion of programming model, above.

Origin. ZPL was designed and implemented by the Orca Project of the Computer Science and Engineering Department at the University of Washington.

Type. ZPL uses the array abstraction to implement a dataparallel programming model; it is a standalone subset of Advanced ZPL.

History. Implementation of the ZPL compiler began in March 1993. It generated code approximately 15 months later. ZPL will be officially "released" during the fourth quarter, 1995.

Approach. ZPL is translated into a conventional abstract syntax tree representation on which program analysis and optimizations are performed. ANSI C code is generated as the object code. This C program is machine independent, and implements certain operations in abstract form. This code is compiled using the native C compiler on the target machine with custom libraries. In this second compilation the abstract operations are customized to the specific platform.

Team. The creators of ZPL are: Brad Chamberlain, Sung-Eun Choi, Marios Dikaiakos, George Forman, E Christopher Lewis, Calvin Lin, Larry Snyder, and W. Derrick Weathersby with assistance from Ruth Anderson and Kurt Partridge.

Funding. The foundational research for the ZPL compiler was funded in part by the Office of Naval Research N00014-89-J-1368. The compiler itself was funded in part by the Advanced Research Projects Agency, N00014-92-J-1824.


References

[Dikaiakos] M.D. Dikaiakos, C. Lin, D. Manoussaki and D. Woodward. "The Portable Parallel Implementation of Two Novel Mathematical Biology Algorithms in ZPL," the 9th Int'l Conference on Supercomputing, pp. 365-374, 1995.

[Gannon] D. Gannon and J. Panetta, "Restructuring Simple for the CHiP Architecture," Parallel Computing, 3:305-326, 1986

[Greenlaw90] R. Greenlaw and L. Snyder, "Achieving Speedups for APL on an SIMD Distributed Memory Machine," Int'l J. of Parallel Programming, 19(2):111-117, 1990

[Griswold90] W. G. Griswold, G. A. Harrison, D. Notkin, and L. Snyder, "Scalable Abstractions for Parallel Programming," Proc. 5th Distributed Memory Computer Conference, IEEE, pp. 1008-1016, 1990

[Lee92] J. Lee, C. Lin and L. Snyder, "Programming SIMPLE for Parallel Portability," Languages and Compilers for Parallel Computing, Uptal Banerjee, David Gelernter, Alexamdru Nicolau and David Padua, eds, pp. 84-98, 1992.

[Lewis] E. Lewis, C. Lin, L. Snyder and G. Turkiyyah. "A Portable Parallel N-Body Solver," the 7th SIAM Conference on Parallel Processing for Scientific Computing pp. 331-336. 1995.

[Lin92] C. Lin, The portability of parallel programs across MIMD computers, Ph.D. Dissertation, University of Washington, 1992.

[Lin90] C. Lin and L. Snyder, "A comparison of programming models for shared memory multiprocessors," Proceedings of the International Conference on Parallel Processing, pp. II 163-180, 1990.

[Lin94] C. Lin and L. Snyder, "SIMPLE Performance Results in ZPL," Languages and Compilers for Parallel Computing, K. Pingali, U. Banerjee, D. Gelernter, A. Nicolau, and D. Padua, eds, pp. 361-375, 1994.

[Lin95] C. Lin, L. Snyder, R. Anderson, B. Chamberlain, S. Choi, G. Forman, E. Lewis, and W. D. Weathersby. ZPL vs. HPF: A Comparison of Performance and Programming Style, CSE Technical Report, University of Washington, 1994.

[Snyder94] L. Snyder, "Foundations of Practical Parallel Programming Languages," in J. Ferrante and A. J. G Hey (eds.), Portability and performance for parallel processing, John Wiley and Sons, Ltd., pp. 1-19, 1994.

[Snyder86] L. Snyder. "Type Architecture, Shared Memory and the Corollary of Modest Potential," Annual Review of Computer Science, pp. 1:289-317. 1986.

[Snyder95]L. Snyder, "Experimental Validation of Models of Parallel Computation," in A. Hofmann and J. van Leeuwen, Lecture Notes in Computers Science, V. 1000, Springer-Verlag, 1995.


[ ZPL | UW CSE | UW ]
zpl-info@cs.washington.edu