Standard ML at Carnegie Mellon

Peter Lee

School of Computer Science

Carnegie Mellon University

Last Revised: August 24, 1998

This is a guide to editing and executing Standard ML (SML) programs at Carnegie Mellon University. In this document you can find information about the local installation of SML, including a few tips to help you cope with some of the quirks of the locally supported systems. This document was written by Peter Lee (, with extensive contributions by Robert Harper (, Iliano Cervesato (, Carsten Shurmann (, Frank Pfenning (, and Herb Derby (

This is not a reference manual for the Standard ML language. If you need a reference manual or a tutorial, you can find several sources of information, both on-line and in hard copy. For textbooks, I recommend either ML for the Working Programmer, by Lawrence Paulson (Second Edition, Cambridge University Press, 1997), or Elements of ML Programming, 2nd Edition (ML97), by Jeffrey Ullman (Prentice Hall, 1994). Paulson's book is more thorough and technically sharper, though I have found that many newcomers who are coming to functional programming from the C language (or something similar) have an easier time with Ullman's gentler introduction. Brief introductions to the language may also be found on-line from the Fox Project's web site at In particular, the Introduction to SML, by Robert Harper, and the Four Lectures on Standard ML, by Mads Tofte, are useful sources of examples and exercises. Scores of ML programmers have gotten their start with Harper's Introduction.

Besides the introductory section that you are now reading, this document consists of two separate parts that are suitable for both on-line reading and for printing. One part is for users of the Standard ML of New Jersey system, and the other is for users of the MLWorks system. These are the two major implementations of SML that are currently supported at CMU. Reading these documents on-line will allow you to take advantage of the numerous links to related on-line documentation, but an attempt has been made to make them useful even in printed form. If you are undecided as to which system to use, continue reading below.

SML Systems at CMU

There are two major dialects of the ML language, Standard ML (SML) and CAML. Both dialects are supported by several implementations. At CMU, the SML is the dominant dialect, and two full-fledged implementations of SML are installed and maintained for local use. One is the Standard ML of New Jersey (SML/NJ), a freely distributed system developed by originally by Andrew Appel and David MacQueen and developed now at Lucent Bell Laboratories. The other locally supported SML system is MLWorks, a commercial implementation developed by Harlequin. Other implementations of SML include Moscow ML, the ML Kit, and Poly ML. For CAML, there are also several implementations, including the popular CAML-Light system.

SML is a practical programming languages with a large number of modern features. These include polymorphic type inference, first-class functions, continuations, and exceptions, parameterized modules, algebraic datatypes with pattern matching, and automatic garbage collection. Being such a complex and relatively large language, SML is unique in having a complete formal definition, and in fact this is the subject of a considerable amount of current research in mathematical semantics, type theory, language design, and compiler design. The complete formal definition for the language can be found in the book, The Definition of Standard ML (Revised), by Robin Milner, Mads Tofte, Robert Harper, and David MacQueen (The MIT Press, 1997). The "Revised" notation in the title refers to the fact that the SML language definition has just recently gone through an extensive revision, resulting in the current language which is sometimes called "SML97", to distinguish it from the original version of the language which was fairly stable from 1985 until 1996. Although this revision is comprehensive, the bulk of most existing SML programs remains unchanged in practice. Still, there are enough fundamental changes that any serious application will require some modification. At the time of this writing, the vast majority of SML programs have been written in the original version of SML, but since all of the major implementations now support SML97, it is likely that all new programs will be written in this new version. I therefore recommend that your programs be written in SML97.

Of the two major implementations of SML, the SML/NJ system is the most thoroughly developed and tested, as it has been under continuous development since 1987. It is also available on most of the popular architectures and operating systems. One of the primary motivations for the development of SML/NJ has been to explore optimization techniques for advanced programming languages; therefore, its developers have put a considerable effort into getting the highest-quality target code out of the compiler, as well as extending the tools and libraries for concurrency, X-windows programming, and compiler construction. On the other hand, relatively little effort has been put into the programming environment. This is in contrast to MLWorks, which provides a comprehensive suite of program-development tools, including an integrated debugger, profiler, and structure browser. For teaching purposes, the MLWorks system may be preferable, whereas the SML/NJ system is perhaps the slightly better choice for research and large-scale software development purposes.

The SML/NJ System

Locally, the SML/NJ system is maintained for all of the major processor/OS combinations, including machines with the MIPS, SPARC, DEC Alpha, and Intel x86 processors. The Unix, Windows 95, and Windows NT operating systems are also supported. Version 0.93 of SML/NJ is the major release that supports the original version of the SML language. However, users are strongly encouraged to use version 110 of the SML/NJ system, which supports the ML97 revision of the SML language. The Motorola 680x0 processor and Macintosh System 7 operating system is also supported, though only for version 0.93. I strongly recommend that the most recent version of the SML/NJ system be used, so that your programs will conform to the revised definition of the language. (Of course, on the Macintosh you don't have the choice, since only the original SML language is supported.) The SML/NJ system is continually maintained and updated. The latest releases and developments can be found at the SML/NJ web site at if you wish to install the system yourself, and up-to-date information on the local support in the CS domain can be obtained from the local maintainer, David Swasey (

SML/NJ version 110 is currently available for most Unix systems supported by Andrew and the Computer Science Department. On Suns in an Andrew cluster, the binary is available as:


On most Andrew Unix systems, SML/NJ may soon become available in /usr/local/bin. For now, you can run the CS installation with:


In the Computer Science Department, SML/NJ is available as/usr/local/bin/sml.

To use SML/NJ at home under Windows, you can retrieve this self-installing .EXE (6.7MB). For other platforms, follow the instructions at

For more information see the document on how to use the SML/NJ system.

The MLWorks System

The MLWorks system by Harlequin Incorporated is an integrated programming environment providing an interpreter, access to various external editors, a debugger and management tools for modular programming, including a browser. The major source of documentation about MLWorks is MLWorks, User's Guide to the Environment, available at:

A PostScript version of the User's Guide is also available on the Andrew file system at:


MLWorks is currently available only from the Andrew network, and only for the Sun4 architecture (SPARCstations) so far. Individual student licenses for other architectures may also become available in the near future. The binary for the Solaris operating system can be found on the Andrew network under the directory:


MLWorks is best used through its X-windows user interface, but it can also be used from a terminal interface by invoking it with -tty option.

For more information, see the document on how to use the MLWorks system.