Sample Code for this Chapter

Lists are one example of the notion of a recursive datatype.  ML provides a general mechanism, the datatype declaration, for introducing recursive types.   Earlier we introduced type declarations as an abbreviation mechanism. Types are given names as documentation and as a convenience to the programmer, but doing so is semantically inconsequential --- one could replace all uses of the type name by its definition and not effect the behavior of the program.  In contrast the datatype declaration provides a means of introducing a new type that is distinct from all other types and that does not merely stand for some other type.   It is the means by which the ML type system may be extended by the programmer.

The datatype declaration in ML has a number of facets.  A datatype declaration introduces

  1. One or more "new" type constructors.  The type constructors introduced may, nor may not, be (mutually) recursive.
  2. One or more "new" value constructors for each of the type constructors introduced by the declaration.

The type constructor may take zero or more arguments; a zero-argument, or nullary, type constructor is just a type.  Each value constructor may also take zero or more arguments; a nullary value constructor is just a constant.  The type and value constructors introduced by the declaration are "new" in the sense that they are distinct from all other type and value constructors previously introduced; if a datatype re-defines an "old" type or value constructor, then the old definition is shadowed by the new one, rendering the old ones inaccessible in the scope of the new definition.

Here's a simple example of a nullary type constructor with four nullary value constructors.

datatype suit = Spades | Hearts | Diamonds | Clubs

This declaration introduces a new type suit with four nullary value constructors, Spades, Hearts, Diamonds, and Clubs.   This declaration may be read as introducing a type suit such that a value of type suit is either Spades, or Hearts, or Diamonds, or Clubs.  There is no significance to the ordering of the constructors in the declaration; we could just as well have written

datatype suit = Hearts | Diamonds | Spades | Clubs

(or any other ordering, for that matter).  It is conventional to capitalize the names of value constructors, but this is not required by the language.

Given the declaration of the type suit, we may define functions on it by case analysis on the value constructors using a clausal function definition.  For example, we may define the suit ordering in Bridge by the function

fun outranks (Spades, Spades) = false
  | outranks (Spades, _) = true
  | outranks (Hearts, Spades) = false
  | outranks (Hearts, Hearts) = false
  | outranks (Hearts, _) = true
  | outranks (Diamonds, Clubs) = true
  | outranks (Diamonds, _) = false
  | outranks (Clubs, _) = false

This defines a function of type

suit * suit -> bool

that determines whether or not the first suit outranks the second.

Data types may also be parameterized by another type.  For example,

datatype 'a option = NONE | SOME of 'a

introduces the unary type constructor 'a option.  The values of type typ option are:

  1. The constant NONE, and
  2. Values of the form SOME val, where val is a value of type typ.

For example, some values of type string option are NONE, SOME "abc", and SOME "def".

The option type constructor is pre-defined in Standard ML.  One common use of option types is to handle functions with an optional argument.  For example, here is a function to compute the base-b exponential function for natural number exponents that defaults to base 2:

fun expt (NONE, n) = expt (SOME 2, n)
  | expt (SOME b, 0) = 1
  | expt (SOME b, n) =
    if n mod 2 = 0 then expt (SOME b*b, n div 2) else b * expt (SOME b, n-1)

The advantage of the option type in this sort of situation is that it avoids the need to make a special case of a particular argument, e.g., using 0 as first argument to mean "use the default exponent".

A related use of option types is in aggregate data structures.  For example, an address book entry might have a record type with fields for various bits of data about a person.  But not all data is relevant to all people.  For example, someone may not have a spouse, but they all have a name.  For this we might use a type definition of the form

type entry = { name:string, spouse:string option, ... }

so that one would create an entry for an unmarried person with a spouse field of NONE.

The next level of generality is the recursive type definition.  For example, one may define a type typ tree of binary trees with values of type typ at the nodes using the following declaration:

datatype 'a tree = Empty | Node of 'a tree * 'a * 'a tree

This declaration corresponds directly to the informal definition of binary trees with values of type typ at the nodes:

  1. The empty tree Empty is a binary tree.
  2. If tree1 and tree2 are binary trees, and val is a value of type typ, then Node (tree1, val, tree2) is a binary tree.
  3. Nothing else is a binary tree.

The distinguishing feature of this definition is that it is recursive in the sense that binary trees are constructed out of other binary trees, with the empty tree serving as the base case.

Incidentally, a leaf in a binary tree is here represented as a node both of whose children are the empty tree.  Our definition of binary trees is analogous to starting the natural numbers with zero, rather than one.  In fact you can think of the children of a node in a binary tree as the "predecessors" of that node, the only difference compared to the usual definition of predecessor being that a node has two, rather than one, predecessors.

To compute with a recursive type one ordinarily defines recursive functions.  For example, here is the function to compute the height of a binary tree:

fun height Empty = 0
  | height (Node (lft, _, rht)) = 1 + max (height lft, height rht)

Notice that height is called recursively on the children of a node, and is defined outright on the empty tree.  This pattern of definition is called structural induction.  The function height is said to be defined by induction on the structure of its argument, a tree.  The general idea is to define the function directly for the base cases of the recursive type (i.e., value constructors with no arguments or whose arguments do not involve values of the type being defined), and to define it for non-base cases in terms of its definitions for the constituent values of that type.  We will see numerous examples of this as we go along.

Here's another example.  The size of a binary tree is the number of nodes occurring in it.  Here's a straightforward definition in ML:

fun size Empty = 0
  | size (Node (lft, _, rht)) = 1 + size lft + size rht

The function size is defined by structural induction on trees.

A word of warning.  One reason to capitalize value constructors is to avoid a pitfall in the ML syntax.  Suppose we gave the following definition of size:

fun size empty = 0
  | size (Node (lft, _, rht)) = 1 + size lft + size rht

What happens?  The compiler will warn us that the second clause of the definition is redundant!  Why?  Because empty, spelled with a lower-case "e", is a variable, not a constructor, and hence matches any tree whatsoever.  Consequently the second clause never applies.   By capitalizing constructors we can hope to make mistakes such as these more evident, but in practice you are bound to run into this sort of mistake.

The tree datatype is appropriate for binary trees: those for which every node has exactly two children.  (Of course, either or both children might be the empty tree, so we may consider this to define the type of trees with at most two children; it's a matter of terminology which interpretation you prefer.)  It should be obvious how to define the type of ternary trees, whose nodes have at most three children, and so on for other fixed arities.  But what if we wished to define a type of trees with a variable number of children?  In a so-called variadic tree some nodes might have three children, some might have two, and so on.  This can be achieved in at least two ways.  One way combines lists and trees, as follows:

datatype 'a tree = Empty | Node of 'a * 'a tree list

Each node has a list of children, so that distinct nodes may have different numbers of children.  Notice that the empty tree is distinct from the tree with one node and no children because there is no data associated with the empty tree, whereas there is a value of type 'a at each node.

Another approach is to simultaneously define a variadic tree to be either empty, or a node collecting together a forest to form a tree, and a forest to be either empty or a variadic tree together with another forest.  This leads to the following definition:

datatype 'a tree = Empty | Node of 'a * 'a forest
and      'a forest = Nil | Cons of 'a tree * 'a forest

This example illustrates the introduction of two mutually recursive datatypes, which is why we present it here.  Mutually recursive datatypes beget mutually recursive functions defined on them.  Here's a definition of the size (number of nodes) of a variadic tree:

fun size_tree Empty = 0
  | size_tree (Node (_, f)) = 1 + size_forest f
and size_forest Nil = 0
  | size_forest (Cons (t, f')) = size_tree t + size_forest f'

Notice that we define the size of a tree in terms of the size of a forest, and vice versa, just as the type of trees is defined in terms of the type of forests.

Many other variations are possible.  Suppose we wish to define a notion of binary tree in which data items are associated with branches, rather than nodes.  Here's a datatype declaration for such trees:

datatype 'a tree = Empty | Node of 'a branch * 'a branch
and      'a branch = Branch of 'a * 'a tree

Notice that in contrast to our first definition of binary trees in which the branches from a node to its children were implicit, these branches are now explicit since they are labelled with data items.  For example, we can collect up into a list the data items labelling the branches of such a tree using the following code:

fun collect Empty = nil
  | collect (Node (Branch (ld, lt), Branch (rd, rt))) =
    ld :: rd :: (collect lt) @ (collect rt)

Returning to the original definition of binary trees (with data items at the nodes), observe that the type of the data items at the nodes must be the same for every node of the tree.  For example, a value of type int tree has an integer at every node, and a value of type string tree has a string at every node.  Therefore it makes no sense to evaluate the expression

Node (Empty, 43, Node (Empty, "43", Empty))

since the result, if it were to be accepted, would be a "heterogeneous" tree with integers at some nodes and strings at others.  Such structures are ruled out in ML as type-incorrect.

In 95% of the cases this apparent restriction is no restriction at all; it is quite rare to encounter heterogeneous data structures in real programs.  For example, a dictionary with strings as keys might be represented as a binary search tree with strings at the nodes; there is no need for heterogeneity to represent such a data structure.  But what about the other 5%?  What if one really wanted to have a tree with integers at some nodes and strings at others?  How would one represent such a thing in ML?  To see the answer, first think about how one might manipulate such a data structure.  When accessing a node, we would need to check at run-time whether the data item is an integer or a string; otherwise we would not know whether to, say, add 1 to it, or concatenate "1" to the end of it.  This suggests that the data item must be labelled with sufficient information so that we may determine the type of the item at run-time.  We must also be able to recover the underlying data item itself so that familiar operations (such as addition or string concatenation) may be applied to it.  This is neatly achieved using a datatype declaration.  Suppose we wish to represent the type of integer-or-string trees.  First, we define the type of values to be integers or strings, marked with a constructor indicating which:

datatype int_or_string = Int of int | String of string

Then we define the type of interest as follows:

type int_or_string_tree = int_or_string tree

Voila!  Perfectly natural and easy --- heterogeneity is really a special case of homogeneity!

Datatype declarations and pattern matching are extremely useful for defining and manipulating the abstract syntax of a language.  For example, we may define a small language of arithmetic expressions using the following declaration:

datatype expr = Numeral of int | Plus of expr * expr | Times of expr * expr

This definition has only three clauses, but one could readily imagine adding others.   Here is the definition of a function to evaluate expressions of the language of arithmetic expressions written using pattern matching:

fun eval (Numeral n) = Numeral n
  | eval (Plus (e1, e2)) =
    let
        val Numeral n1 = eval e1
        val Numeral n2 = eval e2
    in
        Numeral (n1+n2)
    end
  | eval (Times (e1, e2)) =
    let
        val Numeral n1 = eval e1
        val Numeral n2 = eval e2
    in
        Numeral (n1*n2)
    end

The combination of datatype declarations and pattern matching contributes enormously to the readability of programs written in ML.  A less obvious, but more important, benefit is the error checking that the compiler can perform for you if you use these mechanisms in tandem.  As an example, suppose that we extend the type expr with a new component for the reciprocal of a number, yielding the following revised definition:

datatype expr =
  Numeral of int | Plus of expr * expr | Times of expr * expr | Recip of expr

First, observe that the "old" definition of eval is no longer applicable to values of type expr!  For example, the expression

eval (Plus (Numeral 1, Numeral 2))

is ill-typed, even though it doesn't use the Recip constructor.  The reason is that the re-declaration of expr introduces a "new" type that just happens to have the same name as the "old" type, but is in fact distinct from it.  This is a boon because it reminds us to recompile the old code relative to the new definition of the expr type.

Second, upon recompiling the definition of eval we encounter an inexhaustive match warning: the old code no longer applies to every value of type expr according to its new definition!  We are of course lacking a case for Recip, which we may provide as follows:

fun eval (Numeral n) = Numeral n
  | eval (Plus (e1, e2)) = ... as before ...
  | eval (Times (e1, e2)) = ... as before ...
  | eval (Recip e) =
    let val Numeral n = eval e in Numeral (1 div n) end

The value of the checks provided by the compiler in such cases cannot be overestimated.   When recompiling a large program after making a change to a datatype declaration the compiler will automatically point out every line of code that must be changed to conform to the new definition; it is impossible to forget to attend to even a single case.  This is a tremendous help to the developer, especially if she is not the original author of the code being modified and is another reason why the static type discipline of ML is a positive benefit, rather than a hindrance, to programmers.

Sample Code for this Chapter