Sample Code for this Chapter

We have already noted that aggregate data structures are especially easy to handle in ML.  Our first examples were tuple and record types.  The list types provide another example of an aggregate data structure in ML.  Informally, the values of type typ list are the finite lists of values of type typ.   But what is a list?  The values of type typ list are defined as follows:

  1. nil is a value of type typ list.
  2. if h is a value of type typ, and t is a value of type typ list, then h::t is a value of type typ list.
  3. Nothing else is a value of type typ list.

The type expression typ list is a postfix notation for the application of the type constructor list to the argument typ.   Thus list is a kind of "function" mapping types to types: given a type typ, we may apply list to it to get another type, written typ list.  The forms nil and :: are the value constructors of type typ list.  The nullary (no argument) constructor nil may be thought of as the empty list.   The binary (two argument) constructor :: constructs a non-empty list from a value h of type typ and another value t of type typ list; the resulting value, h::t, of type typ list is pronounced "h cons t" (for historical reasons).  We say that "h is cons'd onto t", that h is the "head" of the list, and that t is its "tail".

The definition of the values of type  typ list given above is an example of an inductive definition.   The type  is said to be recursive because this definition is "self-referential" in the sense that the values of type typ list are defined in terms of (other) values of the same type.  This is especially clear if we examine the types of the value constructors for the type typ list:

nil : typ list
op ::
: typ * typ list -> typ list

(The notation op :: is used to refer to the "cons" operator as a function, rather than to use it to form a list, which requires infix notation.)  Two things are notable here:

  1. The "cons" operation takes an argument of type typ list, and yields a result of type typ list.  This reflects the "recursive" nature of the type typ list.
  2. Both operations are polymorphic in the type of the underlying elements of the list.  Thus nil is the empty list of type typ list for any element type typ, and op :: constructs a non-empty list independently of the type of the elements of that list.

A consequence of the inductive definition of the list type is that values of type typ list have the form

h1 ::( h2 :: ... ::( hn :: nil)...)

for some n>=0.  (When n is zero, this is, by convention, the empty list, nil.)  The operator :: is right-associative, so we may omit the parentheses and just write

h1 :: h2 :: ... :: hn :: nil.

As a further convenience this list may be abbreviated using list notation:

[ h1 , h2 , ... , hn ]

This notation emphasizes the interpretation of lists as finite sequences of values, but it obscures the fundamental inductive character of lists as being built up from nil using the :: operation.

How do we compute with values of list type?  Since the values are defined inductively, it is natural that functions on lists be defined recursively, using a clausal definition that analyzes the structure of a list.  Here's a definition of the function length that computes the number of elements of a list:

fun length nil = 0
  | length (_::t) = 1 + length t

The definition is given by induction on the structure of the list argument.  The base case is the empty list, nil.  The inductive step is the non-empty list _::t (notice that we do not need to give a name to the head).   Its definition is given in terms of the tail of the list t, which is "smaller" than the list _::t.  The type of length  is 'a list -> int; it is defined for lists of values of any type whatsoever.

We may define other functions following a similar pattern.  Here's the function to append two lists:

fun append (nil, l) = l
  | append (h::t, l) = h :: append (t, l)

This function is built into ML; it is written using infix notation as exp1 @ exp2The running time of append is proportional to the length of the first list, as should be obvious from its definition.

Here's a function to reverse a list.

fun rev nil = nil
  | rev (h::t) = rev t @ [h]

It is not tail recursive.  In fact, its time complexity is O(n2), where n is the length of the argument list.  This can be demonstrated by writing down a recurrence that defines the running time T(n) on a list of length n.

T(0) = O(1)
T(n+1) = T(n) + O(n)

Solving the recurrence we obtain the result T(n)=O(n2).

Can we do better?  Oddly, we can take advantage of the non-associativity of :: to give a tail-recursive definition of rev.

      fun rev_helper (nil, a) = a
        | rev_helper (h::t, a) = rev_helper (t, h::a)
      fun rev l = rev_helper (l, nil)

The pattern is the same as before, except that by re-associating the uses of :: we reverse the list!  The helper function reverses its first argument and prepends it to its second argument.  That is, rev_helper (l, a) evaluates to (rev l) @ a, where we assume here an independent definition of rev for the sake of the specification.  Notice that rev_helper runs in time proportional to the length of its first argument, and hence rev runs in time proportional to the length of the list.

The correctness of functions defined on lists is established using the principle of structural induction.  We illustrate this by establishing that the function rev_helper satisfies the following specification:

for every l and a of type typ list, rev_helper (l, a) evaluates to the result of appending a to the reversal of l.

The proof is by structural induction on the list l.  If l is nil, then rev_helper (l, a) evaluates to a, which is as required.  If l is h::t, then by the inductive hypothesis rev_helper (l, a) evaluates to the result of appending h::a to the reversal of t, which is easily seen to be the result of appending a to the reversal of h::t.

The form of this argument may be summarized as follows:

  1. Establish the correctness of the function for the empty list, nil.
  2. Assuming the correctness of the function for t, establish its correctness for h::t.

It follows that the function is correct for all lists l, by the inductive definition of the list type.  This is called the principle of structural induction on lists.  We will soon generalize this to other inductively-defined types.

Sample Code for this Chapter