Sample Code for this Chapter

Evaluation of an expression may terminate with a value and may along the way engender an effect upon its environment.  Our first example of an effect was the possibility of raising an exception, which we explored in detail in the preceding chapter.  The next important example of an effect is a storage effect, the allocation or mutation of storage during evaluation.  The introduction of storage effects has profound consequences, not all of which are desirable.  Indeed, storage effects are sometimes denigrated by referring to them as side effects, by analogy with the unintended effects of some medications.  While it is surely excessive to dismiss storage effects as completely undesirable, it is advantageous to minimize the use of storage effects except where clearly appropriate to the task.  We will explore some of the basic techniques for using storage effects later in this chapter, but first we introduce the mechanisms for supporting mutable storage in ML.

To support mutable storage the execution model of programs is modified to include an implicit memory consisting of a finite set of mutable cells containing data items of a fixed type.  A mutable cell may be thought of as a kind of container in which a data value is stored.  During the course of evaluation the content of a cell may be retrieved or may be replaced by any other value of the same type.   Mutation introduces a strongly temporal aspect to evaluation: we speak of the current contents of a cell as the value most recently assigned to it.  This is to be contrasted with the bindings of values to variables, which never change once made and hence have a permanent quality; the binding of a variable is a uniquely-determined value that does not change during evaluation.  Since cells are used by issuing "commands" to modify and retrieve their contents, programming with cells is sometimes called imperative programming.

Since cells may have their contents changed during evaluation it is imperative that we take careful account of the identity of cells. When are two cells the same?   When are they different?  The guiding principle is that two cells (of the same type) are distinct if there is a program that can tell them apart; otherwise they are equal.  How can we tell cells apart?  By doing the only things we can ever do with cells: retrieve their contents or set their contents to specified values.  Given two integer cells, we can determine whether they are the same cell or not by first checking if they have distinct contents.  If so, then they are distinct cells.   If not, we must distinguish between two "copies" of a single cell, or two cells that happen to have the same content.  To do this, bind the current contents of one cell to a variable, and set that cell's value to an integer different from the saved contents.  If the other cell's value is now the newly-assigned value, then the two cells are the same, otherwise they are different.

This principle of equality is called identity of indiscernables: two things are equal if we cannot tell them apart.  The test we just outlined extends to cells of other types, but is a rather roundabout way to test for cell identity.  In practice we work with a slightly conservative approximation to cell identity, called reference (or pointer) equality --- two cells are equal iff they occupy the same address in memory.  This test is conservative in that it may distinguish two cells that are in fact indiscernable: any two unit-valued cells are indiscernable because there is only one value of unit type, yet pointer equality would distinguish them.  To avoid such anomalies we use pointer equality to determine cell identity, relying on the representation of cells as references to memory.  For this reason mutable cells in ML are called reference cells, or references.

Reference cells containing values of type typ are themselves values of type typ ref.  They are "first-class" values in the sense that reference cells may be passed as arguments, returned as results, and even stored in other reference cells.  Reference cells are created, or allocated, by the function ref of type typ -> typ ref.  When applied to a value val of type typ, ref allocates a new cell, initializes its content to val, and returns a reference to the cell.   By a "new cell" we mean a cell that is distinct from all other cells previously allocated; it does not share storage with any of them.  The content of a cell of type typ is retrieved using the function ! of type typ ref -> typ.  Applying ! to a (reference to a) cell returns the current content of that cell.  The content of a cell is modified by the operation op := of type typ * typ ref -> unit; it is written using infix syntax with the reference cell as left-hand argument and the new contents as right-hand argument.  When applied to a cell and a value, it replaces the content of that cell with that value, and yields the null-tuple as result.  Cells may be compared for equality using the equality operation, =, which has type typ ref * typ ref -> bool.

Here are some examples:

val r = ref 0
val s = ref 0
val a = r=s
val _ = r := 3
val x = !s + !r
val t = r
val b = s=t
val c = r=t
val _ = t := 5
val y = !s + !r
val z = !t + !r

Afterwards, a is bound to false, b to false, c to true, x to 3, y to 5, and z to 10.  Be sure you understand exactly why in each case!

The above examples illustrate the problem of aliasing.  The variables t and r are both bound to the same cell, whereas s is bound to a different cell.  We say that t and r are aliases for the same cell because the one cell is known by two different names.  Aliasing is a serious source of bugs in programs since assigning a value to one destroys the contents of the other.  Avoiding these kinds of problems requires careful reasoning about the possibility of two variables being bound to the same reference cell.  A classic example is a program to "rotate" the contents of three cells: given reference cells a, b, and c, with initial contents x, y, and z, set their contents to y, z, and x, respectively.  Here's a candidate implementation:

fun rot3 (a, b, c) =
    let
        val t = !a
    in
        a := !b; b := !c; c := t
    end

This code works fine if a, b, and c are distinct reference cells.  But suppose that a and c are the same cell.   Afterwards the contents of a, b, and c are y, y, and x!  A correct implementation must work even in the presence of aliasing.   Here's a solution that works correctly in all cases:

fun rot3 (a, b, c) =
    let
        val (x, y, z) = (!a, !b, !c)
    in
        a := y; b := z; c := x
    end

Notice that we use immutable variables to temporarily hold the initial contents of the cells while their values are being updated.

This example illustrates the use of the semicolon to sequence evaluation of expressions purely for their effect.  The expression

exp1 ; exp2

is shorthand for

let val _ = exp1 in exp2 end

The expression exp1 is evaluated only for its effect; its return value is thrown away by the wildcard binding.  The value of the entire expression is the value of exp2 after evaluation of exp1 for effect.  The cumulative effect of the sequential composition is the effect of evaluating exp1 followed by the effect of evaluating exp2.

It is a common mistake to omit the exclamation point when referring to the content of a reference, especially when that cell is bound to a variable.  In more familiar languages such as C or Pascal all variables are implicitly bound to reference cells, and they are implicitly de-referenced whenever they are used so that a variable always stands for its current contents.  This is both a boon and a bane.  It is obviously helpful in many common cases since it alleviates the burden of having to explicitly dereference variables whenever their content is required.  However, it shifts the burden to the programmer in the case that the address, and not the content, is intended.  In C one writes &x for the address of (the cell bound to) x; in Pascal one must use reference parameters to achieve a similar effect.   Which is preferable is largely a matter of taste.  The burden of explicit de-referencing is not nearly so onerous in ML as it might be in other languages simply because reference cells are used only infrequently in ML programs, whereas they are the sole means of binding variables in more familiar languages.

An alternative to explicitly de-referencing cells is to use ref patterns.   A pattern of the form ref pat matches a reference cell whose content matches the pattern pat.  This means that the cell's contents are implicitly retrieved during pattern matching, and may be subsequently used without explicit de-referencing.  For example, the second implementation of rot3 above might be written using ref patterns as follows:

fun rot3 (a, b, c) =
    let
        val (ref x, ref y, ref z) = (a, b, c)
    in
        a := y; b := z; c := x
    end

In practice it is common to use both explicit de-referencing and ref patterns, depending on the situation.

Using references it is possible to mimic the style of programming used in imperative languages such as C or C++ or Java.  For example, we might define the factorial function as follows:

fun imperative_fact (n:int) =
    let
        val result = ref 1
        val i = ref 0
        fun loop () =
            if !i = n then
               ()
            else
               (i := !i + 1; result := !result * !i; loop ())
    in
        loop (); !result
    end

Notice that the function loop is essentially just a while loop; it repeatedly executes its body until the contents of the cell bound to i reaches n.  The tail call to loop is essentially just a goto statement since its argument is always the null-tuple.

It is bad style to program in this fashion.  The purpose of the function imperative_fact is to compute a simple function on the natural numbers.  There is nothing about its definition that suggests that state must be maintained, and so it is senseless to allocate and modify storage to compute it.  The definition we gave earlier is shorter, simpler, more efficient, and hence more suitable to the task.  This is not to suggest, however, that there are no good uses of references.  We will now discuss some important uses of state in ML.

The first example is the use of higher-order functions to manage shared private state.   This programming style is closely related to the use of objects to manage state in object-oriented programming languages.  Here's an example to frame the discussion:

local
      val counter = ref 0
in
      fun tick () = (counter := !counter + 1; !counter)
      fun reset () = (counter := 0)
end

This declaration introduces two functions, tick of type unit -> int and reset of type unit -> unit.  Their definitions share a private variable counter that is bound to a mutable cell containing the current value of a shared counter.  The tick operation increments the counter and returns its new value, and the reset operation resets its value to zero.  The types of the operations suggest that implicit state is involved.   In the absence of exceptions and implicit state, there is only one useful function of type unit->unit, namely the function that always returns its argument (and it's debatable whether this is really useful!).

The declaration above defines two functions, tick and reset, that share a single private counter.  Suppose now that we wish to have several different instances of a counter --- different pairs of functions tick and reset that share different state.  We can achieve this by defining a counter generator (or constructor) as follows:

fun new_counter () =
    let
        val counter = ref 0
        fun tick () = (counter := !counter + 1; !counter)
        fun reset () = (counter := 0)
    in
        { tick = tick, reset = reset }
    end

The type of new_counter is unit -> { tick : unit->int, reset : unit->unit }.  We've packaged the two operations into a record containing two functions that share private state.  There is an obvious analogy with class-based object-oriented programming.  The function new_counter may be thought of as a constructor for a class of counter objects.  Each object has a private instance variable counter that is shared between the methods tick and reset of the object represented as a record with two fields.

Here's how we use counters.

val c1 = new_counter ()
val c2 = new_counter ()
#tick c1 ();                 (* 1 *)
#tick c1 ();                 (* 2 *)
#tick c2 ();                 (* 1 *)
#reset c1 ();
#tick c1 ();                 (* 1 *)
#tick c2 ();                 (* 2 *)

Notice that c1 and c2 are distinct counters that increment and reset independently of one another.

A second important use of references is to build mutable data structures.  The data structures (such as lists and trees) we've considered so far are immutable in the sense that it is impossible to change the structure of the list or tree without building a modified copy of that structure.  This is both a benefit and a drawback.  The principle benefit is that immutable data structures are persistent in that operations performed on them do not destroy the original structure --- in ML we can eat our cake and have it too.  For example, we can simultaneously maintain a dictionary both before and after insertion of a given word.  The principle drawback is that if we aren't really relying on persistence, then it is wasteful to make a copy of a structure if the original is going to be discarded anyway.  What we'd like in this case is to have an "update in place" operation to build an ephemeral (opposite of persistent) data structure.  To do this in ML we make use of references.

A simple example is the type of possibly circular lists, or pcls.   Informally, a pcl is a finite graph in which every node has at most one neighbor, called its predecessor, in the graph.  In contrast to ordinary lists the predecessor relation is not necessarily well-founded: there may be an infinite sequence of nodes arranged in descending order of predecession.  Since the graph is finite, this can only happen if there is a cycle in the graph: some node has an ancestor as predecessor.  How can such a structure ever come into existence?  If the predecessors of a cell are needed to construct a cell, then the ancestor that is to serve as predecessor in the cyclic case can never be created!  The "trick" is to employ backpatching: the predecessor is initialized to Nil, so that the node and its ancestors can be constructed, then it is reset to the appropriate ancestor to create the cycle.

This can be achieved in ML using the following datatype declaration:

datatype 'a pcl = Nil | Cons of 'a * 'a pcl ref

The "tail" of a Cons node is a reference cell so that we may assign to it to implement backpatching.  Here's an example:

fun hd (Cons (h, _)) = h          (* auxiliary functions *)
fun tl (Cons (_, t)) = t

val ones = Cons (1, ref Nil)     (* create a preliminary acyclic structure *)

val _ = (tl ones) := ones        (* backpatch to form the cycle *)

Initially the variable ones is bound to the acyclic pcl with one node whose head element is 1.   We then assign that very node to the predecessor (tail) of that node, resulting in a circular pcl with one node.  Observe that hd ones, hd !(tl ones), hd !(tl !(tl ones)), etc all evaluate to 1.  Notice that we must explicitly refer to the contents of the tail of each node since it is a reference cell!

Let us define the length of a pcl to be the number of distinct nodes occurring in it.  An interesting exercise is to define a length function for pcls that makes no use of auxiliary storage (i.e., no list of previously-encountered nodes) and runs in time proportional to the number of cells in the pcl.  Hint: think of the fable of the tortoise and the hare.  If they run a long race on an oval track, what is sure to happen, and when?  Does this suggest an algorithm?

In addition to reference cells, ML also provides mutable arrays as a primitive data structure.  The type typ array is the type of arrays carrying values of type typ.  The basic operations on arrays are these:

array : int * 'a -> 'a array create array of given size with given initial value for each element
size : 'a array -> int number of elements in a given array
sub : 'a array * int -> 'a access element; raises Subscript exception if out of bounds access is attempted
update : 'a array * int * 'a -> unit change the contents of a given array element; raises Subscript for out of bounds access

These are just the basic operations on arrays; consult the Basis Library document for further details.  Immutable arrays are also available.  The type 'a vector is similar to the type 'a array, except that vectors are immutable, whereas arrays are mutable.

One simple use of arrays is for memoization.  Here's a function to compute the nth Catalan number, which may be thought of as the number of distinct ways to parenthesize an arithmetic expression consisting of a sequence of n consecutive multiplication's.  It makes use of an auxiliary summation function that you can easily define for yourself.  (Applying sum to f and n computes the sum  of f 0 + ... + f n.)

fun C 1 = 1
  | C n = sum (fn k => (C k) * (C (n-k))) (n-1)

This definition of C is hugely inefficient because a given computation may be repeated exponentially many times.  For example, to compute C 10 we must compute C 1, C2, ..., C 9, and the computation of C i engenders the computation of C 1, ..., C (i-1) for each 1<=i<=9.  We can do better by caching previously-computed results in an array, leading to an enormous improvement in execution speed.  Here's the code:

local
       val limit : int = 100
       val memopad : int option array = Array.array (limit, NONE)
in
       fun C' 1 = 1
         | C' n = sum (fn k => (C k) * (C (n-k))) (n-1)
       and C n =
           if n < limit then
              case Array.sub (memopad, n)
                of SOME r => r
                 | NONE =>
                   let
                       val r = C' n
                   in
                       Array.update (memopad, n, SOME r);
                       r
                   end
            else
               C' n
end

Note carefully the structure of the solution.  The function C is a memoized version of the Catalan number function.  When called it consults the memopad to determine whether or not the required result has already been computed.  If so, the answer is simply retrieved from the memopad, otherwise the result is computed, stored in the cache, and returned.  The function C' looks superficially similar to the earlier definition of C, with the important difference that the recursive calls are to C, rather than C' itself.  This ensures that sub-computations are properly cached and that the cache is consulted whenever possible.

The main weakness of this solution is that we must fix an upper bound on the size of the cache.  This can be alleviated by implementing a more sophisticated cache management scheme that dynamically adjusts the size of the cache based on the calls made to it.

Sample Code for this Chapter