Sample Code for this Chapter

It's time to return to function definitions.  So far we've only considered very simple functions (such as the reciprocal function) whose value is computed more or less directly using the primitives of the language.  You may well be wondering at this stage how to define functions that require some form of iteration to compute.   In familiar imperative languages iteration is accomplished using while and for loops; in ML it is accomplished using recursion.

Informally, a function defined by recursion is one that computes the result of a call by "calling itself".  To accomplish this, the function must be given a name by which it can refer to itself.  This is achieved using a recursive value binding.  Recursive value bindings have almost the same form as ordinary, non-recursive value bindings, except that the binding is qualified with the adjective "rec" by writing val rec pat = exp.   Here's an example:

val rec factorial : int->int = fn 0 => 1 | n:int => n * factorial (n-1)

This is a recursive definition of the factorial function, which is ordinarily defined in textbooks by the recursion equations

0! = 1
n! = n*(n-1)!  (n>0)

Using fun notation we may write the definition of factorial much more clearly and concisely as follows:

fun factorial 0 = 1
  | factorial (n:int) = n * factorial (n-1)

There is clearly a close correspondence between the ML notation and the mathematical notation for the factorial function.

How are recursive value bindings type-checked?  The answer may appear, at first reading, to be paradoxical: assume that the function has the type specified, then check that the definition is consistent with this assumption.  In the case of factorial we assume that factorial has type int->int, then check that its definition

fn 0 => 1 | n:int => n * factorial (n-1)

has type int->int.  To do so we must check that each pattern has type int, and that each corresponding expression has type int.   This is clearly true for the first clause of the definition.  For the second, we assume that n has type int, then check that n * factorial (n-1) has type int.  This is so because of the rules for the primitive arithmetic operations and because of our assumption that factorial has type int->int.  (Be certain that you understand this reasoning!   It is essential for what follows.)

How are applications of recursive value bindings evaluated?  The rules are almost the same as before.  We need only observe that the binding for the function may have to be retrieved many times during evaluation (once for each recursive call).  For example, to evaluate factorial 3, we retrieve the definition of factorial, then pattern match the argument against the pattern of each clause.  Clearly 3 does not match 0, but it does match n:int, binding n to 3 in the process.  We then evaluate n * factorial (n-1) relative to this binding for n.  To do so we retrieve the binding for factorial a second time, and to apply it to 2.  Once again we consider each clause in turn, failing to match 0, but succeeding to match n:int.   This introduces a new binding for n that shadows the previous binding so that n now evaluates to 2.  We then proceed once again to evaluate n * factorial (n-1), this time with n bound to 2.  Once again we retrieve the binding for factorial, then bind n to 1, shadowing the two previous bindings, then evaluating n * factorial (n-1) with this binding for n.  We retrieve the binding for factorial one last time, then apply it to 0.   This time we match the pattern 0 and yield 1.  We then (in four steps) compute the result, 6, by completing the pending multiplications.

The factorial function illustrates an important point about recursive function definitions.  Notice that the recursive call in the definition of factorial occurs as the argument of a multiplication.  This means that in order for the multiplication to complete, we must first complete the calculation of the recursive call to factorial.  In rough outline the computation of factorial 3 proceeds as follows:

  1. factorial 3
  2. 3 * factorial 2
  3. 3 * 2 * factorial 1
  4. 3 * 2 * 1 * factorial 0
  5. 3 * 2 * 1 * 1
  6. 3 * 2 * 1
  7. 3 * 2
  8. 6

(The chains of multiplications are implicitly right-associated.)  Notice that the size of the expression first grows (in direct proportion to the argument), then shrinks as the pending multiplications are completed.  This growth in expression size corresponds directly to a growth in run-time storage required to record the state of the pending computation.

The foregoing definition of factorial should be contrasted with the following definition:

fun fact_helper (0,r:int) = r
  | fact_helper (n:int,r:int) = fact_helper (n-1,n*r)

fun factorial n:int = fact_helper (n, 1)

We define factorial using a helper function fact_helper that takes an additional parameter, called an accumulator, that records the running partial result of the computation.  This corresponds to reducing the prefix of the pending computations in the trace given above by "left-associating" the multiplications.  (In fact the technique is only applicable to associative binary operations for precisely this reason.)

The important thing to observe about fact_helper is that it is tail recursive, meaning that the recursive call is the last step of evaluation of an application of it to an argument.  The following evaluation trace illustrates the point:

  1. factorial 3
  2. fact_helper (3, 1)
  3. fact_helper (2, 3)
  4. fact_helper (1, 6)
  5. fact_helper (0, 6)
  6. 6

Notice that there is no growth in the size of the expression because there are no pending computations to be resumed upon completion of the recursive call.   Consequently, there is no growth in the space required for an application, in contrast to the first definition given above.  In this sense tail recursive definitions are analogous to loops in imperative languages: they merely iterate a computation, and do not require allocation of storage during execution.  For this reason tail recursive procedures are sometimes called iterative.

Time and space usage are important, but what is more important is that the function compute the intended result.  The key to the correctness of a recursive function is an inductive argument establishing its correctness.  The critical ingredients are these:

  1. A specification of the result of the function stated in terms of its arguments.   This specification will usually involve assumptions about the arguments that are sufficient to establish that the function behaves correctly.
  2. An induction principle that justifies the correctness of the recursive function based on the pattern of its recursive calls.  In simple cases this is ordinary mathematical induction, but in more complicated situations a more sophisticated approach is often required.

These ideas may be illustrated by considering the first definition of factorial given above.  A reasonable specification for factorial is as follows:

if n>=0 then factorial n evaluates to n!

Notice that the specification expresses the assumption that the argument, n, is non-negative, and asserts that the application of factorial to n terminates with the expected answer.

To check that  satisfies this specification, we apply the principle of mathematical induction on the argument n.  Recall that this means we are to establish the specification for the case n=0, and, assuming it to hold for n>=0, show that it holds for n+1.  The base case, n=0, is trivial: by definition factorial n evaluates to 1, which is 0!.   Now suppose that n=m+1 for some m>=0.  By the inductive hypothesis we have that factorial m evaluates to m!, and so by the definition factorial n evaluates to the value of n*m! = (m+1)*m! = (m+1)! = n!, as required.  This completes the proof.

That was easy.  What about the second definition of factorial?  We focus on the behavior of fact_helper.  A suitable specification is

if n>=0 then fact_helper (n,r) evaluates to n!*r

Once again we proceed by mathematical induction on n; you can easily check that fact_helper satisfies this specification.  It follows that the second definition of factorial in terms of fact_helper satisfies the specification of factorial given above, since we may take r=1 in the specification of fact_helper.

As a matter of programming style, it is usual to conceal the definitions of helper functions using a local declaration.  In practice we would make the following definition of the iterative version of factorial:

      fun fact_helper (0,r:int) = r
        | fact_helper (n:int,r:int) = fact_helper (n-1,n*r)
      fun factorial (n:int) = fact_helper (n,1)

This way the helper function is not visible, only the function of interest is "exported" by the declaration.

Here's an example of a function defined by complete induction, the Fibonacci function, defined on integers n>=0:

(* for n>=0, fib n evaluates to the nth Fibonacci number *)
fun fib 0 = 1
  | fib 1 = 1
  | fib (n:int) = fib (n-1) + fib (n-2)

The recursive calls are made not only on n-1, but also n-2, which is why we must appeal to complete induction to justify the definition.  This definition of fib is very inefficient because it performs many redundant computations: to compute fib n requires that we compute fib (n-1) and fib (n-2).  To compute fib (n-1) requires that we compute fib (n-2) a second time, and fib (n-3).   Computing fib (n-2) requires computing fib (n-3) again, and fib (n-4).  As you can see, there is considerable redundancy here.  It can be shown that the running time fib of  is exponential in its argument, which is clearly awful for such a simple function.

Here's a better solution: for each n>=0 compute not only the nth Fibonacci number, but also the (n-1)st as well.  (For n=0 we define the "-1"st Fibonacci number to be zero).  That way we can avoid redundant recomputation, resulting in a linear-time algorithm.  Here's the code:

(* for n>=0, fibb n evaluates to (a, b), where a is the nth Fibonacci number and b is the (n-1)st *)
fun fibb 0 = (1, 0)
  | fibb 1 = (1, 1)
  | fibb (n:int) =
        val (a:int, b:int) = fibb (n-1)
        (a+b, a)

You might feel satisfied with this solution since it runs in time linear in n.   But in fact there's a constant-time algorithm to compute the nth Fibonacci number!  In other words the recurrence

F0 = 1
F1 = 1
Fn = Fn-1 + Fn-2

has a closed-form solution.  (See Knuth's Concrete Mathematics (Addison-Wesley 1989) for a derivation.)  However, this is an unusual case.  In most instances recursively-defined functions have no known closed-form solution, so that some form of iteration is inevitable.

It is often useful to define two functions simultaneously, each of which calls itself and/or the other to compute its result.  Such fnctions are said to be mutually recursive.  Here's a simple example to illustrate the point, namely testing whether a natural number is odd or even.  The most obvious approach is to test whether the number is congruent to 0 mod 2, and indeed this is what one would do in practice.  But to illustrate the idea of mutual recursion we instead use the following inductive characterization: 0 is even, and not odd; n>0 is even iff n-1 is odd; n>0 is odd iff n-1 is even.  This may be coded up using two mutually-recursive procedures as follows:

fun even 0 = true
  | even n = odd (n-1)
and odd 0 = false
  | odd n = even (n-1)

Notice that even calls odd and odd calls even, so they are not definable separately from one another.  We join their definitions using the keyword and to indicate that they are defined simultaneously by mutual recursion.  Later in these notes we will see more compelling examples of mutually-recursive functions.

Sample Code for this Chapter