Sample Code for this Chapter

It's time to return to function definitions.  So far we've only considered very simple functions (such as the reciprocal function) whose value is computed more or less directly using the primitives of the language.  You may well be wondering at this stage how to define functions that require some form of iteration to compute.   In familiar imperative languages iteration is accomplished using `while` and `for` loops; in ML it is accomplished using recursion.

Informally, a function defined by recursion is one that computes the result of a call by "calling itself".  To accomplish this, the function must be given a name by which it can refer to itself.  This is achieved using a recursive value binding.  Recursive value bindings have almost the same form as ordinary, non-recursive value bindings, except that the binding is qualified with the adjective "`rec`" by writing `val rec `pat` = `exp.   Here's an example:

`val rec factorial : int->int = fn 0 => 1 | n:int => n * factorial (n-1)`

This is a recursive definition of the factorial function, which is ordinarily defined in textbooks by the recursion equations

0! = 1
n! = n*(n-1)!  (n>0)

Using `fun` notation we may write the definition of factorial much more clearly and concisely as follows:

```fun factorial 0 = 1   | factorial (n:int) = n * factorial (n-1)```

There is clearly a close correspondence between the ML notation and the mathematical notation for the factorial function.

How are recursive value bindings type-checked?  The answer may appear, at first reading, to be paradoxical: assume that the function has the type specified, then check that the definition is consistent with this assumption.  In the case of `factorial` we assume that `factorial` has type `int->int`, then check that its definition

`fn 0 => 1 | n:int => n * factorial (n-1)`

has type `int->int`.  To do so we must check that each pattern has type `int`, and that each corresponding expression has type `int`.   This is clearly true for the first clause of the definition.  For the second, we assume that `n` has type `int`, then check that ```n * factorial (n-1)``` has type `int`.  This is so because of the rules for the primitive arithmetic operations and because of our assumption that `factorial` has type `int->int`.  (Be certain that you understand this reasoning!   It is essential for what follows.)

How are applications of recursive value bindings evaluated?  The rules are almost the same as before.  We need only observe that the binding for the function may have to be retrieved many times during evaluation (once for each recursive call).  For example, to evaluate `factorial 3`, we retrieve the definition of `factorial`, then pattern match the argument against the pattern of each clause.  Clearly `3` does not match `0`, but it does match `n:int`, binding `n` to `3` in the process.  We then evaluate `n * factorial (n-1)` relative to this binding for `n`.  To do so we retrieve the binding for `factorial` a second time, and to apply it to `2`.  Once again we consider each clause in turn, failing to match `0`, but succeeding to match `n:int`.   This introduces a new binding for `n` that shadows the previous binding so that `n` now evaluates to `2`.  We then proceed once again to evaluate `n * factorial (n-1)`, this time with `n` bound to `2`.  Once again we retrieve the binding for `factorial`, then bind `n` to `1`, shadowing the two previous bindings, then evaluating `n * factorial (n-1)` with this binding for `n`.  We retrieve the binding for `factorial` one last time, then apply it to `0`.   This time we match the pattern `0` and yield `1`.  We then (in four steps) compute the result, `6`, by completing the pending multiplications.

The `factorial` function illustrates an important point about recursive function definitions.  Notice that the recursive call in the definition of `factorial` occurs as the argument of a multiplication.  This means that in order for the multiplication to complete, we must first complete the calculation of the recursive call to `factorial`.  In rough outline the computation of `factorial 3` proceeds as follows:

1. `factorial 3`
2. `3 * factorial 2`
3. `3 * 2 * factorial 1`
4. `3 * 2 * 1 * factorial 0`
5. `3 * 2 * 1 * 1`
6. `3 * 2 * 1`
7. `3 * 2`
8. `6`

(The chains of multiplications are implicitly right-associated.)  Notice that the size of the expression first grows (in direct proportion to the argument), then shrinks as the pending multiplications are completed.  This growth in expression size corresponds directly to a growth in run-time storage required to record the state of the pending computation.

The foregoing definition of `factorial` should be contrasted with the following definition:

```fun fact_helper (0,r:int) = r   | fact_helper (n:int,r:int) = fact_helper (n-1,n*r) fun factorial n:int = fact_helper (n, 1)```

We define `factorial` using a helper function `fact_helper` that takes an additional parameter, called an accumulator, that records the running partial result of the computation.  This corresponds to reducing the prefix of the pending computations in the trace given above by "left-associating" the multiplications.  (In fact the technique is only applicable to associative binary operations for precisely this reason.)

The important thing to observe about `fact_helper` is that it is tail recursive, meaning that the recursive call is the last step of evaluation of an application of it to an argument.  The following evaluation trace illustrates the point:

1. `factorial 3`
2. `fact_helper (3, 1)`
3. `fact_helper (2, 3)`
4. `fact_helper (1, 6)`
5. `fact_helper (0, 6)`
6. `6`

Notice that there is no growth in the size of the expression because there are no pending computations to be resumed upon completion of the recursive call.   Consequently, there is no growth in the space required for an application, in contrast to the first definition given above.  In this sense tail recursive definitions are analogous to loops in imperative languages: they merely iterate a computation, and do not require allocation of storage during execution.  For this reason tail recursive procedures are sometimes called iterative.

Time and space usage are important, but what is more important is that the function compute the intended result.  The key to the correctness of a recursive function is an inductive argument establishing its correctness.  The critical ingredients are these:

1. A specification of the result of the function stated in terms of its arguments.   This specification will usually involve assumptions about the arguments that are sufficient to establish that the function behaves correctly.
2. An induction principle that justifies the correctness of the recursive function based on the pattern of its recursive calls.  In simple cases this is ordinary mathematical induction, but in more complicated situations a more sophisticated approach is often required.

These ideas may be illustrated by considering the first definition of `factorial` given above.  A reasonable specification for `factorial` is as follows:

if n>=0 then `factorial` n evaluates to n!

Notice that the specification expresses the assumption that the argument, n, is non-negative, and asserts that the application of `factorial` to n terminates with the expected answer.

To check that  satisfies this specification, we apply the principle of mathematical induction on the argument n.  Recall that this means we are to establish the specification for the case n=0, and, assuming it to hold for n>=0, show that it holds for n+1.  The base case, n=0, is trivial: by definition `factorial` n evaluates to 1, which is 0!.   Now suppose that n=m+1 for some m>=0.  By the inductive hypothesis we have that `factorial` m evaluates to m!, and so by the definition `factorial` n evaluates to the value of n*m! = (m+1)*m! = (m+1)! = n!, as required.  This completes the proof.

That was easy.  What about the second definition of `factorial`?  We focus on the behavior of `fact_helper`.  A suitable specification is

if n>=0 then `fact_helper` `(`n,r`)` evaluates to n!*r

Once again we proceed by mathematical induction on n; you can easily check that `fact_helper `satisfies this specification.  It follows that the second definition of `factorial` in terms of `fact_helper` satisfies the specification of `factorial` given above, since we may take r=1 in the specification of `fact_helper`.

As a matter of programming style, it is usual to conceal the definitions of helper functions using a `local` declaration.  In practice we would make the following definition of the iterative version of `factorial`:

```local       fun fact_helper (0,r:int) = r         | fact_helper (n:int,r:int) = fact_helper (n-1,n*r) in       fun factorial (n:int) = fact_helper (n,1) end```

This way the helper function is not visible, only the function of interest is "exported" by the declaration.

Here's an example of a function defined by complete induction, the Fibonacci function, defined on integers n>=0:

(* for n>=0, fib n evaluates to the nth Fibonacci number *)
fun fib 0 = 1
| fib 1 = 1
| fib (n:int) = fib (n-1) + fib (n-2)

The recursive calls are made not only on `n-1`, but also `n-2`, which is why we must appeal to complete induction to justify the definition.  This definition of `fib` is very inefficient because it performs many redundant computations: to compute `fib n` requires that we compute `fib (n-1)` and `fib (n-2)`.  To compute `fib (n-1)` requires that we compute `fib (n-2)` a second time, and `fib (n-3)`.   Computing `fib (n-2)` requires computing `fib (n-3)` again, and `fib (n-4)`.  As you can see, there is considerable redundancy here.  It can be shown that the running time `fib` of  is exponential in its argument, which is clearly awful for such a simple function.

Here's a better solution: for each n>=0 compute not only the nth Fibonacci number, but also the (n-1)st as well.  (For n=0 we define the "-1"st Fibonacci number to be zero).  That way we can avoid redundant recomputation, resulting in a linear-time algorithm.  Here's the code:

(* for n>=0, fibb n evaluates to (a, b), where a is the nth Fibonacci number and b is the (n-1)st *)
fun fibb 0 = (1, 0)
| fibb 1 = (1, 1)
| fibb (n:int) =
let
val (a:int, b:int) = fibb (n-1)
in
(a+b, a)
end

You might feel satisfied with this solution since it runs in time linear in n.   But in fact there's a constant-time algorithm to compute the nth Fibonacci number!  In other words the recurrence

F0 = 1
F1 = 1
Fn = Fn-1 + Fn-2

has a closed-form solution.  (See Knuth's Concrete Mathematics (Addison-Wesley 1989) for a derivation.)  However, this is an unusual case.  In most instances recursively-defined functions have no known closed-form solution, so that some form of iteration is inevitable.

It is often useful to define two functions simultaneously, each of which calls itself and/or the other to compute its result.  Such fnctions are said to be mutually recursive.  Here's a simple example to illustrate the point, namely testing whether a natural number is odd or even.  The most obvious approach is to test whether the number is congruent to 0 mod 2, and indeed this is what one would do in practice.  But to illustrate the idea of mutual recursion we instead use the following inductive characterization: 0 is even, and not odd; n>0 is even iff n-1 is odd; n>0 is odd iff n-1 is even.  This may be coded up using two mutually-recursive procedures as follows:

fun even 0 = true
| even n = odd (n-1)
and odd 0 = false
| odd n = even (n-1)

Notice that `even` calls `odd` and `odd` calls `even`, so they are not definable separately from one another.  We join their definitions using the keyword `and` to indicate that they are defined simultaneously by mutual recursion.  Later in these notes we will see more compelling examples of mutually-recursive functions.