We have already noted that aggregate data structures are especially easy to handle in
ML. Our first examples were tuple and record types. The *list *types
provide another example of an aggregate data structure in ML. Informally, the values
of type *typ *`list`

are the finite lists of values of type *typ*.
But what is a list? The values of type *typ *`list`

are
defined as follows:

`nil`

is a value of type*typ*`list`

.- if
*h*is a value of type*typ*, and*t*is a value of type*typ*`list`

, then*h*`::`

*t*is a value of type*typ*`list`

. - Nothing else is a value of type
*typ*`list`

.

The type expression *typ *`list`

is a postfix notation for the
application of the *type constructor *`list`

to the argument *typ*.
Thus `list`

is a kind of "function" mapping types to types:
given a type *typ*, we may apply `list`

to it to get another type,
written *typ *`list`

. The forms `nil`

and `::`

are the *value constructors* of type *typ *`list`

. The
nullary (no argument) constructor `nil`

may be thought of as the empty list.
The binary (two argument) constructor `::`

constructs a non-empty list
from a value *h* of type *typ* and another value *t* of type *typ *`list`

;
the resulting value, *h*`::`

*t*, of type *typ* `list`

is pronounced "*h* cons* t*" (for historical reasons). We
say that *"h* is cons'd onto *t*", that *h* is the
"head" of the list, and that *t* is its "tail".

The definition of the values of type *typ *`list`

given above
is an example of an *inductive* *definition*. The type is said
to be *recursive* because this definition is "self-referential" in the
sense that the values of type *typ *`list`

are defined in terms of
(other) values of the same type. This is especially clear if we examine the types of
the value constructors for the type *typ *`list`

:

`nil`

:

typ`list`

:

op ::

typ`*`

typ`list ->`

typ`list`

(The notation `op ::`

is used to *refer* to the "cons"
operator as a function, rather than to *use* it to form a list, which requires
infix notation.) Two things are notable here:

- The "cons" operation takes an argument of type
*typ*`list`

, and yields a result of type*typ*`list`

. This reflects the "recursive" nature of the type*typ*`list`

. - Both operations are
*polymorphic*in the type of the underlying elements of the list. Thus`nil`

is the empty list of type*typ*`list`

for any element type*typ*, and`op ::`

constructs a non-empty list independently of the type of the elements of that list.

A consequence of the inductive definition of the list type is that values of type *typ*```
list
```

have the form

h_{1}`::(`

h_{2}`::`

...`::(`

h_{n}`::`

`nil)`

...`)`

for some *n>=0*. (When *n* is zero, this is, by convention, the
empty list,

.) The operator *nil*`::`

is *right-associative*,
so we may omit the parentheses and just write

h_{1}`::`

h_{2}`::`

...`::`

h_{n}`::`

`nil`

.

As a further convenience this list may be abbreviated using *list notation*:

`[`

h_{1}`,`

h_{2}`,`

...`,`

h_{n}`]`

This notation emphasizes the interpretation of lists as finite sequences of values, but
it obscures the fundamental inductive character of lists as being built up from `nil`

using the `::`

operation.

How do we compute with values of list type? Since the values are defined
inductively, it is natural that functions on lists be defined recursively, using a clausal
definition that analyzes the structure of a list. Here's a definition of the
function `length`

that computes the number of elements of a list:

`fun length nil = 0`

| length (_::t) = 1 + length t

The definition is given by induction on the structure of the list argument. The
base case is the empty list, `nil`

. The inductive step is the non-empty
list `_::t`

(notice that we do not need to give a name to the head). Its
definition is given in terms of the tail of the list `t`

, which is
"smaller" than the list `_::t`

. The type of `length`

is `'a list -> int`

; it is defined for lists of values of any type
whatsoever.

We may define other functions following a similar pattern. Here's the function to append two lists:

`fun append (nil, l) = l`

| append (h::t, l) = h :: append (t, l)

This function is built into ML; it is written using infix notation as *exp _{1}
*

`@`

`append`

is proportional to the length of the first list, as should be obvious from its definition.Here's a function to reverse a list.

`fun rev nil = nil`

| rev (h::t) = rev t @ [h]

It is not tail recursive. In fact, its time complexity is *O(n ^{2})*,
where

T(0) = O(1)

T(n+1) = T(n) + O(n)

Solving the recurrence we obtain the result *T(n)=O(n ^{2}).*

Can we do better? Oddly, we can take advantage of the *non-associativity*
of `::`

to give a tail-recursive definition of `rev`

.

`local`

fun rev_helper (nil, a) = a

| rev_helper (h::t, a) = rev_helper (t, h::a)

in

fun rev l = rev_helper (l, nil)

end

The pattern is the same as before, except that by re-associating the uses of `::`

we reverse the list! The helper function reverses its first argument and prepends it
to its second argument. That is, `rev_helper (l, a)`

evaluates to ```
(rev
l) @ a
```

, where we assume here an independent definition of `rev`

for the
sake of the specification. Notice that `rev_helper`

runs in time
proportional to the length of its first argument, and hence `rev`

runs in time
proportional to the length of the list.

The correctness of functions defined on lists is established using the principle of *structural
induction*. We illustrate this by establishing that the function `rev_helper`

satisfies the following specification:

for every l and a of type typ`list`

,`rev_helper (`

l`,`

a`)`

evaluates to the result of appending a to the reversal of l.

The proof is by structural induction on the list *l*. If *l* is `nil`

,
then `rev_helper (`

*l*`, `

*a*`)`

evaluates
to *a*, which is as required. If *l* is *h*`::`

*t*,
then by the inductive hypothesis `rev_helper (`

*l*`, `

*a*`)`

evaluates to the result of appending *h*`::`

*a* to the reversal
of *t*, which is easily seen to be the result of appending *a* to the
reversal of *h*`::`

*t*.

The form of this argument may be summarized as follows:

- Establish the correctness of the function for the empty list,
`nil`

. - Assuming the correctness of the function for
*t*, establish its correctness for*h*`::`

*t*.

It follows that the function is correct for all lists *l*, by the inductive
definition of the list type. This is called the principle of *structural
induction on lists*. We will soon generalize this to other inductively-defined
types.