In the preceding chapter we considered the following signature of dictionaries with an arbitrary key type:

signature DICT = sig

type key

val lt : key * key -> bool

val eq : key * key -> bool

type 'a dict

val empty : 'a dict

val insert : 'a dict * key * 'a -> 'a dict

val lookup : 'a dict * key -> 'a

end

The signatures of dictionaries with particular choices of key type were defined using
the "`where type`

" construct. For example, the signature
declarations

signature STRING_DICT = DICT where type key=string

signature INT_DICT = DICT where type key=int

define the signatures of dictionaries with string and integer keys, respectively.
The motivation for introducing these specialized instances of the `DICT`

signature is that we typically wish to hold the implementation type, `'a dict`

,
of dictionaries abstract, but leave the type of keys concrete, as described earlier.

The signature `DICT`

is a bit unsatisfactory because it mixes two different
notions in one interface, namely the type, `key`

, of keys and its associated
comparison operations, `lt`

and `eq`

, and the type `'a dict`

of dictionaries and its associated operations `empty`

, `insert`

, and
`lookup`

. It would be cleaner to separate these two aspects of the
interface, especially since we shall soon consider the key component to be
"generic", with the rest being "specific", to the abstraction.
The way to do this in ML is with a *substructure*, as follows:

signature DICT = sig

structure Key : ORDERED

type 'a dict

val empty : 'a dict

val insert : 'a dict * Key.t * 'a -> 'a dict

val lookup : 'a dict * Key.t -> 'a

end

The type of keys and the operation on it are segregated into a *substructure* of
the dictionary structure, a component of a structure that is itself a structure.
Correspondingly, uses of the type `key`

are replaced by references to the `t`

component of the substructure `Key`

. This leads to a *hierarchical*
organization in which we consider the key structure to be subservient to the dictionary
operations.

Specialized versions of the signature DICT are build essentially as before, except that we use a long identifier to specify the type of keys:

signature STRING_DICT = DICT where type Key.t=string

signature INT_DICT = DICT where type Key.t=int

Specific implementations of these specialized instances may be defined as follows:

structure StringDict :> STRING_DICT = struct

structure Key : ORDERED = StringLT

type 'a dict = Key.t BinTree.tree

val empty = BinTree.empty

val insert = ...insert into a BST using Key.lt and Key.eq...

val lookup = ...lookup in a BST using Key.lt and Key.eq...

end

structure IntDict :> INT_DICT = struct

structure Key : ORDERED = IntLT

type 'a dict = Key.t BinTree.tree

val empty = BinTree.empty

val insert = ...insert into a BST using Key.lt and Key.eq...

val lookup = ...lookup in a BST using Key.lt and Key.eq...

end

The difficulty, of course, is that we are repeating the code for dictionaries in each
implementation; the elided parts of both structures would be *identical*. The
*only* difference between the two dictionary structures lies in the implementation
of keys; in one case we choose string operations and in the other we choose integer
operations. Since the bulk of the code is the same, it is a pity to have to repeat
it for each choice of key type.

Fortunately, ML provides a convenient means of avoiding such redundancy, called a *functor*.
A functor is a *parameterized module*, or a *generic structure*, that
is defined in terms of zero or more argument structures with a specified signature.
A functor may be *applied*, or *instaniated*, with any structures matching
the argument signatures. A functor is therefore a kind of function taking zero or
more structures as arguments and yielding a structure as result.

In the case of dictionaries we may define a generic implementation that is parameterized by the type of keys and associated comparison operations. This is achieved by introducing a functor.

functor Dict (structure K : ORDERED) :> DICT where type Key.t=K.t =

struct

structure Key : ORDERED = K

type 'a dict = Key.t BinTree.tree

val empty = BinTree.empty

val insert = ...insert into a BST using Key.lt and Key.eq...

val lookup = ...lookup in a BST using Key.lt and Key.eq...

end

This declaration introduces a functor named `Dict`

that takes as argument
any structure implementing the signature `ORDERED`

, and yields a structure
implementing the instance of the signature `DICT`

determined by taking the key
type of the dictionary to be the type component of its argument, leaving the type of
dictionaries abstract. The type checker ensures that the body of the functor matches
the specified result signature, under the assumption that the argument has the stated
signature. In the case of the `Dict`

functor the type checker ensures
that the principal signature of the body of the functor (the part between `struct`

and `end`

) matches the signature

DICT where type Key.t=K.t,

assuming that the structure `K`

has signature `ORDERED`

.

The `Dict`

functor encapsulates the implementation of dictionaries as a
generic structure that is independent of the specific choice of keys. One advantage
of this encapsulation is that should we wish to modify the implementation of dictionaries,
say to fix an error or to improve performance, we need only modify the `Dict`

functor, rather than change every occurrence of the dictionary code spread throughout a
large system. This is obviously advantageous for both the original author of the
code, and anyone who must maintain it in the future. In fact common data structures
such as dictionaries are typically provided as part of a "shrink wrapped"
library, and hence are shared among many different programs, thereby increasing code reuse
and reducing redundancy.

The `Dict`

functor provides a generic implementation of dictionaries.
Dictionaries with specific key types may be built by instantiating the `Dict`

functor as follows:

structure IntDict = Dict (structure K = IntLT)

structure StringDict = Dict (structure K = StringLT)

Notice that functor application uses keyword parameter passing --- the parameter is explicitly bound to a structure using a structure binding. In practice the right-hand sides of such bindings are always (long) identifiers; if not, the compiler implicitly inserts bindings to ensure that this is the case. In our discussions we will tacitly assume that the right-hand side of all such bindings are (long) identifiers.

What are the signatures of the structure variables `IntDict`

and `StringDict`

?
Since no signature is ascribed to these bindings, the principal signature of the
corresponding right-hand side of the binding is assigned to each variable, in keeping with
our previous policies. Since the right-hand side in these examples is a functor
application, we must answer the question: what is the principal signature of a functor
application? If --- as here --- the result signature of the functor is opaque, the
principal signature is precisely the asribed signature of the functor, but with the
structure parameter replaced by its binding (which must be, by our assumption, another
structure identifier). Thus the signature assigned to `IntDict`

is

DICT where type Key.t=IntLT.t

which is equivalent to the signature

DICT where type Key.t=int

since `IntLt.t`

= `int`

. Similarly, the signature assigned
to `StringDict`

is

DICT where type Key.t=StringLT.t

which is equivalent to the signature

DICT where type Key.t=string

What if the functor has no result signature, or its result signature is transparently ascribed? In that case we assign the intermediate signature of the match as the result signature of the functor, and use that signature as the implied result signature of the functor.

Dictionaries illustrate the use of the ML module system to build generic
implementations of abstract types. A generic implementation of priority queues
(which support a `remove_min`

operation that dequeues the "least"
element of the queue relative to a specified ordering) may be built in an exactly
analogous manner. Here's a suitable signature of priority queues:

signature PRIO_QUEUE = sig

structure Elt : ORDERED

type prio_queue

exception Empty

val empty : prio_queue

val insert : Elt.t * prio_queue -> prio_queue

val remove : prio_queue -> Elt.t * prio_queue

end

Notice that `prio_queue`

is a type, and not a type constructor, as it was in
the case of "plain" queues. This is a reflection of the fact that the
operations on a priority queue are not independent of the type of elements (as they are
with plain queues), but rely on the comparison operations that are provided with the `Elt`

structure.

A generic implementation of priority queues is a functor taking as argument a structure containing the element type together with its associated operations:

functor PrioQueue

(structure E : ORDERED) :> PRIO_QUEUE where type Elt.t=E.t =

struct

structure Elt : ORDERED = E

type prio_queue = ...a heap based on the ordering Elt.lt...

exception Empty

val empty = ...the empty heap...

val insert = ...sift a new element into the heap...

val remove = ...remove the least element and adjust the heap...

end

Specific instances of priority queues may be built as follows:

structure IntPQ = PrioQueue (structure E = IntLT)

structure StringPQ = PrioQueue (structure E = StringLT)

with signatures

PRIO_QUEUE where type Elt.t=int

and

PRIO_QUEUE where type Elt.t=string

respectively.

The situation becomes more interesting when we wish to combine two or more abstract types to form a third. Suppose we are to implement a (hypothetical) abstract type that employs an ordered type of values that occur both as keys of a dictionary and elements of a priority queue. The signature of this abstract type might look like this

signature ADT = sig

structure Val : ORDERED

type adt

...operations...

end

The implementation should be generic in the type of values, and also in the
implementation of dictionaries and priority queues; we don't want to build the
implementation of these auxiliary data structures into the implementation of `ADT`

's.
There are two approaches to building an `Adt`

functor, each with its
advantages and disadvantages. Here's the first approach:

functor Adt

(structure V : ORDERED) :> ADT where type Val.t=V.t =

struct

structure Val : ORDERED = V

structure D = Dict (structure K = V)

structure Q = PrioQueue (structure E = V)

type adt = ...

...

end

The functor `Adt`

instantiates the `Dict`

and `PrioQueue`

functors to the structure of values specified as argument to the `Adt`

functor.
This ensures that the type equation

D.Key.t = Q.Elt.t = V.t

holds inside the body of the functor, so that expressions such as

D.insert (Q.remove_min ..., ...)

are well-typed. (The structures `D`

and `Q`

are not visible
outside of the functor since they do not appear in the result signature; they are local
auxiliaries used within the functor.)

This approach works well, but if the `Dict`

or `PrioQueue`

functors are changed, the `Adt`

functor must be recompiled to pick up the new
versions. An alternative, which avoids this dependency of the implementation of `Adt`

on the implementations of the `Dict`

and `PrioQueue`

functors, is to
treat the dictionary and priority queue structures as additional parameters to the `Adt`

functor. This leads to the following setup:

functor Adt'

(structure V : ORDERED and D : DICT and Q : PRIO_QUEUE) :>

ADT where type Val.t=V.t =

struct

structure Val = V

type adt = ...implementation type...

...implementation of operations...

end

To build an instance of the `Adt'`

functor we must first built appropriate
instances of the `Dict`

and `PrioQueue`

functors and pass these to `Adt'`

:

structure IntDict = Dict (structure K=IntLT)

structure IntPQ = Dict (structure K=IntLT)

structure A = Adt' (structure V=IntLt and D=IntDict and Q=IntPQ)

There is a problem, however, with this setup: the functor `Adt'`

is
ill-typed! It is no longer true within the body of `Adt`

that the type
equation

D.Key.t = Q.Elt.t = V.t

holds in the body of `Adt'`

, even though the equation

IntDict.Key.t = IntPQ.Elt.t = IntLT.t = int

does hold of the arguments, for we might well choose arguments for which the required equation is invalid. In short, the functor is "too generic", and consequently the body is not type correct.

What to do? The solution is to restrict the parameters to the `Adt'`

functor so that the only possible instances are those that satisfy the required equation.
There are two methods for doing this, both equivalent. The first is to
explicitly require that the dictionary and priority queue arguments agree on the value
type passed as parameter:

functor Adt'

(structure V : ORDERED

and D : DICT where type Key.t=V.t

and Q : PRIO_QUEUE where type Elt.t=V.t) :>

ADT where type Val.t=V.t =

struct

...

end

The body of `Adt'`

is now type correct since the required type equations
hold as a result of our additional assumptions on the arguments.

An alternative is to impose the equational requirement on types in a *post hoc*
manner using a *sharing specification*:

functor Adt'

(structure V : ORDERED and D : DICT and Q : PRIO_QUEUE

sharing type D.Key.t = Q.Elt.t = V.t) :>

ADT where type Val.t=V.t =

struct

...

end

The sharing specification stipulates that the given equation must hold of any instance
of this functor. Any attempt to instantiate `Adt'`

with structures `V`

,
`D`

, and `Q`

not satisfying the sharing specification is rejected as
ill-formed.

An advantage of sharing specifications is that they provide a direct, symmetric
specification of the required type equation without forcing the programmer to explicitly
"thread" the common type through the various signatures. In fact sharing
specifications encourage concision since they do not require that the common component be
"factored out" as it is in the foregoing example. Here is a more concise
formulation of the `Adt'`

functor in which we drop the first argument entirely,
relying only on a `sharing`

specification to constraint the dictionary and
priority queue structures appropriately.

functor Adt'

(structure D : DICT and Q : PRIO_QUEUE

sharing type D.Key.t = Q.Elt.t) :>

ADT where type Val.t=D.Key.t =

struct

...

end

Notice that the result signature changes slightly to extract the common type from one
of the parameters, the choice of which being arbitrary in the presence of the `sharing`

specification.