Sample Code for This Chapter
This chapter is concerned with persistent and ephemeral abstract types. The distinction is best explained in terms of the logical future of a value. Whenever a value of an abstract type is created it may be subsequently acted upon by the operations of the type (and, since the type is abstract, by no other operations). Each of these operations may yield (other) values of that abstract type, which may themselves be handed off to further operations of the type. Ultimately a value of some other type, say a string or an integer, is obtained as an observable outcome of the succession of operations on the abstract value. The sequence of operations performed on a value of an abstract type constitutes a logical future of that type --- a computation that starts with that value and ends with a value of some observable type. We say that a type is ephemeral iff every value of that type has at most one logical future, which is to say that it is handed off from one operation of the type to another until an observable value is obtained from it. This is the normal case in familiar imperative programming languages because in such languages the operations of an abstract type destructively modify the value upon which they operate; its original state is irretrievably lost by the performance of an operation. It is therefore inherent in the imperative programming model that a value have at most one logical future. In contrast, values of an abstract type in functional languages such as ML may have many different logical futures, precisely because the operations do not "destroy" the value upon which they operate, but rather create fresh values of that type to yield as results. Such values are said to be persistent because they persist after application of an operation of the type, and in fact may serve as arguments to further operations of that type.
Some examples will help to clarify the distinction. The primitive list types of ML are persistent because the performance of an operation such as cons'ing, appending, or reversing a list does not destroy the original list. This leads naturally to the idea of multiple logical futures for a given value, as illustrated by the following code sequence:
val l = [1,2,3] (* original list *)
val m1 = hd l (* first future of l *)
val n1 = rev m1
val m2 = l @ [4,5,6] (* second future of l *)
Notice that the original list value,
[1,2,3], has two distinct logical
futures, one in which we remove its head, then reverse the tail, and the other in which we
append the list
[4,5,6] to it. The ability to easily handle multiple
logical futures for a data structure is a tremendous source of flexibility and expressive
power, alleviating the need to perform tedious bookkeeping to manage "versions"
or "copies" of a data structure to be passed to different operations.
The prototypical ephemeral data structure in ML is the reference cell. Performing an assignment operation on a reference cell changes it irrevocably; the original contents of the cell are lost, even if we keep a handle on it.
val r = ref 0 (* original cell *)
val s = r
val _ = (s := 1)
val x = !r (* 1! *)
Notice that the contents of (the cell bound to)
r changes as a result of
performing an assignment to the underlying cell. There is only one future for this
cell; a reference to its original binding does not yield its original contents.
More elaborate forms of ephemeral data structures are certainly possible. For example, the following declaration defines a type of lists whose tails are mutable. It is therefore a singly-linked list, one whose predecessor relation may be changed dynamically by assignment:
datatype 'a mutable_list = Nil | Cons of 'a * 'a mutable_list ref
Values of this type are ephemeral in the sense that some operations on values of this type are destructive, and hence are irreversible (so to speak!). For example, here's an implementation of a destructive reversal of a mutable list. Given a mutable list l, this function reverses the links in the cell so that the elements occur in reverse order of their occurrence in l.
fun ipr (Nil, a) = a
| ipr (this as (Cons (_, r as ref next)), a) =
ipr (next, (r := a; this))
(* destructively reverse a list *)
fun inplace_reverse l = ipr (l, Nil)
As you can see, the code is quite tricky to understand! The idea is the same as the iterative reverse function for pure lists, except that we re-use the nodes of the original list, rather than generate new ones, when moving elements onto the accumulator argument.
The distinction between ephemeral and persistent data structures is essentially the distinction between functional (effect-free) and imperative (effect-ful) programming --- functional data structures are persistent; imperative data structures are ephemeral. However, this characterization is oversimplified in two respects. First, it is possible to implement a persistent data structure that exploits mutable storage. Such a use of mutation is an example of what is called a benign effect because for all practical purposes the data structure is "purely functional" (i.e., persistent), but is in fact implemented using mutable storage. As we will see later the exploitation of benign effects is crucial for building efficient implementations of persistent data structures. Second, it is possible for a persistent data type to be used in such a way that persistence is not exploited --- rather, every value of the type has at most one future in the program. Such a type is said to be single-threaded, reflecting the linear, as opposed to branching, structure of the future uses of values of that type. The significance of a single-threaded type is that it may as well have been implemented as an ephemeral data structure (e.g., by having observable effects on values) without changing the behavior of the program.
Here is a signature of persistent queues:
signature QUEUE = sig
type 'a queue
val empty : 'a queue
val insert : 'a * 'a queue -> 'a queue
val remove : 'a queue -> 'a * 'a queue
This signature describes a structure providing a representation type for queues, together with operations to create an empty queue, insert an element onto the back of the queue, and to remove an element from the front of the queue. It also provides an exception that is raised in response to an attempt to remove an element from the empty queue. Notice that removing an element from a queue yields both the element at the front of the queue, and the queue resulting from removing that element. This is a direct reflection of the persistence of queues implemented by this signature; the original queue remains available as an argument to further queue operations.
By a sequence of queue operations we shall mean a succession of uses of
remove operations in such a way that the queue
argument of one operation is obtained as a result of the immediately preceding queue
operation. Thus a sequence of queue operations represents a single-threaded
time-line in the life of a queue value. Here is an example of a sequence of queue
val q0 : int queue = empty
val q1 = insert (1, q0)
val q2 = insert (2, q1)
val (h1, q3) = remove q2 (* h1 = 1, q3 = q1 *)
val (h2, q4) = remove q3 (* h2 = 2, q4 = q0 *)
By contrast the following operations do not form a single thread, but rather a branching development of the queue's lifetime:
val q0 : int queue = empty
val q1 = insert (1, q0)
val q2 = insert (2, q0) (* NB: q0, not q1! *)
val (h1, q3) = remove q1 (* h1 = 1, q3 = q0 *)
val (h2, q4) = remove q3 (* raise Empty *)
val (h2, q4) = remove q2 (* h2 = 2,, q4 = q0 *)
In the remainder of this chapter we will be concerned with single-threaded sequences of queue operations.
How might we implement the signature
QUEUE? The most obvious
approach is to represent the queue as a list with, say, the head element of the list
representing the "back" (most recently enqueued element) of the queue.
With this representation enqueueing is a constant-time operation, but dequeuing requires
time proportional to the number of elements in the queue. Thus in the worst case a
sequence of n enqueue and dequeue operations will take time O(n2),
which is clearly excessive. We can make dequeue simpler, at the expense of
enqueue, by regarding the head of the list as the "front" of the queue, but the
time bound for n operations remains the same in the worst case.
Can we do better? A well-known "trick" achieves an O(n) worst-case performance for any sequence of n operations, which means that each operation takes O(1) steps if we amortize the cost over the entire sequence. Notice that this is a worst-case bound for the sequence, yielding an amortized bound for each operation of the sequence. This means that some operations may be relatively expensive, but, in compensation, many operations will be cheap.
How is this achieved? By combining the two naive solutions sketched above. The idea is to represent the queue by two lists, one for the back "half" consisting of recently inserted elements in the order of arrival, and one for the front "half" consisting of soon-to-be-removed elements in reverse order of arrival (i.e., in order of removal). We put "half" in quotes because we will not, in general, maintain an even split of elements between the front and the back lists. Rather, we will arrange things so that the following representation invariant holds true:
- The elements of the queue listed in order of removal are the elements of the front followed by the elements of the back in reverse order.
- The front is empty only if the back is empty.
This invariant is maintained by using a "smart constructor" that creates a queue from two lists representing the back and front parts of the queue. This constructor ensures that the representation invariant holds by ensuring that condition (2) is always true of the resulting queue. The constructor proceeds by a case analysis on the back and fron parts of the queue. If the front list is non-empty, or both the front and back are empty, the resulting queue consists of the back and front parts as given. If the front is empty and the back is non-empty, the queue constructor yields the queue consisting of an empty back part and a front part equal to the reversal of the given back part. Observe that this is sufficient to ensure that the representation invariant holds of the resulting queue in all cases. Observe also that the smart constructor either runs in constant time, or in time proportional to the length of the back part, according to whether the front part is empty or not.
Insertion of an element into a queue is achieved by cons'ing the element onto the back of the queue, then calling the queue constructor to ensure that the result is in conformance with the representation invariant. Thus an insert can either take constant time, or time proportional to the size of the back of the queue, depending on whether the front part is empty. Removal of an element from a queue requires a case analysis. If the front is empty, then by condition (2) the queue is empty, so we raise an exception. If the front is non-empty, we simply return the head element together with the queue created from the original back part and the front part with the head element removed. Here again the time required is either constant or proportional to the size of the back of the queue, according to whether the front part becomes empty after the removal. Notice that if an insertion or removal requires a reversal of k elements, then the next k operations are constant-time. This is the fundamental insight as to why we achieve O(n) time complexity over any sequence of n operations. (We will give a more rigorous analysis shortly.)
Here's the implementation of this idea in ML:
structure Queue :> QUEUE = struct
type 'a queue = 'a list * 'a list
fun make_queue (q as (nil, nil)) = q
| make_queue (bs, nil) = (nil, rev bs)
| make_queue (q as (bs, fs)) = q
val empty = make_queue (nil, nil)
fun insert (x, (back,front)) = make_queue (x::back, front)
fun remove (_, nil) = raise Empty
| remove (bs, f::fs) = (f, make_queue (bs, fs))
Notice that we call the "smart constructor"
we wish to return a queue to ensure that the representation invariant holds.
Consequently, some queue operations are more expensive than others, according to whether
or not the queue needs to be reorganized to satisfy the representation invariant.
However, each such reorganization makes a corresponding number of subsequent queue
operations "cheap" (constant-time), so the overall effort required evens out in
the end to constant-time per operation. More precisely, the running time of a
sequence of n queue operations is now O(n), rather than O(n2),
as it was in the naive implementation. Consequently, each operation takes O(1)
(constant) time "on average", i.e., when the total effort is evenly
apportioned among each of the operations in the sequence. Note that this is a worst-case
time bound for each operation, amortized over the entire sequence, not an average-case
time bound based on assumptions about the distribution of the operations.
How can we prove this claim? First we given an informal argument, then we tighten it up with a more rigorous analysis. We are to account for the total work performed by a sequence of n operations by showing that any sequence of noperations can be executed in cn steps for some constant c. Dividing by n, we obtain the result that each operations takes c steps when amortized over the entire sequence. The key is to observe first that the work required to execute a sequence of queue operations may be apportioned to the elements themselves, then that only a constant amount of work is expended on each element. The "life" of a queue element may be divided into three stages: it's arrival in the queue, it's transit time in the queue, and it's departure from the queue. In the worst case each element passes through each of these stages (but may "die young", never participating in the second or third stage). Arrival requires constant time to add the element to the back of the queue. Transit consists of being moved from the back to the front by a reversal, which takes constant time per element on the back. Departure takes constant time to pattern match and extract the element. Thus at worst we require three steps per element to account for the entire effort expended to perform a sequence of queue operations. This is in fact a conservative upper bound in the sense that we may need less than 3n steps for the sequence, but asymptotically the bound is optimal --- we cannot do better than constant time per operation! (You might reasonably wonder whether there is a worst-case, non-amortized constant-time implementation of persistent queues. The answer is "yes", but the code is far more complicated than the simple implementation we are sketching here.)
This argument can be made rigorous as follows. The general idea is to introduce the notion of a charge scheme that provides an upper bound on the actual cost of executing a sequence of operations. An upper bound on the charge will then provide an upper bound on the actual cost. Let T(n) be the cumulative time required (in the worst case) to execute a sequence of n queue operations. We will introduce a charge function, C(n), representing the cumulative charge for executing a sequence of n operations and show that T(n)<=C(n)=O(n). It is convenient to express this in terms of a function R(n) = C(n)-T(n) representing the cumulative residual, or overcharge, which is the amount that the charge for n operations exceeds the actual cost of executing them. We will arrange things so that R(n)>=0 and that C(n)=O(n), from which the result follows immediately.
Down to specifics. By charging 2 for each insert operation and 1 for each remove, it follows that C(n)<=2n for any sequence of n inserts and removes. Thus C(n)=O(n). After any sequence of n>=0 operations have been performed, the queue contains 0<=b<=n elements on the back "half" and 0<=f<=n elements on the front "half". We claim that for every n>=0, R(n)=b. We prove this by induction on n>=0. The condition clearly holds after performing 0 operations, since T(0)=0, C(0)=0, and hence R(0)=C(0)-T(0)=0. Consider the n+1st operation. If it is an insert, and f>0, T(n+1)=T(n)+1, C(n+1)=C(n)+2, and hence R(n+1)=R(n)+1=b+1. This is correct because an insert operation adds one element to the back of the queue. If, on the other hand, f=0, then T(n+1)=T(n)+b+2 (charging one for the cons and one for creating the new pair of lists), C(n+1)=C(n)+2, so R(n+1)=R(n)+2-b-2=b+2-b-2=0. This is correct because the back is now empty; we have used the residual overcharge to pay for the cost of the reversal. If the n+1st operation is a remove, and f>0, then T(n+1)=T(n)+1 and C(n+1)=C(n)+1 and hence R(n+1)=R(n)=b. This is correct because the remove doesn't disturb the back in this case. Finally, if we are performing a remove with f=0, then T(n+1)=T(n)+b+1, C(n+1)=C(n)+1, and hence R(n+1)=R(n)-b=b-b=0. Here again we use of the residual overcharge to pay for the reversal of the back to the front. The result follows immediately since R(n)=b>=0, and hence C(n)>=T(n).
It is instructive to examine where this solution breaks down in the multi-threaded case (i.e., where persistence is fully exploited). Suppose that we perform a sequence of n insert operations on the empty queue, resulting in a queue with n elements on the back and none on the front. Call this queue q. Let us suppose that we have n independent "futures" for q, each of which removes an element from it, for a total of 2n operations. How much time do these 2n operations take? Since each independent future must reverse all n elements onto the front of the queue before performing the removal, the entire collection of 2n operations takes n+n2 steps, or O(n) steps per operation, breaking the amortized constant-time bound just derived for a single-threaded sequence of queue operations. Can we recover a constant-time amortized cost in the persistent case? We can, provided that we share the cost of the reversal among all futures of q --- as soon as one performs the reversal, they all enjoy the benefit of its having been done. This may be achieved by using a benign side effect to cache the result of the reversal in a reference cell that is shared among all uses of the queue. We will return to this once we introduce memoization and lazy evaluation.
Sample Code for This Chapter