In this section we present a conservative ``recursive pairing''
algorithm, Algorithm LC
, that can perform many of the same
functions on lists as recursive doubling. The idea is to
contract an input list by repeatedly pairing and merging adjacent
elements of the list until only a single element remains. The merges
are recorded as internal nodes of a binary contraction tree
whose leaves are the elements in the input list. After building the
contraction tree, operations such as broadcasting from the root or
parallel prefix can be performed in a conservative fashion.
Algorithm LC
is a randomized algorithm, and with high
probability, the height of the contraction tree and the number of
steps on a DRAM are both
, where n is the number of
elements in the input list. A deterministic variant based on
deterministic coin tossing [5] runs in
steps, where m is the number of processors in the DRAM,
and produces a contraction tree of height
.
The recursive pairing strategy is illustrated in
Figure 3 for a list
. In the first step, elements B and C pair and
merge, as do elements D and E. The merges are shown as contours
in the figure. A new contracted list
is formed from
the unpaired element A and the two compound elements BC and DE.
After the second step of the algorithm, the contracted list consists
of the elements ABC and DE. The third and final step reduces the
list to the single element ABCDE.
In Algorithm LC , the contours of Figure 3 are represented in a data structure called a contraction tree. The leaves of the contraction tree are the list elements, and the internal nodes are the contours. To maintain the contraction-tree data structure, the algorithm requires constant extra space for each element in the input list. Each processor contains two elements: an element in the input list, and a spare element that will act as an internal node in the contraction tree. We call the two elements in the same processor mates. Each element holds a pointer to an unused internal node, which for each list element initially points to its mate. The use of spare nodes allows the algorithm to distribute the space for the internal nodes of the contraction tree uniformly over the elements in the list. (Spare internal nodes are used in [1] and [17] for similar reasons, but in a different context.)
We now describe the operation of Algorithm LC , which is illustrated in Figure 4 for the example of Figure 3. (A description in pseudocode can be found in [19].) In the first step, each element of the input list randomly picks either its left or right neighbor. An element at the left or right end of the list always picks its only neighbor. If two elements pick each other, then they merge. The merge is recorded by making the spare of the left element of the pair be the root of a new contraction tree. The spare of the right element becomes the spare for the root, and the elements themselves become the children of the root. The roots of the new contraction trees and the unpaired list elements now form themselves into a new list representing the contracted list, upon which the algorithm operates recursively.
At each step of the algorithm, any given element of the contracted list is a set of consecutive elements in the input list--a contour in Figure 3. The set is represented by a contraction-tree data structure whose leaves are the elements of the set and whose internal nodes record the merges. When the entire input list has been contracted to a single node, the algorithm terminates and a single contraction tree records all of the merges.
Figure 3: The recursive pairing strategy operating on a
list
. Merged nodes are shown as contours, and the
nesting of contours gives the structure of the contraction tree.
Figure 4: The operation of Algorithm LC
on the
example of Figure 3. The input list is
,
and the corresponding spares are in lower case. When elements B and
C pair and merge in the first step of the algorithm, the spare b
becomes the root of a contraction tree with leaves B and C to
represent the compound node BC. The spare for b is c. At the
end of the first step, the list consisting of the elements
A, b, and d represents the contracted list
. After two more
contraction steps of Algorithm LC
, the input list is contracted
to a single element ABCDE, which is represented by a contraction tree whose
root is c and whose
leaves are the elements of the input list
.
To describe the efficiency of randomized algorithms such as
Algorithm LC
, we shall sometimes say that an algorithm runs in
steps ``with high probability,'' by which we shall mean
that for any constant k>0, there are constants
and
such that with probability
, the algorithm terminates in at
most
steps.
Proof:
We show that the algorithm terminates after
iterations with probability at least
. We use an
accounting scheme involving ``tokens'' to analyze the algorithm.
Initially, a unique token resides between each pair of elements in the
input list. Whenever two list elements pick each other, we destroy
the token between them. For each token destroyed, the length of the
list decreases by one, and the algorithm terminates when no token
remains. In any iteration, an existing token has probability at least
of being destroyed. Thus, after m iterations, a token has
probability at most
of remaining in existence. Let
be
the event that token i exists after m iterations, and let T be
the event that any token remains after m iterations. Then the
probability that any token remains after m iterations is given by
For
iterations, we have
Proof: The height of the contraction tree is no greater than the number of iterations of Algorithm LC .
We now prove that Algorithm LC is conservative.
Proof: By convention, let the mate of an element in the input list lie in the order between that element and its right neighbor. The key idea is that the order of the list elements and their spares is preserved by the merging operation, and consequently, after each contraction step, the pointers in the contracted list correspond to disjoint paths in the original list, and the pointers between elements and their spares also correspond to disjoint paths. By the Shortcut Lemma these two sets of pointers are each conservative with respect to the input list, and since each set of memory accesses in a contraction step of the algorithm is a subset of one of these two sets, the algorithm is conservative.
Although a contraction tree itself is not conservative with respect to an input list, it can be used as a data structure in conservative algorithms. For example, contraction trees can be used to efficiently broadcast a value to all of the elements of a list and to accumulate values stored in each element of a list.
More generally, contraction trees are useful for performing
prefix computations in a conservative fashion. Let
be a
domain with a binary associative operation
and an identity
. A prefix computation
[3, 7, 23] on a list with elements
in
puts the value
in element i for each
.
A prefix computation on a list can be performed by a conservative,
two-phase algorithm on the contraction tree. The leaves of the
contraction tree from left to right are the elements in the list from
to
. The first phase proceeds bottom up on the tree. Each
leaf passes its x value to its parent. When an internal node
receives a value
from its left child and a value
from its
right child, the node saves the value
and passes
to its parent. When the root receives values from its children,
the second top-down phase begins. The root passes
to
its left child and its
value to its right child. When an
internal node receives a value
from its parent, it passes
to its left child, and passes
to its right child.
When a leaf receives
it computes
.
The number of steps required by the prefix computation is proportional
to the height of the tree, which with high probability is
.
At each step, the algorithm communicates across a set of pointers in
the contraction tree, all of which are the same distance from the
leaves in the first phase, and the same distance from the root in the
second. That this computation is performed in a conservative fashion
is a consequence of the following lemma.
Proof: An inorder traversal of CT alternately visits list elements (leaves) and their mates (internal nodes) in the same order that the list elements and mates appear in L. Thus, if no pointer in P is an ancestor of another pointer in P, the pointers in P correspond to disjoint paths in L. By the Shortcut Lemma, any set of pointers that correspond to disjoint paths in the list L are conservative with respect to L.
Algorithm LC
, which constructs a contraction tree in
steps, is a randomized algorithm. By using the ``deterministic coin
tossing'' technique of Cole and Vishkin [5], the
algorithm can be performed nearly as well deterministically.
Specifically, the randomized pairing step can be performed
deterministically in
steps on a DRAM with m
processors, where
is the number of times the logarithm
function must be successively applied to reduce m to a value at most
1. The overall running time for list contraction is thus
.
As a final comment, we observe that with minor modifications, Algorithm LC can be used to contract circular lists with the same complexity bounds as for linear lists.