To construct the paths in bit steps we modify the first
algorithm as follows. Given a set of at most paths that
need to be extended at an *M*-input splitter, the algorithm does not
wait time for every path to be extended before it
begins the extension at the next level. Instead, it waits only
steps, in which time the number of unextended paths falls to a
fraction of its original value. We will choose to be
less than . Now the path extension process can start at the next
level. The only danger here is that the fraction of paths left
behind may find themselves blocked by the time they reach the next
level, and so we need to ensure that this won't happen. Therefore,
stalled paths send out *placeholders* to all of their neighbors at
the next level, and henceforth the neighbors with placeholders
participate in the path extension process at the next level, as if
they were paths. Thus, a placeholder not only reserves a spot that
may be used by a path at a future time, but also helps to chart out
the path by continuing to extend ahead. Since a placeholder doesn't
know which path will ultimately use it, a node holding a placeholder
must extend paths into both the upper and lower output portions of its
splitter. A placeholder that first extends a path into the upper
output portion of its splitter continues to attempt to extend a path
into the lower portion, and vice versa. We will call a path from the
inputs of the network to the inputs of any splitter in the network a
*real path* if it contains no placeholders. The goal of the
algorithm, of course, is to extend real paths all the way through the
network. Any path that contains at least one placeholder is called a
*placeholder path*.

Since each stalled path generates up to *2d* placeholders at the next
level, and these placeholders might later become stalled themselves,
there is a risk that the network will become clogged with
placeholders. In particular, if the fraction of inputs in a splitter
that are trying to extend rises above , the path extension
algorithm ceases to work. Thus, in order to prevent placeholders from
clogging the system, whenever a stalled path, either real or a
placeholder, gets extended into either the upper or lower output
portion of a splitter, it sends a *cancellation signal* to each
of the nodes in that portion of the splitter that are holding
placeholders for it. When a placeholder is replaced by a real path,
one of the two directions (up or down) into which the placeholder has
been attempting to extend becomes unnecessary. If the
placeholder has already extended its path in that direction, a single
cancellation is sent along the edge that the path uses. Otherwise, a
cancellation is sent to each of the *d* placeholding neighbors in that
direction. When a placeholding node gets cancellations from all of the
nodes that had requested it to hold their places, it ceases its
attempts to extend. It also sends cancellations to any nodes ahead of
it that may be holding a place for it. Note that a placeholding node
that has received cancellations from all but one of the nodes that had
requested it to hold their places continues to try to extend into both
the upper and lower output portions of the splitter. As we shall see,
this scheme of cancellations prevents placeholders from getting too
numerous.

The -step algorithm for routing paths proceeds in *
phases*. Each path is restricted to extend forward by at most one
level during each phase. We refer to the first wave of paths and
placeholders to arrive at a level as the *wavefront*. The
wavefront moves forward by one level during each phase. A phase
consists of the following three parts:

Note that for constant *T* and *C*, each phase can be performed in
bit steps. We will assume that so that cancellation
signals have a chance to catch up with the wavefront, and that .

The key to our analysis of the algorithm is to focus on the number of
stalled paths (corresponding to real paths *or* placeholders) at
the inputs of each splitter. In phase *t* of the algorithm, where the
first phase is phase *0*, the wavefront advances from level *t* to
level *t+1*. Let denote the maximum fraction of inputs
containing wavefront paths (real and placeholder) in a level *i*
splitter that wish to extend to the upper (or similarly, to the lower)
outputs at the end of phase *i-1*, i.e., when the wavefront arrives at
level *i*, and let denote the maximum fraction of inputs that
contain stalled paths that wish to extend to the upper (or similarly,
to the lower) outputs of any splitter at level *i* at the end of phase
*t*. Note that for *t < i*, since there are no paths to
extend at level *i* before phase *i*. Also, note that .

The following lemmas will be useful in proving that every path is extended to completion in phases provided that and .

The following lemma bounds the size of the wavefront in terms of the number of stalled paths behind it.

The next lemma, Lemma 4.5, presents a weaker bound on
. The difference between this lemma and the previous lemma is
that in Lemma 4.5 we assume that a cancellation signal
must reach level *i* rather than *i-1* before the start of the path
extension part of phase *i-1* in order for it to have an effect on the
size of the wave propagating from level *i-1* to level *i*. The
reason for this assumption is that we will later speed up the
algorithm by overlapping the cancellation passing and path extension
parts of each phase.

The following lemma shows that for the right choices of *L*, ,
*d*, and *C*, no splitter ever receives too many paths (real or
placeholders) that want to extend to the upper outputs (and similarly,
to the lower outputs).

From Lemma 4.6, it is clear that no splitter ever has
more than an fraction of its inputs containing paths to be
extended to the upper (or lower) outputs. Therefore the path-extension
algorithm is never swamped by placeholders and always works as planned
at each level, cutting down the number of stalled paths by a factor of
during each phase. Hence,
phases after the wavefront arrives at a splitter of size *M*, all
paths are extended. Since the wavefront arrives at level *i* during
phase *i-1*, the algorithm establishes all real paths to level (recall that the last levels have been removed) by
phase

phases, since a path that is last stalled at level *i* extends to
level *i+1* by phase , and if the
wavefront reaches level before its cancellation signals
do, then these signals arrive phases later.
Otherwise, if the cancellation signals catch up to the wavefront (but
the path is never again stalled), then the path extends to level
by phase For and ,
this expression takes on a maximum value of . At first, this result seems too good to be true,
but stalled real paths catch up to the wavefront very quickly once
they get through, and they get through at a very high rate. Hence,
all real paths get through to the final level along with the
wavefront!

Since the number of phases required is basically , the
overall time for the algorithm depends mainly on the parameters *C*
and *T*. By propagating the cancellations at the same time that paths
are extended, a single phase can be implemented in
steps. As long as , the algorithm will work for . Since , and Lemma 4.2 gives
us , and we need ,
*T* must be at least *2*. In general, in order to make *T* small, we
need to be large. In order to achieve large , we
need to be close to *d*, which requires to be small
(and consequently *L* to be large) and *d* to be large. By using good
splitters ( ), small, *d* large, *C=5*, and
*T=2*, and replacing each edge with a small constant number of edges,
we can obtain a -step algorithm for routing all
the paths. Unfortunately, *d* and *L* need to be quite large to
achieve this bound. For more reasonable values of *d* (less than
*10*) and *L* (less than *150*), we can achieve provable routing times
of about . Fortunately, the algorithms appear to run faster
in simulations [19].

It is worth noting that each node only needs to keep track of a few
bits of information to make its decisions. This is because only the
*i*th bit of the destination is needed to make a switching decision at
level *i*, and therefore a node at that level looks at this bit,
strips it off, and passes the rest of the destination address onward.
The path as a whole snakes forward through the network. If it ever
gets blocked, the entire snake halts behind it. The implementation
details for this scheme are straightforward. Previously, only the AKS
sorting circuit was known to achieve this performance for
bounded-degree networks, but at a much greater cost in complexity and
constant factors. Recently, Leighton and Plaxton have also developed
a randomized algorithm for sorting on the butterfly in
bit steps [20].

Mon Jul 22 21:19:59 EDT 1996