Routing is more complicated in the weak switch model. A problem arises when several packets arrive on the same input bundle at the same step. Since a switch can examine only one edge in the bundle, it can forward only one of them, while the others must wait. The routing algorithm must guarantee that the total delay experienced by each packet is at most bit steps.

Our routing algorithm is a variant of Ranade's algorithm for routing on the butterfly [18, 19, 23]. In Ranade's algorithm, each packet is initially assigned an -bit random rank, which it carries in front of its destination header. The most important invariant maintained by the algorithm is that the stream of packets leaving each switch is arranged in order of non-decreasing rank. In order to keep the stream sorted, a switch never sends a packet until it is sure that no other packets with smaller rank will arrive later.

The problem with implementing Ranade's algorithm in the bit model is
that it takes too long to compare the ranks of two packets at the same
switch. In our algorithm, each packet is also assigned a random rank
initially, and that rank does not change throughout the course of the
routing. However, a packet does not carry its rank along with it.
Instead, we introduce a new type of packet called a *token*. A
token carries no information other than its type, and as a consequence
consists of only bits. The algorithm uses these tokens to
represent the ranks of the other packets. It maintains the invariant
that a packet of rank *r* passes through a bundle only after exactly
*r* tokens have passed through the bundle. This invariant guarantees
that the packets pass through the bundle in order sorted by rank. A
token can be considered to have a rank as well; a token of rank *r*
passes through a bundle only after exactly *r* tokens have passed
through the bundle. Note that the tokens are not the same as the
ghost packets used in Ranade's algorithm, which carry a rank. Our
algorithm does not use ghost packets.

The mechanics of the algorithm are fairly simple. A switch maintains
a pointer to an edge in each of its two input and two output bundles.
We call these edges the *current* input and output edges,
respectively. The behavior of the switch depends on whether the current
input queues contain packets or tokens. There are four cases. If
both of the current input edges have tokens in their queues, then the
switch sends tokens on both of the current output edges and advances
all four pointers. If one of the current input edges has a token in
its queue and the other a packet, then the switch strips off the first
bit of the packet's header, shunts the packet to the current edge of
the output bundle specified by that bit, and advances the pointers for
the input and output bundles used by the packet. If both of the
current input edges have packets in their queues, then the switch
forwards one of them, and advances the pointers for the bundles used
by that one. If either of the current input edges has an empty queue,
then the switch does nothing.

Initially, each packet *p* is assigned a random rank in the
range , where , and will be specified
later. The input switches on level *0* send out a stream of packets
interspersed with tokens so that *r* tokens precede each packet with
rank *r*. A total of *w* tokens are sent on each bundle. The tokens
with rank *w-1* follow all of the other packets and tokens in the
network. A routing phase terminates when every output has received a
rank *w-1* token on both of its input bundles. Rank *w-1* tokens can be
specially marked so that the outputs do not have to count the number
of tokens that they have received.

During the course of the algorithm, each edge is used by either one
packet, one token, or not at all. Because *w* tokens pass through
each bundle, we further dilate the butterfly by adding
edges to each bundle. By using the same edges to send both packets
and tokens, it is possible to remove these extra *w* edges. However,
the algorithm and its analysis become slightly more complicated
because several tokens may be sent across the same edge, and each one
must wait to be sent until a space frees up in the queue at the end of
the edge.

The proof that the heads of the packets quickly reach their
destinations uses a delay sequence argument like that in
[18, 19, 23]. A * k-delay
sequence* consists of four components: a path backwards from an
output to an input, a sequence of (not necessarily distinct) switches
appearing on the path in the order , a sequence
of distinct packets, and a sequence of non-increasing ranks. We say that a

The following lemma is the crux of the delay sequence argument. It shows that if some packet is delayed, then a delay sequence must have occurred. For simplicity we analyze the first routing phase only.

**Proof:**
We will construct a delay sequence by spending *lag*
points. The lag of a packet on level *l* at time is . If
some packet arrives at its destination at step , then it has
lag . We use these points to build the sequence. For each
point spent, we will either insert a packet into the sequence, or
decrease the rank of the next packet in the sequence. Since there can
be at most *w* decreases in the rank, the sequence must contain at
least packets.

The construction of the delay sequence begins with the last packet to arrive at its destination. Suppose a packet arrives at its destination at time . First, we insert , , and into the delay sequence. Then we follow back in time until it was last delayed. The lag remains at each step on the path, until the step at which failed to move forward, where the lag drops to .

If was delayed at switch because another packet was sent instead, then , and we insert , , and into the sequence, and follow back in time. In this case, we have inserted one packet into the sequence at a cost of one lag point. In the butterfly network, once the paths of two packets diverge, they do not meet again. Thus, if delays , then remains ahead of as long as their paths intersect. As a consequence, no packet is inserted into the sequence more than once.

If was delayed because a token was sent instead, then . In this case we do not insert anything into the sequence, but instead follow back in time until it was last delayed. In this case, we have lost one point in lag, but have also decreased the rank of the next packet in the sequence by at least one point.

The third possibility is that was delayed because the current
edge on the other input bundle, *c*, had nothing to send. In this
case we do not insert anything into the sequence, but go back through
*c* to a switch *s* at the previous level at the previous time step.
At the previous time step, switch *s* must not have sent anything
through bundle *c*. This can happen for one of two reasons. Either
*s* sent a packet on its other bundle, or one of the input
bundles to *s* had nothing to send. In the former case, we
insert , , and into the
sequence, and follow back until it was last delayed. In the
latter case, we go back through to a switch at the previous
level at the previous time step and continue. Continuing in this
fashion, we must eventually reach a switch that sent a packet.

Now that we have analyzed the ways in a which a packet can be delayed,
we must consider the ways in which a token can be delayed. Suppose we
are following a token back in time until it was last delayed at
a switch *s*.

If was delayed because *s* sent a packet *p* instead, then we
insert , , and into
the sequence.

Otherwise, if was delayed because *s* sent another token
instead, then , and we follow back
until it was last delayed.

The last possibility is that was delayed because one of the
input bundles *c* to *s* had nothing to send. In this case, we go
back through *c* just as if a packet had been delayed for the same
reason.

The preceding case analysis shows that for each point of lag spent, either a packet is inserted into the delay sequence, or the rank of the next packet in the sequence decreases. Thus, the sequence must contain at least packets.

to .667emto .667em

**Proof:**
To bound the probability that some packet is delayed for
steps, we now need only to bound the probability that some -delay
sequence occurs. To do so, we enumerate all possible delay sequences,
and sum the probabilities that each occurs. The argument is similar
to the ones in [18, 19, 23]. There are
ways to choose the path, since there are choices for
the output at which it starts, choices for the input at which it
ends, and only one path between them. Once the path is fixed, there
are at most ways of choosing
switches on the path. Let denote
the level of switch . Then there are at most
ways to choose . Finally, there are at most ways of choosing a sequence of non-increasing ranks in the range .
A delay sequence occurs only if passes through and
, for . For the delay
sequence constructed thus far, the probabilities that these two events
occur are , and
, respectively. Thus, the probability that any delay
sequence occurs is at most

Using the inequality for , it is straightforward to show that for any , there are constants and , such that for , and , , the product is at most .

to .667emto .667em

The analysis for the second phase of routing is similar. If we require the first phase to be completed before the second phase begins, then each takes bit steps, and the total time is bit steps. In fact, it is possible to pipeline the two phases so that some packets begin the second phase before the first is finished. The resulting algorithm is simpler and still requires only bit steps. The analysis is a little more complicated because a packet can delay another packet in the first phase, and can delay in the second.

Once the head of a packet reaches its true destination, the *M* bits
of data following behind it trickle into the destination in bit
steps. Since, with high probability, all of the heads reach their
destinations in bit steps, the total time is also with high probability.

Mon Jul 22 17:37:10 EDT 1996