Next: 4.4 Routing many paths Up: 4 Establishing many paths Previous: 4.2 A level-by-level algorithm

## 4.3 A faster algorithm

To construct the paths in bit steps we modify the first algorithm as follows. Given a set of at most paths that need to be extended at an M-input splitter, the algorithm does not wait time for every path to be extended before it begins the extension at the next level. Instead, it waits only steps, in which time the number of unextended paths falls to a fraction of its original value. We will choose to be less than . Now the path extension process can start at the next level. The only danger here is that the fraction of paths left behind may find themselves blocked by the time they reach the next level, and so we need to ensure that this won't happen. Therefore, stalled paths send out placeholders to all of their neighbors at the next level, and henceforth the neighbors with placeholders participate in the path extension process at the next level, as if they were paths. Thus, a placeholder not only reserves a spot that may be used by a path at a future time, but also helps to chart out the path by continuing to extend ahead. Since a placeholder doesn't know which path will ultimately use it, a node holding a placeholder must extend paths into both the upper and lower output portions of its splitter. A placeholder that first extends a path into the upper output portion of its splitter continues to attempt to extend a path into the lower portion, and vice versa. We will call a path from the inputs of the network to the inputs of any splitter in the network a real path if it contains no placeholders. The goal of the algorithm, of course, is to extend real paths all the way through the network. Any path that contains at least one placeholder is called a placeholder path.

Since each stalled path generates up to 2d placeholders at the next level, and these placeholders might later become stalled themselves, there is a risk that the network will become clogged with placeholders. In particular, if the fraction of inputs in a splitter that are trying to extend rises above , the path extension algorithm ceases to work. Thus, in order to prevent placeholders from clogging the system, whenever a stalled path, either real or a placeholder, gets extended into either the upper or lower output portion of a splitter, it sends a cancellation signal to each of the nodes in that portion of the splitter that are holding placeholders for it. When a placeholder is replaced by a real path, one of the two directions (up or down) into which the placeholder has been attempting to extend becomes unnecessary. If the placeholder has already extended its path in that direction, a single cancellation is sent along the edge that the path uses. Otherwise, a cancellation is sent to each of the d placeholding neighbors in that direction. When a placeholding node gets cancellations from all of the nodes that had requested it to hold their places, it ceases its attempts to extend. It also sends cancellations to any nodes ahead of it that may be holding a place for it. Note that a placeholding node that has received cancellations from all but one of the nodes that had requested it to hold their places continues to try to extend into both the upper and lower output portions of the splitter. As we shall see, this scheme of cancellations prevents placeholders from getting too numerous.

The -step algorithm for routing paths proceeds in phases. Each path is restricted to extend forward by at most one level during each phase. We refer to the first wave of paths and placeholders to arrive at a level as the wavefront. The wavefront moves forward by one level during each phase. A phase consists of the following three parts:

Note that for constant T and C, each phase can be performed in bit steps. We will assume that so that cancellation signals have a chance to catch up with the wavefront, and that .

The key to our analysis of the algorithm is to focus on the number of stalled paths (corresponding to real paths or placeholders) at the inputs of each splitter. In phase t of the algorithm, where the first phase is phase 0, the wavefront advances from level t to level t+1. Let denote the maximum fraction of inputs containing wavefront paths (real and placeholder) in a level i splitter that wish to extend to the upper (or similarly, to the lower) outputs at the end of phase i-1, i.e., when the wavefront arrives at level i, and let denote the maximum fraction of inputs that contain stalled paths that wish to extend to the upper (or similarly, to the lower) outputs of any splitter at level i at the end of phase t. Note that for t < i, since there are no paths to extend at level i before phase i. Also, note that .

The following lemmas will be useful in proving that every path is extended to completion in phases provided that and .

The following lemma bounds the size of the wavefront in terms of the number of stalled paths behind it.

The next lemma, Lemma 4.5, presents a weaker bound on . The difference between this lemma and the previous lemma is that in Lemma 4.5 we assume that a cancellation signal must reach level i rather than i-1 before the start of the path extension part of phase i-1 in order for it to have an effect on the size of the wave propagating from level i-1 to level i. The reason for this assumption is that we will later speed up the algorithm by overlapping the cancellation passing and path extension parts of each phase.

The following lemma shows that for the right choices of L, , d, and C, no splitter ever receives too many paths (real or placeholders) that want to extend to the upper outputs (and similarly, to the lower outputs).

From Lemma 4.6, it is clear that no splitter ever has more than an fraction of its inputs containing paths to be extended to the upper (or lower) outputs. Therefore the path-extension algorithm is never swamped by placeholders and always works as planned at each level, cutting down the number of stalled paths by a factor of during each phase. Hence, phases after the wavefront arrives at a splitter of size M, all paths are extended. Since the wavefront arrives at level i during phase i-1, the algorithm establishes all real paths to level (recall that the last levels have been removed) by phase

phases, since a path that is last stalled at level i extends to level i+1 by phase , and if the wavefront reaches level before its cancellation signals do, then these signals arrive phases later. Otherwise, if the cancellation signals catch up to the wavefront (but the path is never again stalled), then the path extends to level by phase For and , this expression takes on a maximum value of . At first, this result seems too good to be true, but stalled real paths catch up to the wavefront very quickly once they get through, and they get through at a very high rate. Hence, all real paths get through to the final level along with the wavefront!

Since the number of phases required is basically , the overall time for the algorithm depends mainly on the parameters C and T. By propagating the cancellations at the same time that paths are extended, a single phase can be implemented in steps. As long as , the algorithm will work for . Since , and Lemma 4.2 gives us , and we need , T must be at least 2. In general, in order to make T small, we need to be large. In order to achieve large , we need to be close to d, which requires to be small (and consequently L to be large) and d to be large. By using good splitters ( ), small, d large, C=5, and T=2, and replacing each edge with a small constant number of edges, we can obtain a -step algorithm for routing all the paths. Unfortunately, d and L need to be quite large to achieve this bound. For more reasonable values of d (less than 10) and L (less than 150), we can achieve provable routing times of about . Fortunately, the algorithms appear to run faster in simulations [19].

It is worth noting that each node only needs to keep track of a few bits of information to make its decisions. This is because only the ith bit of the destination is needed to make a switching decision at level i, and therefore a node at that level looks at this bit, strips it off, and passes the rest of the destination address onward. The path as a whole snakes forward through the network. If it ever gets blocked, the entire snake halts behind it. The implementation details for this scheme are straightforward. Previously, only the AKS sorting circuit was known to achieve this performance for bounded-degree networks, but at a much greater cost in complexity and constant factors. Recently, Leighton and Plaxton have also developed a randomized algorithm for sorting on the butterfly in bit steps [20].

Next: 4.4 Routing many paths Up: 4 Establishing many paths Previous: 4.2 A level-by-level algorithm

Bruce Maggs
Mon Jul 22 21:19:59 EDT 1996