The basic outline of the algorithm is the same as that of Flashsort.
The first step is to randomly select a small set of *splitters*
from among the packets that are to be sorted. Next the splitters are
sorted deterministically. The splitters partition the packets into
*intervals*. The *i*th interval consists of those packets whose
keys are larger than the key of the st largest splitter, and
smaller than the key of the *i*th largest splitter. (We assume
without loss of generality that all of the keys are distinct.) Using
the splitters as guides, each interval of packets is routed to a
different subbutterfly, where it is sorted recursively.

We begin by describing a recursive algorithm for sorting
packets in time on an -node butterfly, where
is some fixed constant greater than one. The butterfly is
``lightly loaded'' by this factor of to ensure
that, with high probability, at the lower levels of the recursion the
number of packets to be sorted by each subbutterfly does not exceed
the number of inputs to that subbutterfly. When the algorithm is invoked,
each packet must reside at a distinct input. As we shall see, this
algorithm can be combined with Leighton's *Columnsort* algorithm
[16] to sort all packets in time.

The steps taken by a subbutterfly with *M* inputs are presented in
some detail in Figure 6.

**Figure 6:** The steps
performed by an *M*-input subbutterfly in the recursive algorithm for sorting
packets in time on an -node
butterfly using constant-size queues.

The first step in the algorithm is to count the number of packets entering the subbutterfly. Since the packets reside in distinct inputs, the total number of packets can be computed via a parallel prefix computation. The prefix computation can be performed in time deterministically.

Next each packet independently chooses to be a splitter candidate with probability . As we shall see, with high probability the number of candidates is between and . This step requires only constant time.

The candidates are then sorted in time using a simple deterministic algorithm based on counting due to Nassimi and Sahni [25].

After the candidates are sorted, every th one in the sorted
order is chosen to be a splitter. This *oversampling*
technique, due to Reif, ensures that each of the intervals contains
approximately the same number of splitters, with high probability.
Note that we oversample by a factor of , where *N* is the
number of inputs in the entire network, independent of the number of
inputs, *M*, of the subbutterfly on which the algorithm is invoked.
Since with high probability there are at least
splitters, the subbutterflies at the next level of recursion should
have at most inputs.

Next the splitters are distributed throughout the *M*-input
subbutterfly so that they can direct each interval of packets to the
appropriate smaller subbutterfly. We distribute a copy of the median
splitter to each node in level *0* of the *M*-input subbutterfly. Then we
divide the splitters into upper and lower halves. We distribute a
copy of the median splitter from the upper half to each node in the
upper half of level *1*. Similarly, we distribute a copy of the
median splitter from the lower half to each node in the lower half of
level *1*. The process continues in this fashion until all of the
splitters are used up. At this point, every node in the first
levels of the butterfly has a copy of
a splitter. This step can be performed deterministically in time.

After the splitters are positioned, each packet is routed to a random
row of the *M*-input subbutterfly. The packets are scheduled using
the algorithm for routing on leveled networks.

Each interval of packets is then routed to a different smaller
subbutterfly. This step is called *splitter-directed routing*
[32]. The paths of the packets are determined as
follows. At level *0*, each packet compares itself to the median
splitter. If it is larger, it moves to the upper half of the second
level, otherwise it moves to the lower half. The process is repeated
at the level *1*, with each packet being directed to the appropriate
quarter of level *2*, and so on. The packets are scheduled using the
algorithm for routing on leveled networks. When all of the packets have
been routed along in the butterfly as deeply as the splitters are
assigned, each subbutterfly at that level picks new splitters and
proceeds recursively.

The last step before the recursive call is to position the packets in
each smaller subbutterfly in distinct inputs. On an *M*-input
butterfly where at most *c* packets enter each input, *M* packets can
be distributed to distinct inputs deterministically in
time.

The recursion continues until either the number of inputs, *M*, is
smaller than , or the number of packets, *n*, is
smaller than . In the first case, the sort is completed
using Batcher's odd-even merge sort. An *M*-input butterfly can sort
*M* packets in time using odd-even merge sort. For
, the time is . In the second case,
the packets can be sorted deterministically in time using
Nassimi and Sahni's sorting algorithm, as in step four.

We can now make a rough estimate of the running time of this algorithm. Steps 1 and 2 are performed deterministically in time. Assuming that there are candidates, Steps 3, 4, and 5 also require time. As we shall see, the expected time for Steps 6, 7 and 8 is . Although these steps sometimes take longer than expected, let us assume for now that they do not. In this case, the running time is given by the recurrence

which has solution .

Mon Jul 22 22:57:44 EDT 1996