The basic outline of the algorithm is the same as that of Flashsort.
The first step is to randomly select a small set of splitters
from among the packets that are to be sorted. Next the splitters are
sorted deterministically. The splitters partition the packets into
intervals. The ith interval consists of those packets whose
keys are larger than the key of the
st largest splitter, and
smaller than the key of the ith largest splitter. (We assume
without loss of generality that all of the keys are distinct.) Using
the splitters as guides, each interval of packets is routed to a
different subbutterfly, where it is sorted recursively.
We begin by describing a recursive algorithm for sorting
packets in
time on an
-node butterfly, where
is some fixed constant greater than one. The butterfly is
``lightly loaded'' by this factor of
to ensure
that, with high probability, at the lower levels of the recursion the
number of packets to be sorted by each subbutterfly does not exceed
the number of inputs to that subbutterfly. When the algorithm is invoked,
each packet must reside at a distinct input. As we shall see, this
algorithm can be combined with Leighton's Columnsort algorithm
[16] to sort all
packets in
time.
The steps taken by a subbutterfly with M inputs are presented in some detail in Figure 6.
Figure 6: The steps
performed by an M-input subbutterfly in the recursive algorithm for sorting
packets in
time on an
-node
butterfly using constant-size queues.
The first step in the algorithm is to count the number of packets
entering the subbutterfly. Since the packets reside in distinct
inputs, the total number of packets can be computed via a parallel
prefix computation. The prefix computation can be performed in
time deterministically.
Next each packet independently chooses to be a splitter candidate with
probability
. As we shall see, with high probability the
number of candidates is between
and
. This
step requires only constant time.
The candidates are then sorted in
time using a simple
deterministic algorithm based on counting due to Nassimi and Sahni
[25].
After the candidates are sorted, every
th one in the sorted
order is chosen to be a splitter. This oversampling
technique, due to Reif, ensures that each of the intervals contains
approximately the same number of splitters, with high probability.
Note that we oversample by a factor of
, where N is the
number of inputs in the entire network, independent of the number of
inputs, M, of the subbutterfly on which the algorithm is invoked.
Since with high probability there are at least
splitters, the subbutterflies at the next level of recursion should
have at most
inputs.
Next the splitters are distributed throughout the M-input
subbutterfly so that they can direct each interval of packets to the
appropriate smaller subbutterfly. We distribute a copy of the median
splitter to each node in level 0 of the M-input subbutterfly. Then we
divide the splitters into upper and lower halves. We distribute a
copy of the median splitter from the upper half to each node in the
upper half of level 1. Similarly, we distribute a copy of the
median splitter from the lower half to each node in the lower half of
level 1. The process continues in this fashion until all of the
splitters are used up. At this point, every node in the first
levels of the butterfly has a copy of
a splitter. This step can be performed deterministically in
time.
After the splitters are positioned, each packet is routed to a random row of the M-input subbutterfly. The packets are scheduled using the algorithm for routing on leveled networks.
Each interval of packets is then routed to a different smaller subbutterfly. This step is called splitter-directed routing [32]. The paths of the packets are determined as follows. At level 0, each packet compares itself to the median splitter. If it is larger, it moves to the upper half of the second level, otherwise it moves to the lower half. The process is repeated at the level 1, with each packet being directed to the appropriate quarter of level 2, and so on. The packets are scheduled using the algorithm for routing on leveled networks. When all of the packets have been routed along in the butterfly as deeply as the splitters are assigned, each subbutterfly at that level picks new splitters and proceeds recursively.
The last step before the recursive call is to position the packets in
each smaller subbutterfly in distinct inputs. On an M-input
butterfly where at most c packets enter each input, M packets can
be distributed to distinct inputs deterministically in
time.
The recursion continues until either the number of inputs, M, is
smaller than
, or the number of packets, n, is
smaller than
. In the first case, the sort is completed
using Batcher's odd-even merge sort. An M-input butterfly can sort
M packets in
time using odd-even merge sort. For
, the time is
. In the second case,
the packets can be sorted deterministically in
time using
Nassimi and Sahni's sorting algorithm, as in step four.
We can now make a rough estimate of the running time of this
algorithm. Steps 1 and
2 are performed deterministically in
time. Assuming that there are
candidates,
Steps 3, 4, and
5 also require
time. As we
shall see, the expected time for Steps 6,
7 and 8 is
. Although these steps sometimes take longer than
expected, let us assume for now that they do not. In this case, the
running time is given by the recurrence
which has solution
.