next up previous
Next: 8.2 Analysis Up: 8 Sorting on butterflies Previous: 8 Sorting on butterflies

8.1 The algorithm

The basic outline of the algorithm is the same as that of Flashsort. The first step is to randomly select a small set of splitters from among the packets that are to be sorted. Next the splitters are sorted deterministically. The splitters partition the packets into intervals. The ith interval consists of those packets whose keys are larger than the key of the tex2html_wrap_inline3819 st largest splitter, and smaller than the key of the ith largest splitter. (We assume without loss of generality that all of the keys are distinct.) Using the splitters as guides, each interval of packets is routed to a different subbutterfly, where it is sorted recursively.

We begin by describing a recursive algorithm for sorting tex2html_wrap_inline3823 packets in tex2html_wrap_inline3825 time on an tex2html_wrap_inline3827 -node butterfly, where tex2html_wrap_inline3829 is some fixed constant greater than one. The butterfly is ``lightly loaded'' by this factor of tex2html_wrap_inline3831 to ensure that, with high probability, at the lower levels of the recursion the number of packets to be sorted by each subbutterfly does not exceed the number of inputs to that subbutterfly. When the algorithm is invoked, each packet must reside at a distinct input. As we shall see, this algorithm can be combined with Leighton's Columnsort algorithm [16] to sort all tex2html_wrap_inline3833 packets in tex2html_wrap_inline3835 time.

The steps taken by a subbutterfly with M inputs are presented in some detail in Figure 6.

            figure844
Figure 6: The steps performed by an M-input subbutterfly in the recursive algorithm for sorting tex2html_wrap_inline3851 packets in tex2html_wrap_inline3853 time on an tex2html_wrap_inline3855 -node butterfly using constant-size queues.

The first step in the algorithm is to count the number of packets entering the subbutterfly. Since the packets reside in distinct inputs, the total number of packets can be computed via a parallel prefix computation. The prefix computation can be performed in tex2html_wrap_inline3857 time deterministically.

Next each packet independently chooses to be a splitter candidate with probability tex2html_wrap_inline3859 . As we shall see, with high probability the number of candidates is between tex2html_wrap_inline3861 and tex2html_wrap_inline3863 . This step requires only constant time.

The candidates are then sorted in tex2html_wrap_inline3865 time using a simple deterministic algorithm based on counting due to Nassimi and Sahni [25].

After the candidates are sorted, every tex2html_wrap_inline3867 th one in the sorted order is chosen to be a splitter. This oversampling technique, due to Reif, ensures that each of the intervals contains approximately the same number of splitters, with high probability. Note that we oversample by a factor of tex2html_wrap_inline3869 , where N is the number of inputs in the entire network, independent of the number of inputs, M, of the subbutterfly on which the algorithm is invoked. Since with high probability there are at least tex2html_wrap_inline3875 splitters, the subbutterflies at the next level of recursion should have at most tex2html_wrap_inline3877 inputs.

Next the splitters are distributed throughout the M-input subbutterfly so that they can direct each interval of packets to the appropriate smaller subbutterfly. We distribute a copy of the median splitter to each node in level 0 of the M-input subbutterfly. Then we divide the splitters into upper and lower halves. We distribute a copy of the median splitter from the upper half to each node in the upper half of level 1. Similarly, we distribute a copy of the median splitter from the lower half to each node in the lower half of level 1. The process continues in this fashion until all of the splitters are used up. At this point, every node in the first tex2html_wrap_inline3889 levels of the butterfly has a copy of a splitter. This step can be performed deterministically in tex2html_wrap_inline3891 time.

After the splitters are positioned, each packet is routed to a random row of the M-input subbutterfly. The packets are scheduled using the algorithm for routing on leveled networks.

Each interval of packets is then routed to a different smaller subbutterfly. This step is called splitter-directed routing [32]. The paths of the packets are determined as follows. At level 0, each packet compares itself to the median splitter. If it is larger, it moves to the upper half of the second level, otherwise it moves to the lower half. The process is repeated at the level 1, with each packet being directed to the appropriate quarter of level 2, and so on. The packets are scheduled using the algorithm for routing on leveled networks. When all of the packets have been routed along in the butterfly as deeply as the splitters are assigned, each subbutterfly at that level picks new splitters and proceeds recursively.

The last step before the recursive call is to position the packets in each smaller subbutterfly in distinct inputs. On an M-input butterfly where at most c packets enter each input, M packets can be distributed to distinct inputs deterministically in tex2html_wrap_inline3907 time.

The recursion continues until either the number of inputs, M, is smaller than tex2html_wrap_inline3911 , or the number of packets, n, is smaller than tex2html_wrap_inline3915 . In the first case, the sort is completed using Batcher's odd-even merge sort. An M-input butterfly can sort M packets in tex2html_wrap_inline3921 time using odd-even merge sort. For tex2html_wrap_inline3923 , the time is tex2html_wrap_inline3925 . In the second case, the packets can be sorted deterministically in tex2html_wrap_inline3927 time using Nassimi and Sahni's sorting algorithm, as in step four.

We can now make a rough estimate of the running time of this algorithm. Steps 1 and 2 are performed deterministically in tex2html_wrap_inline3929 time. Assuming that there are tex2html_wrap_inline3931 candidates, Steps 3, 4, and 5 also require tex2html_wrap_inline3933 time. As we shall see, the expected time for Steps 6, 7 and 8 is tex2html_wrap_inline3935 . Although these steps sometimes take longer than expected, let us assume for now that they do not. In this case, the running time is given by the recurrence

eqnarray882

which has solution tex2html_wrap_inline3937 .



next up previous
Next: 8.2 Analysis Up: 8 Sorting on butterflies Previous: 8 Sorting on butterflies



Bruce Maggs
Mon Jul 22 22:57:44 EDT 1996