The first step in the analysis is to show that, with high probability, the number of splitter candidates chosen by each butterfly is within a constant factor of the expectation. We say that an M-input butterfly is well-partitioned if the number of splitter candidates chosen is between and . The upper bound ensures that the candidates can be sorted deterministically by the butterfly in time and the lower bound implies that the subbutterflies at the next level of recursion will have at most inputs. If all of the butterflies are well-partitioned, then the algorithm terminates after levels of recursion. (The choice of and as the coefficients of are not particularly important. Other constants would serve equally well.)
Proof: We begin by considering a single M-input butterfly that is to sort n packets. Since each packet chooses independently to be a candidate, the number of candidates has a binomial distribution. Let S be the number of successes in r independent Bernoulli trials where each trial has probability p of success. Then we have . We estimate the area under the tails of this binomial distribution using a Chernoff-type bound . Following Angluin and Valiant  we have
for and . In our application r = n, , , and . For any fixed constant , there is a constant such that the right-hand sides of the two inequalities sum to at most for .
To bound the probability that any butterfly is not well-partitioned, we sum the probabilities for all of the individual butterflies. Over the course of the algorithm, the algorithm is invoked on at most individual butterflies. Thus, the sum is at most . For any , there is a such that this sum is at most .
The next lemma shows that, with high probability, the number of packets in each interval is at most a constant factor times its expectation. We say that an M-input butterfly that is assigned n packets to sort is -split if every interval has size at most . As we shall see, if every butterfly is -split and there are levels of recursion, then by lightly loading the butterfly we can ensure that no butterfly is assigned too many packets to sort.
Proof: We begin by examining a single packet in a single M-input butterfly that is to sort n-packets. To show that a packet lies in an interval of size at most it is sufficient to show that both following and preceding it in the sorted order at least of the next packets are candidates.
First we consider the packets that follow in the sorted order. The number of candidates in a sequence of packets has a binomial distribution. For , , , , and , we have . For any we can make the right-hand side smaller than by choosing large enough.
The calculation for the packets that precede in the sorted order are identical. The probability that fewer of the preceding packets are candidates is at most . Thus, the probability that an individual packet lies in an interval of size greater than is at most .
To bound the probability that any interval in the butterfly is too large we sum the probabilities that each individual packet lies in an interval that is too large. Since there are at most packets, this sum is at most .
To bound the probability that any butterfly is not -split, we sum the probabilities that each individual butterfly is not. Over the course of the algorithm, the algorithm is invoked on at most butterflies. The sum of the probabilities is at most . For any constant , we can make this sum at most by making large enough.
The remainder of the analysis is conditioned on the event that every subbutterfly is well-partitioned and -split, which occurs with high probability. Two technical points bear mentioning. First, Lemma 8.1 requires that the number of inputs to every subbutterfly be at least , where is some constant. Since the recursion terminates when the number of inputs is , N must be large enough that . Second, both Lemmas 8.1 and 8.2 hold independent of the number of packets to be sorted by each subbutterfly. Thus, as the following lemmas show, we can adjust the load on the butterfly in order to ensure that each M-input subbutterfly receives at most M packets to sort.
Proof: At each level of recursion the number of inputs drops from M to at most , until the number of inputs reaches .
Proof: Since the ratio of packets to inputs is at the top level of the recursion, and increases by at most a constant factor at each of levels, it is possible to choose such that at the bottom level it will be at most one.