next up previous
Next: 4 List contraction Up: Communication-Efficient Parallel Algorithms Previous: 2 The DRAM model

3 Conservative algorithms

 

This section introduces the notion of a conservative algorithm. In the DRAM model, a conservative algorithm is communication efficient in the sense that it never produces more congestion across cuts of the DRAM than is implicit in the input data structure. We give an important lemma that shows how pointers in a data structure can be ``shortcut'' without introducing congestion.

A conservative algorithm is a DRAM algorithm in which the load factor of memory accesses in any step is bounded by the load factor of the input data structure, independent of the cut capacities of the DRAM on which the algorithm is executed. To be precise, we define a set M of memory accesses to be conservative with respect to another set tex2html_wrap_inline1163 of memory accesses if for all cuts S of a DRAM, we have tex2html_wrap_inline1167 . By implication, whatever the cut capacities of the DRAM, we have tex2html_wrap_inline1169 . We make the natural extension of the term conservative to sets of pointers and data structures. A conservative algorithm is thus one all of whose memory accesses are conservative with respect to the input data structure. Thus, if a conservative algorithm runs for T steps on an input data structure with load factor tex2html_wrap_inline1173 , then the total time for the algorithm is at most tex2html_wrap_inline1175 .

If at every step, the memory accesses of an algorithm correspond to a subset of pointers in the input data structure, then the algorithm is certainly conservative since if M is a subset of tex2html_wrap_inline1179 , then we have tex2html_wrap_inline1181 . For example, synchronous distributed algorithms, such as the network flow algorithms of Goldberg and Tarjan [8, 9], are conservative for this reason. We do not wish to restrict our attention to this limited class of conservative algorithms because synchronous distributed algorithms cannot efficiently solve certain problems on graphs with high diameter. For example, the problem considered earlier of determining the distance of each element to the end of the list cannot be solved in less than linear time with a synchronous distributed algorithm. A PRAM algorithm, however, can perform such the computation in logarithmic time, for example, by recursive doubling, but recursive doubling is not conservative.

We would like to know conditions under which processors in a DRAM can communicate directly with distant locations in a data structure without increasing communication requirements as measured by the load factor. The following simple, but important, lemma provides conditions that are sufficient for any DRAM.

lemma226

Proof: We show only the first part of the lemma since the second part follows immediately by induction. We shall show that tex2html_wrap_inline1201 for any cut S of the underlying network. Consider the eight ways in which a, b, and c can be assigned to sides of the partition induced by a cut S. Half the cases can be eliminated by symmetry if we assume that a is on the left side. In each of the four remaining cases, the load across the cut is either unchanged or diminished when tex2html_wrap_inline1215 and tex2html_wrap_inline1217 are replaced with tex2html_wrap_inline1219 , as is shown in Figure 2.


to .667emto .667em

   figure231
Figure 2: The Shortcut Lemma. In each of the four cases illustrated, the load factor across the cut is either unchanged or diminished by replacing tex2html_wrap_inline1221 and tex2html_wrap_inline1223 with tex2html_wrap_inline1225 .

In summary, this section has introduced the notion of a conservative algorithm. An upper bound on the time required by a conservative algorithm can be determined solely from the embedding of an input data structure on the DRAM. If the number of steps of the conservative algorithm is T and the load factor of the input data structure is tex2html_wrap_inline1229 , then the total time is at most tex2html_wrap_inline1231 . A user of a conservative algorithm therefore need only minimize the congestion of pointers in the input data structure across cuts of the DRAM to minimize the time required by the algorithm. If the embedding of the data structure is good, that is, its load factor is small, then a conservative algorithm that uses a small number of steps runs fast.



next up previous
Next: 4 List contraction Up: Communication-Efficient Parallel Algorithms Previous: 2 The DRAM model



Bruce Maggs
Thu Jul 25 19:12:36 EDT 1996