   Next: 8 Sorting on butterflies Up: Randomized Routing and Sorting Previous: 6 Routing on multidimensional

# 7 Construction of area and volume-universal networks

In this section we construct a class of networks that are area-universal in the sense that a network in the class with N nodes has area and can, with high probability, simulate in steps each step of any network of area . The networks are based on the fat-trees of Greenberg and Leiserson  and the simulation uses the packet routing algorithm from Section 2.

Leiserson was the first to display a class of networks that could efficiently simulate any other network of the same area or volume. In  he showed that a fat-tree of area can simulate in bit-steps each bit-step of any network of area . (In the bit model an edge can transmit a single bit in each time step. All of the algorithms in this paper are described in terms of the packet model, in which an edge can transmit a packet of at least bits in one step.) Leiserson's simulation used an off-line routing algorithm for fat-trees. On-line routing algorithms were later developed by Greenberg and Leiserson .

A fat-tree network is shown in Figure 5. Its underlying structure is a complete 4-ary tree. Each edge in the 4-ary tree corresponds to a pair of oppositely directed groups of edges called channels. The channel directed from the leaves to the root is called an up channel; the other is called a down channel. The capacity of a channel C, , is the number of edges In the channel. We call the tree ``fat'' because the capacities of the channels grow by a factor of 2 at every level. A fat-tree of height m has nodes at the root and leaves. Figure 5: A fat-tree.

It will prove useful to label the nodes at the top and bottom of each channel. Let the level of a node be its distance from the leaves. Suppose a channel C connects nodes at level l with nodes at level l+1. Give the nodes at level l labels 0 through and the nodes at level l+1 labels 0 through . Then node k at level l is connected to nodes k and at level l+1. The following lemma relates the labels of the nodes on a packet's path from a leaf to the root. The simplest way to route packets in a leveled fashion on a fat-tree is to route every packet all of the way up to the root before routing it down to its destination. The problem with this strategy is that it may cause unnecessary congestion at the root. For example, suppose that every leaf wants to send a single packet to its sibling. If these packets are sent to the root, then packets will pass through each channel at the root. On the other hand, if each packet goes up just one level before turning around, then at most one packet will pass through any channel in the entire network. Thus, we will route every packet from its origin to its destination along a path that passes through as few channels as possible.

For a set Q of packets to be delivered between the leaves of the fat-tree, we define the load of Q on a channel C, , to be the number of different destinations of packets in Q for which at least one packet must pass through C. (A packet must pass through C only if every path from the packet's origin to its destination passes through the C.) Note that even if many packets with the same destination must pass through a channel, that destination contributes at most one to the load of the channel. The routing algorithm will combine all packets with the same destination that attempt to pass through the channel. We define the load factor of Q on C, , to be the ratio of the load of Q on C to the capacity of C, The load factor on the entire network is simply the maximum load factor on any channel The load factor is a lower bound on the the number of steps required to deliver Q. We shall sometimes write to denote when the set of packets to be delivered is clear from the context.

In a leveled fat-tree a node at the top of an up channel at level l is connected to itself at the top of the corresponding down channel by a linear chain of nodes of length . A packet may only make a transition from an up channel to a down channel by traversing a chain. Thus all shortest paths between leaves in a leveled fat-tree have length 2m. Note that the load of a set of packets on a channel of the leveled fat-tree is identical to the load on the corresponding channel in the fat-tree.

The path that a packet for destination x takes through a leveled fat-tree is determined by the m-universal hash function where P is a prime number larger than the number of possible different destinations, and the are chosen at random off-line. A packet with destination x follows up channels along the unique shortest path to the node labeled at the root until it can reach x without using any more up channels. It then crosses over to a down channel via a chain, and follows down channels to x. Note that a packet only passes through a channel if all paths from its origin to its destination pass through that channel. Also, all packets with destination x that pass through channel C pass through node at the top of C and through node at the bottom of C.

The following lemma shows that we can use the scheduling algorithm from Section 2 to route packets in a fat-tree. Proof: The paths of the packets are first randomized using the universal hash function . With high probability, the resulting congestion is . Each packet travels a distance of . The packets are then scheduled using the algorithm from Section 2.

to .667emto .667em

Let us now consider the VLSI area requirements  of fat-trees. A fat-tree with root capacity M and nodes has a layout with area that is obtained by embedding the fat-tree in the tree of meshes . The nodes of the tree of meshes in this layout are separated by a distance of in both the horizontal and vertical directions. Thus, the space for the chain associated with each node in the leveled fat-tree can be allocated without increasing the asymptotic area of the layout. (In fact, it is possible to attach a chain of size to each fat-tree node without increasing the area by more than a constant factor.) The leaves of the fat-tree are separated in the layout from each other by a distance of in each direction. We can improve the density of nodes without increasing the asymptotic area of the layout by connecting a mesh of nodes to each leaf. The resulting network has nodes and area . The N-node network in this class has root capacity , leaves, and area .

The following theorem shows that this class of networks is area-universal. Proof: The nodes of network B are mapped to the nodes of the area-universal network U off-line using a recursive decomposition technique as in . In each step, an edge of B is simulated by routing packets between the nodes that it connects. At each level of the recursion at most edges connect the nodes mapped below a channel C with the rest of the network. This property of the mapping ensures that the load factor of each set of packets used in the simulation of B is at most . At the bottom of the decomposition tree, a region of the layout of B is mapped to each leaf of the fat-tree. The mesh connected to the leaf in U simulates this region of B with slowdown using a mesh routing algorithm such as the one in Section 3.

to .667emto .667em

The study of fat-tree routing algorithms that perform combining was motivated in part by an abstraction of the volume and area-universal networks called the distributed random-access machine (DRAM). A host of conservative algorithms for tree and graph problems for the exclusive-read exclusive-write (EREW) DRAM are presented in . Recently we discovered conservative concurrent-read concurrent-write (CRCW) algorithms that require fewer steps for some of these problems . Until now, however, no efficient fat-tree routing algorithms that perform combining were known. The step routing algorithm presented here fills the void.

Only slight modifications to the area-universal fat-tree are necessary to make it volume universal . The underlying structure of the volume-universal fat-tree is a complete 8-ary tree. Instead of doubling at each level, the channel capacities increase by a factor of 4. The tree has m levels, root capacity , and leaves. The nodes at the top of a channel at level l are labeled 0 through . Node k at level l is connected to nodes k, , , and at level l+1. A layout with volume for the fat-tree can be obtained by embedding it in the three-dimensional tree of meshes. As before, a chain of size can be attached to each node of the fat-tree without increasing the asymptotic layout area and the density of nodes can be improved by connecting a mesh to each leaf.

The simulation scheme in this section can also be used to simulate shared-bus networks. In a shared-bus network, an edge is allowed to connect more than two nodes. If several nodes attempt to send packets on the same edge in the same step, then the packets are combined using some simple rule to form a single packet. Combining is assumed to require a single step, regardless of the number of packets combined or the rule used. Because the fat-tree routing algorithm is capable of combining, it is no more difficult for the fat-tree to simulate shared-bus networks than to simulate point-to-point networks (i.e., networks in which each edge connects a pair of nodes). The simulation is optimal because a point-to-point network may require steps to simulate one step of a shared-bus network. Previously, no fat-tree routing algorithms were capable of combining packets to the same destination. As a consequence, no scheme for simulating shared-bus networks was known. A network that can simulate in steps each step of any shared-bus network area of equal area was presented in . However, the connections in that network are not fixed, but instead nodes communicate via reconfigurable buses.   Next: 8 Sorting on butterflies Up: Randomized Routing and Sorting Previous: 6 Routing on multidimensional

Bruce Maggs
Mon Jul 22 22:57:44 EDT 1996