Control Flow, 1/24/01 ---------------------------------------------------------------- Basic Blocks Basic Block Optimizations Control Flow Graphs Dominators Loops Basic Blocks ------------ A basic block is a sequence of instructions with one entry point and one exit point. Advantages of converting into basic blocks? Creating basic blocks is easy: - Scan instructions for LEADERS A leader is: - the first instruction - a label - an instruction that follows a branch - Each leader starts a basic block which ends at the instruction before the next leader There are cases when the definition of a leader is different. For example, in the above definition calls do NOT end a BB. (Thought setjmp and longjmp seem to be cases for this usually they aren't.) Delay slots may cause the BB to be reformulated. Optimizations on Basic Blocks: (History: Most optimizations we cover in this course are global optimizations. They often started out as local optimizations on basic blocks) Local CSE (13.1.1) is an example: Scan instructions maintaining a table of 5-tuples: At each instruction, c <- a OP b - we see if there is a 5-tuple of the form: or If there is: We have found a CSE. - We see if there is a tuple of the form or If there is remove the tuple If we find a CSE: and var is in tuple, then rewrite c <- a op b with c <- var otherwise, create new temp and put in var rewrite instruction at pos to be: ... <- var insert before pos, var <- a op b rewrite current instruction with c <- var This can be quite fast if we hash on a,op,b (and make hash function commutative in a and b) Control Flow Graphs Once we have converted sequence of instructions into basic blocks we can create a control flow graph. A CFG is a directed graph where - Each basic block becomes a node in the CFG. - Usually the CFG is augmented with two distinguished nodes: Entry and Exit. - There is an edge between two nodes, B1 and B2, B2 can follow B1 in some execution of the program, i.e., - B2 is the target of a branch/jump at the end of B1 - B2 follows B1 and B1 does NOT end in an unconditional jump - There is an edge from Entry to the block which contains the first instr - There is an edge from any block which can exit the procedure to Exit If there is an edge from B1 to B2, then - B1 is a predecessor of B2 - B2 is a successor of B1 If a node has multiple predecessors, then that node is called a JOIN node. Before we look at how to use the CFG let me just mention that there are other kinds of basic blocks: Hyperblocks, Extended basic blocks. Hyperblocks (used in Impact) can have multiple entries and exits. Often used for trace scheduling. Extended basic blocks have 1 entry and multiple exits. More formally, an EBB is a maximal sequence of instructions starting with a leader and containing no join nodes except possibly for the leader. Now we have a CFG. So what? - Useful for data flow analysis, Loop transformations, etc. etc. Some Examples: - Unreachable Code Elimination Removes all basic blocks that can NEVER be reached in ANY execution of the program sweep graph marking nodes as you go (depth first, breadth first) remove all unmarked nodes This is really not so much an optimization as it is a graph conditioner. - Straightening Enlarges basic blocks when - B1->B2 - B1 is B2's only pred - B2 is B1's only succ B1 | \|/ -> B1/2 B2 / \ / \ Have to be careful of how you fuse the two blocks B1: ... goto B2 B3: ... goto B7 B2: ... if (cond) goto B5 B4: ... B1: ... goto B2 <- zap zap->B2: ... if (cond) goto B5 [need to insert a goto here: goto B4] B3: ... goto B7 B4: ... more = false repeat Foreach basic block, b, if (|(succ(b)| == 1) && (|pred(succ(b))| == 1) dest = succ(b) more = true // fuse if (b.lastInstruction == ubr) eliminate it if (dest.firstInstruction == label) eliminate it append dest instruction to b if (b.lastInstruction != ubr) make new label, l fall_thru = block dest can fall_thru to append "goto l" to b prepend "l:" to fall_thru succ(b) = succ(dest) while more Loops ----- There is more leverage in optimizing loops than anything else. So we need an easy way to find loops. Furthermore, we need to be able to find inner loops. To find loops we use dominators. First some definitions: A loop in a cfg is a set of nodes, L such that - one node is distinguished as the header, h - For all nodes in L, there is a path to h - There is a path from h to all nodes in L - the predecessors of every node in L are in L except for h which can have predecessors outside of L --------------------Ex. A ----------------- (Note: this excludes some things you might think are loops) E.g., 1->2, 1->3 2->3 3->2 Finding loops: Use dominitors Dominators: A node d dominates a node n if every path from Entry to n must go through d. Every node dominates itself. -------------------------Ex. B ------------------ Example: 1 dominates 1,2,3,... 2 dominates 2,3 4 dominates 4,5,6,... 12 dominates 12 We say that a node, i, IMMEDIATELY DOMINATES another node, n (i = idom(n)) if - idom(n) != n - idom(n) dominates n - idom(n) doesn't dominate a dominator of n Thm: Every node (except entry) has exactly 1 immediate dominator Proof: Use lemma we are about to prove Lemma: If d dominates n AND e dominates n then either d dom e or e dom d Proof: Assume !(d dom e) AND !(e dom d) Then there exists a path: Entry->e not through d Thus, EVERY path from e->n must go through d (otherwise d wouldn't dominate n) But the same must be true of d->n then there must be an infinite loop d->e->d->... QED Proof of above thm: - For every node, n, except Entry there exists a dominator, d, of n such that d != n. - Assume d = idom(n) and e = idom(n) and e != d By lemma, (wlog) e dom d, but then e != idom(n) We can construct a DOMINATOR TREE by connecting idom(n) to n. It is a tree by thm. How do we find loops: We call an edge (n,h) where h dom n a BACK EDGE. Every back edge (n,h) defines a loop (NATURAL LOOP) in the CFG with header h. The natural loop defined by (n,h) is the set of nodes x such that - h dom x - there exists a path x->n that does NOT contain h Eg. (10,5) {5,8,9,10} (9,8) {8,9} (3,2) {2,3} (4,2) {2,4} Notice that a loop can contain another loop. I.e., (10,5) contains (9,8). [We do NOT say that (3,2) or (4,2) contain each other]. Constructing the loop: Find all nodes in loop defined by n->d. loop <- { d } stack.push(n) while stack not empty m <- stack.pop() foreach predecessor of m, p if (NOT p in loop) stack.push(p) loop = loop UNION {p} We can find INNER LOOPS as follows: if there are two loops A and B with headers a and b resp and a != b and a dom b, then B is NESTED inside A. Or, B is an inner loop and A is the outer loop. To find the loop nests: - Compute dominators - Construct the dominator tree - Find all the natural loops (and headers) - merge all loops with a common header - Construct a tree such that if h1 dom h2, then h1 is above h2 in the tree. ---------------------- Ex. C ------------------ For convience we can say that the Entry node is a "loop", then we have a single root. depth in tree indicates how deeply the loop is nested. OK, finally, lets compute dominators (simple way, a better way will be done later when we do SSA) remember definition of dominators: A node d dominates a node n if every path from Entry to n must go through d. Every node dominates itself. If a node, n, dominates all pred(m), then n dom m. Using this we can define a set of equations to determine the dominators of n. Lets say that D[n] are all the nodes that dominate n. Then, D[Entry] = { Entry } D[n] = {n} UNION ( INTERSECTION D[p] ) n != Entry p in pred(n) The idea is to iterate over the graph until no changes are made to D for any n. We know it terminates since the computed D[n] is always a subset of the original one. So, this becomes: D[entry] = {entry} for all n, n != Entry: D[n] = {all nodes} repeat changed = false foreach node n (n != Entry) D'[n] = {n} UNION ( INTERSECTION D[p] ) p in pred(n) if (D'[n] != D[n]) changed = true D[n] = D'[n] while changed 1 {1} 2 {2,3,4,5,6,7,8,9,10,11,12} 3 {2,3,4,5,6,7,8,9,10,11,12} 4 {2,3,4,5,6,7,8,9,10,11,12} 5 {2,3,4,5,6,7,8,9,10,11,12} 6 {2,3,4,5,6,7,8,9,10,11,12} 7 {2,3,4,5,6,7,8,9,10,11,12} 8 {2,3,4,5,6,7,8,9,10,11,12} 9 {2,3,4,5,6,7,8,9,10,11,12} 10 {2,3,4,5,6,7,8,9,10,11,12} 11 {2,3,4,5,6,7,8,9,10,11,12} 12 {2,3,4,5,6,7,8,9,10,11,12} 1 {1} 2 {2,1} 3 {2,3} 4 {2,4} 5 {4,5} 6 {4,6} 7 {4,7} 8 {5,8} 9 {9,8} 10 {9,10} 11 {7,11} 12 {12} Very fast looking. We will discuss why this is fast and how to make sure it happens later. Also, discuss ways to implement this efficiently later too. Other dominator relationships: POST-DOMINATION: p pdom n iff every path from n to Exit goes through p