### Towards a More Principled Compiler: Progressive Backend Compiler Optimization David Koes 8/28/2006 Carnegie Mellon ## Outline I. Motivation II. Related Work III. Completed Work IV. Proposed Work V. Contributions & Timeline | Method | Expressive | Fast | Optimal | |-------------------------------------------|------------|----------|---------| | Linear Scan | × | VV | × | | Graph Coloring | × | <b>V</b> | × | | Integer Linear<br>Programming | ~ | × | ~ | | Partitioned Boolean Quadratic Programming | V | VIX | ×/V | | Method | DAG<br>Tiling | Register<br>Allocation Aware | Fast | Optimal | |-----------------------------------------|---------------|------------------------------|------|----------| | Dynamic Programming | × | × | ~ | ~ | | Binate Covering | ~ | × | × | ~ | | Peephole Based<br>Instruction Selection | ~ | × | ~ | × | | AVIV Code Generator | ~ | ~ | X | × | | Exhaustive Search | × | <b>V</b> | × | <b>/</b> | ### **Outline** - I. Motivation - II. Related Work - III. Completed Work - IV. Proposed Work - V. Contributions & Timeline 3 School of Computer Science | A More Prin | cipled Register Allocator | |----------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------| | reg<br>alloc<br>machine<br>description | - fully utilize machine description • explicit and expressive model of costs of allocation for given architecture - optimal solutions | | 14 | Carregie Mellon | # Evaluation Implemented in gcc 3.4.4 targeting x86 Optimize for code size - perfect static evaluation - important metric in its own right MediaBench, MiBench, Spec95, Spec2000 - over 10,000 functions ### **Outline** - I. Motivation - II. Related Work - III. Completed Work - IV. Proposed Work - V. Contributions & Timeline 33 Carnegie Mellon ### A Better Better Register Allocator ### Solver Improvements - Improve initial solution - Improve quality as prices converge - Hope to prove approximation bounds ### Model Improvements - Improve accuracy of model - Model simplification - Represent uniform register sets efficiently 34 Carnegie Mellon School of Computer Science ### Implementing RA<sup>2</sup>ISE Add side-constraints to Global MCNF model - implement inter-variable preferences and constraints - "if x allocated to r<sub>1</sub> and y allocated to r<sub>2</sub>, then save three bytes" - "x and y must be allocated to the same register" ### Implement x86 RAATs - RAAT tables created manually - GMCNF RAAT representation automatically generated from RAAT table with minimum use of side constraints ### Algorithms for tiling RAATs - leverage existing algorithms - exploit feedback between passes 39 Carnegie Mellon PROPOSED WORK ### PROPOSED WOR ### **Evaluation** Implement in production quality compiler (gcc) Evaluate code size and simple code speed metric Evaluate on three different architectures - x86 (8 registers) - 68k/ColdFire (16 registers) - PPC (32 registers) 41 Carnegie Mellon ### I. Motivation II. Related Work III. Completed Work IV. Proposed Work V. Contributions & Timeline ### **Contributions** ### RA2ISE - register allocation aware tiles (RAATs) explicitly encode effect of register allocation on instruction sequence - algorithms for tiling RAATs - expressive model of register allocation that operates on RAATs and explicitly represents all important components of register allocation - progressive solver for this model that can quickly find decent solution and approaches optimality as more time is allowed for compilation Comprehensive evaluation of RA2ISE 43 Carnegie Mellon School of Computer Science ### **Thesis Statement** RA<sup>2</sup>ISE is a principled and effective system for performing instruction selection and register allocation. Carnegie Mellon | imelin | 9 | |-------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Fall 2006 | add simple speed metric option to model begin model simplification work improve model accuracy and solver performance | | Winter 2006 | finish model simplification work add side-constraints to model implement existing gcc tiles as RAATs improve model accuracy and solver performance | | Spring 2007 | finish implementation of side-constraints and gcc RAATs<br>begin work on RA*ISE infrastructure<br>create gcc-independent set of RAATs for x86<br>improve model accuracy and solver performance | | Summer 2007 | finish work on RA2ISE<br>investigate and develop tiling algorithms<br>improve model accuracy and solver performance | | Fall 2007 | add 68k/ColdFire and PowerPC targets<br>investigate uniform register set simplifications<br>improve model accuracy and solver performance | | Winter 2007 | begin writing thesis work on improving compile time performance | | Spring 2008 | finish writing thesis |