Next: Locality Analysis Up: Core Compiler Algorithm Previous: Key Concepts

Overview of Algorithm

We have developed a compiler algorithm that selectively prefetches only those references that are likely to suffer cache misses [62]. Our algorithm consists of the following three major steps:

  1. For each static affine array reference, use locality analysis to determine which dynamic accesses are likely to be cache misses and therefore should be prefetched.

  2. Isolate the predicted dynamic miss instances using loop splitting techniques such as peeling and unrolling. This avoids the overhead of adding conditional statements to the loop bodies.

  3. Schedule prefetches the proper amount of time in advance using software pipelining, where the computation of one iteration is overlapped with prefetches for a future iteration.

The first step in our algorithm comprises the analysis phase, thus answering the question of what we should prefetch. The details of this locality analysis algorithm are presented in Section . The second and third steps constitute the scheduling phase of our algorithm, and are described in Section . We will tie all of these components together in Section by showing the code generated for a running example that is introduced in Section and used throughout the remainder of this chapter.

Sat Jun 25 15:13:04 PDT 1994