This section describes the first step in our algorithm, which is to use locality analysis to determine which references are likely to cause cache misses.
Before we begin our discussion, we must define and distinguish two fundamental concepts: reuse and locality. We say that reuse occurs whenever the same data item is referenced multiple times. Understanding data reuse is essential to predicting cache behavior since a data item will normally only be found in the cache if its line was referenced sometime in the past. However, reuse might not result in a cache hit if intervening references flush the data item from the cache between uses. If the reused data actually does remain in the cache, we say that the reference that enjoys the cache hit has locality. Therefore, it is important to realize that reuse does not necessarily result in locality. Instead, the references with data locality are a subset of the references with data reuse.
Given this relationship between reuse and locality, our locality analysis algorithm is broken down into the following three substeps:
These first two substeps have been adapted from Wolf and Lam's data locality optimizing algorithm , while the third substep is unique to prefetching. Although analyzing data locality is essential for both locality optimizations and prefetching, the differing goals of the two techniques lead to differences in how the analysis is applied. For locality optimizations, the goal of locality analysis is quantifying the data locality of various permutations of the loop nest to select the transformation that yields the maximum locality. For prefetching, the goal is identifying rather than quantifying references with data locality, and the algorithm does not consider permuting the given loop nest. There are other differences, such as the way the localized iteration space is computed. For Wolf and Lam's algorithm, the assumption is that all loop bounds are large, and therefore the localized iteration space only contains loops other than the innermost loop if transformations such as ``cache blocking'' are actively applied. In contrast, since prefetching is interested in discovering rather than enhancing data locality, it is more important to determine the given localized iteration space accurately, so our algorithm attempts to reason about constant and symbolic loop bounds. A final difference is that prefetching attempts to analyze locality in all loop nests, rather than just the well-behaved loop nests where it is considered safe to reorder the computation.
The result of the first two substeps of this algorithm is a mathematical description of the data locality in a vector space representation. While this description is precise, a more convenient representation for our purposes is the prefetch predicates which are constructed during the third substep of our locality analysis algorithm. Details of these three substeps will be presented in Sections , , and , respectively. Before going into these details, however, we first introduce the following example.