Although non-binding prefetching allows the compiler to ignore the fact that it is compiling for a multiprocessor from a correctness standpoint, there are some performance reasons why it should take multiprocessing into account. In this subsection, we discuss the first of these reasons, which is that communication between processors can potentially increase the miss rate by causing more coherence misses (e.g., misses due to invalidations when using an invalidation-based cache coherence protocol).
As an example of how communication affects the miss rate, consider the example in Figure . In this example, two processors are both accessing location A, and both processors initially have copies of A in their caches in a ``shared'' state. Processor 1 loads A twice. Assume that during the interval between these loads, no other locations are accessed by Processor 1 that would interfere with A in the cache. If this access pattern occurred on a uniprocessor, it would be reasonable to expect the second load of A to hit in the cache, since A has not been replaced by other accesses since it was first loaded. However, in the multiprocessor scenario in Figure , Processor 2 stores to location A during this interval, thus invalidating A from Processor 1's cache, and resulting in a cache miss the second time Processor 1 loads A. Such coherence misses should be taken into account by the compiler during its analysis phase when it is predicting which references to prefetch.