 
  
  
  
 
Hardware-controlled prefetching primarily offers two advantages over 
software-controlled prefetching. First, old code does not need to be 
recompiled to take advantage of prefetching. However, this dissertation has 
demonstrated that the compiler technology for automatically inserting 
prefetches can be quite successful and is straightforward to implement. 
Therefore since prefetching compilers should be readily available in the 
future, this does not appear to be a compelling argument. In particular, 
scientific programmers usually care enough about performance that they are 
willing to recompile their code. The second advantage of 
hardware-controlled prefetching is that it adds no instruction overhead. 
However, as we have already seen in Chapters  and
 and 
 , the instruction overhead of 
software-controlled prefetching is typically quite small, so this also 
appears not to be much of an advantage.
, the instruction overhead of 
software-controlled prefetching is typically quite small, so this also 
appears not to be much of an advantage.
Hardware-controlled prefetching has some important disadvantages compared 
to software-controlled prefetching. First, it is limited only to 
constant-stride access patterns, and therefore cannot prefetch the indirect 
references which our compiler can handle (as demonstrated in Section 
 ). Since the compiler is also quite successful at 
prefetching the constant-stride cases (as we have demonstrated), 
software-controlled prefetching is likely to offer better coverage 
than hardware-controlled prefetching. We would expect this trend to 
continue in the future as the compiler becomes more sophisticated.
 Second, although hardware-based schemes have no software cost, they 
may have a significant hardware cost, consuming chip area and 
possibly affecting cycle time. Therefore since software-controlled 
prefetching has been shown to be quite effective, offers a broader coverage 
of misses, and is much simpler to implement in the processor, it 
appears to be a better solution than hardware-controlled prefetching.
). Since the compiler is also quite successful at 
prefetching the constant-stride cases (as we have demonstrated), 
software-controlled prefetching is likely to offer better coverage 
than hardware-controlled prefetching. We would expect this trend to 
continue in the future as the compiler becomes more sophisticated.
 Second, although hardware-based schemes have no software cost, they 
may have a significant hardware cost, consuming chip area and 
possibly affecting cycle time. Therefore since software-controlled 
prefetching has been shown to be quite effective, offers a broader coverage 
of misses, and is much simpler to implement in the processor, it 
appears to be a better solution than hardware-controlled prefetching.