To reclaim some of the lost performance potential illustrated in Figures and , techniques for coping with memory latency are essential. These techniques fall broadly into two categories: those that reduce latency, and those that tolerate latency. Techniques for reducing latency include caching data and making the best use of those caches through locality optimizations. Techniques for tolerating latency include buffering and pipelining references, prefetching, and multithreading. We will briefly discuss each of these techniques in this subsection to show how prefetching fits into the overall approach to hiding latency, and to motivate why prefetching itself is worth studying.