Providing a lockup-free cache is essential to obtain the benefits of prefetching. For the architecture and applications we studied, most of the performance benefit was captured by allowing up to four outstanding misses. In some cases there is a performance advantage to providing additional prefetch request buffering beyond these four misses. Finally, combining prefetches and writes in the same buffer showed no significant loss in performance for our multiprocessor architecture.