Solving Linear Recurrences with Loop Raking

Guy E. Blelloch, Siddhartha Chatterjee, and Marco Zagha.
Journal of Parallel and Distributed Computing, February 1995.

73k compressed postscript

Abstract: We present a variation of the partition method for solving linear recurrences that is well-suited to vector multiprocessors. The algorithm fully utilizes both vector and multiprocessor capabilities, and reduces the number of memory accesses and temporary memory requirements as compared to the more commonly used version of the partition method. Our variation uses a general loop restructuring technique called loop raking.

We describe an implementation of this technique on the CRAY, and present performance results for first- and second-order linear recurrences. On a single processor of the C90 our implementations are up to 7.3 times faster than the corresponding optimized library routines in SCILIB, an optimized mathematical library supplied by Cray Research. On 4 processors, we gain an additional speedup of at least 3.7.

	author = "Guy~E. Blelloch and Siddhartha Chatterjee and Marco Zagha",
	title = "Solving Linear Recurrences with Loop Raking",
	journal = "Journal of Parallel and Distributed Computing",
        month = feb,  
        volume = 25,
        number = 1,
       	year = 1995,
        pages = "91--97"}