15-213 Lab #4 Q&A


Saturday, March 9, 2002

Question:

When blocking, should block size be dependent of the size of matrices in lab 4 Part I? I found myself quite unable to explain the trend that block size of B. While block size of 32 will improve performance matrices of size N = 128, the rest remained constant. When changing B to 16, the next level (N = 256) shows dramatic performance gain (miss score approx. = 0.4) while score for N = 512 and 1024 remained equal to that of naive rotate.

I'm perhaps misunderstanding blocking, but does block size of B = 32 implies that we're breaking each block into 32 x 32 matrices, and therefore should not be effected by increasing size of the input?

Answer:

Well, keep in mind that increasing sizes of the data set changes the cache alignment of various elements. Hence, you have to think of situations where blocking really does help the cache alignment of elements that are problems for you when not blocking.

One thing about blocking is that it will help keep rows of items in the cache, whereas otherwise they would have been kicked out by further accesses. For certain sizes of matrices, that may or may not be a problem - all depending on the alignment with respect to the cache.


Tuesday, March 5, 2002

Question:

Is it true that a "while" loop is potentially faster than a "for" loop?

Answer:

The final problem on Exam 1 should be example enough: a for loop, when compiled, looks the same as the equivalent while loop.