Double Sum combine1: Maximum use of data abstraction: 21.25 cycles/element Double Sum combine2: Take vec_length() out of loop: 15.21 cycles/element Double Sum combine3: Array reference to vector data: 3.39 cycles/element Double Sum combine4: Array reference, accumulate in temporary: 3.33 cycles/element Double Sum combine4p: Pointer reference, accumulate in temporary: 3.33 cycles/element Double Sum Array code, unrolled by 2: 3.00 cycles/element Double Sum combine5p: Pointer code, unrolled by 3, for loop: 3.00 cycles/element Function Array code, unrolled by 3, while loop, Should be 18, Got 17 Double Sum Array code, unrolled by 3, while loop: 3.12 cycles/element Double Sum Array code, unrolled by 4: 3.08 cycles/element Double Sum Array code, unrolled by 8: 3.11 cycles/element Double Sum Array code, unrolled by 16: 3.00 cycles/element Double Sum Pointer code, unrolled by 2: 3.00 cycles/element Double Sum Pointer code, unrolled by 3: 3.00 cycles/element Double Sum Pointer code, unrolled by 4: 3.08 cycles/element Double Sum Pointer code, unrolled by 8: 3.11 cycles/element Double Sum Pointer code, unrolled by 16: 3.02 cycles/element Double Sum combine6: Array code, unrolled by 2, Superscalar x2: 1.81 cycles/element Double Sum Array code, unrolled by 4, Superscalar x2: 1.64 cycles/element Double Sum Array code, unrolled by 8, Superscalar x2: 1.58 cycles/element Double Sum Array code, unrolled by 3, Superscalar x3: 1.67 cycles/element Double Sum Array code, unrolled by 4, Superscalar x4: 1.50 cycles/element Double Sum Array code, unrolled by 8, Superscalar x4: 1.45 cycles/element Double Sum Array code, unrolled by 6, Superscalar x6: 1.38 cycles/element Double Sum Array code, unrolled by 8, Superscalar x8: 1.65 cycles/element Double Sum Array code, unrolled by 10, Superscalar x10: 1.60 cycles/element Double Sum Array code, unrolled by 12, Superscalar x6: 1.48 cycles/element Double Sum Array code, unrolled by 12, Superscalar x12: 1.44 cycles/element Double Sum Pointer code, unrolled by 8, Superscalar x2: 1.57 cycles/element Double Sum Pointer code, unrolled by 8, Superscalar x4: 1.38 cycles/element Double Sum Pointer code, unrolled by 8, Superscalar x8: 1.40 cycles/element Double Sum Pointer code, unrolled by 9, Superscalar x3: 1.34 cycles/element Double Sum Array code, Unroll x2, Superscalar x2, noninterleaved: 1.80 cycles/element Double Sum Array code, unrolled by 2, different associativity: 1.83 cycles/element Double Sum Array code, unrolled by 3, Different Associativity: 1.55 cycles/element Double Sum Array code, unrolled by 4, Different Associativity: 1.50 cycles/element Double Sum Array code, unrolled by 6, Different Associativity: 1.43 cycles/element Double Sum Array code, unrolled by 8, Different Associativity: 1.33 cycles/element Double Sum SSE code, 1*VSIZE-way parallelism: 1.80 cycles/element Double Sum SSE code, 2*VSIZE-way parallelism: 1.50 cycles/element Double Sum SSE code, 4*VSIZE-way parallelism: 1.49 cycles/element Double Sum SSE code, 8*VSIZE-way parallelism: 1.31 cycles/element Double Sum SSE code, 12*VSIZE-way parallelism: 1.43 cycles/element Double Sum SSE code, 2*VSIZE-way parallelism, reassociate: 1.66 cycles/element Double Sum SSE code, 4*VSIZE-way parallelism, reassociate: 1.42 cycles/element Double Sum SSE code, 8*VSIZE-way parallelism, reassociate: 1.36 cycles/element