13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CODING FOR SIMD ARCHITECTURESpattern in Figure 4-3. The two-dimensional array A is referenced in the J (column)direction <strong>and</strong> then referenced in the I (row) direction (column-major order); whereasarray B is referenced in the opposite manner (row-major order). Assume thememory layout is in column-major order; therefore, the access strides of array A <strong>and</strong>B for the code in Example 4-18 would be 1 <strong>and</strong> MAX, respectively.Example 4-18. Loop BlockingA. Original Loopfloat A[MAX, MAX], B[MAX, MAX]for (i=0; i< MAX; i++) {for (j=0; j< MAX; j++) {A[i,j] = A[i,j] + B[j, i];}}B. Transformed Loop after Blockingfloat A[MAX, MAX], B[MAX, MAX];for (i=0; i< MAX; i+=block_size) {for (j=0; j< MAX; j+=block_size) {for (ii=i; ii

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!