12.07.2015 Views

Non-linear memory layout transformations and data prefetching ...

Non-linear memory layout transformations and data prefetching ...

Non-linear memory layout transformations and data prefetching ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

List of Figures2.1 The Memory Hierarchy pyramid . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2 Virtual Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.1 Row-major indexing of a 4×4 matrix, analogous Morton indexing <strong>and</strong> Mortonindexing of the order-4 quadtree . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.2 The 4-dimensional array in 4D <strong>layout</strong> . . . . . . . . . . . . . . . . . . . . . . . . 363.3 The tiled array <strong>and</strong> indexing according to Morton <strong>layout</strong> . . . . . . . . . . . . . . 383.4 The 4-dimensional array in Morton <strong>layout</strong> in their actual storage order: paddingelements are not canonically placed on the borders of the array, but mixed withuseful elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.5 ZZ-transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.6 NZ-transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.7 NN-transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.8 ZN-transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.9 A 2-dimensional array converted to 1-dimensional array in ZZ-transformation . . 473.10 Conversion of the <strong>linear</strong> values of row <strong>and</strong> column indices to dilated ones throughthe use of masks. This is an 8 × 8 array with 4 × 4 element tiles . . . . . . . . . 473.11 Matrix multiplication: C[i, j]+ = A[i, k] ∗ B[k, j]: Oval boxes show the form ofrow <strong>and</strong> column binary masks. Shifting from element to element inside a tile takesplace by changing the four least signicant digits of the binary representation ofthe element position. Switching from tile to tile takes place by changing the 2most signicant digits of the binary representation of the element position. . . . 483.12 Flow Chart of the proposed optimization algorithm: it guides optimal nested loopordering, which is being matched with respective storage transformation . . . . . 514.1 Reuse of array elements in the matrix multiplication code . . . . . . . . . . . . . 574.2 Alignment of arrays A, B, C, when N 2 ≤ C L1 . . . . . . . . . . . . . . . . . . . . 584.3 Alignment of arrays A, B, C, when C L1 = N 2 . . . . . . . . . . . . . . . . . . . . 59

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!