29.11.2012 Views

Compile-time Loop Splitting for Distributed Memory ... - Stanford AI Lab

Compile-time Loop Splitting for Distributed Memory ... - Stanford AI Lab

Compile-time Loop Splitting for Distributed Memory ... - Stanford AI Lab

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

aref (aref (A, div-a-i), rem-a-i);<br />

aref (aref (B, div-b-i + (i-n * div-b-j)),<br />

rem-b-i + (i-spread * rem-b-j));<br />

aref (aref (C, div-c-i + i-proc * (div-c-j + (j-proc * div-c-k))),<br />

rem-c-i + i-spread * (rem-c-j + j-spread * rem-c-k));<br />

Figure 2-8: The same references with all the divisions replaced by interval invariants and<br />

all modulos replaced by an incrementing counter.<br />

aref (aref (B, pid-B-ij), rem-b-i + (i-spread * rem-b-j));<br />

aref (aref (C, pid-C-ijk), rem-c-i + i-spread * (rem-c-j + j-spread * rem-c-k);<br />

Figure 2-9: The 2-D and 3-D reference expressions after further compiler optimization.<br />

each reference.<br />

An optimizing compiler could even further reduce the calculation by replacing the<br />

multiplication by � ����—�in the second reference (� ����—�in the third reference) with an<br />

addition of � ����—�(� ����—�) on each iteration. This type of optimization, called strength<br />

reduction, is described in Section 3.3.2.<br />

In the context of rectangular partitioning, “sufficiently small” interval values are<br />

those that keep the loop nest occupied with a single processor. If this condition is met, the<br />

processor ID is constant, and the offset into the processor’s memory can be determined<br />

with monotonically increasing or decreasing counters. Thus, appropriate intervals can<br />

Number of Operations Per Array Reference<br />

Reference Be<strong>for</strong>e Optimization After Optimization<br />

mod 4 2 C mod 4 2 C<br />

1-D 1 1 0 0 0 0 0 0<br />

2-D 2 2 2 2 0 0 1 1<br />

3-D 3 3 4 4 0 0 2 2<br />

Table 2.3: Reduction in operations <strong>for</strong> array references by an optimizing compiler.<br />

26

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!