29.11.2012 Views

Compile-time Loop Splitting for Distributed Memory ... - Stanford AI Lab

Compile-time Loop Splitting for Distributed Memory ... - Stanford AI Lab

Compile-time Loop Splitting for Distributed Memory ... - Stanford AI Lab

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

For example, in the context of the previous 2-D loop nest, a task partition with<br />

blocking (Figure 2-3b) and a data partition with striping by rows (Figure 2-4b) have poor<br />

alignment. Even with optimal placement (virtual processors 0, 1, 2, 3 map to real<br />

processors 3, 2, 1, 0, respectively), only half of a processor’s data footprint is in its local<br />

memory. Processor Zero (with virtual PID 3) would require data in both local memory and<br />

the memory of Processor One (with virtual PID 2). For Processor Zero to reference the<br />

data, it must not only specify the address of an array cell but also the processor’s memory<br />

in which it is located.<br />

Thus, the very definition of a distributed memory multiprocessor leads to<br />

complications in referencing the program’s dispersed data. The complication appears in the<br />

<strong>for</strong>m of involved reference expressions, which adversely affect the execution <strong>time</strong> of a<br />

program. The next section describes these expressions and illustrates the possibilities <strong>for</strong><br />

simplification.<br />

2.5 Array Referencing<br />

Methods of array referencing on distributed memory multiprocessors vary; the method<br />

described here is used by the Alewife ef<strong>for</strong>t ([Aet al.91]) at MIT. This thesis concerns itself<br />

with rectangular task and data partitioning and barrier synchronization of parallel loop<br />

nests, both of which are provided by the Alewife compiler.<br />

Array referencing in Alewife is implemented in software. The general expression to<br />

access an array cell is<br />

—��� @—��� @—��—�Y ���AY ������AY<br />

where —��� is the array reference procedure, —��—� is the name of the array, ���is the<br />

unique ID of the processor whose memory contains the cell, and ������ is the offset into that<br />

memory. The reference procedure —��� has two arguments – a list structure and an offset<br />

into the structure, and returns the element in the structure located at the offset. In the<br />

general expression above, the inner —��� determines the memory segment in which the cell<br />

resides from both —��—�’s dope vector (a list of pointers to memory segments) and the<br />

processor ID. The outer —��� uses the memory segment and ������ to obtain the actual<br />

22

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!