The Hierarchically Tiled Arrays Programming Approach

Figure 2: Bottom up tiling.an assignment to be legal is that once it takes place, theadjacent tiles in the resulting HTA continue to have thesame size along each dimension of adjacency, that is, theresulting HTA continues to fulfill the properties of an HTA.2.4 Execution ModelThe machine model for an HTA program is that of a client,which runs the main thread of execution, and that is connectedto a distributed memory machine with an array ofprocessors, called servers, onto which the top-level tiles ofthe HTAs are mapped. Whenever an operation found inthe code that the client executes involves the distributedtiles of an HTA, such operation is broadcasted from theclient to the servers so that they execute it in parallel.When the operation only involves tiles that the serverowns, the server performs locally the computation. If, however,the computation requires tiles that the server doesnot own, it first requests them to the owner servers, andthen it performs the computation. Thus, in a programusing HTAs the parallelism and the communication is encapsulatedin the statements that operate on tiles of oneor more HTAs.While this is the execution model from the point of view ofthe programmer, and it is the way our current implementationworks, HTA programs could also be translated bya compiler into tasks that execute in the nodes of the arrayof processors synchronizing and exchanging data whenrequired. This is perfectly feasible approach that wouldimprove the scalability of this programming approach.2.5 Construction of HTAsThe simplest way to build an HTA is by providing a sourcearray and a series of delimiters in each dimension wherethe array should be cut into tiles. For example, if M is a1000×1000 matrix, an HTA resulting from its partitioningin tiles of 100 × 250 elements would be created by thestatement:A = hta(M, {1:100:1000,1:250:1000});The triplet with the curly brackets are the partition vectorfor each dimension of the source array. The elements ineach partition vector specify the hyperplanes that cut theinput matrix along the corresponding dimension to distributeit in tiles. The elements in the partition vectormark the beginning of each sub-tile. This constructor canalso be used to create HTAs with different levels of tilingusing a bottom-up approach. For example, given a 10 ×12 matrix D, the statementsF= hta(a, {1:2:6, 1:2:6}, [2,2])matrix aP1 P2 P1P3 P4 P3P1 P2 P1distributedHTA Fmesh ofprocessorsFigure 3: Mapping of tiles to processors.C = hta(D, {[1,3,5,7,9],[1,4,7,10]});B = hta(C, {[1,4],[1,2,3,4]});A = hta(B, {[1,2],[1,2]});will generate the three HTAs shown in Fig. 2. Notice thatusing this bottom-up approach matrix A can also be createdusing a single statementA = hta(D, {[1,3,5,7,9],[1,4,7,10]}, ...{[1,4],[1,2,3,4]}, ...{[1,2],[1,2]});where the ... just mean the continuation of the commandin the following line.Finally, it is also possible to build empty HTAs whose tilesare later filled in. To build one, the HTA constructor mustbe called with the number of desired tiles per dimension.For example, F = hta(3, 3) would generate an empty 3×3 HTA F.The examples discussed above generate non-distributedHTAs, which are located only in the client. Nevertheless,most of the times we will be interested in generatingHTAs whose contents are distributed on a mesh of processors,so that we can operate in parallel on its tiles. Ourtoolbox currently supports a single form of distribution.Namely, it can distribute the top level tiles of an HTAcyclically on a mesh of processors. This corresponds toa block cyclic distribution of the matrix contained in theHTA, with the blocks defined by the top level partition.In order to achieve this, the constructor of the HTA needsa last parameter that specifies the dimensions of the meshby means of a vector. Fig. 3 shows an example where a

Previous page

Next page

1

3

4

5

6

7

8

9

10

11

12

The Hierarchically Tiled Arrays Programming Approach

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?