11.07.2015 Views

A Compiler for Parallel Exeuction of Numerical Python Programs on ...

A Compiler for Parallel Exeuction of Numerical Python Programs on ...

A Compiler for Parallel Exeuction of Numerical Python Programs on ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>on</strong> the texture unit to bring the data first into a register. Sec<strong>on</strong>d, the texture units runin parallel to the ALUs. There<str<strong>on</strong>g>for</str<strong>on</strong>g>e, to predict per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance, the number <str<strong>on</strong>g>of</str<strong>on</strong>g> cycles requiredto execute the texture instructi<strong>on</strong>s and the number <str<strong>on</strong>g>of</str<strong>on</strong>g> cycles required to execute ALUinstructi<strong>on</strong>s are calculated separately. The larger <str<strong>on</strong>g>of</str<strong>on</strong>g> the two values represents the bottleneckin the computati<strong>on</strong>.C<strong>on</strong>sider matrix multiplicati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> matrices <str<strong>on</strong>g>of</str<strong>on</strong>g> size N*N. the syntax used in the examplesis very similar to code accepted by the compiler library, calseum, that I wrote.syntax corresp<strong>on</strong>ds roughly to C-like code. First, c<strong>on</strong>sider a naive code. This code launchesN*N threads each computing <strong>on</strong>e scalar value. C<strong>on</strong>sider the following pseudo-C code thatrepresents the code being executed by <strong>on</strong>e thread.float[] i0;float[] i1;float o0;float sum = 0.0;<str<strong>on</strong>g>for</str<strong>on</strong>g>(int i=0;i

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!