11.07.2015 Views

A Compiler for Parallel Exeuction of Numerical Python Programs on ...

A Compiler for Parallel Exeuction of Numerical Python Programs on ...

A Compiler for Parallel Exeuction of Numerical Python Programs on ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 9C<strong>on</strong>clusi<strong>on</strong>sThis thesis introduced a new programming model <str<strong>on</strong>g>for</str<strong>on</strong>g> more efficient programming <str<strong>on</strong>g>of</str<strong>on</strong>g> numericalprograms in <str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>for</str<strong>on</strong>g> executi<strong>on</strong> <strong>on</strong> GPUs. The thesis also described the design andimplementati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> a compiling system to c<strong>on</strong>vert numerical <str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g> programs annotatedwith type and parallel loop annotati<strong>on</strong>s to multi-cores and GPUs. In this new programmingmodel, a programmer writes code <str<strong>on</strong>g>for</str<strong>on</strong>g> a simple shared-memory abstracti<strong>on</strong> and the compilerautomatically c<strong>on</strong>verts the program to use a GPU as an accelerator. The program remainsportable to multicores and GPUs with no code changes.The compiler system c<strong>on</strong>sists <str<strong>on</strong>g>of</str<strong>on</strong>g> un<str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g>, an ahead-<str<strong>on</strong>g>of</str<strong>on</strong>g>-time compiler and jit4GPU,a just-in-time compiler. Jit4GPU implements a new algorithm to analyze the regi<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g>memory accessed by an array reference in a loop nest. The algorithm is restricted to a class<str<strong>on</strong>g>of</str<strong>on</strong>g> affine accesses termed as RCSLMADs. Jit4GPU automatically transfers the required data<str<strong>on</strong>g>for</str<strong>on</strong>g> the computati<strong>on</strong> between the CPU and the GPU based <strong>on</strong> the results <str<strong>on</strong>g>of</str<strong>on</strong>g> the array accessalgorithms. Jit4GPU is not a general-purpose JIT compiler and <strong>on</strong>ly works <strong>on</strong> numericalprograms represented as parallel loop nests with array accesses representable as RCSLMADs.Jit4GPU generates GPU code from a typed abstract-syntax-tree (AST) representati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the<str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g> program generated by un<str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g>. Jit4GPU also per<str<strong>on</strong>g>for</str<strong>on</strong>g>ms several loop optimizati<strong>on</strong>ssuch as loop unrolling and memory load coalescing.The per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance evaluati<strong>on</strong> used several numerical kernels. On some kernels, Jit4GPUper<str<strong>on</strong>g>for</str<strong>on</strong>g>ms over 100 times faster than OpenMP code generated by un<str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g>. Jit4GPU alsodelivers better per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance than some highly tuned CPU libraries, such as ATLAS, withoutrequiring the programmer to do any optimizati<strong>on</strong>s such as unrolling or tiling in the original<str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g> source. <str<strong>on</strong>g>Compiler</str<strong>on</strong>g>s, such as Jit4GPU, allow the programmer to easily utilize thecomputati<strong>on</strong>al power <str<strong>on</strong>g>of</str<strong>on</strong>g> modern GPUs <str<strong>on</strong>g>for</str<strong>on</strong>g> general purpose computati<strong>on</strong>.65

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!