05.08.2012 Views

Par4all: Auto-Parallelizing C and Fortran for the CUDA Architecture

Par4all: Auto-Parallelizing C and Fortran for the CUDA Architecture

Par4all: Auto-Parallelizing C and Fortran for the CUDA Architecture

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

•Conclusion ◮<br />

(I)<br />

• BaC99k to C99imple C99ings<br />

• Avoiding using C99umbersome old C C99onstruC99ts C99an lead<br />

to C99leaner C99ode, more effiC99ient <strong>and</strong> more parallelizable<br />

C99ode with Par4All<br />

• C99 adds C99ympaC99etiC99 features that are un<strong>for</strong>tunately not<br />

well known:<br />

◮ MultidimenC99ional arrays with non-statiC99c size<br />

� Avoid malloc() <strong>and</strong> useless pointer C 99onstruC 99ions<br />

� You C 99an have arrays with a dynamiC 99 size in funC 99tion<br />

parameters (as in <strong>Fortran</strong>)<br />

� Avoid useless linearizing C 99omputaC 99ions (a[i][j] instead of<br />

a[i+n*j]...)<br />

� Avoid most of alloca()<br />

◮ TLS (Thread-Local Storage) ExtenC99ion C99an express<br />

independent storage<br />

◮ C99omplex numbers <strong>and</strong> booleans avoid fur<strong>the</strong>r struC99tures or<br />

enums<br />

�Par4All in <strong>CUDA</strong> — GPU conference 10/1/2009<br />

HPC Project, Mines ParisTech, TÉLÉCOM Bretagne, RPI Ronan KERYELL et al. 39 / 46

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!