Par4all: Auto-Parallelizing C and Fortran for the CUDA Architecture

More documents

Recommendations

Info

•CUDA generation ◮ (II) 1 i n t main( i n t argc, char *argv[]) { [...] 3 float_t (*p4a_var_space)[SIZE][SIZE]; P4A_ACCEL_MALLOC(&p4a_var_space , s i z e o f(space)); 5 P4A_COPY_TO_ACCEL(space, p4a_var_space , s i z e o f(space)); 7 float_t (*p4a_var_save)[SIZE][SIZE]; P4A_ACCEL_MALLOC(&p4a_var_save , s i z e o f(save)); 9 P4A_COPY_TO_ACCEL(save, p4a_var_save , s i z e o f(save)); 11 P4A_ACCEL_TIMER_START; 13 f o r(t = 0; t < T; t++) compute(*p4a_var_space , *p4a_var_save); 15 double execution_time = P4A_ACCEL_TIMER_STOP_AND_FLOAT_MEASURE () 17 P4A_COPY_FROM_ACCEL(space, p4a_var_space , s i z e o f(space)); 19 P4A_ACCEL_FREE(p4a_var_space); P4A_ACCEL_FREE(p4a_var_save); �Par4All in CUDA — GPU conference 10/1/2009 HPC Project, Mines ParisTech, TÉLÉCOM Bretagne, RPI Ronan KERYELL et al. 29 / 46
•CUDA generation ◮ (III) 21 [...] } 23 void compute(float_t space[SIZE][SIZE], float_t save[SIZE][SIZE]) { 25 [...] p4a_kernel_launcher_0(space, save); 27 [...] } 29 void p4a_kernel_launcher_0(float_t space[SIZE][SIZE], float_t save[SIZE][SIZE]) { 31 P4A_CALL_ACCEL_KERNEL_2D(p4a_kernel_wrapper_0 , SIZE, SIZE, space, save); 33 } P4A_ACCEL_KERNEL_WRAPPER void 35 p4a_kernel_wrapper_0(float_t space[SIZE][SIZE], float_t save[SIZE][SIZE]) { 37 i n t j; i n t i; 39 i = P4A_VP_0; j = P4A_VP_1; �Par4All in CUDA — GPU conference 10/1/2009 HPC Project, Mines ParisTech, TÉLÉCOM Bretagne, RPI Ronan KERYELL et al. 30 / 46
Page 1 and 2: • ◮ Par4All: Auto-Parallelizing
Page 3 and 4: •Introduction ◮ • Parallelize
Page 5 and 6: •Par4All ◮ 1 Par4All 2 CUDA gen
Page 7 and 8: •Par4All ◮ (I) • PIPS (Interp
Page 9 and 10: •Par4All ◮ • Automatic parall
Page 11 and 12: •CUDA generation ◮ 1 Par4All 2
Page 13 and 14: •CUDA generation ◮ • Find par
Page 15 and 16: •CUDA generation ◮ (I) Parallel
Page 17 and 18: •CUDA generation ◮ (I) • Memo
Page 19 and 20: •CUDA generation ◮ (I) Conserva
Page 21 and 22: •CUDA generation ◮ • Hardware
Page 23 and 24: •CUDA generation ◮ (II) • So
Page 25 and 26: •CUDA generation ◮ • Reductio
Page 27 and 28: •CUDA generation ◮ (II) • For
Page 29: •CUDA generation ◮ (I) • CUDA
Page 33 and 34: •Results ◮ 1 Par4All 2 CUDA gen
Page 35 and 36: •Results ◮ (I) Average time in
Page 37 and 38: •Results ◮ (I) Average time in
Page 39 and 40: •Conclusion ◮ 1 Par4All 2 CUDA
Page 41 and 42: •Conclusion ◮ Why open source?
Page 43 and 44: •Conclusion ◮ (II) • We have
Page 45 and 46: •Conclusion ◮ (II) • Capitali
Page 47 and 48: •Conclusion ◮ • HPC Project

Par4all: Auto-Parallelizing C and Fortran for the CUDA Architecture

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?