Par4all: Auto-Parallelizing C and Fortran for the CUDA Architecture

More documents

Recommendations

Info

•Conclusion ◮ (I) • GPU (and other heterogeneous accelerators): impressive peak performances and memory bandwidth, power efficient • Looks like to the 80’s : many accelerators, vectors, parallel computers ◮ Need to port the programs to new architectures ◮ Many start-ups • ...but some hypothesis changed, hardware constraints � we can no longer escape parallelism! No longer a niche market! • Programming is quite more complex... � • Non-functional specifications (speed, time, power, parallelism...) not taken into account by main-stream programming environment and teaching (high-level object programming...) • And programs are quite bigger now! � • Application source code evolves slowly compared to hardware �Par4All in CUDA — GPU conference 10/1/2009 HPC Project, Mines ParisTech, TÉLÉCOM Bretagne, RPI Ronan KERYELL et al. 43 / 46
•Conclusion ◮ (II) • Capitalize on this with a source-to-source tool! • Before addressing parallelism, may be interesting to optimize the sequential program (algorithmic, coding, right tools...) • Take a positive attitude...Parallelization is a good opportunity for deep cleaning (refactoring, modernization...) � improve also the original code • A cleaner code is often easier to parallelize automatically or manually • Do not forget simple precision exist too... � • Entry cost • Exit cost! � • Par4All initiative aims at minimizing entry cost and exit cost ◮ Use source-to-source compilers to surf on the best back-ends (CUDA for nVidia GPU) ◮ Avoid manual work with preprocessing, tools, code generators... �Par4All in CUDA — GPU conference 10/1/2009 HPC Project, Mines ParisTech, TÉLÉCOM Bretagne, RPI Ronan KERYELL et al. 44 / 46
Page 1 and 2: • ◮ Par4All: Auto-Parallelizing
Page 3 and 4: •Introduction ◮ • Parallelize
Page 5 and 6: •Par4All ◮ 1 Par4All 2 CUDA gen
Page 7 and 8: •Par4All ◮ (I) • PIPS (Interp
Page 9 and 10: •Par4All ◮ • Automatic parall
Page 11 and 12: •CUDA generation ◮ 1 Par4All 2
Page 13 and 14: •CUDA generation ◮ • Find par
Page 15 and 16: •CUDA generation ◮ (I) Parallel
Page 17 and 18: •CUDA generation ◮ (I) • Memo
Page 19 and 20: •CUDA generation ◮ (I) Conserva
Page 21 and 22: •CUDA generation ◮ • Hardware
Page 23 and 24: •CUDA generation ◮ (II) • So
Page 25 and 26: •CUDA generation ◮ • Reductio
Page 27 and 28: •CUDA generation ◮ (II) • For
Page 29 and 30: •CUDA generation ◮ (I) • CUDA
Page 31 and 32: •CUDA generation ◮ (III) 21 [..
Page 33 and 34: •Results ◮ 1 Par4All 2 CUDA gen
Page 35 and 36: •Results ◮ (I) Average time in
Page 37 and 38: •Results ◮ (I) Average time in
Page 39 and 40: •Conclusion ◮ 1 Par4All 2 CUDA
Page 41 and 42: •Conclusion ◮ Why open source?
Page 43: •Conclusion ◮ (II) • We have
Page 47 and 48: •Conclusion ◮ • HPC Project

Par4all: Auto-Parallelizing C and Fortran for the CUDA Architecture

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?