11.07.2015 Views

A Compiler for Parallel Exeuction of Numerical Python Programs on ...

A Compiler for Parallel Exeuction of Numerical Python Programs on ...

A Compiler for Parallel Exeuction of Numerical Python Programs on ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Table 7.3: Executi<strong>on</strong> time <str<strong>on</strong>g>for</str<strong>on</strong>g> matrix multiplicati<strong>on</strong> benchmark <str<strong>on</strong>g>for</str<strong>on</strong>g> 64-bit floating point(sec<strong>on</strong>ds)Problem Size 1024 2048 3072 4096OpenMP 4 threads min 8.55 109.5 274.85 1406.82max 8.68 110.4 275.8 1409.5mean 8.59 109.8 275.1 1408.0ATLAS BLAS min 0.244 1.34 4.38 11.62max 0.247 1.36 4.47 11.70mean 0.245 1.35 4.41 11.64GPU Total (Opt) min 0.147 0.679 3.71 5.34max 0.16 0.695 3.82 5.53mean 0.15 0.685 3.78 5.39GPU Only (Opt) min 0.078 0.513 2.34 3.84max 0.09 0.543 2.44 4.06mean 0.081 0.523 2.38 3.91GPU Total (No opt) min 0.227 1.42 6.21 11.26max 0.24 1.53 6.52 11.73mean 0.232 1.47 6.33 11.4GPU Only (No opt) min 0.164 1.249 4.87 10.13max 0.177 1.33 5.182 10.66mean 0.168 1.27 4.96 10.31Table 7.4: Speedups <str<strong>on</strong>g>for</str<strong>on</strong>g> matrix multiplicati<strong>on</strong> using GPU <str<strong>on</strong>g>for</str<strong>on</strong>g> 64-bit floating point overATLASProblem Size Speedup (No Opt) Speedup (Opt)1024 1.07 1.6592048 0.95 1.973072 0.7 1.184096 1.03 2.1755

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!