29.06.2013 Views

A GPU/CUDA Based Monte Carlo Code for Proton Transport ...

A GPU/CUDA Based Monte Carlo Code for Proton Transport ...

A GPU/CUDA Based Monte Carlo Code for Proton Transport ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

A <strong>GPU</strong>/<strong>CUDA</strong> <strong>Based</strong> <strong>Monte</strong> <strong>Carlo</strong> <strong>Code</strong> <strong>for</strong> <strong>Proton</strong> <strong>Transport</strong>: Preliminary Results of<br />

<strong>Proton</strong> Depth Dose in Water<br />

Introduction and Innovation<br />

<strong>Monte</strong> <strong>Carlo</strong> (MC) methods provide the most accurate calculation of radiation dose<br />

absorbed in human organs or tissues. However, the simulations traditionally per<strong>for</strong>med on the<br />

central processing unit (CPU) are usually very time-consuming, creating difficulties <strong>for</strong> routine<br />

clinical applications. On the other hand, the graphics processing unit (<strong>GPU</strong>) has recently<br />

emerged as a powerful and af<strong>for</strong>dable tool <strong>for</strong> high per<strong>for</strong>mance parallel computing. <strong>GPU</strong>s<br />

provide extensive thread-level parallelism and high efficiency <strong>for</strong> energy consumption. Compute<br />

Unified Device Architecture (<strong>CUDA</strong>) [11] is a parallel computing architecture developed by Nvidia,<br />

which greatly facilitates the <strong>GPU</strong> programming. Recently, a number of <strong>GPU</strong>/<strong>CUDA</strong>-based<br />

<strong>Monte</strong> <strong>Carlo</strong> codes <strong>for</strong> medical physics application have been reported, including of Badal et al<br />

<strong>for</strong> x-ray radiography [1] [2] , and Hissoiny et al [3] and Jia et al [4] [5] on photon treatment planning.<br />

As reviewed by Pratx and Xing [6] , there is limited or no reported ef<strong>for</strong>t in developing such a<br />

code <strong>for</strong> proton radiotherapy which is gaining popularity rapidly. The goal of this project was to<br />

develop a 3D proton transport code <strong>for</strong> dose calculations. This paper describes the methods<br />

and preliminary results that compare the timing in the CPU and <strong>GPU</strong> plat<strong>for</strong>ms. The<br />

innovation: one of the first attempts to take advantage of <strong>GPU</strong>s to accelerate the <strong>Monte</strong> <strong>Carlo</strong><br />

simulation <strong>for</strong> proton dose calculations.<br />

Methods<br />

In this study, protons were transported in a 3D homogeneous material (water) and the<br />

energy deposition was tallied at different depths. The energy-loss fluctuation due to the<br />

Coulomb collisions with atomic electrons and the multiple-scattering deflections due to elastic<br />

scattering by atoms were included. Nuclear reaction was not yet incorporated, which contributes<br />

less than 10% to the total dose. The secondary electrons were assumed to deposit the energies<br />

locally. The condensed history method [7] was used in the proton transport simulation. In this<br />

method, the effect of the extremely large number of interactions was grouped into a few single<br />

condensed steps. The angular deflection in one step was sampled from the Moliere distribution<br />

[8] , with the energy loss being sampled from the Valilov distribution [9] . The position change of a<br />

proton after one step follows those given by Berger [7] .<br />

The pathlength of each collision step was pre-calculated via continuous slowing down<br />

approximation (CSDA). Expected energy loss in each step was chosen as a constant value, or a<br />

fraction (several percent) of the initial energy, whichever is smaller. For each step, the Moliere<br />

distribution and Valilov distribution were evaluated. The distributions and parameters were precalculated<br />

<strong>for</strong> each step. Cubic spline interpolation was used <strong>for</strong> parameters from pre-calculated<br />

table. The sampling of the distribution was based on the Alias Sampling Method [10] .<br />

The <strong>GPU</strong> used in this study is NVIDIA Tesla M2090 and the CPU is Intel Xeon X5660<br />

with 2.80 GHz and 9.00 GB RAM. The CPU code was written in C++ and <strong>GPU</strong> code was<br />

developed in <strong>CUDA</strong> C 4.0 [11] , both under the IDE Microsoft Visual Studio 2010. <strong>CUDA</strong> scalable<br />

programming model has a thread hierarchy <strong>for</strong> parallel computation. The lowest level is called a<br />

thread, which was designed to simulate a fixed number of protons. A total of 512 threads in our<br />

case were grouped together to <strong>for</strong>m a block. Threads in a block cooperate with each other in<br />

such a way that a warp of threads (32 threads) are able to execute one common instruction at a<br />

time in parallel, making the calculation faster than a typical serialized CPU program. Thread<br />

blocks are executed independently to accommodate <strong>for</strong> very large number of <strong>CUDA</strong> cores.<br />

Results<br />

The proton dose distributions in water were calculated <strong>for</strong> energies of 40 MeV, 80 MeV,<br />

100 MeV, 160 MeV, and 200 MeV. Our CPU and <strong>GPU</strong> codes yield almost identical results (less<br />

than 0.1% difference). The relative error is within 1% <strong>for</strong> 98% of the tally depths. The results<br />

were compared with extensively benchmarked general purpose <strong>Monte</strong> <strong>Carlo</strong> code MCNPX<br />

2.5.0 [12] and GEANT4.8.2 [13]. Figure 1 compares the 200-MeV proton pencil-beam depth<br />

1


dose distribution in water between our code and GEANT4.8.2 (nuclear interaction turned off).<br />

The speedup factor, herein defined as the ratio of the number of particles simulated per second<br />

by <strong>GPU</strong> to that by CPU is about 57. For the simulation of 1 million proton transport histories in<br />

water, our <strong>GPU</strong> code is ~620 times faster<br />

than MCNPX 2.5.0(nuclear interaction on).<br />

Conclusion<br />

The goal of this project was to demonstrate<br />

that the <strong>GPU</strong>s can be used to accelerate<br />

<strong>Monte</strong> <strong>Carlo</strong> calculations <strong>for</strong> protons. A<br />

proton transport code was developed to<br />

calculate the dose distribution in a water<br />

phantom. The code was first developed <strong>for</strong><br />

CPU and then ported to <strong>GPU</strong>/<strong>CUDA</strong><br />

plat<strong>for</strong>m <strong>for</strong> timing comparison. The dose<br />

results were benchmarked against those<br />

from the Geant4 code. A speedup of 57 in<br />

the <strong>GPU</strong>/<strong>CUDA</strong> version was observed in<br />

comparison with the identical CPU code.<br />

To the best of our knowledge, it is one of<br />

the first ef<strong>for</strong>ts to successfully demonstrate<br />

a <strong>GPU</strong>/<strong>CUDA</strong>-based proton transport MC<br />

code. It does not yet include the nuclear<br />

interactions and the protons were only<br />

Figure 1: 200 MeV pencil beam proton depth dose in<br />

water, our <strong>GPU</strong> result compared with Geant4 (nuclear<br />

interaction turned off)<br />

transported in homogenous water phantom. Despite these limitations, we were able to<br />

demonstrate a significant reduction in the computing time, thus suggesting a promising future to<br />

use such extremely fast <strong>Monte</strong> <strong>Carlo</strong> tools <strong>for</strong> routine treatment planning and verification.<br />

References<br />

[1] Badal, A. and Badano, A. “Accelerating <strong>Monte</strong> <strong>Carlo</strong> simulations of photon transport in a<br />

voxelized geometry using a massively parallel <strong>GPU</strong>”, Med. Phys. 36, p. 4878-4880 (2009)<br />

[2] Badal, A. and Badano, A. “Fast and accurate estimation of organ doses in medical imaging<br />

using a <strong>GPU</strong> <strong>GPU</strong>-accelerated <strong>Monte</strong> <strong>Carlo</strong> simulation code”, AAPM 2011 annual meeting<br />

[3] Hissoiny, S. and Ozell, B. “<strong>GPU</strong>MCD: A new <strong>GPU</strong>-oriented <strong>Monte</strong> <strong>Carlo</strong> dose calculation<br />

plat<strong>for</strong>m” Med. Phys. 38, 754 (2011)<br />

[4] Jia, X. et al, “Development of a <strong>GPU</strong>-based <strong>Monte</strong> <strong>Carlo</strong> dose calculation code <strong>for</strong> coupled<br />

electron–photon transport” Phys. Med. Biol. 55(2010) 3077<br />

[5] Jia, X. et al, “<strong>GPU</strong>-based fast <strong>Monte</strong> <strong>Carlo</strong> simulation <strong>for</strong> radiotherapy dose calculation”<br />

Phys. Med. Biol. 56 (2011) 7017–7031<br />

[6] Pratx, G. and Xing, L. “<strong>GPU</strong> computing in medical physics: a review” Med Phys. 38(5):2685-<br />

97 (2011).<br />

[7] Berger, M. J., “<strong>Monte</strong> <strong>Carlo</strong> Calculation of the penetration and Diffusion of Fast Charged<br />

Particles,in: Methods in Computational Physics”, Vol I, Accad. Press, New York, 1963, p. 135<br />

[8] Bethe, H. A., “Moliere's Theory of Multiple Scattering”, Phys. Rev., 89 (1953), pp. 1256-1266<br />

[9] Vavilov, P. V., “Ionizational Losses of High Energy Heavy Particle” J. of Phys USSR, 32<br />

(1957),4, pp. 920-923<br />

[10] Kronmal, R. A. and Peterson, Jr., A. V. On the alias method <strong>for</strong> generating random<br />

variables, The Amer. Statistian, 33 (1979), 214<br />

[11] NVIDIA, “NVIDIA <strong>CUDA</strong> C Programming Guide” V4.0, 2011<br />

[12] Pelowitz, D. B. MCNPX User’s manual version 2.5.0 Los Alamos National Laboratory<br />

Report LA-CP-05–0369 (2005)<br />

[13] Agostinelli, S. et al. Geant4 - a simulation toolkit, Nuclear Instruments and Methods in<br />

Physics Research A 506 (2003) 250-303<br />

2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!