2000 PROGRESS REPORT - ENEA - Fusione

More documents

Recommendations

Info

1. Magnetic Confinement When distributing the workload among the different processors of a shared memory node, the alternative between particle and domain decomposition does not correspond to an alternative between high and low-level languages. Indeed, even in the framework of a domain decomposition approach, particle migration from one processor to the other does not require any communication at all, and a high-level parallel programming language, such as OpenMP, can still be used. The choice between the two alternatives can then be solved on the basis of different considerations. Here, we present the results of our experience in the high-level language porting of a specific PIC code, namely the HMGC developed in Frascati, which includes all the main features of the PIC codes used for the investigation of magnetically confined plasmas in toroidal devices. While fixing a particle decomposition approach (implemented in HPF) for the distributed memory decomposition (among different nodes), both particle and domain decomposition techniques (implemented in OpenMP) have been tested for the shared memory (intra-node) decomposition. A trade-off between low memory requirement and high parallelization efficiency emerges, when comparing the two approaches, which makes one method preferable as compared to the other, depending on the practical constraints the user has to face. The first strategy used for intra-node workload decomposition is the particle decomposition, which distributes the loop iterations over the particle among the different processors of the node. Several particles, belonging to different processors, can contribute to the pressure on the same grid point, thus rising a race condition exception. OpenMP allows such sections to be protected by defining them as critical sections, but this practically consists of a serial execution of those specific portions of the code. The introduction of an auxiliary (private) grid array, owned by each processor and representing a partial and local pressure to be summed over all the processors at the end of the particle loop, can overcome the memory conflicts at the expenses of an increase in the memory occupation of the code. In fig. 1.44, the speed-up is shown (namely, the ratio between the time of execution of the serial code and that of the parallel code) of the most demanding section of the PIC code, viz. the pressure-updating phase, vs the number of processors n proc at fixed number, n node =2, of (8-processors) nodes for this version of the code (ν1), which uses the particle decomposition both in the inter-node and the intra-node workload decomposition. Four different values of the average number of particles per cell (from N ppc =4 to N ppc =256) have been considered, with N cell =4096 cells. From fig. 1.44 it can be observed that the speed-up values depart from the linear scaling with n proc only for n proc greater than a certain value, which is higher the higher is the average number of particles per cell, N ppc . The second version considered here (ν2a) implements the coupling of the s u 16 12 8 4 0 N ppc = 4 N ppc = 16 N ppc = 64 N ppc = 256 2 4 6 8 n proc Fig. 1.44 - Speed-up of the pressure-updating phase vs the number Fig. 1.44 of processors, at fixed number of (8-processors) nodes, n node =2, for the particle decomposition version, ν1. Four different values of the average number of particles per cell (from N ppc =4 to N ppc =256) have been considered ν1 43
1. Magnetic Confinement s u 16 12 8 4 0 N ppc = 4 N ppc = 16 N ppc = 64 N ppc = 256 2 4 6 8 n proc Fig. 1.45 - Speed-up vs the number Fig. 1.45 of processors, at fixed number of (8-processors) nodes, n node =2, for the domain decomposition version, ν2a ν2a inter-node particle decomposition and the intra-node domain decomposition, which do not require the introduction of the grid auxiliary array introduced in version ν1, but implies heavier restructuring of the code and, possibly, addressing loadbalancing problems. It consists of reordering the particle population according to the portion of domain in which each particle resides, and assigning a different portion to each processor. Such a reordering gives rise, once again, to the risk of race conditions (the particles belonging to a certain domain portion have to be counted within a particle loop, and the updating of the counter is a critical operation). Once assigned to the processors, however, no further race condition occurs in updating the pressure array element, as loop iterations, which could, in principle, concur to the updating of the same element, are executed by the same processor. Figure 1.45 shows the scaling of the speed-up as compared to n proc , at the fixed number (n node =2) of 8-processors nodes, obtained by this version. The speed-up values, obtained with the version ν2a, do not seem to be very satisfactory. In fact, several operations, which are absent in the intra-node particle decomposition version ν1, will have to be performed anyway: namely, the identification of the domain portion into which each particle falls, a loop to balance the particles over processors, and a re-ordering loop. However, for specific (but rather common) applications characterized by a contained particle migration per time step from one portion of the domain to the other, a significant efficiency improvement can be obtained by limiting the reordering phase (and then, the critical computation) to those particles that have changed their domain portion in the last step. Their number can be very low indeed, if it is possible to decompose the domain along a slowly-varying coordinate. Figure 1.46 shows a comparison between the results obtained by the version ν2a and a companion version (ν2b), which implements such a selective reordering. The results of the particle decomposition implementation, ν1, are also shown for reference. The case N ppc =256 is considered as an example. It can be concluded that, at least for the specific application considered here, this mixed “particle-domain decomposition” strategy represents an interesting compromise between the two competing targets — namely, high speed-up and low memory requirements. 1.4.5 Non-linear zonal dynamics of drift and drift-Alfvén turbulences in tokamak plasmas (In collaboration with University of California at Irvine and Princeton University Plasma Physics Laboratory) In recent years, increasing attention has been devoted to exploring non-linear dynamics of zonal flow [1.27] associated with electrostatic drift-type turbulence [1.28-1.30]. On the other hand, though it is well known how electrostatic drift modes couple to the electromagnetic shear Alfvén wave as the plasma β=8πP/B2 increases [1.31-1.33], little effort has been devoted so far to 44
Page 1 and 2: E ITALIAN AGENCY FOR NEW TECHNOLOGI
Page 3: 1. MAGNETIC CONFINEMENT 9 1.1 INTRO
Page 6 and 7: 1. Magnetic Confinement 1.1 INTRODU
Page 8 and 9: 1. Magnetic Confinement The shutdow
Page 10 and 11: 1. Magnetic Confinement waveguides
Page 12 and 13: 1. Magnetic Confinement thin polyme
Page 14 and 15: 1. Magnetic Confinement The camera
Page 16 and 17: 1. Magnetic Confinement Radial prof
Page 18 and 19: 1. Magnetic Confinement 10 18 n 0 (
Page 20 and 21: 1. Magnetic Confinement second pell
Page 22 and 23: 1. Magnetic Confinement temperature
Page 24 and 25: 1. Magnetic Confinement terms occur
Page 26 and 27: 1. Magnetic Confinement for pulses
Page 28 and 29: 1. Magnetic Confinement be obtained
Page 30 and 31: 1. Magnetic Confinement in strongly
Page 32 and 33: 1. Magnetic Confinement T e (keV) 2
Page 34 and 35: 1. Magnetic Confinement is being as
Page 36 and 37: 1. Magnetic Confinement 10 4 10 5 1
Page 38 and 39: 1. Magnetic Confinement 10 4 10 5 1
Page 42 and 43: 1. Magnetic Confinement investigati
Page 44 and 45: 1. Magnetic Confinement incompressi
Page 46 and 47: 1. Magnetic Confinement asked by th
Page 48 and 49: 1. Magnetic Confinement parameter (
Page 50 and 51: 1. Magnetic Confinement The final r
Page 52 and 53: 1. Magnetic Confinement peaking, i.
Page 54 and 55: 1. Magnetic Confinement ITB collaps
Page 56 and 57: References [1.1] ENEA-Nuclear Fusio
Page 58 and 59: References [1.35] P.H. Diamond, Pri
Page 61 and 62: 2. Ignitor 2.1 INTRODUCTION A large
Page 63 and 64: 0.17 0.42 2. Ignitor constrained to
Page 65 and 66: 2. Ignitor 2.3.2 Engineering of the
Page 67: References [2.1] M.N. Rosenbluth, F
Page 70 and 71: 3.8 NEUTRONICS 103 3.8.1. Experimen
Page 72 and 73: 3. Technology Program 3.2 MAGNETS 3
Page 74 and 75: 3. Technology Program busbar twisti
Page 76 and 77: 3. Technology Program Initial FEM C
Page 78 and 79: 3. Technology Program 3.2.7 Develop
Page 80 and 81: 3. Technology Program In the second
Page 82 and 83: 3. Technology Program a) b) Fig. 3.
Page 84 and 85: 3. Technology Program fig. 3.16). T
Page 86 and 87: 3. Technology Program stress tensor
Page 88 and 89: 3. Technology Program Table 3.IV -
Page 90 and 91:
3. Technology Program The presence
Page 92 and 93:
3. Technology Program have been ret
Page 94 and 95:
3. Technology Program Fig 3.31 - Im
Page 96 and 97:
3. Technology Program Fig.3.37 - Mu
Page 98 and 99:
3. Technology Program Fig. 3.40 - S
Page 100 and 101:
3. Technology Program Design review
Page 102 and 103:
Page 104 and 105:
Page 106 and 107:
3. Technology Program diamond. A co
Page 108 and 109:
3. Technology Program electron cycl
Page 110 and 111:
3. Technology Program highest diver
Page 112 and 113:
3. Technology Program Table 3.X - E
Page 114 and 115:
3. Technology Program The results a
Page 116 and 117:
3. Technology Program • Operation
Page 118 and 119:
3. Technology Program Table 3.XI -
Page 120 and 121:
3. Technology Program Table 3.XII -
Page 122 and 123:
3. Technology Program Fig. 3.49 - a
Page 124 and 125:
3. Technology Program in which the
Page 126 and 127:
3. Technology Program Fig. 3.51 - P
Page 128 and 129:
3. Technology Program 3.12.5 Corros
Page 130 and 131:
3. Technology Program Fig. 3.54 - h
Page 132 and 133:
3. Technology Program are also part
Page 134 and 135:
3. Technology Program Fig 3.61 - Op
Page 136 and 137:
3. Technology Program drop characte
Page 138 and 139:
3. Technology Program Fig. 3.69 - P
Page 140 and 141:
References [3.25] S. Rollet, M. Ang
Page 142 and 143:
References [3.65] G. Dell’Orco et
Page 145 and 146:
4. Inertial Confinement 4.1 INTRODU
Page 147 and 148:
4. Inertial Confinement Zo = ρ c R
Page 149 and 150:
4. Inertial Confinement during the
Page 151:
5. Miscellaneous 5.1 ADVANCED SUPER
Page 154 and 155:
5. Miscellaneous Fig. 5.2 - DC-tran
Page 156 and 157:
5. Miscellaneous X-ray diffraction
Page 158 and 159:
5. Miscellaneous 5.3. NEW HYDROGEN
Page 160 and 161:
5. Miscellaneous dilution refrigera
Page 162 and 163:
5. Miscellaneous Table 5.II - Oxyge
Page 164 and 165:
5. Miscellaneous 5.6 PLASMA FOCUS 5
Page 166 and 167:
5. Miscellaneous Fig. 5.18 - Neutro
Page 169:
Publications and Conferences Public
Page 172 and 173:
Publications and Conferences 00/023
Page 174 and 175:
Publications and Conferences for ac
Page 176 and 177:
Publications and Conferences result
Page 178 and 179:
Publications and Conferences M. FER
Page 180 and 181:
Publications and Conferences E. GIO
Page 182 and 183:
Publications and Conferences CONFER
Page 185:
List of Personnel Fusion Division D
Page 188 and 189:
List of Personnel Lupini Sergio Maf
Page 190 and 191:
List of Personnel Celentano Giusepp
Page 193:
Abreviations and acronyms
Page 196 and 197:
Abreviations and acronyms CP cr CR
Page 198 and 199:
Abreviations and acronyms FPSS FRDF
Page 200 and 201:
Abreviations and acronyms LHC LHCD
Page 202 and 203:
Abreviations and acronyms SANS SB S
show all

2000 PROGRESS REPORT - ENEA - Fusione

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?