Parallelism in Cosmological Simulations Physics, Algorithms ...

**Parallelism** **in** **Cosmological** **Simulations**

**Physics**, **Algorithms**, Techniques

S.R. Knollmann, UAM

Introduction

Methods **Parallelism** Summary

Overview Cosmology Numerical Cosmology Why parallel

Overview

– Introduction

– What is cosmology

– **Physics**

– Methods

– Quick overview of the basics steps of a cosmological simulation

– numerical methods to solve the relevant equations

– **Parallelism**

– Parallel strategies

– Doma**in** decomposition

– Load balanc**in**g

– Summary

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Cosmology

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assum**in**g homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 s**in** 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assum**in**g homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 s**in** 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assum**in**g homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 s**in** 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assum**in**g homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 s**in** 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assum**in**g homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 s**in** 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assum**in**g homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 s**in** 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assum**in**g homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 s**in** 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology

Are we done then

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology: Homogeneity & Isotropy

Distribution of galaxies (2df Survey)

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology: Homogeneity & Isotropy

CMB: Cosmic Microwave Background (t**in**y temperature fluctuations, 1 part **in** 100000)

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology

– on large scales, the Universe is homogeneous and isotropic

→ us**in**g GRT to describe the expansion history

– structure forms on small scales

→ us**in**g numerical methods to follow the structure formation

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology

– on large scales, the Universe is homogeneous and isotropic

→ us**in**g GRT to describe the expansion history

– structure forms on small scales

→ us**in**g numerical methods to follow the structure formation

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology: Our Universe

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Physical Cosmology: Our Universe

Introduction

Methods

**Parallelism**

Overview Cosmology Numerical Cosmology Why parallel

Summary

Numerical Cosmology

means: Simulat**in**g the structure formation **in** the Universe

Basically, we would want to solve the E**in**ste**in** equation for an **in**homogeneous and

anisotropic energy momentum tensor.

However:

– on large scale, the Universe is homogeneous and isotropic

– only on very small scales the gravitational fields are strong enough to warrant the use of

GRT, generally, the Newtonian limit is a very good approximation

Introduction

Methods

**Parallelism**

Overview Cosmology Numerical Cosmology Why parallel

Summary

Numerical Cosmology

means: Simulat**in**g the structure formation **in** the Universe

Basically, we would want to solve the E**in**ste**in** equation for an **in**homogeneous and

anisotropic energy momentum tensor.

However:

– on large scale, the Universe is homogeneous and isotropic

– only on very small scales the gravitational fields are strong enough to warrant the use of

GRT, generally, the Newtonian limit is a very good approximation

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Numerical Cosmology

means: Simulat**in**g the structure formation **in** the Universe

Basically, we would want to solve the E**in**ste**in** equation for an **in**homogeneous and

anisotropic energy momentum tensor.

Too hard!

However:

– on large scale, the Universe is homogeneous and isotropic

– only on very small scales the gravitational fields are strong enough to warrant the use of

GRT, generally, the Newtonian limit is a very good approximation

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Numerical Cosmology

means: Simulat**in**g the structure formation **in** the Universe

Basically, we would want to solve the E**in**ste**in** equation for an **in**homogeneous and

anisotropic energy momentum tensor.

Too hard!

However:

– on large scale, the Universe is homogeneous and isotropic

– only on very small scales the gravitational fields are strong enough to warrant the use of

GRT, generally, the Newtonian limit is a very good approximation

Introduction

Methods

**Parallelism**

Overview Cosmology Numerical Cosmology Why parallel

Summary

Numerical Cosmology

→ Us**in**g standard physics **in** an expand**in**g coord**in**ate system.

Ma**in** **in**gredients of the Universe:

Dark Energy

Dark Matter

Baryonic Matter ('gas')

→ expansion, a(t)

→ collisionless fluid, **in**teract**in**g via gravity

→ (magneto)hydrodynamic, self-gravity

Introduction

Methods

**Parallelism**

Overview Cosmology Numerical Cosmology Why parallel

Summary

Numerical Cosmology

→ Us**in**g standard physics **in** an expand**in**g coord**in**ate system.

Ma**in** **in**gredients of the Universe:

Dark Energy

Dark Matter

Baryonic Matter ('gas')

→ expansion, a(t)

→ collisionless fluid, **in**teract**in**g via gravity

→ (magneto)hydrodynamic, self-gravity

Introduction

Methods

**Parallelism**

Overview Cosmology Numerical Cosmology Why parallel

Summary

Numerical Cosmology

→ Us**in**g standard physics **in** an expand**in**g coord**in**ate system.

Ma**in** **in**gredients of the Universe:

Dark Energy

Dark Matter

Baryonic Matter ('gas')

→ expansion, a(t)

→ collisionless fluid, **in**teract**in**g via gravity

→ (magneto)hydrodynamic, self-gravity

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Numerical Cosmology

→ Us**in**g standard physics **in** an expand**in**g coord**in**ate system.

Ma**in** **in**gredients of the Universe:

Dark Energy

Dark Matter

Baryonic Matter ('gas')

→ expansion, a(t)

→ collisionless fluid, **in**teract**in**g via gravity

→ (magneto)hydrodynamic, self-gravity

that is actually not enough, we need to **in**clude sub-resolution physics

(cool**in**g, star formation, feedback processes, …)

and we would like to have radiative transport as well (next Christmas then)

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Numerical Cosmology

→ Us**in**g standard physics **in** an expand**in**g coord**in**ate system.

Ma**in** **in**gredients of the Universe:

Dark Energy

Dark Matter

Baryonic Matter ('gas')

→ expansion, a(t)

→ collisionless fluid, **in**teract**in**g via gravity

→ (magneto)hydrodynamic, self-gravity

that is actually not enough, we need to **in**clude sub-resolution physics

(cool**in**g, star formation, feedback processes, …)

and we would like to have radiative transport as well (next Christmas then)

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Numerical Cosmology: the equations

∂

∂t

∂ v

∂ t

∂ E

∇ v E p B2

∂ t

d x

dt

= v

d v

dt

= −∇

= 4 G tot

− tot

at

∇ v = 0

∇ v v p B2

2 1− 1 B B = − ∇

2 − 1 Bv⋅B

= − v ∇ H B2

2

1

2 H B

∇⋅B = 0

∂ B

∂t ∇×−v× B =

Gravitation

MHD

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Numerical Cosmology: the equations

∂

∂t

∂ v

∂ t

∂ E

∇ v E p B2

∂ t

d x

dt

= v

d v

dt

= −∇

= 4 G tot

− tot

at

∇ v = 0

∇ v v p B2

2 1− 1 B B = − ∇

2 − 1 Bv⋅B

= − v ∇ H B2

2

1

2 H B

∇⋅B = 0

∂ B

∂t ∇×−v× B =

+ recipes for subgrid physics

Gravitation

MHD

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

**Parallelism**

Summary

Numerical Cosmology: the equations

∂

∂t

∂ v

∂ t

∂ E

∇ v E p B2

∂ t

d x

dt

= v

d v

dt

= −∇

= 4 G tot

− tot

at

∇ v = 0

∇ v v p B2

2 1− 1 B B = − ∇

2 − 1 Bv⋅B

= − v ∇ H B2

2

1

2 H B

∇⋅B = 0

∂ B

∂t ∇×−v× B =

+ recipes for subgrid physics

and generally B = 0

Gravitation

MHD

Introduction

Methods

**Parallelism**

Overview Cosmology Numerical Cosmology Why parallel

Summary

Why parallel

– very large memory

requirements

– high computational cost

(when **in**clud**in**g more

than gravity)

Introduction

Methods

**Parallelism**

Overview Cosmology Numerical Cosmology Why parallel

Summary

Why parallel

– very large memory

requirements

– high computational cost

(when **in**clud**in**g more

than gravity)

Overview

Introduction

Ma**in** steps

Methods

Initial Conditions

Simulat**in**g gravity

**Parallelism**

Simulat**in**g gas

Halo F**in**d**in**g

Summary

Overview

Ma**in** steps

– What is happen**in**g **in** a cosmological simulation

Initial Conditions

– How set up **in**itial conditions

Simulat**in**g Gravity

– **Algorithms** for solv**in**g the N-body problem

Simulat**in**g Gas

– How to deal with gas dynamics

Halo F**in**d**in**g

– How to f**in**d objects, ie. how to analyse the simulation

Overview

Introduction

Ma**in** steps

Methods

Initial Conditions

Simulat**in**g gravity

**Parallelism**

Simulat**in**g gas

Halo F**in**d**in**g

Summary

Ma**in** steps

Physical parameters:

– cosmology

– **in**cluded physics

Technical parameters:

– boxsize L

– number of particles N

Get **in**itial conditions.

Simulate.

Analyze.

Overview

Introduction

Ma**in** steps

Methods

Initial Conditions

Simulat**in**g gravity

**Parallelism**

Simulat**in**g gas

Halo F**in**d**in**g

Summary

Ma**in** steps

Physical parameters:

– cosmology

– **in**cluded physics

Technical parameters:

– boxsize L

– number of particles N

Get **in**itial conditions.

Simulate.

Analyze.

Overview

Introduction

Ma**in** steps

Methods

Initial Conditions

Simulat**in**g gravity

**Parallelism**

Simulat**in**g gas

Halo F**in**d**in**g

Summary

Ma**in** steps

Physical parameters:

– cosmology

– **in**cluded physics

Technical parameters:

– boxsize L

– number of particles N

Get **in**itial conditions.

Simulate.

Analyze.

Overview

Introduction

Ma**in** steps

Methods

Initial Conditions

Simulat**in**g gravity

**Parallelism**

Simulat**in**g gas

Halo F**in**d**in**g

Summary

Ma**in** steps

Physical parameters:

– cosmology

– **in**cluded physics

Technical parameters:

– boxsize L

– number of particles N

Get **in**itial conditions.

Simulate.

Analyze.

Overview

Introduction

Methods

**Parallelism**

Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas

Halo F**in**d**in**g

Summary

Initial conditions

– The (density) fluctuations of the very early universe can be assumed to be

Gaussian.

– Initial conditions (for Gaussian perturbations) are completely specified by the

power spectrum P(k) of density fluctuations:

– Either sampl**in**g the Fourier components of a Gaussian random field on a Cartesian

lattice, or

– us**in**g a convolution with White Noise

Overview

Introduction

Methods

**Parallelism**

Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas

Halo F**in**d**in**g

Summary

Initial conditions

– The (density) fluctuations of the very early universe can be assumed to be

Gaussian.

– Initial conditions (for Gaussian perturbations) are completely specified by the

power spectrum P(k) of density fluctuations:

– Either sampl**in**g the Fourier components of a Gaussian random field on a Cartesian

lattice, or

– us**in**g a convolution with White Noise

Overview

Introduction

Methods

**Parallelism**

Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas

Halo F**in**d**in**g

Summary

Initial conditions

– The (density) fluctuations of the very early universe can be assumed to be

Gaussian.

– Initial conditions (for Gaussian perturbations) are completely specified by the

power spectrum P(k) of density fluctuations:

– Either sampl**in**g the Fourier components of a Gaussian random field on a Cartesian

lattice, or

– us**in**g a convolution with White Noise

Overview

Introduction

Methods

**Parallelism**

Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas

Halo F**in**d**in**g

Summary

Initial conditions

– The (density) fluctuations of the very early universe can be assumed to be

Gaussian.

– Initial conditions (for Gaussian perturbations) are completely specified by the

power spectrum P(k) of density fluctuations:

– Either sampl**in**g the Fourier components of a Gaussian random field on a Cartesian

lattice, or

– us**in**g a convolution with White Noise

Overview

Introduction

Methods

**Parallelism**

Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas

Halo F**in**d**in**g

Summary

Initial conditions: Zel'dovich approximation

– Us**in**g l**in**ear theory to quickly run through the early times and then use the

Zel'dovich approximation (Zel'dovich 1970):

● x = q + D(t) S(q)

– x: position of the particle

– q: unperturbed lattice position

– D(t): growth factor

– S: displacement field

● v = a H f D(t) S

– v: velocity

– a: expansion factor

– H: Hubble parameter

– f = d ln D / d ln a: logarithmic growth rate

● S = - ∇ψ; Δψ = δ / D(t)

– δ: density contrast

Overview

Introduction

Methods

**Parallelism**

Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas

Halo F**in**d**in**g

Summary

Initial conditions: Zel'dovich approximation

– Us**in**g l**in**ear theory to quickly run through the early times and then use the

Zel'dovich approximation (Zel'dovich 1970):

● x = q + D(t) S(q)

– x: position of the particle

– q: unperturbed lattice position

– D(t): growth factor

– S: displacement field

● v = a H f D(t) S

– v: velocity

– a: expansion factor

– H: Hubble parameter

– f = d ln D / d ln a: logarithmic growth rate

● S = - ∇ψ; Δψ = δ / D(t)

– δ: density contrast

Overview

Introduction

Methods

**Parallelism**

Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas

Halo F**in**d**in**g

Summary

Initial conditions: Zel'dovich approximation

– Us**in**g l**in**ear theory to quickly run through the early times and then use the

Zel'dovich approximation (Zel'dovich 1970):

● x = q + D(t) S(q)

– x: position of the particle

– q: unperturbed lattice position

– D(t): growth factor

– S: displacement field

● v = a H f D(t) S

– v: velocity

– a: expansion factor

– H: Hubble parameter

– f = d ln D / d ln a: logarithmic growth rate

● S = - ∇ψ; Δψ = δ / D(t)

– δ: density contrast

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gravity

Basic Methods:

– Direct Summation (PP)

– only feasible for small N

– scal**in**g is N 2

– keep **in** m**in**d that we want collisionless **in**teractions

– Tree code

– Barnes-Hut tree algorithm

– Comb**in**es distant particles

– scal**in**g is N log N (can be optimised to N)

– Particle Mesh (PM)

– us**in**g a regular grid to solve the Poisson equation

– pr**in**cipal algorithm: Fast Fourier Transform

– scal**in**g is N log N

– bad resolution

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gravity

Basic Methods:

– Direct Summation (PP)

– only feasible for small N

– scal**in**g is N 2

– keep **in** m**in**d that we want collisionless **in**teractions

– Tree code

– Barnes-Hut tree algorithm

– Comb**in**es distant particles

– scal**in**g is N log N (can be optimised to N)

– Particle Mesh (PM)

– us**in**g a regular grid to solve the Poisson equation

– pr**in**cipal algorithm: Fast Fourier Transform

– scal**in**g is N log N

– bad resolution

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gravity

Basic Methods:

– Direct Summation (PP)

– only feasible for small N

– scal**in**g is N 2

– keep **in** m**in**d that we want collisionless **in**teractions

– Tree code

– Barnes-Hut tree algorithm

– Comb**in**es distant particles

– scal**in**g is N log N (can be optimised to N)

– Particle Mesh (PM)

– us**in**g a regular grid to solve the Poisson equation

– pr**in**cipal algorithm: Fast Fourier Transform

– scal**in**g is N log N

– bad resolution

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gravity

Basic Methods:

– Direct Summation (PP)

– only feasible for small N

– scal**in**g is N 2

– keep **in** m**in**d that we want collisionless **in**teractions

– Tree code

– Barnes-Hut tree algorithm

– Comb**in**es distant particles

– scal**in**g is N log N (can be optimised to N)

– Particle Mesh (PM)

– us**in**g a regular grid to solve the Poisson equation

– pr**in**cipal algorithm: Fast Fourier Transform

– scal**in**g is N log N

– bad resolution

Introduction

Methods

**Parallelism**

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas

Halo F**in**d**in**g

Summary

Simulat**in**g gravity

Basic Methods:

– Direct Summation (PP)

– only feasible for small N

– scal**in**g is N 2

– keep **in** m**in**d that we want collisionless **in**teractions

– Tree code

– Barnes-Hut tree algorithm

– Comb**in**es distant particles

– scal**in**g is N log N (can be optimised to N)

– Particle Mesh (PM)

– us**in**g a regular grid to solve the Poisson equation

– pr**in**cipal algorithm: Fast Fourier Transform

– scal**in**g is N log N

– bad resolution

F i =−∑ j≠i

F i

=−m ∇ x i

G m i

m j

x i

−x j

3 x i−x j

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gravity

Hybrid Methods (splitt**in**g force **in** different ranges):

– Direct Summation + PM (P3M)

– like PM but us**in**g direct summation to **in**crease resolution

– bad scal**in**g under cluster**in**g (PP dom**in**ates)

– Direct Summation + Adaptive PM (AP3M)

– like P3M but us**in**g multiple grid levels

– alleviates scal**in**g issue of P3M

– Adaptive Mesh Ref**in**ement (AMR)

– like AP3M but completely removes PP part

– Tree + PM (TreePM)

– PM part for long range forces

– Tree part for short range forces

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gravity

Hybrid Methods (splitt**in**g force **in** different ranges):

– Direct Summation + PM (P3M)

– like PM but us**in**g direct summation to **in**crease resolution

– bad scal**in**g under cluster**in**g (PP dom**in**ates)

– Direct Summation + Adaptive PM (AP3M)

– like P3M but us**in**g multiple grid levels

– alleviates scal**in**g issue of P3M

– Adaptive Mesh Ref**in**ement (AMR)

– like AP3M but completely removes PP part

– Tree + PM (TreePM)

– PM part for long range forces

– Tree part for short range forces

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gravity

Hybrid Methods (splitt**in**g force **in** different ranges):

– Direct Summation + PM (P3M)

– like PM but us**in**g direct summation to **in**crease resolution

– bad scal**in**g under cluster**in**g (PP dom**in**ates)

– Direct Summation + Adaptive PM (AP3M)

– like P3M but us**in**g multiple grid levels

– alleviates scal**in**g issue of P3M

– Adaptive Mesh Ref**in**ement (AMR)

– like AP3M but completely removes PP part

– Tree + PM (TreePM)

– PM part for long range forces

– Tree part for short range forces

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gravity

Hybrid Methods (splitt**in**g force **in** different ranges):

– Direct Summation + PM (P3M)

– like PM but us**in**g direct summation to **in**crease resolution

– bad scal**in**g under cluster**in**g (PP dom**in**ates)

– Direct Summation + Adaptive PM (AP3M)

– like P3M but us**in**g multiple grid levels

– alleviates scal**in**g issue of P3M

– Adaptive Mesh Ref**in**ement (AMR)

– like AP3M but completely removes PP part

– Tree + PM (TreePM)

– PM part for long range forces

– Tree part for short range forces

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gravity

Hybrid Methods (splitt**in**g force **in** different ranges):

– Direct Summation + PM (P3M)

– like PM but us**in**g direct summation to **in**crease resolution

– bad scal**in**g under cluster**in**g (PP dom**in**ates)

– Direct Summation + Adaptive PM (AP3M)

– like P3M but us**in**g multiple grid levels

– alleviates scal**in**g issue of P3M

– Adaptive Mesh Ref**in**ement (AMR)

– like AP3M but completely removes PP part

– Tree + PM (TreePM)

– PM part for long range forces

– Tree part for short range forces

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gas

Two ways of do**in**g gas dynamics:

Eulerian

– discretize space

Lagrangian

– discretize mass

Ma**in** advantages

– high accuracy

– low numerical viscosity

Ma**in** advantages

– resolution adjusts to the flow (focus

lies **in** high density regions)

Easy to **in**clude **in** mesh codes

Easy to **in**clude **in** Tree codes

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gas

Two ways of do**in**g gas dynamics:

Eulerian

– discretize space

Lagrangian

– discretize mass

Ma**in** advantages

– high accuracy

– low numerical viscosity

Ma**in** advantages

– resolution adjusts to the flow (focus

lies **in** high density regions)

Easy to **in**clude **in** mesh codes

Easy to **in**clude **in** Tree codes

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gas

Two ways of do**in**g gas dynamics:

Eulerian

– discretize space

Lagrangian

– discretize mass

Ma**in** advantages

– high accuracy

– low numerical viscosity

Ma**in** advantages

– resolution adjusts to the flow (focus

lies **in** high density regions)

Easy to **in**clude **in** mesh codes

Easy to **in**clude **in** Tree codes

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gas

Two ways of do**in**g gas dynamics:

Eulerian

– discretize space

Lagrangian

– discretize mass

Ma**in** advantages

– high accuracy

– low numerical viscosity

Ma**in** advantages

– resolution adjusts to the flow (focus

lies **in** high density regions)

Easy to **in**clude **in** mesh codes

Easy to **in**clude **in** Tree codes

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Simulat**in**g gas

Two ways of do**in**g gas dynamics:

Eulerian

– discretize space

Lagrangian

– discretize mass

Ma**in** advantages

– high accuracy

– low numerical viscosity

Ma**in** advantages

– resolution adjusts to the flow (focus

lies **in** high density regions)

Easy to **in**clude **in** mesh codes

Easy to **in**clude **in** Tree codes

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Halo F**in**d**in**g

Objective:

– f**in**d**in**g bound structures

– deal**in**g with substructures

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Halo F**in**d**in**g

Objective:

– f**in**d**in**g bound structures

– deal**in**g with substructures

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Halo F**in**d**in**g

Objective:

– f**in**d**in**g bound structures

– deal**in**g with substructures

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Halo F**in**d**in**g: Methods

General steps:

– locat**in**g density peak

– collect**in**g particles around those peaks

Friends-of-Friends:

– l**in**k**in**g particles below a certa**in** distance

More later...

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Halo F**in**d**in**g: Methods

General steps:

– locat**in**g density peak

– collect**in**g particles around those peaks

Friends-of-Friends:

– l**in**k**in**g particles below a certa**in** distance

More later...

Introduction

Methods

**Parallelism**

Summary

Overview Ma**in** steps Initial Conditions Simulat**in**g gravity Simulat**in**g gas Halo F**in**d**in**g

Halo F**in**d**in**g: Methods

General steps:

– locat**in**g density peak

– collect**in**g particles around those peaks

Friends-of-Friends:

– l**in**k**in**g particles below a certa**in** distance

More later...

Introduction

Methods

Overview Initial conditions Simulation Halo F**in**d**in**g

**Parallelism**

Summary

Overview

Initial conditions

– Example code g**in**nungagap

– FFT **in** slab and pencil-decomposition

Simulation

– Example code Gadget 2 (3)

– Doma**in** decomposition along a space fill**in**g curve

– Scal**in**g results

Halo F**in**d**in**g

– Example code AHF

– Data chunk**in**g

– Scal**in**g results

Introduction

Methods

Overview Initial conditions Simulation Halo F**in**d**in**g

**Parallelism**

Summary

Initial Conditions: Algorithmic requirements

– (Gaussian) Random Numbers

– Fast Fourier Transform

– Gradient (should be done **in** Fourier space)

While there are many codes available for **in**itial conditions, the discussion **in** the follow**in**g is

based on g**in**nungagap (Knollmann et al, **in** prep)

Introduction

Methods

Overview Initial conditions Simulation Halo F**in**d**in**g

**Parallelism**

Summary

Initial Conditions: Algorithmic requirements

– (Gaussian) Random Numbers

– Fast Fourier Transform

– Gradient (should be done **in** Fourier space)

While there are many codes available for **in**itial conditions, the discussion **in** the follow**in**g is

based on g**in**nungagap (Knollmann et al, **in** prep)

Introduction

Methods

Overview Initial conditions Simulation Halo F**in**d**in**g

**Parallelism**

Summary

Initial Conditions: Basic steps

For density field:

– Generate gaussian random field **in** real space

– Go to Fourier space

– Convolve with Power Spectrum

– Go to real space to yield density field

Introduction

Methods

Overview Initial conditions Simulation Halo F**in**d**in**g

**Parallelism**

Summary

Initial Conditions: Basic steps

For velocity field:

– Generate gaussian random field **in** real space

– Go to Fourier space

– Convolve with Power Spectrum

– Differentiate **in** k-space for x (y, z)

– Go to real space to velocity field for x (y, z)

Introduction

Methods

Overview Initial conditions Simulation Halo F**in**d**in**g

**Parallelism**

Summary

Initial Conditions: Random numbers

Us**in**g SPRNG library.

– Modified Lagged Fibonacci

– provides 2 39648 streams

– sequence length 2 1310

Introduction

Methods

Overview Initial conditions Simulation Halo F**in**d**in**g

**Parallelism**

Summary

Initial Conditions: FFT

Standard library for FFTs (FFTW2) supports MPI with slab-decomposition (FFTW3

does not support MPI officially)

Is a slab-decomposition the best way

Introduction

Methods

Overview Initial conditions Simulation Halo F**in**d**in**g

**Parallelism**

Summary

Initial Conditions: FFT

Speed concerns:

– slab:

– 2D FFTs

– transpose

– 1D FFTs

– pencil:

– 1D FFTs

– transpose

– 1D FFTs

– transpose

– 1D FFTs

Memory requirements

Introduction

Methods

Overview Initial conditions Simulation Halo F**in**d**in**g

**Parallelism**

Summary

Initial Conditions: FFT

Speed concerns:

– slab:

– 2D FFTs

– transpose

– 1D FFTs

– pencil:

– 1D FFTs

– transpose

– 1D FFTs

– transpose

– 1D FFTs

Memory requirements

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Initial Conditions: FFT

Speed concerns:

– slab:

– 2D FFTs

– transpose

– 1D FFTs

– pencil:

– 1D FFTs

– transpose

– 1D FFTs

– transpose

– 1D FFTs

Memory requirements

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Initial Conditions: Limitations

Ma**in** limit is IO...

→ 4096 3 corresponds to 1536 GB of particle data (position, velocity)

→ 8192 3 corresponds to 12288 GB

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Simulation

The ma**in** challenge is **in** the doma**in** decomposition and the load-balance.

Need to dist**in**guish between:

– homogeneous particle distribution (large simulation boxes, not too many

particles per object)

– simple Cartesian decomposition is sufficient

– easy communication patterns for FFT solver (**in**herently slab or pencil decomposed)

– highly clustered particle distribution (small boxes, zoomed simulations, large

number of particles **in** one object)

– general approach nowadays is the doma**in** decomposition along a space fill**in**g curve

(and there mostly a Hilbert curve)

– for hybrid solvers that use an FFT part, this requires more complicated

communication patterns

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Simulation

The ma**in** challenge is **in** the doma**in** decomposition and the load-balance.

Need to dist**in**guish between:

– homogeneous particle distribution (large simulation boxes, not too many

particles per object)

– simple Cartesian decomposition is sufficient

– easy communication patterns for FFT solver (**in**herently slab or pencil decomposed)

– highly clustered particle distribution (small boxes, zoomed simulations, large

number of particles **in** one object)

– general approach nowadays is the doma**in** decomposition along a space fill**in**g curve

(and there mostly a Hilbert curve)

– for hybrid solvers that use an FFT part, this requires more complicated

communication patterns

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Simulation

The ma**in** challenge is **in** the doma**in** decomposition and the load-balance.

Need to dist**in**guish between:

– homogeneous particle distribution (large simulation boxes, not too many

particles per object)

– simple Cartesian decomposition is sufficient

– easy communication patterns for FFT solver (**in**herently slab or pencil decomposed)

– highly clustered particle distribution (small boxes, zoomed simulations, large

number of particles **in** one object)

– general approach nowadays is the doma**in** decomposition along a space fill**in**g curve

(and there mostly a Hilbert curve)

– for hybrid solvers that use an FFT part, this requires more complicated

communication patterns

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Simulation

The ma**in** challenge is **in** the doma**in** decomposition and the load-balance.

Need to dist**in**guish between:

– homogeneous particle distribution (large simulation boxes, not too many

particles per object)

– simple Cartesian decomposition is sufficient

– easy communication patterns for FFT solver (**in**herently slab or pencil decomposed)

– highly clustered particle distribution (small boxes, zoomed simulations, large

number of particles **in** one object)

– general approach nowadays is the doma**in** decomposition along a space fill**in**g curve

(and there mostly a Hilbert curve)

– for hybrid solvers that use an FFT part, this requires more complicated

communication patterns

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Simulation: Hilbert curve decomposition

– All particles are ordered accord**in**g to a 1D **in**dex

– Each process deals with the same number of particles or the same work-load

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Simulation: Hilbert curve decomposition

– All particles are ordered accord**in**g to a 1D **in**dex

– Each process deals with the same number of particles or the same work-load

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Simulation: Scalability, Gadget 2

Spr**in**gel 2005

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Simulation: Scalability, Gadget 2

Spr**in**gel 2005

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Simulation: Improved load-balance **in** Gadget-3

– To further improve load-balance (and to allow for extremely zoomed

simulations), the Hilbert curve can be subdivided **in**to more chunks than

available processes (multi-doma**in**s).

– This made it possible to deal with more than 10 9 particles **in** one object.

– Also allows us**in**g large number of CPUs.

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Simulation: Improved load-balance **in** Gadget-3

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Halo F**in**d**in**g

Example: AHF (Knollmann & Knebe, 2009)

– uses an adaptive mesh hierarchy to identify density peaks

– collects particles around those peaks tak**in**g **in**to account the hierarchical

structure of the object (naturally deal**in**g with substructures)

– use MPI for a subchunk**in**g of the simulation data (**in**clud**in**g buffer zones,

elim**in**at**in**g further communication)

– uses OpenMP to parallelize local work

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Halo F**in**d**in**g: MPI splitt**in**g

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Halo F**in**d**in**g: MPI splitt**in**g

Overview

Introduction

Initial conditions

Simulation

Methods

Halo F**in**d**in**g

**Parallelism**

Summary

Halo F**in**d**in**g: Scal**in**g

Summary

Introduction

Outlook

Methods

**Parallelism**

Summary

Summary

– cosmological simulations require massively parallel system

– various methods exists to effectively solve the relevant equations on modern

supercomputers

– some free implementation exist (RAMSES, Enzo, Gadget-2,...)

– analysis tools and support**in**g software are slowly catch**in**g up to the simulation

software

Summary

Introduction

Outlook

Methods

**Parallelism**

Summary

Outlook

– the ma**in** challenge for simulation software is the **in**clusion of additional physics

– convergence of subgrid models is unclear (however, the more resolution is

available, the more physical processes need to be **in**cluded)

– from a programmers perspective: load-balance is the hardest part to tackle