# Parallelism in Cosmological Simulations Physics, Algorithms ...

Parallelism in Cosmological Simulations Physics, Algorithms ...

Parallelism in Cosmological Simulations

Physics, Algorithms, Techniques

S.R. Knollmann, UAM

Introduction

Methods Parallelism Summary

Overview Cosmology Numerical Cosmology Why parallel

Overview

– Introduction

– What is cosmology

Physics

– Methods

– Quick overview of the basics steps of a cosmological simulation

– numerical methods to solve the relevant equations

Parallelism

– Parallel strategies

– Domain decomposition

– Summary

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Cosmology

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assuming homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 sin 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assuming homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 sin 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assuming homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 sin 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assuming homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 sin 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assuming homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 sin 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assuming homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 sin 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology

Basis is General Relativity

G ab

g ab

=8 G T ab

Assuming homogeneity and isotropy

Friedmann equation(s)

ds 2 =−dt 2 a 2 t

dr 2

2

ȧ

2

≡H 2 =H

a 0

1−kr 2r2 d 2 sin 2 d 2

r

a m

4

a 3

k

a 2

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology

Are we done then

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology: Homogeneity & Isotropy

Distribution of galaxies (2df Survey)

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology: Homogeneity & Isotropy

CMB: Cosmic Microwave Background (tiny temperature fluctuations, 1 part in 100000)

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology

– on large scales, the Universe is homogeneous and isotropic

→ using GRT to describe the expansion history

– structure forms on small scales

→ using numerical methods to follow the structure formation

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology

– on large scales, the Universe is homogeneous and isotropic

→ using GRT to describe the expansion history

– structure forms on small scales

→ using numerical methods to follow the structure formation

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology: Our Universe

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Physical Cosmology: Our Universe

Introduction

Methods

Parallelism

Overview Cosmology Numerical Cosmology Why parallel

Summary

Numerical Cosmology

means: Simulating the structure formation in the Universe

Basically, we would want to solve the Einstein equation for an inhomogeneous and

anisotropic energy momentum tensor.

However:

– on large scale, the Universe is homogeneous and isotropic

– only on very small scales the gravitational fields are strong enough to warrant the use of

GRT, generally, the Newtonian limit is a very good approximation

Introduction

Methods

Parallelism

Overview Cosmology Numerical Cosmology Why parallel

Summary

Numerical Cosmology

means: Simulating the structure formation in the Universe

Basically, we would want to solve the Einstein equation for an inhomogeneous and

anisotropic energy momentum tensor.

However:

– on large scale, the Universe is homogeneous and isotropic

– only on very small scales the gravitational fields are strong enough to warrant the use of

GRT, generally, the Newtonian limit is a very good approximation

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Numerical Cosmology

means: Simulating the structure formation in the Universe

Basically, we would want to solve the Einstein equation for an inhomogeneous and

anisotropic energy momentum tensor.

Too hard!

However:

– on large scale, the Universe is homogeneous and isotropic

– only on very small scales the gravitational fields are strong enough to warrant the use of

GRT, generally, the Newtonian limit is a very good approximation

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Numerical Cosmology

means: Simulating the structure formation in the Universe

Basically, we would want to solve the Einstein equation for an inhomogeneous and

anisotropic energy momentum tensor.

Too hard!

However:

– on large scale, the Universe is homogeneous and isotropic

– only on very small scales the gravitational fields are strong enough to warrant the use of

GRT, generally, the Newtonian limit is a very good approximation

Introduction

Methods

Parallelism

Overview Cosmology Numerical Cosmology Why parallel

Summary

Numerical Cosmology

→ Using standard physics in an expanding coordinate system.

Main ingredients of the Universe:

Dark Energy

Dark Matter

Baryonic Matter ('gas')

→ expansion, a(t)

→ collisionless fluid, interacting via gravity

→ (magneto)hydrodynamic, self-gravity

Introduction

Methods

Parallelism

Overview Cosmology Numerical Cosmology Why parallel

Summary

Numerical Cosmology

→ Using standard physics in an expanding coordinate system.

Main ingredients of the Universe:

Dark Energy

Dark Matter

Baryonic Matter ('gas')

→ expansion, a(t)

→ collisionless fluid, interacting via gravity

→ (magneto)hydrodynamic, self-gravity

Introduction

Methods

Parallelism

Overview Cosmology Numerical Cosmology Why parallel

Summary

Numerical Cosmology

→ Using standard physics in an expanding coordinate system.

Main ingredients of the Universe:

Dark Energy

Dark Matter

Baryonic Matter ('gas')

→ expansion, a(t)

→ collisionless fluid, interacting via gravity

→ (magneto)hydrodynamic, self-gravity

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Numerical Cosmology

→ Using standard physics in an expanding coordinate system.

Main ingredients of the Universe:

Dark Energy

Dark Matter

Baryonic Matter ('gas')

→ expansion, a(t)

→ collisionless fluid, interacting via gravity

→ (magneto)hydrodynamic, self-gravity

that is actually not enough, we need to include sub-resolution physics

(cooling, star formation, feedback processes, …)

and we would like to have radiative transport as well (next Christmas then)

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Numerical Cosmology

→ Using standard physics in an expanding coordinate system.

Main ingredients of the Universe:

Dark Energy

Dark Matter

Baryonic Matter ('gas')

→ expansion, a(t)

→ collisionless fluid, interacting via gravity

→ (magneto)hydrodynamic, self-gravity

that is actually not enough, we need to include sub-resolution physics

(cooling, star formation, feedback processes, …)

and we would like to have radiative transport as well (next Christmas then)

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Numerical Cosmology: the equations

∂t

∂ v

∂ t

∂ E

∇ v E p B2

∂ t

d x

dt

= v

d v

dt

= −∇

= 4 G tot

− tot

at

∇ v = 0

∇ v v p B2

2 1− 1 B B = − ∇

2 − 1 Bv⋅B

= − v ∇ H B2

2

1

2 H B

∇⋅B = 0

∂ B

∂t ∇×−v× B =

Gravitation

MHD

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Numerical Cosmology: the equations

∂t

∂ v

∂ t

∂ E

∇ v E p B2

∂ t

d x

dt

= v

d v

dt

= −∇

= 4 G tot

− tot

at

∇ v = 0

∇ v v p B2

2 1− 1 B B = − ∇

2 − 1 Bv⋅B

= − v ∇ H B2

2

1

2 H B

∇⋅B = 0

∂ B

∂t ∇×−v× B =

+ recipes for subgrid physics

Gravitation

MHD

Overview

Introduction

Cosmology

Methods

Numerical Cosmology

Why parallel

Parallelism

Summary

Numerical Cosmology: the equations

∂t

∂ v

∂ t

∂ E

∇ v E p B2

∂ t

d x

dt

= v

d v

dt

= −∇

= 4 G tot

− tot

at

∇ v = 0

∇ v v p B2

2 1− 1 B B = − ∇

2 − 1 Bv⋅B

= − v ∇ H B2

2

1

2 H B

∇⋅B = 0

∂ B

∂t ∇×−v× B =

+ recipes for subgrid physics

and generally B = 0

Gravitation

MHD

Introduction

Methods

Parallelism

Overview Cosmology Numerical Cosmology Why parallel

Summary

Why parallel

– very large memory

requirements

– high computational cost

(when including more

than gravity)

Introduction

Methods

Parallelism

Overview Cosmology Numerical Cosmology Why parallel

Summary

Why parallel

– very large memory

requirements

– high computational cost

(when including more

than gravity)

Overview

Introduction

Main steps

Methods

Initial Conditions

Simulating gravity

Parallelism

Simulating gas

Halo Finding

Summary

Overview

Main steps

– What is happening in a cosmological simulation

Initial Conditions

– How set up initial conditions

Simulating Gravity

Algorithms for solving the N-body problem

Simulating Gas

– How to deal with gas dynamics

Halo Finding

– How to find objects, ie. how to analyse the simulation

Overview

Introduction

Main steps

Methods

Initial Conditions

Simulating gravity

Parallelism

Simulating gas

Halo Finding

Summary

Main steps

Physical parameters:

– cosmology

included physics

Technical parameters:

– boxsize L

– number of particles N

Get initial conditions.

Simulate.

Analyze.

Overview

Introduction

Main steps

Methods

Initial Conditions

Simulating gravity

Parallelism

Simulating gas

Halo Finding

Summary

Main steps

Physical parameters:

– cosmology

included physics

Technical parameters:

– boxsize L

– number of particles N

Get initial conditions.

Simulate.

Analyze.

Overview

Introduction

Main steps

Methods

Initial Conditions

Simulating gravity

Parallelism

Simulating gas

Halo Finding

Summary

Main steps

Physical parameters:

– cosmology

included physics

Technical parameters:

– boxsize L

– number of particles N

Get initial conditions.

Simulate.

Analyze.

Overview

Introduction

Main steps

Methods

Initial Conditions

Simulating gravity

Parallelism

Simulating gas

Halo Finding

Summary

Main steps

Physical parameters:

– cosmology

included physics

Technical parameters:

– boxsize L

– number of particles N

Get initial conditions.

Simulate.

Analyze.

Overview

Introduction

Methods

Parallelism

Main steps Initial Conditions Simulating gravity Simulating gas

Halo Finding

Summary

Initial conditions

– The (density) fluctuations of the very early universe can be assumed to be

Gaussian.

– Initial conditions (for Gaussian perturbations) are completely specified by the

power spectrum P(k) of density fluctuations:

– Either sampling the Fourier components of a Gaussian random field on a Cartesian

lattice, or

– using a convolution with White Noise

Overview

Introduction

Methods

Parallelism

Main steps Initial Conditions Simulating gravity Simulating gas

Halo Finding

Summary

Initial conditions

– The (density) fluctuations of the very early universe can be assumed to be

Gaussian.

– Initial conditions (for Gaussian perturbations) are completely specified by the

power spectrum P(k) of density fluctuations:

– Either sampling the Fourier components of a Gaussian random field on a Cartesian

lattice, or

– using a convolution with White Noise

Overview

Introduction

Methods

Parallelism

Main steps Initial Conditions Simulating gravity Simulating gas

Halo Finding

Summary

Initial conditions

– The (density) fluctuations of the very early universe can be assumed to be

Gaussian.

– Initial conditions (for Gaussian perturbations) are completely specified by the

power spectrum P(k) of density fluctuations:

– Either sampling the Fourier components of a Gaussian random field on a Cartesian

lattice, or

– using a convolution with White Noise

Overview

Introduction

Methods

Parallelism

Main steps Initial Conditions Simulating gravity Simulating gas

Halo Finding

Summary

Initial conditions

– The (density) fluctuations of the very early universe can be assumed to be

Gaussian.

– Initial conditions (for Gaussian perturbations) are completely specified by the

power spectrum P(k) of density fluctuations:

– Either sampling the Fourier components of a Gaussian random field on a Cartesian

lattice, or

– using a convolution with White Noise

Overview

Introduction

Methods

Parallelism

Main steps Initial Conditions Simulating gravity Simulating gas

Halo Finding

Summary

Initial conditions: Zel'dovich approximation

– Using linear theory to quickly run through the early times and then use the

Zel'dovich approximation (Zel'dovich 1970):

● x = q + D(t) S(q)

– x: position of the particle

– q: unperturbed lattice position

– D(t): growth factor

– S: displacement field

● v = a H f D(t) S

– v: velocity

– a: expansion factor

– H: Hubble parameter

– f = d ln D / d ln a: logarithmic growth rate

● S = - ∇ψ; Δψ = δ / D(t)

– δ: density contrast

Overview

Introduction

Methods

Parallelism

Main steps Initial Conditions Simulating gravity Simulating gas

Halo Finding

Summary

Initial conditions: Zel'dovich approximation

– Using linear theory to quickly run through the early times and then use the

Zel'dovich approximation (Zel'dovich 1970):

● x = q + D(t) S(q)

– x: position of the particle

– q: unperturbed lattice position

– D(t): growth factor

– S: displacement field

● v = a H f D(t) S

– v: velocity

– a: expansion factor

– H: Hubble parameter

– f = d ln D / d ln a: logarithmic growth rate

● S = - ∇ψ; Δψ = δ / D(t)

– δ: density contrast

Overview

Introduction

Methods

Parallelism

Main steps Initial Conditions Simulating gravity Simulating gas

Halo Finding

Summary

Initial conditions: Zel'dovich approximation

– Using linear theory to quickly run through the early times and then use the

Zel'dovich approximation (Zel'dovich 1970):

● x = q + D(t) S(q)

– x: position of the particle

– q: unperturbed lattice position

– D(t): growth factor

– S: displacement field

● v = a H f D(t) S

– v: velocity

– a: expansion factor

– H: Hubble parameter

– f = d ln D / d ln a: logarithmic growth rate

● S = - ∇ψ; Δψ = δ / D(t)

– δ: density contrast

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gravity

Basic Methods:

– Direct Summation (PP)

– only feasible for small N

– scaling is N 2

– keep in mind that we want collisionless interactions

– Tree code

– Barnes-Hut tree algorithm

– Combines distant particles

– scaling is N log N (can be optimised to N)

– Particle Mesh (PM)

– using a regular grid to solve the Poisson equation

– principal algorithm: Fast Fourier Transform

– scaling is N log N

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gravity

Basic Methods:

– Direct Summation (PP)

– only feasible for small N

– scaling is N 2

– keep in mind that we want collisionless interactions

– Tree code

– Barnes-Hut tree algorithm

– Combines distant particles

– scaling is N log N (can be optimised to N)

– Particle Mesh (PM)

– using a regular grid to solve the Poisson equation

– principal algorithm: Fast Fourier Transform

– scaling is N log N

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gravity

Basic Methods:

– Direct Summation (PP)

– only feasible for small N

– scaling is N 2

– keep in mind that we want collisionless interactions

– Tree code

– Barnes-Hut tree algorithm

– Combines distant particles

– scaling is N log N (can be optimised to N)

– Particle Mesh (PM)

– using a regular grid to solve the Poisson equation

– principal algorithm: Fast Fourier Transform

– scaling is N log N

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gravity

Basic Methods:

– Direct Summation (PP)

– only feasible for small N

– scaling is N 2

– keep in mind that we want collisionless interactions

– Tree code

– Barnes-Hut tree algorithm

– Combines distant particles

– scaling is N log N (can be optimised to N)

– Particle Mesh (PM)

– using a regular grid to solve the Poisson equation

– principal algorithm: Fast Fourier Transform

– scaling is N log N

Introduction

Methods

Parallelism

Overview Main steps Initial Conditions Simulating gravity Simulating gas

Halo Finding

Summary

Simulating gravity

Basic Methods:

– Direct Summation (PP)

– only feasible for small N

– scaling is N 2

– keep in mind that we want collisionless interactions

– Tree code

– Barnes-Hut tree algorithm

– Combines distant particles

– scaling is N log N (can be optimised to N)

– Particle Mesh (PM)

– using a regular grid to solve the Poisson equation

– principal algorithm: Fast Fourier Transform

– scaling is N log N

F i =−∑ j≠i

F i

=−m ∇ x i

G m i

m j

x i

−x j

3 x i−x j

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gravity

Hybrid Methods (splitting force in different ranges):

– Direct Summation + PM (P3M)

– like PM but using direct summation to increase resolution

– bad scaling under clustering (PP dominates)

– Direct Summation + Adaptive PM (AP3M)

– like P3M but using multiple grid levels

– alleviates scaling issue of P3M

– like AP3M but completely removes PP part

– Tree + PM (TreePM)

– PM part for long range forces

– Tree part for short range forces

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gravity

Hybrid Methods (splitting force in different ranges):

– Direct Summation + PM (P3M)

– like PM but using direct summation to increase resolution

– bad scaling under clustering (PP dominates)

– Direct Summation + Adaptive PM (AP3M)

– like P3M but using multiple grid levels

– alleviates scaling issue of P3M

– like AP3M but completely removes PP part

– Tree + PM (TreePM)

– PM part for long range forces

– Tree part for short range forces

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gravity

Hybrid Methods (splitting force in different ranges):

– Direct Summation + PM (P3M)

– like PM but using direct summation to increase resolution

– bad scaling under clustering (PP dominates)

– Direct Summation + Adaptive PM (AP3M)

– like P3M but using multiple grid levels

– alleviates scaling issue of P3M

– like AP3M but completely removes PP part

– Tree + PM (TreePM)

– PM part for long range forces

– Tree part for short range forces

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gravity

Hybrid Methods (splitting force in different ranges):

– Direct Summation + PM (P3M)

– like PM but using direct summation to increase resolution

– bad scaling under clustering (PP dominates)

– Direct Summation + Adaptive PM (AP3M)

– like P3M but using multiple grid levels

– alleviates scaling issue of P3M

– like AP3M but completely removes PP part

– Tree + PM (TreePM)

– PM part for long range forces

– Tree part for short range forces

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gravity

Hybrid Methods (splitting force in different ranges):

– Direct Summation + PM (P3M)

– like PM but using direct summation to increase resolution

– bad scaling under clustering (PP dominates)

– Direct Summation + Adaptive PM (AP3M)

– like P3M but using multiple grid levels

– alleviates scaling issue of P3M

– like AP3M but completely removes PP part

– Tree + PM (TreePM)

– PM part for long range forces

– Tree part for short range forces

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gas

Two ways of doing gas dynamics:

Eulerian

– discretize space

Lagrangian

– discretize mass

– high accuracy

– low numerical viscosity

– resolution adjusts to the flow (focus

lies in high density regions)

Easy to include in mesh codes

Easy to include in Tree codes

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gas

Two ways of doing gas dynamics:

Eulerian

– discretize space

Lagrangian

– discretize mass

– high accuracy

– low numerical viscosity

– resolution adjusts to the flow (focus

lies in high density regions)

Easy to include in mesh codes

Easy to include in Tree codes

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gas

Two ways of doing gas dynamics:

Eulerian

– discretize space

Lagrangian

– discretize mass

– high accuracy

– low numerical viscosity

– resolution adjusts to the flow (focus

lies in high density regions)

Easy to include in mesh codes

Easy to include in Tree codes

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gas

Two ways of doing gas dynamics:

Eulerian

– discretize space

Lagrangian

– discretize mass

– high accuracy

– low numerical viscosity

– resolution adjusts to the flow (focus

lies in high density regions)

Easy to include in mesh codes

Easy to include in Tree codes

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Simulating gas

Two ways of doing gas dynamics:

Eulerian

– discretize space

Lagrangian

– discretize mass

– high accuracy

– low numerical viscosity

– resolution adjusts to the flow (focus

lies in high density regions)

Easy to include in mesh codes

Easy to include in Tree codes

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Halo Finding

Objective:

– finding bound structures

– dealing with substructures

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Halo Finding

Objective:

– finding bound structures

– dealing with substructures

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Halo Finding

Objective:

– finding bound structures

– dealing with substructures

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Halo Finding: Methods

General steps:

– locating density peak

– collecting particles around those peaks

Friends-of-Friends:

– linking particles below a certain distance

More later...

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Halo Finding: Methods

General steps:

– locating density peak

– collecting particles around those peaks

Friends-of-Friends:

– linking particles below a certain distance

More later...

Introduction

Methods

Parallelism

Summary

Overview Main steps Initial Conditions Simulating gravity Simulating gas Halo Finding

Halo Finding: Methods

General steps:

– locating density peak

– collecting particles around those peaks

Friends-of-Friends:

– linking particles below a certain distance

More later...

Introduction

Methods

Overview Initial conditions Simulation Halo Finding

Parallelism

Summary

Overview

Initial conditions

– Example code ginnungagap

– FFT in slab and pencil-decomposition

Simulation

– Example code Gadget 2 (3)

– Domain decomposition along a space filling curve

– Scaling results

Halo Finding

– Example code AHF

– Data chunking

– Scaling results

Introduction

Methods

Overview Initial conditions Simulation Halo Finding

Parallelism

Summary

Initial Conditions: Algorithmic requirements

– (Gaussian) Random Numbers

– Fast Fourier Transform

– Gradient (should be done in Fourier space)

While there are many codes available for initial conditions, the discussion in the following is

based on ginnungagap (Knollmann et al, in prep)

Introduction

Methods

Overview Initial conditions Simulation Halo Finding

Parallelism

Summary

Initial Conditions: Algorithmic requirements

– (Gaussian) Random Numbers

– Fast Fourier Transform

– Gradient (should be done in Fourier space)

While there are many codes available for initial conditions, the discussion in the following is

based on ginnungagap (Knollmann et al, in prep)

Introduction

Methods

Overview Initial conditions Simulation Halo Finding

Parallelism

Summary

Initial Conditions: Basic steps

For density field:

– Generate gaussian random field in real space

– Go to Fourier space

– Convolve with Power Spectrum

– Go to real space to yield density field

Introduction

Methods

Overview Initial conditions Simulation Halo Finding

Parallelism

Summary

Initial Conditions: Basic steps

For velocity field:

– Generate gaussian random field in real space

– Go to Fourier space

– Convolve with Power Spectrum

– Differentiate in k-space for x (y, z)

– Go to real space to velocity field for x (y, z)

Introduction

Methods

Overview Initial conditions Simulation Halo Finding

Parallelism

Summary

Initial Conditions: Random numbers

Using SPRNG library.

– Modified Lagged Fibonacci

– provides 2 39648 streams

– sequence length 2 1310

Introduction

Methods

Overview Initial conditions Simulation Halo Finding

Parallelism

Summary

Initial Conditions: FFT

Standard library for FFTs (FFTW2) supports MPI with slab-decomposition (FFTW3

does not support MPI officially)

Is a slab-decomposition the best way

Introduction

Methods

Overview Initial conditions Simulation Halo Finding

Parallelism

Summary

Initial Conditions: FFT

Speed concerns:

– slab:

– 2D FFTs

– transpose

– 1D FFTs

– pencil:

– 1D FFTs

– transpose

– 1D FFTs

– transpose

– 1D FFTs

Memory requirements

Introduction

Methods

Overview Initial conditions Simulation Halo Finding

Parallelism

Summary

Initial Conditions: FFT

Speed concerns:

– slab:

– 2D FFTs

– transpose

– 1D FFTs

– pencil:

– 1D FFTs

– transpose

– 1D FFTs

– transpose

– 1D FFTs

Memory requirements

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Initial Conditions: FFT

Speed concerns:

– slab:

– 2D FFTs

– transpose

– 1D FFTs

– pencil:

– 1D FFTs

– transpose

– 1D FFTs

– transpose

– 1D FFTs

Memory requirements

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Initial Conditions: Limitations

Main limit is IO...

→ 4096 3 corresponds to 1536 GB of particle data (position, velocity)

→ 8192 3 corresponds to 12288 GB

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Simulation

The main challenge is in the domain decomposition and the load-balance.

Need to distinguish between:

– homogeneous particle distribution (large simulation boxes, not too many

particles per object)

– simple Cartesian decomposition is sufficient

– easy communication patterns for FFT solver (inherently slab or pencil decomposed)

– highly clustered particle distribution (small boxes, zoomed simulations, large

number of particles in one object)

– general approach nowadays is the domain decomposition along a space filling curve

(and there mostly a Hilbert curve)

– for hybrid solvers that use an FFT part, this requires more complicated

communication patterns

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Simulation

The main challenge is in the domain decomposition and the load-balance.

Need to distinguish between:

– homogeneous particle distribution (large simulation boxes, not too many

particles per object)

– simple Cartesian decomposition is sufficient

– easy communication patterns for FFT solver (inherently slab or pencil decomposed)

– highly clustered particle distribution (small boxes, zoomed simulations, large

number of particles in one object)

– general approach nowadays is the domain decomposition along a space filling curve

(and there mostly a Hilbert curve)

– for hybrid solvers that use an FFT part, this requires more complicated

communication patterns

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Simulation

The main challenge is in the domain decomposition and the load-balance.

Need to distinguish between:

– homogeneous particle distribution (large simulation boxes, not too many

particles per object)

– simple Cartesian decomposition is sufficient

– easy communication patterns for FFT solver (inherently slab or pencil decomposed)

– highly clustered particle distribution (small boxes, zoomed simulations, large

number of particles in one object)

– general approach nowadays is the domain decomposition along a space filling curve

(and there mostly a Hilbert curve)

– for hybrid solvers that use an FFT part, this requires more complicated

communication patterns

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Simulation

The main challenge is in the domain decomposition and the load-balance.

Need to distinguish between:

– homogeneous particle distribution (large simulation boxes, not too many

particles per object)

– simple Cartesian decomposition is sufficient

– easy communication patterns for FFT solver (inherently slab or pencil decomposed)

– highly clustered particle distribution (small boxes, zoomed simulations, large

number of particles in one object)

– general approach nowadays is the domain decomposition along a space filling curve

(and there mostly a Hilbert curve)

– for hybrid solvers that use an FFT part, this requires more complicated

communication patterns

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Simulation: Hilbert curve decomposition

– All particles are ordered according to a 1D index

– Each process deals with the same number of particles or the same work-load

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Simulation: Hilbert curve decomposition

– All particles are ordered according to a 1D index

– Each process deals with the same number of particles or the same work-load

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Springel 2005

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Springel 2005

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

– To further improve load-balance (and to allow for extremely zoomed

simulations), the Hilbert curve can be subdivided into more chunks than

available processes (multi-domains).

– This made it possible to deal with more than 10 9 particles in one object.

– Also allows using large number of CPUs.

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Halo Finding

Example: AHF (Knollmann & Knebe, 2009)

– uses an adaptive mesh hierarchy to identify density peaks

– collects particles around those peaks taking into account the hierarchical

structure of the object (naturally dealing with substructures)

– use MPI for a subchunking of the simulation data (including buffer zones,

eliminating further communication)

– uses OpenMP to parallelize local work

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Halo Finding: MPI splitting

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Halo Finding: MPI splitting

Overview

Introduction

Initial conditions

Simulation

Methods

Halo Finding

Parallelism

Summary

Halo Finding: Scaling

Summary

Introduction

Outlook

Methods

Parallelism

Summary

Summary

– cosmological simulations require massively parallel system

– various methods exists to effectively solve the relevant equations on modern

supercomputers

– some free implementation exist (RAMSES, Enzo, Gadget-2,...)

– analysis tools and supporting software are slowly catching up to the simulation

software

Summary

Introduction

Outlook

Methods

Parallelism

Summary

Outlook

– the main challenge for simulation software is the inclusion of additional physics

– convergence of subgrid models is unclear (however, the more resolution is

available, the more physical processes need to be included)

– from a programmers perspective: load-balance is the hardest part to tackle

More magazines by this user
Similar magazines