Visualising the Input Space of a Galaxy Formation Simulation - MUCM

mucm.ac.uk

Visualising the Input Space of a Galaxy Formation Simulation - MUCM

Visualising the Input Space of a Galaxy Formation Simulation

Ian Vernon ∗

Department of Mathematical Sciences

Durham University

∗ work done in collaboration with Michael Goldstein (Dept. Mathematical Sciences); Richard Bower and

Carlos Frenk’s group at the Institute for Computational Cosmology, Durham University. Funding: MUCM

and EPSRC.

1 / 65


Overview

• Going to History Match a Galaxy formation simulation known as Galform.

2 / 65


Overview

• Going to History Match a Galaxy formation simulation known as Galform.

• This involves learning about acceptable inputs x to the Galform model,

using observed data z.

2 / 65


Overview

• Going to History Match a Galaxy formation simulation known as Galform.

• This involves learning about acceptable inputs x to the Galform model,

using observed data z.

• We use emulators and implausibility measures to cut out input space

iteratively.

2 / 65


Overview

• Going to History Match a Galaxy formation simulation known as Galform.

• This involves learning about acceptable inputs x to the Galform model,

using observed data z.

• We use emulators and implausibility measures to cut out input space

iteratively.

• We will discuss relevant uncertainties: model discrepancy, observational

errors, function uncertainty etc.

2 / 65


Overview

• Going to History Match a Galaxy formation simulation known as Galform.

• This involves learning about acceptable inputs x to the Galform model,

using observed data z.

• We use emulators and implausibility measures to cut out input space

iteratively.

• We will discuss relevant uncertainties: model discrepancy, observational

errors, function uncertainty etc.

• Finally, we will consider various visualisation issues: even when we can

identify the acceptable input space, visualisation is difficult.

2 / 65


Overview

• Going to History Match a Galaxy formation simulation known as Galform.

• This involves learning about acceptable inputs x to the Galform model,

using observed data z.

• We use emulators and implausibility measures to cut out input space

iteratively.

• We will discuss relevant uncertainties: model discrepancy, observational

errors, function uncertainty etc.

• Finally, we will consider various visualisation issues: even when we can

identify the acceptable input space, visualisation is difficult.

• The approach described is completely general, and can be used for any

model that is relatively slow to run and requires lots of inputs.

2 / 65


Why History Match?

• Often the most pressing question for a modeller is: are there any inputs x to

the model that produce outputs f(x), that match the observed data z?

3 / 65


Why History Match?

• Often the most pressing question for a modeller is: are there any inputs x to

the model that produce outputs f(x), that match the observed data z?

• History Matching is an efficient technique that seeks to identify the set X of

all such inputs.

3 / 65


Why History Match?

• Often the most pressing question for a modeller is: are there any inputs x to

the model that produce outputs f(x), that match the observed data z?

• History Matching is an efficient technique that seeks to identify the set X of

all such inputs.

• Often X only occupies a tiny fraction of the original input space.

3 / 65


Why History Match?

• Often the most pressing question for a modeller is: are there any inputs x to

the model that produce outputs f(x), that match the observed data z?

• History Matching is an efficient technique that seeks to identify the set X of

all such inputs.

• Often X only occupies a tiny fraction of the original input space.

• This set X may be empty: we do not presuppose that any such inputs exist.

3 / 65


Why History Match?

• Often the most pressing question for a modeller is: are there any inputs x to

the model that produce outputs f(x), that match the observed data z?

• History Matching is an efficient technique that seeks to identify the set X of

all such inputs.

• Often X only occupies a tiny fraction of the original input space.

• This set X may be empty: we do not presuppose that any such inputs exist.

• This is the main difference between History Matching and the related

technique of Probabilistic Calibration.

3 / 65


Why History Match?

• Often the most pressing question for a modeller is: are there any inputs x to

the model that produce outputs f(x), that match the observed data z?

• History Matching is an efficient technique that seeks to identify the set X of

all such inputs.

• Often X only occupies a tiny fraction of the original input space.

• This set X may be empty: we do not presuppose that any such inputs exist.

• This is the main difference between History Matching and the related

technique of Probabilistic Calibration.

• The later is a useful technique, but assumes a single ‘best input’ and gives

its posterior distribution.

3 / 65


Andromeda Galaxy and Hubble Deep Field View

• Andromeda Galaxy: closest large galaxy to our own milky way, contains 1

trillion stars.

• Hubble Deep Field: one of the furthest images yet taken. Covers 2

millionths of the sky but contains over 3000 galaxies.

4 / 65


The Galform Model

• The Cosmologists at the ICC are interested in modelling galaxy formation in

the presence of Dark Matter.

5 / 65


The Galform Model

• The Cosmologists at the ICC are interested in modelling galaxy formation in

the presence of Dark Matter.

• First a Dark Matter simulation is performed over a volume of (1.63 billion

light years) 3 . This takes 3 months on a supercomputer.

5 / 65


The Galform Model

• The Cosmologists at the ICC are interested in modelling galaxy formation in

the presence of Dark Matter.

• First a Dark Matter simulation is performed over a volume of (1.63 billion

light years) 3 . This takes 3 months on a supercomputer.

• Galform takes the results of this simulation, includes the more realistic

physics of ‘normal matter’, and models the evolution and attributes of

approximately 1 million galaxies.

5 / 65


The Galform Model

• The Cosmologists at the ICC are interested in modelling galaxy formation in

the presence of Dark Matter.

• First a Dark Matter simulation is performed over a volume of (1.63 billion

light years) 3 . This takes 3 months on a supercomputer.

• Galform takes the results of this simulation, includes the more realistic

physics of ‘normal matter’, and models the evolution and attributes of

approximately 1 million galaxies.

• Galform requires the specification of 17 unknown inputs in order to run.

5 / 65


The Galform Model

• The Cosmologists at the ICC are interested in modelling galaxy formation in

the presence of Dark Matter.

• First a Dark Matter simulation is performed over a volume of (1.63 billion

light years) 3 . This takes 3 months on a supercomputer.

• Galform takes the results of this simulation, includes the more realistic

physics of ‘normal matter’, and models the evolution and attributes of

approximately 1 million galaxies.

• Galform requires the specification of 17 unknown inputs in order to run.

• It takes approximately 1 day to complete 1 run (using a single processor).

5 / 65


The Galform Model

• The Cosmologists at the ICC are interested in modelling galaxy formation in

the presence of Dark Matter.

• First a Dark Matter simulation is performed over a volume of (1.63 billion

light years) 3 . This takes 3 months on a supercomputer.

• Galform takes the results of this simulation, includes the more realistic

physics of ‘normal matter’, and models the evolution and attributes of

approximately 1 million galaxies.

• Galform requires the specification of 17 unknown inputs in order to run.

• It takes approximately 1 day to complete 1 run (using a single processor).

• The Galform model produces lots of outputs, some of which can be

compared to observed data from the real Universe.

5 / 65


Galform: Which Inputs to Use?

• PROBLEM: We want to identify the set of all inputs X that lead to

acceptable matches between model outputs and observed data, given

all relevant uncertainties.

6 / 65


Galform: Which Inputs to Use?

• PROBLEM: We want to identify the set of all inputs X that lead to

acceptable matches between model outputs and observed data, given

all relevant uncertainties.

• 17-dimensional input space is large! If we did the simplest grid based

search (setting each input to max or min), we would require 2 17 runs.

6 / 65


Galform: Which Inputs to Use?

• PROBLEM: We want to identify the set of all inputs X that lead to

acceptable matches between model outputs and observed data, given

all relevant uncertainties.

• 17-dimensional input space is large! If we did the simplest grid based

search (setting each input to max or min), we would require 2 17 runs.

• This would take approximately 360 years to complete (on one processor)!

6 / 65


Galform: Which Inputs to Use?

• PROBLEM: We want to identify the set of all inputs X that lead to

acceptable matches between model outputs and observed data, given

all relevant uncertainties.

• 17-dimensional input space is large! If we did the simplest grid based

search (setting each input to max or min), we would require 2 17 runs.

• This would take approximately 360 years to complete (on one processor)!

• We would really want a higher definition, so would want say 10 17 runs...

6 / 65


Galform: Which Inputs to Use?

• PROBLEM: We want to identify the set of all inputs X that lead to

acceptable matches between model outputs and observed data, given

all relevant uncertainties.

• 17-dimensional input space is large! If we did the simplest grid based

search (setting each input to max or min), we would require 2 17 runs.

• This would take approximately 360 years to complete (on one processor)!

• We would really want a higher definition, so would want say 10 17 runs...

This would take far longer than the current age of the Universe.

6 / 65


Galform: Which Inputs to Use?

• PROBLEM: We want to identify the set of all inputs X that lead to

acceptable matches between model outputs and observed data, given

all relevant uncertainties.

• 17-dimensional input space is large! If we did the simplest grid based

search (setting each input to max or min), we would require 2 17 runs.

• This would take approximately 360 years to complete (on one processor)!

• We would really want a higher definition, so would want say 10 17 runs...

This would take far longer than the current age of the Universe.

• SOLUTION: Construct an Emulator, which is a stochastic function that

approximates the Galform model, and is fast to evaluate.

6 / 65


Galform: Which Inputs to Use?

• PROBLEM: We want to identify the set of all inputs X that lead to

acceptable matches between model outputs and observed data, given

all relevant uncertainties.

• 17-dimensional input space is large! If we did the simplest grid based

search (setting each input to max or min), we would require 2 17 runs.

• This would take approximately 360 years to complete (on one processor)!

• We would really want a higher definition, so would want say 10 17 runs...

This would take far longer than the current age of the Universe.

• SOLUTION: Construct an Emulator, which is a stochastic function that

approximates the Galform model, and is fast to evaluate.

• Use the Emulator to find the acceptable inputs.

6 / 65


The Dark Matter Simulation

7 / 65


The Galform Model

8 / 65


Galform Outputs: The Luminosity Functions

• Galform provides multiple output data sets.

• Initially we analyse the luminosity functions which give the number of

galaxies per unit volume, for each luminosity.

• Bj Luminosity: corresponds to density of young (blue) galaxies

• K Luminosity: corresponds to density of old (red) galaxies

9 / 65


Input Parameters

• To perform one run, we need to specify numbers for each of the following 17

inputs:

vhotdisk: 100 - 550 VCUT: 20 - 50

aReheat: 0.2 - 1.2 ZCUT: 6 - 9

alphacool: 0.2 - 1.2 alphastar: -3.2 - -0.3

vhotburst: 100 - 550 tau0mrg: 0.8 - 2.7

epsilonStar: 0.001 - 0.1 fellip: 0.1 - 0.35

stabledisk: 0.65 - 0.95 fburst: 0.01 - 0.15

alphahot: 2 - 3.7 FSMBH: 0.001 - 0.01

yield: 0.02 - 0.05 eSMBH: 0.004 - 0.05

tdisk: 0 - 1

• What input values should we choose to get ‘acceptable’ outputs?

10 / 65


Galform Outputs: The Luminosity Functions

• Basic problem is that we pick inputs:

• vhotdisk = 290.5, aReheat = 1.15, alphacool = 0.31, ...

• And find that after 1 Day of Runtime:

11 / 65


Galform Outputs: The Luminosity Functions

• Basic problem is that we pick inputs:

• vhotdisk = 290.5, aReheat = 1.15, alphacool = 0.31, ...

• And find that after 1 Day of Runtime:

• 1st run is rubbish.

11 / 65


Galform Outputs: The Luminosity Functions

• Basic problem is that we pick inputs:

• vhotdisk = 223.3, aReheat = 0.49, alphacool = 1.12, ...

• And find that after 2 Days of Runtime:

12 / 65


Galform Outputs: The Luminosity Functions

• Basic problem is that we pick inputs:

• vhotdisk = 223.3, aReheat = 0.49, alphacool = 1.12, ...

• And find that after 2 Days of Runtime:

• 2nd run is rubbish.

13 / 65


Galform Outputs: The Luminosity Functions

• Basic problem is that we pick inputs:

• vhotdisk = 349.7, aReheat = 0.21, alphacool = 1.08, ...

• And find that after 3 Days of Runtime:

14 / 65


Galform Outputs: The Luminosity Functions

• Basic problem is that we pick inputs:

• vhotdisk = 349.7, aReheat = 0.21, alphacool = 1.08, ...

• And find that after 3 Days of Runtime:

• 3rd run is rubbish.

15 / 65


Galform Outputs: The Luminosity Functions

• Pick 20 inputs and find after 20 Days of Runtime:

• All runs are rubbish.

16 / 65


Galform Outputs: The Luminosity Functions

• Pick 40 inputs and find after 40 Days of Runtime:

• All runs are rubbish.

17 / 65


Galform Outputs: The Luminosity Functions

• Pick 60 inputs and find after 60 Days of Runtime:

• All runs are rubbish.

18 / 65


11 Outputs Chosen

• We do 1000 runs using carefully chosen inputs (a space-filling maximin latin

hypercube design).

• (Again all runs are found to be unacceptable.)

19 / 65


11 Outputs Chosen

• We do 1000 runs using carefully chosen inputs (a space-filling maximin latin

hypercube design).

• (Again all runs are found to be unacceptable.)

• We choose 11 outputs that are representative of the Luminosity functions

and emulate the functions fi(x).

19 / 65


Linking Model to Reality

• We represent the model (Galform) as a function, which maps the vector of

17 inputs x to the vector of 11 outputs f(x).

20 / 65


Linking Model to Reality

• We represent the model (Galform) as a function, which maps the vector of

17 inputs x to the vector of 11 outputs f(x).

• We use the “Best Input Approach” to link the model f(x) to the real system

y (i.e. the real Universe) via:

y = f(x + ) + d

where we define d to be the model discrepancy and assume that d is

independent of f and x + .

20 / 65


Linking Model to Reality

• We represent the model (Galform) as a function, which maps the vector of

17 inputs x to the vector of 11 outputs f(x).

• We use the “Best Input Approach” to link the model f(x) to the real system

y (i.e. the real Universe) via:

y = f(x + ) + d

where we define d to be the model discrepancy and assume that d is

independent of f and x + .

• Finally, we relate the true system y to the observational data z by,

z = y + e

where e represent the observational errors.

20 / 65


Linking Model to Reality

• We represent the model (Galform) as a function, which maps the vector of

17 inputs x to the vector of 11 outputs f(x).

• We use the “Best Input Approach” to link the model f(x) to the real system

y (i.e. the real Universe) via:

y = f(x + ) + d

where we define d to be the model discrepancy and assume that d is

independent of f and x + .

• Finally, we relate the true system y to the observational data z by,

z = y + e

where e represent the observational errors.

• We will use the Bayes Linear methodology, which only involves

expectations, variances and covariances.

20 / 65


Galform: Emulation

• For each of the 11 outputs we pick active variables x A then emulate

univariately (at first) using:

fi(x) =

j

βij gij(x A ) + ui(x A ) + δi(x)

21 / 65


Galform: Emulation

• For each of the 11 outputs we pick active variables x A then emulate

univariately (at first) using:

fi(x) =

j

βij gij(x A ) + ui(x A ) + δi(x)

• The

j βij gij(x A ) is a 3rd order polynomial in the active inputs.

21 / 65


Galform: Emulation

• For each of the 11 outputs we pick active variables x A then emulate

univariately (at first) using:

fi(x) =

j

βij gij(x A ) + ui(x A ) + δi(x)

• The

j βij gij(x A ) is a 3rd order polynomial in the active inputs.

• ui(x A ) is a Gaussian process.

21 / 65


Galform: Emulation

• For each of the 11 outputs we pick active variables x A then emulate

univariately (at first) using:

fi(x) =

j

βij gij(x A ) + ui(x A ) + δi(x)

• The

j βij gij(x A ) is a 3rd order polynomial in the active inputs.

• ui(x A ) is a Gaussian process.

• The nugget δi(x) models the effects of inactive variables as random noise.

21 / 65


Galform: Emulation

• For each of the 11 outputs we pick active variables x A then emulate

univariately (at first) using:

fi(x) =

j

βij gij(x A ) + ui(x A ) + δi(x)

• The

j βij gij(x A ) is a 3rd order polynomial in the active inputs.

• ui(x A ) is a Gaussian process.

• The nugget δi(x) models the effects of inactive variables as random noise.

• The ui(x A ) have covariance structure given by:

Cov(ui(x A 1 ), ui(x A 2 )) = σ 2 i exp[−|x A 1 − x A 2 | 2 /θ 2 i ]

21 / 65


Galform: Emulation

• For each of the 11 outputs we pick active variables x A then emulate

univariately (at first) using:

fi(x) =

j

βij gij(x A ) + ui(x A ) + δi(x)

• The

j βij gij(x A ) is a 3rd order polynomial in the active inputs.

• ui(x A ) is a Gaussian process.

• The nugget δi(x) models the effects of inactive variables as random noise.

• The ui(x A ) have covariance structure given by:

Cov(ui(x A 1 ), ui(x A 2 )) = σ 2 i exp[−|x A 1 − x A 2 | 2 /θ 2 i ]

• The Emulators give the expectation E[fi(x)] and variance Var[fi(x)] at

point x for each output given by i = 1, .., 11, and are fast to evaluate.

21 / 65


Emulation: a 1D Example

22 / 65


Emulation: a 1D Example

23 / 65


Emulation: a 1D Example

24 / 65


Emulation: a 1D Example

25 / 65


Emulation: a 1D Example

26 / 65


Emulation: a 1D Example

27 / 65


Emulation: a 1D Example

28 / 65


Emulation: a 1D Example

29 / 65


Model Discrepancy

Before calculating the implausibility we need to assess the Model Discrepancy

and Measurement error:

Model Discrepancy Var[d] = Φ40 + Φ9 + ΦE

30 / 65


Model Discrepancy

Before calculating the implausibility we need to assess the Model Discrepancy

and Measurement error:

Model Discrepancy Var[d] = Φ40 + Φ9 + ΦE

• Φ40: Discrepancy term due to choosing first 40 sub-volumes from full 512

sub-volumes. Assess this by repeating 100 runs but now choosing 40

random regions. More advanced: exchangeable models paper.

• Φ9: As we have initially neglected 9 parameters (due to expert advice) we

need to assess effect of this (by running latin hypercube design across all

17 parameters)

• ΦE: Expert assessment of model discrepancy of full model with 17

parameters and using 512 sub-volumes

It is straightforward to find the multivariate expressions for Φ40 and Φ9, but ΦE

requires more careful thought.

30 / 65


Model Discrepancy: Subjective ΦE

• Experts assert that there are clear ways that the model could be defective.

• Model predicts too many (or too few) galaxies. This would lead to a highly

correlated model discrepancy across all outputs.

• Model systematically gets the colours of galaxies wrong: results in too few

(too many) blue galaxies and too many (too few) red galaxies. Gives

negatively correlated model discrepancy between outputs from different

coloured (bj and K) luminosity graphs.

• We therefore assume the model discrepancy term ΦE has the form:

ΦE = a




1 b .. c .. c

b 1 .. c . c

: : : : : :

c .. c 1 b ..

c .. c b 1 ..

: : : : : :

• Obtain values for a, b and c from expert assessment.




31 / 65


Expert Assessment of ΦE: Elicitation Tool

• We obtain expert assessments of a, b and c using an elicitation tool.

32 / 65


Observational Errors

Observational Errors Var[e] are composed of 4 parts:

• Normalisation Error: correlated vertical error on all luminosity output points

• Luminostiy Zero Point Error: correlated horizontal error on all luminosity

points

• k + e Correction Error: Outputs have to be corrected for the fact that

galaxies are moving away from us at different speeds (light is red-shifted),

and for the fact that galaxies are seen in the past (as light takes millions of

years to reach us)

Galaxy Production Error: assumed Poisson process to describe galaxy

production

The multivariate form for each of these quantities is straightforward(!) to

calculate.

33 / 65


Implausibility Measures (Univariate)

We can now calculate the Implausibility I (i)(x) at any input parameter point x

for each of the i = 1, .., 11 outputs. This is given by:

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

34 / 65


Implausibility Measures (Univariate)

We can now calculate the Implausibility I (i)(x) at any input parameter point x

for each of the i = 1, .., 11 outputs. This is given by:

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

• E[fi(x)] and Var[fi(x)] are the emulator expectation and variance.

34 / 65


Implausibility Measures (Univariate)

We can now calculate the Implausibility I (i)(x) at any input parameter point x

for each of the i = 1, .., 11 outputs. This is given by:

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

• E[fi(x)] and Var[fi(x)] are the emulator expectation and variance.

• zi are the observed data and Var[di] and Var[ei] are the (univariate)

Model Discrepancy and Observational Error variances.

34 / 65


Implausibility Measures (Univariate)

We can now calculate the Implausibility I (i)(x) at any input parameter point x

for each of the i = 1, .., 11 outputs. This is given by:

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

• E[fi(x)] and Var[fi(x)] are the emulator expectation and variance.

• zi are the observed data and Var[di] and Var[ei] are the (univariate)

Model Discrepancy and Observational Error variances.

• Large values of I (i)(x) imply that we are highly unlikely to obtain

acceptable matches between model output and observed data at

input x.

34 / 65


Implausibility Measures (Univariate)

• We can combine the univariate implausibilities across the 11 outputs by

maximizing over outputs:

IM(x) = maxi I (i)(x)

35 / 65


Implausibility Measures (Univariate)

• We can combine the univariate implausibilities across the 11 outputs by

maximizing over outputs:

IM(x) = maxi I (i)(x)

• We can then impose a cutoff IM(x) < cM in order to discard regions of

input parameter space that we now deem to be implausible.

35 / 65


Implausibility Measures (Univariate)

• We can combine the univariate implausibilities across the 11 outputs by

maximizing over outputs:

IM(x) = maxi I (i)(x)

• We can then impose a cutoff IM(x) < cM in order to discard regions of

input parameter space that we now deem to be implausible.

• The choice of cutoff cM is often motivated by Pukelsheim’s 3-sigma rule.

35 / 65


Implausibility Measures (Univariate)

• We can combine the univariate implausibilities across the 11 outputs by

maximizing over outputs:

IM(x) = maxi I (i)(x)

• We can then impose a cutoff IM(x) < cM in order to discard regions of

input parameter space that we now deem to be implausible.

• The choice of cutoff cM is often motivated by Pukelsheim’s 3-sigma rule.

• We may simultaneously employ other choices of implausibility measure:

e.g. multivariate, second maximum etc.

35 / 65


Multivariate Implausibility Measure

• As we have constructed a multivariate model discrepancy, we can define a

multivariate Implausibility measure:

I 2 (x) = (E[f(x)] − z) T Var[f(x) − z] −1 (E[f(x)] − z),

which becomes:

I 2 (x) = (E[f(x)] − z) T (Var[f(x)] + Var[d] + Var[e]) −1 (E[f(x)] − z)

• where Var[f(x)], Var[d] and Var[e] are now the multivariate emulator

variance, multivariate model discrepancy and multivariate observational

errors respectively (all 11×11 matrices).

• We now have two implausibility measures IM(x) and I(x) that we can use

to reduce the input space.

• We impose suitable cutoffs on each measure to define a smaller set of

non-implausible inputs.

36 / 65


History Matching via Implausibility: a 1D Example

37 / 65


History Matching via Implausibility: a 1D Example

38 / 65


History Matching via Implausibility: a 1D Example

39 / 65


History Matching via Implausibility: a 1D Example

40 / 65


History Matching via Implausibility: a 1D Example

41 / 65


History Matching via Implausibility: a 1D Example

42 / 65


Iterative Refocussing Strategy for Reducing Input Space.

We use an iterative strategy to reduce the input parameter space. Denoting the

current non-implausible volume by Xj, at each stage or wave we:

43 / 65


Iterative Refocussing Strategy for Reducing Input Space.

We use an iterative strategy to reduce the input parameter space. Denoting the

current non-implausible volume by Xj, at each stage or wave we:

1. Design a set of runs over the non-implausible input region Xj

43 / 65


Iterative Refocussing Strategy for Reducing Input Space.

We use an iterative strategy to reduce the input parameter space. Denoting the

current non-implausible volume by Xj, at each stage or wave we:

1. Design a set of runs over the non-implausible input region Xj

2. Construct new emulators for f(x) only over this region Xj

43 / 65


Iterative Refocussing Strategy for Reducing Input Space.

We use an iterative strategy to reduce the input parameter space. Denoting the

current non-implausible volume by Xj, at each stage or wave we:

1. Design a set of runs over the non-implausible input region Xj

2. Construct new emulators for f(x) only over this region Xj

3. Evaluate the new implausibility function IM(x) over Xj

43 / 65


Iterative Refocussing Strategy for Reducing Input Space.

We use an iterative strategy to reduce the input parameter space. Denoting the

current non-implausible volume by Xj, at each stage or wave we:

1. Design a set of runs over the non-implausible input region Xj

2. Construct new emulators for f(x) only over this region Xj

3. Evaluate the new implausibility function IM(x) over Xj

4. Define a new (reduced) non-implausible region Xj+1, by IM(x) < cM ,

which should satisfy X ⊂ Xj+1 ⊂ Xj.

43 / 65


Iterative Refocussing Strategy for Reducing Input Space.

We use an iterative strategy to reduce the input parameter space. Denoting the

current non-implausible volume by Xj, at each stage or wave we:

1. Design a set of runs over the non-implausible input region Xj

2. Construct new emulators for f(x) only over this region Xj

3. Evaluate the new implausibility function IM(x) over Xj

4. Define a new (reduced) non-implausible region Xj+1, by IM(x) < cM ,

which should satisfy X ⊂ Xj+1 ⊂ Xj.

This algorithm is continued until a) we run out of computational resources, or b)

the emulators are found to be of sufficient accuracy compared to the other

uncertainties present (model discrepancy and observational errors).

43 / 65


History Matching via Implausibility: 1D Example - WAVE 1

44 / 65


History Matching via Implausibility: 1D Example

45 / 65


History Matching via Implausibility: 1D Example

46 / 65


History Matching via Implausibility: 1D Example - WAVE 2

47 / 65


Why Does Iterative Refocussing Work?

Why do we reduce space in waves? Why not attempt to do it all at once?

48 / 65


Why Does Iterative Refocussing Work?

Why do we reduce space in waves? Why not attempt to do it all at once?

Because this requires an accurate emulator valid over whole input space.

48 / 65


Why Does Iterative Refocussing Work?

Why do we reduce space in waves? Why not attempt to do it all at once?

Because this requires an accurate emulator valid over whole input space.

• In contrast, the iterative approach is far more efficient.

48 / 65


Why Does Iterative Refocussing Work?

Why do we reduce space in waves? Why not attempt to do it all at once?

Because this requires an accurate emulator valid over whole input space.

• In contrast, the iterative approach is far more efficient.

• At each wave the emulators are found to be significantly more accurate (in

that Var[f(x)] becomes smaller). This is expected as:

48 / 65


Why Does Iterative Refocussing Work?

Why do we reduce space in waves? Why not attempt to do it all at once?

Because this requires an accurate emulator valid over whole input space.

• In contrast, the iterative approach is far more efficient.

• At each wave the emulators are found to be significantly more accurate (in

that Var[f(x)] becomes smaller). This is expected as:

1. We have ‘zoomed in’ on a smaller part of the function, it will be

smoother and most likely easier to fit with low order polynomials.

48 / 65


Why Does Iterative Refocussing Work?

Why do we reduce space in waves? Why not attempt to do it all at once?

Because this requires an accurate emulator valid over whole input space.

• In contrast, the iterative approach is far more efficient.

• At each wave the emulators are found to be significantly more accurate (in

that Var[f(x)] becomes smaller). This is expected as:

1. We have ‘zoomed in’ on a smaller part of the function, it will be

smoother and most likely easier to fit with low order polynomials.

2. We have a much higher density of runs in the new volume, and hence

the Gaussian process part of the emulator will do more work.

48 / 65


Why Does Iterative Refocussing Work?

Why do we reduce space in waves? Why not attempt to do it all at once?

Because this requires an accurate emulator valid over whole input space.

• In contrast, the iterative approach is far more efficient.

• At each wave the emulators are found to be significantly more accurate (in

that Var[f(x)] becomes smaller). This is expected as:

1. We have ‘zoomed in’ on a smaller part of the function, it will be

smoother and most likely easier to fit with low order polynomials.

2. We have a much higher density of runs in the new volume, and hence

the Gaussian process part of the emulator will do more work.

3. We can identify more active variables, leading to more detailed

polynomial and Gaussian process parts of the emulator, as previously

dominant variables are now somewhat suppressed.

48 / 65


Why Does Iterative Refocussing Work?

Why do we reduce space in waves? Why not attempt to do it all at once?

Because this requires an accurate emulator valid over whole input space.

• In contrast, the iterative approach is far more efficient.

• At each wave the emulators are found to be significantly more accurate (in

that Var[f(x)] becomes smaller). This is expected as:

1. We have ‘zoomed in’ on a smaller part of the function, it will be

smoother and most likely easier to fit with low order polynomials.

2. We have a much higher density of runs in the new volume, and hence

the Gaussian process part of the emulator will do more work.

3. We can identify more active variables, leading to more detailed

polynomial and Gaussian process parts of the emulator, as previously

dominant variables are now somewhat suppressed.

• This is a major strength of the History Matching approach.

48 / 65


Galform: Visualizing the non-implausible volumes Xj

• Plotting the non-implausible region Xj in 1 dimension is trivial, but how do

we view a 17-dimensional Xj?

49 / 65


Galform: Visualizing the non-implausible volumes Xj

• Plotting the non-implausible region Xj in 1 dimension is trivial, but how do

we view a 17-dimensional Xj?

• Even though our emulators are very fast to evaluate, we still cannot cover

the 17-dimensional space with a simple grid design.

49 / 65


Galform: Visualizing the non-implausible volumes Xj

• Plotting the non-implausible region Xj in 1 dimension is trivial, but how do

we view a 17-dimensional Xj?

• Even though our emulators are very fast to evaluate, we still cannot cover

the 17-dimensional space with a simple grid design.

• We instead use efficient emulator designs, developed specifically for

constructing lower dimensional projections, e.g.

49 / 65


Galform: Visualizing the non-implausible volumes Xj

• Plotting the non-implausible region Xj in 1 dimension is trivial, but how do

we view a 17-dimensional Xj?

• Even though our emulators are very fast to evaluate, we still cannot cover

the 17-dimensional space with a simple grid design.

• We instead use efficient emulator designs, developed specifically for

constructing lower dimensional projections, e.g.

• 2-Dimensional Projection: for each pair of active variables we evaluate

the emulator on a large (2D grid) x (15D latin hypercube) design.

49 / 65


Galform: Visualizing the non-implausible volumes Xj

• Plotting the non-implausible region Xj in 1 dimension is trivial, but how do

we view a 17-dimensional Xj?

• Even though our emulators are very fast to evaluate, we still cannot cover

the 17-dimensional space with a simple grid design.

• We instead use efficient emulator designs, developed specifically for

constructing lower dimensional projections, e.g.

• 2-Dimensional Projection: for each pair of active variables we evaluate

the emulator on a large (2D grid) x (15D latin hypercube) design.

• Very fast emulator design as we can take several algebraic shortcuts due to

symmetries of emulator update: approximately 5 times faster.

49 / 65


Galform: Visualizing the non-implausible volumes Xj

• Plotting the non-implausible region Xj in 1 dimension is trivial, but how do

we view a 17-dimensional Xj?

• Even though our emulators are very fast to evaluate, we still cannot cover

the 17-dimensional space with a simple grid design.

• We instead use efficient emulator designs, developed specifically for

constructing lower dimensional projections, e.g.

• 2-Dimensional Projection: for each pair of active variables we evaluate

the emulator on a large (2D grid) x (15D latin hypercube) design.

• Very fast emulator design as we can take several algebraic shortcuts due to

symmetries of emulator update: approximately 5 times faster.

• Using this, we can now produce 2D Minimised Implausibility Projections

and Optical Depth Plots.

49 / 65


2D Minimised Implausibility Projections: Wave 1

• Minimised Implausibility Projections: at each 2D grid point, minimise the

implausibility IM(x) over the 15D hypercube.

50 / 65


2D Minimised Implausibility Projections: Wave 1

• Minimised Implausibility Projections: at each 2D grid point, minimise the

implausibility IM(x) over the 15D hypercube.

• If a point on these plots is implausible (coloured red), then it will be

implausible for any choice of the 15 other inputs.

50 / 65


2D Minimised Implausibility Projections: Wave 1

• Minimised Implausibility Projections: at each 2D grid point, minimise the

implausibility IM(x) over the 15D hypercube.

• If a point on these plots is implausible (coloured red), then it will be

implausible for any choice of the 15 other inputs.

• If a point is green, it may or may not prove to be an acceptable input.

50 / 65


2D Min Implausibility Projections: Wave 1 to Wave 4 (0.12%)

51 / 65


2D Min Implausibility Projections: Wave 1 to Wave 4 (0.12%)

51 / 65


2D Min Implausibility Projections: Wave 1 to Wave 4 (0.12%)

51 / 65


2D Min Implausibility Projections: Wave 1 to Wave 4 (0.12%)

51 / 65


2D Min Implausibility Projections: Wave 1 to Wave 4 (0.12%)

52 / 65


2D Min Implausibility Projections: Wave 1 to Wave 4 (0.12%)

52 / 65


2D Min Implausibility Projections: Wave 1 to Wave 4 (0.12%)

52 / 65


2D Min Implausibility Projections: Wave 1 to Wave 4 (0.12%)

52 / 65


2D Optical Depth Plots: Wave 2

• Optical Depth Plots: at each 2D grid point plot the proportion of the 15D

latin hypercube points that survive the cutoff IM(x) < cM .

53 / 65


2D Optical Depth Plots: Wave 2

• Optical Depth Plots: at each 2D grid point plot the proportion of the 15D

latin hypercube points that survive the cutoff IM(x) < cM .

• These plots show the ‘depth’ of the non-implausible volume Xj for wave j,

at each grid point.

53 / 65


2D Optical Depth Plots: Wave 2

• Optical Depth Plots: at each 2D grid point plot the proportion of the 15D

latin hypercube points that survive the cutoff IM(x) < cM .

• These plots show the ‘depth’ of the non-implausible volume Xj for wave j,

at each grid point.

• Shows where the majority of non-implausible points can be found, but not

necessarily where the best matches are.

53 / 65


2D Optical Depth Plots: Wave 1 to Wave 4

54 / 65


2D Optical Depth Plots: Wave 1 to Wave 4

54 / 65


2D Optical Depth Plots: Wave 1 to Wave 4

54 / 65


2D Optical Depth Plots: Wave 1 to Wave 4

54 / 65


2D Implausibility Projections: Stage 4 (0.12%)

55 / 65


Visualisation: Fast Approximate Emulators

The emulation approach we have taken, that of building substantial structure in

the polynomial part of the emulator, allows the construction of fast

approximations to the emulators.

56 / 65


Visualisation: Fast Approximate Emulators

The emulation approach we have taken, that of building substantial structure in

the polynomial part of the emulator, allows the construction of fast

approximations to the emulators.

• If we take the regression part of the wave 4 emulator only:

fi(x) =

j

βij gij(x A ) + wi,

and assume Var[wi] = α 2 (σ 2 ui + σ2 δi ) for some conservative choice of

α > 1, we can then use this fast approximate emulator to screen all

candidate input points.

56 / 65


Visualisation: Fast Approximate Emulators

The emulation approach we have taken, that of building substantial structure in

the polynomial part of the emulator, allows the construction of fast

approximations to the emulators.

• If we take the regression part of the wave 4 emulator only:

fi(x) =

j

βij gij(x A ) + wi,

and assume Var[wi] = α2 (σ2 ui + σ2 ) for some conservative choice of

δi

α > 1, we can then use this fast approximate emulator to screen all

candidate input points.

• Using this method we were able to reduce the input space to 1.2% of its

original volume.

56 / 65


Visualisation: Fast Approximate Emulators

The emulation approach we have taken, that of building substantial structure in

the polynomial part of the emulator, allows the construction of fast

approximations to the emulators.

• If we take the regression part of the wave 4 emulator only:

fi(x) =

j

βij gij(x A ) + wi,

and assume Var[wi] = α2 (σ2 ui + σ2 ) for some conservative choice of

δi

α > 1, we can then use this fast approximate emulator to screen all

candidate input points.

• Using this method we were able to reduce the input space to 1.2% of its

original volume.

• We then only had to use the full, slower wave 1 to 4 emulators on the

surviving points. Used to generate higher dimensional figures.

56 / 65


3D Minimised Implausibility and Optical Depth Plots

• 3D projections created using the Fast Approximate Emulator approach.

57 / 65


4-Dimensional Implausibility Plots: Anyone?

58 / 65


4-Dimensional Implausibility Plots: Anyone?

59 / 65


2D Implausibility Projections: Stage 4 (0.12%)

60 / 65


Wave 5 runs

61 / 65


j Luminosity Output of Waves 1,2,3 and 5

62 / 65


j Luminosity Output of Waves 1,2,3 and 5

62 / 65


j Luminosity Output of Waves 1,2,3 and 5

62 / 65


j Luminosity Output of Waves 1,2,3 and 5

62 / 65


j Luminosity Output of Waves 1,2,3 and 5

63 / 65


j Luminosity Output of Waves 1,2,3 and 5

63 / 65


j Luminosity Output of Waves 1,2,3 and 5

63 / 65


j Luminosity Output of Waves 1,2,3 and 5

63 / 65


Conclusions and Further Issues

• History Matching using iterative refocussing: very efficient technique for

learning about the set of acceptable inputs X .

64 / 65


Conclusions and Further Issues

• History Matching using iterative refocussing: very efficient technique for

learning about the set of acceptable inputs X .

• We have developed novel methods to visualise X that exploit the structure

of the emulators.

64 / 65


Conclusions and Further Issues

• History Matching using iterative refocussing: very efficient technique for

learning about the set of acceptable inputs X .

• We have developed novel methods to visualise X that exploit the structure

of the emulators.

• We now have a large set of acceptable (Wave 5) runs that can be analysed

by the Cosmologists, and used to explore other features of Galform.

64 / 65


Conclusions and Further Issues

• History Matching using iterative refocussing: very efficient technique for

learning about the set of acceptable inputs X .

• We have developed novel methods to visualise X that exploit the structure

of the emulators.

• We now have a large set of acceptable (Wave 5) runs that can be analysed

by the Cosmologists, and used to explore other features of Galform.

• Future work: beginning to explore the more advanced Galform 2 model.

64 / 65


Conclusions and Further Issues

• History Matching using iterative refocussing: very efficient technique for

learning about the set of acceptable inputs X .

• We have developed novel methods to visualise X that exploit the structure

of the emulators.

• We now have a large set of acceptable (Wave 5) runs that can be analysed

by the Cosmologists, and used to explore other features of Galform.

• Future work: beginning to explore the more advanced Galform 2 model.

Bower, R., Vernon, I., Goldstein, M., et al. (2009), ”The Parameter Space of

Galaxy Formation”, to appear in MNRAS; MUCM Technical Report 10/02

Vernon, I., Goldstein, M., and Bower, R. (2010), ”Galaxy Formation: a Bayesian

Uncertainty Analysis”, invited discussion paper for Bayesian Analysis; MUCM

Technical Report 10/03

64 / 65


Conclusions and Further Issues

• History Matching using iterative refocussing: very efficient technique for

learning about the set of acceptable inputs X .

• We have developed novel methods to visualise X that exploit the structure

of the emulators.

• We now have a large set of acceptable (Wave 5) runs that can be analysed

by the Cosmologists, and used to explore other features of Galform.

• Future work: beginning to explore the more advanced Galform 2 model.

Bower, R., Vernon, I., Goldstein, M., et al. (2009), ”The Parameter Space of

Galaxy Formation”, to appear in MNRAS; MUCM Technical Report 10/02

Vernon, I., Goldstein, M., and Bower, R. (2010), ”Galaxy Formation: a Bayesian

Uncertainty Analysis”, invited discussion paper for Bayesian Analysis; MUCM

Technical Report 10/03

— History Matching now available on the MUCM toolkit: Release 6—

64 / 65


References

P.S. Craig, M. Goldstein, A.H. Seheult, J.A. Smith (1997). Pressure matching

for hydocarbon reservoirs: a case study in the use of Bayes linear strategies for

large computer experiments (with discussion), in Case Studies in Bayesian

Statistics, vol. III, eds. C. Gastonis et al. 37-93. Springer-Verlag.

M. Goldstein and J.C.Rougier (2008). Reified Bayesian modelling and

inference for physical systems (with discussion), JSPI, to appear, .

Kennedy, M.C. and O’Hagan, A. (2001). Bayesian calibration of computer

models (with discussion). Journal of the Royal Statistical Society, B,63,

425-464

Santner, T., Williams, B. and Notz, W. (2003). The Design and Analysis of

Computer Experiments. Springer Verlag: New York.

Bower, R.G., Benson, A. J. et.al.(2006). The Broken hierarchy of galaxy

formation, Mon.Not.Roy.Astron.Soc. 370, 645-655

65 / 65

More magazines by this user
Similar magazines