Arabidopsis - MUCM

Emulation, Efficient History Matching and Design for

Systems Biology Models

Ian Vernon ∗

Department of Mathematical Sciences

Durham University.

∗ work in collaboration with Michael Goldstein (Dept of Mathematical Sciences), Junli Liu and Keith

Lindsey (Dept of Biological Sciences, Durham University) with funding from an EPSRC Impact award and

**MUCM**.

1 / 49

Overview

• Systems Biology: deterministic and stochastic models used

2 / 49

Overview

• Systems Biology: deterministic and stochastic models used

• **Arabidopsis** plant hormone model

2 / 49

Overview

• Systems Biology: deterministic and stochastic models used

• **Arabidopsis** plant hormone model

• Is the model consistent with observations: Iterative History Matching

2 / 49

Overview

• Systems Biology: deterministic and stochastic models used

• **Arabidopsis** plant hormone model

• Is the model consistent with observations: Iterative History Matching

• Designing future experiments using implausibility

2 / 49

Overview

• Systems Biology: deterministic and stochastic models used

• **Arabidopsis** plant hormone model

• Is the model consistent with observations: Iterative History Matching

• Designing future experiments using implausibility

• Generalisations to stochastic systems biology models

2 / 49

Systems Biology: **Arabidopsis**

• Small flowering plant related to cabbage and mustard.

3 / 49

Systems Biology: **Arabidopsis**

• Small flowering plant related to cabbage and mustard.

• One of the model organisms used for studying plant biology and the first

plant to have its entire genome sequenced.

3 / 49

Systems Biology: **Arabidopsis**

• Small flowering plant related to cabbage and mustard.

• One of the model organisms used for studying plant biology and the first

plant to have its entire genome sequenced.

• Changes in it are easily observed, making it a very useful model.

3 / 49

Hormonal Crosstalk in **Arabidopsis**

• Much interest in understanding the processes involved in root growth in

arabidopsis.

4 / 49

Hormonal Crosstalk in **Arabidopsis**

• Much interest in understanding the processes involved in root growth in

arabidopsis.

• Long term aim is to improve root growth leading to sturdier, taller plants with

greater yields.

4 / 49

Hormonal Crosstalk in **Arabidopsis**

• Much interest in understanding the processes involved in root growth in

arabidopsis.

• Long term aim is to improve root growth leading to sturdier, taller plants with

greater yields.

• Junli Liu proposed a more complex model of Hormonal Crosstalk in

**Arabidopsis**, involving hormones thought to be responsible for root growth.

4 / 49

Hormonal Crosstalk in **Arabidopsis**

• Much interest in understanding the processes involved in root growth in

arabidopsis.

• Long term aim is to improve root growth leading to sturdier, taller plants with

greater yields.

• Junli Liu proposed a more complex model of Hormonal Crosstalk in

**Arabidopsis**, involving hormones thought to be responsible for root growth.

• Although model is published, it is at an early stage of development.

4 / 49

Hormonal Crosstalk in **Arabidopsis**

• Much interest in understanding the processes involved in root growth in

arabidopsis.

• Long term aim is to improve root growth leading to sturdier, taller plants with

greater yields.

• Junli Liu proposed a more complex model of Hormonal Crosstalk in

**Arabidopsis**, involving hormones thought to be responsible for root growth.

• Although model is published, it is at an early stage of development.

• Fundamental scientific questions remain about whether the model exhibits

certain modes of behaviour observed in the real system.

4 / 49

Hormonal Crosstalk in **Arabidopsis**

• Much interest in understanding the processes involved in root growth in

arabidopsis.

• Long term aim is to improve root growth leading to sturdier, taller plants with

greater yields.

• Junli Liu proposed a more complex model of Hormonal Crosstalk in

**Arabidopsis**, involving hormones thought to be responsible for root growth.

• Although model is published, it is at an early stage of development.

• Fundamental scientific questions remain about whether the model exhibits

certain modes of behaviour observed in the real system.

• These questions can be framed in terms of history matching.

4 / 49

Hormonal Crosstalk in **Arabidopsis**

• Much interest in understanding the processes involved in root growth in

arabidopsis.

• Long term aim is to improve root growth leading to sturdier, taller plants with

greater yields.

• Junli Liu proposed a more complex model of Hormonal Crosstalk in

**Arabidopsis**, involving hormones thought to be responsible for root growth.

• Although model is published, it is at an early stage of development.

• Fundamental scientific questions remain about whether the model exhibits

certain modes of behaviour observed in the real system.

• These questions can be framed in terms of history matching.

• Have funding for future experiments: discuss designing experiments using

history matching based criteria.

4 / 49

Slides describing model inputs and outputs.

• Model has 12 outputs (chemical concentrations) and 32 inputs (reaction

rate parameters) with 1 control input.

5 / 49

Slides describing model inputs and outputs.

• Model has 12 outputs (chemical concentrations) and 32 inputs (reaction

rate parameters) with 1 control input.

Chemical Output Initial concentration Measurable

Auxin 0.1 Yes

X 0.1

PLSp 0.1 Yes

Ra 0

Ra star 1

CK 0.1 Yes

ET 0.1 Yes

PLSm 0.1

Re 0

Re star 0.3

CTR1 0

CTR1 star 0.3

IAA 0

cytokinin 0

ACC 0

5 / 49

Reaction Network Model

6 / 49

32 Reaction Rates: 32 Input parameters

• k6 control input: k6 = 0.3 implies wild type, k6 = 0 implies mutant (gene

removed) and k6 = 0.45 implies super mutant.

Input min max Input min max

k1 0.01 100 k1a 0.01 100

k2 0.002 20 k2a 0.028 280

k2b 0.01 100 k2c 1×10 −4

1

k3 0.02 200 k3a 0.0045 45

k4 0.01 100 k5 0.01 100

k6 0.3 0.3 k6a 0.002 20

k7 0.01 100 k8 0.01 100

k9 0.01 100 k10 3×10 −6

0.03

k10a 0.005 50 k11 0.05 500

k12 0.001 10 k12a 0.001 10

k13 0.01 100 k14 0.03 300

k15 8.5×10 −4

8.5 k16 0.003 30

k16a 0.01 100 k17 0.001 10

k18 0.001 10 k18a 0.01 100

k19 0.01 100 k1vauxin 0.01 100

k1vCK 0.01 100 k1veth 0.01 100

7 / 49

Measurements of root hormone level.

8 / 49

Early stage Model

• Fundamental scientific questions remain about whether the model exhibits

certain modes of behaviour observed in the real system.

9 / 49

Early stage Model

• Fundamental scientific questions remain about whether the model exhibits

certain modes of behaviour observed in the real system.

• Plant Biologists can adjust real system in 4 ways:

1. By removing a gene, reduce input k6 to zero

9 / 49

Early stage Model

• Fundamental scientific questions remain about whether the model exhibits

certain modes of behaviour observed in the real system.

• Plant Biologists can adjust real system in 4 ways:

1. By removing a gene, reduce input k6 to zero

2. Feed plant Auxin: increase initial condition IAA.

9 / 49

Early stage Model

• Fundamental scientific questions remain about whether the model exhibits

certain modes of behaviour observed in the real system.

• Plant Biologists can adjust real system in 4 ways:

1. By removing a gene, reduce input k6 to zero

2. Feed plant Auxin: increase initial condition IAA.

3. Feed plant Cytokinin: increase initial condition cytokinin.

9 / 49

Early stage Model

• Fundamental scientific questions remain about whether the model exhibits

certain modes of behaviour observed in the real system.

• Plant Biologists can adjust real system in 4 ways:

1. By removing a gene, reduce input k6 to zero

2. Feed plant Auxin: increase initial condition IAA.

3. Feed plant Cytokinin: increase initial condition cytokinin.

4. Feed plant Ethelene: increase initial condition ACC.

9 / 49

Early stage Model

• Fundamental scientific questions remain about whether the model exhibits

certain modes of behaviour observed in the real system.

• Plant Biologists can adjust real system in 4 ways:

1. By removing a gene, reduce input k6 to zero

2. Feed plant Auxin: increase initial condition IAA.

3. Feed plant Cytokinin: increase initial condition cytokinin.

4. Feed plant Ethelene: increase initial condition ACC.

• Can measure changes in Auxin, ET, CK and PLSp in response to these

changes.

9 / 49

Measurements of root hormone level.

10 / 49

Modes of Behaviour: Observed Trends

• Knock out gene: observe auxin (going down to 0.14um), ethylene (almost

does not change within 10 percent) and CK (going up by approx 42%)

11 / 49

Modes of Behaviour: Observed Trends

• Knock out gene: observe auxin (going down to 0.14um), ethylene (almost

does not change within 10 percent) and CK (going up by approx 42%)

• Feed plant Auxin: observe auxin up; ethylene up, cytokinin down; PLSp up

(all by approx 200 to 400% )

11 / 49

Modes of Behaviour: Observed Trends

• Knock out gene: observe auxin (going down to 0.14um), ethylene (almost

does not change within 10 percent) and CK (going up by approx 42%)

• Feed plant Auxin: observe auxin up; ethylene up, cytokinin down; PLSp up

(all by approx 200 to 400% )

• Feed plant Cytokinin: auxin up; ethylene up, cytokinin down; PLSp down

11 / 49

Modes of Behaviour: Observed Trends

• Knock out gene: observe auxin (going down to 0.14um), ethylene (almost

does not change within 10 percent) and CK (going up by approx 42%)

• Feed plant Auxin: observe auxin up; ethylene up, cytokinin down; PLSp up

(all by approx 200 to 400% )

• Feed plant Cytokinin: auxin up; ethylene up, cytokinin down; PLSp down

• Feed plant Ethelene: auxin down; ethylene up, cytokinin up; PLSp down.

11 / 49

Modes of Behaviour: Observed Trends

• Knock out gene: observe auxin (going down to 0.14um), ethylene (almost

does not change within 10 percent) and CK (going up by approx 42%)

• Feed plant Auxin: observe auxin up; ethylene up, cytokinin down; PLSp up

(all by approx 200 to 400% )

• Feed plant Cytokinin: auxin up; ethylene up, cytokinin down; PLSp down

• Feed plant Ethelene: auxin down; ethylene up, cytokinin up; PLSp down.

• All these modes of behaviour can be represented within Implausibility

Measures

11 / 49

Plots of outputs

12 / 49

Plots of outputs

13 / 49

Plots of outputs

14 / 49

Fundamental Scientific Questions

Fundamental scientific questions:

15 / 49

Fundamental Scientific Questions

Fundamental scientific questions:

1. Are there any choices of rate parameters consistent with observed trends?

15 / 49

Fundamental Scientific Questions

Fundamental scientific questions:

1. Are there any choices of rate parameters consistent with observed trends?

2. Can we identify the set X of all such input or rate parameters?

15 / 49

Fundamental Scientific Questions

Fundamental scientific questions:

1. Are there any choices of rate parameters consistent with observed trends?

2. Can we identify the set X of all such input or rate parameters?

3. What design of future experiment will reduce this set X , and hence resolve

uncertainty about the rate parameters?

15 / 49

Fundamental Scientific Questions

Fundamental scientific questions:

1. Are there any choices of rate parameters consistent with observed trends?

2. Can we identify the set X of all such input or rate parameters?

3. What design of future experiment will reduce this set X , and hence resolve

uncertainty about the rate parameters?

• Questions 1. and 2. answered using an iterative History Match

15 / 49

Fundamental Scientific Questions

Fundamental scientific questions:

1. Are there any choices of rate parameters consistent with observed trends?

2. Can we identify the set X of all such input or rate parameters?

3. What design of future experiment will reduce this set X , and hence resolve

uncertainty about the rate parameters?

• Questions 1. and 2. answered using an iterative History Match

• Question 3. requires a design based on History Matching (space cutout)

criteria.

15 / 49

Linking Model to Reality

• We represent the model as a function, which maps the vector of 32 inputs x

to the vector of 20 outputs f(x).

16 / 49

Linking Model to Reality

• We represent the model as a function, which maps the vector of 32 inputs x

to the vector of 20 outputs f(x).

• We use the “Best Input Approach” to link the model f(x) to the real system

y (i.e. the real plant) via:

y = f(x ∗ ) + d

where we define d to be the model discrepancy and assume that d is

independent of f and x ∗ .

16 / 49

Linking Model to Reality

• We represent the model as a function, which maps the vector of 32 inputs x

to the vector of 20 outputs f(x).

• We use the “Best Input Approach” to link the model f(x) to the real system

y (i.e. the real plant) via:

y = f(x ∗ ) + d

where we define d to be the model discrepancy and assume that d is

independent of f and x ∗ .

• Finally, we relate the true system y to the observational data z by,

z = y + e

where e represent the observational errors.

16 / 49

Linking Model to Reality

• We represent the model as a function, which maps the vector of 32 inputs x

to the vector of 20 outputs f(x).

• We use the “Best Input Approach” to link the model f(x) to the real system

y (i.e. the real plant) via:

y = f(x ∗ ) + d

where we define d to be the model discrepancy and assume that d is

independent of f and x ∗ .

• Finally, we relate the true system y to the observational data z by,

z = y + e

where e represent the observational errors.

• We will use the Bayes Linear methodology, which only involves

expectations, variances and covariances.

16 / 49

**Arabidopsis**: Emulation

• For each of the 20 outputs we pick active variables x A then emulate

univariately (at first) using:

fi(x) =

j

βij gij(x A ) + ui(x A ) + δi(x)

17 / 49

**Arabidopsis**: Emulation

• For each of the 20 outputs we pick active variables x A then emulate

univariately (at first) using:

fi(x) =

j

βij gij(x A ) + ui(x A ) + δi(x)

• The

j βij gij(x A ) is a 3rd order polynomial in the active inputs.

17 / 49

**Arabidopsis**: Emulation

• For each of the 20 outputs we pick active variables x A then emulate

univariately (at first) using:

fi(x) =

j

βij gij(x A ) + ui(x A ) + δi(x)

• The

j βij gij(x A ) is a 3rd order polynomial in the active inputs.

• ui(x A ) is a Gaussian process.

17 / 49

**Arabidopsis**: Emulation

• For each of the 20 outputs we pick active variables x A then emulate

univariately (at first) using:

fi(x) =

j

βij gij(x A ) + ui(x A ) + δi(x)

• The

j βij gij(x A ) is a 3rd order polynomial in the active inputs.

• ui(x A ) is a Gaussian process.

• The nugget δi(x) models the effects of inactive variables as random noise.

17 / 49

**Arabidopsis**: Emulation

• For each of the 20 outputs we pick active variables x A then emulate

univariately (at first) using:

fi(x) =

j

βij gij(x A ) + ui(x A ) + δi(x)

• The

j βij gij(x A ) is a 3rd order polynomial in the active inputs.

• ui(x A ) is a Gaussian process.

• The nugget δi(x) models the effects of inactive variables as random noise.

• The ui(x A ) have covariance structure given by:

Cov(ui(x A 1 ), ui(x A 2 )) = σ 2 i exp[−|x A 1 − x A 2 | 2 /θ 2 i ]

17 / 49

**Arabidopsis**: Emulation

• For each of the 20 outputs we pick active variables x A then emulate

univariately (at first) using:

fi(x) =

j

βij gij(x A ) + ui(x A ) + δi(x)

• The

j βij gij(x A ) is a 3rd order polynomial in the active inputs.

• ui(x A ) is a Gaussian process.

• The nugget δi(x) models the effects of inactive variables as random noise.

• The ui(x A ) have covariance structure given by:

Cov(ui(x A 1 ), ui(x A 2 )) = σ 2 i exp[−|x A 1 − x A 2 | 2 /θ 2 i ]

• The Emulators give the expectation E[fi(x)] and variance Var[fi(x)] at

point x for each output given by i = 1, .., 20, and are fast to evaluate.

17 / 49

Implausibility Measures (Univariate)

We can now calculate the Implausibility I (i)(x) at any input parameter point x

for each of the i = 1, .., 20 outputs. This is given by:

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

18 / 49

Implausibility Measures (Univariate)

We can now calculate the Implausibility I (i)(x) at any input parameter point x

for each of the i = 1, .., 20 outputs. This is given by:

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

• E[fi(x)] and Var[fi(x)] are the emulator expectation and variance.

18 / 49

Implausibility Measures (Univariate)

We can now calculate the Implausibility I (i)(x) at any input parameter point x

for each of the i = 1, .., 20 outputs. This is given by:

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

• E[fi(x)] and Var[fi(x)] are the emulator expectation and variance.

• zi are the observed data and Var[di] and Var[ei] are the (univariate)

Model Discrepancy and Observational Error variances.

18 / 49

Implausibility Measures (Univariate)

We can now calculate the Implausibility I (i)(x) at any input parameter point x

for each of the i = 1, .., 20 outputs. This is given by:

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

• E[fi(x)] and Var[fi(x)] are the emulator expectation and variance.

• zi are the observed data and Var[di] and Var[ei] are the (univariate)

Model Discrepancy and Observational Error variances.

• Large values of I (i)(x) imply that we are highly unlikely to obtain

acceptable matches between model output and observed data at

input x.

18 / 49

Implausibility Measures (Univariate)

We can now calculate the Implausibility I (i)(x) at any input parameter point x

for each of the i = 1, .., 20 outputs. This is given by:

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

• E[fi(x)] and Var[fi(x)] are the emulator expectation and variance.

• zi are the observed data and Var[di] and Var[ei] are the (univariate)

Model Discrepancy and Observational Error variances.

• Large values of I (i)(x) imply that we are highly unlikely to obtain

acceptable matches between model output and observed data at

input x.

• Small values of I (i)(x) do not imply that x is good!

18 / 49

Implausibility Measures (Univariate)

• We can combine the univariate implausibilities across the 20 outputs by

maximizing over outputs:

IM(x) = maxi I (i)(x)

19 / 49

Implausibility Measures (Univariate)

• We can combine the univariate implausibilities across the 20 outputs by

maximizing over outputs:

IM(x) = maxi I (i)(x)

• We can then impose a cutoff IM(x) < cM in order to discard regions of

input parameter space that we now deem to be implausible.

19 / 49

Implausibility Measures (Univariate)

• We can combine the univariate implausibilities across the 20 outputs by

maximizing over outputs:

IM(x) = maxi I (i)(x)

• We can then impose a cutoff IM(x) < cM in order to discard regions of

input parameter space that we now deem to be implausible.

• The choice of cutoff cM is often motivated by Pukelsheim’s 3-sigma rule.

19 / 49

Implausibility Measures (Univariate)

• We can combine the univariate implausibilities across the 20 outputs by

maximizing over outputs:

IM(x) = maxi I (i)(x)

• We can then impose a cutoff IM(x) < cM in order to discard regions of

input parameter space that we now deem to be implausible.

• The choice of cutoff cM is often motivated by Pukelsheim’s 3-sigma rule.

• We may simultaneously employ other choices of implausibility measure:

e.g. multivariate, second/third maximum etc.

19 / 49

Iterative Refocussing Strategy for Reducing Input Space.

We use an iterative strategy to reduce the input parameter space. Denoting the

current non-implausible volume by Xj, at each stage or wave we:

20 / 49

Iterative Refocussing Strategy for Reducing Input Space.

We use an iterative strategy to reduce the input parameter space. Denoting the

current non-implausible volume by Xj, at each stage or wave we:

1. Design a set of runs over the non-implausible input region Xj

20 / 49

Iterative Refocussing Strategy for Reducing Input Space.

We use an iterative strategy to reduce the input parameter space. Denoting the

current non-implausible volume by Xj, at each stage or wave we:

1. Design a set of runs over the non-implausible input region Xj

2. Construct new emulators for f(x) only over this region Xj

20 / 49

Iterative Refocussing Strategy for Reducing Input Space.

We use an iterative strategy to reduce the input parameter space. Denoting the

current non-implausible volume by Xj, at each stage or wave we:

1. Design a set of runs over the non-implausible input region Xj

2. Construct new emulators for f(x) only over this region Xj

3. Evaluate the new implausibility function IM(x) over Xj

20 / 49

Iterative Refocussing Strategy for Reducing Input Space.

We use an iterative strategy to reduce the input parameter space. Denoting the

current non-implausible volume by Xj, at each stage or wave we:

1. Design a set of runs over the non-implausible input region Xj

2. Construct new emulators for f(x) only over this region Xj

3. Evaluate the new implausibility function IM(x) over Xj

4. Define a new (reduced) non-implausible region Xj+1, by IM(x) < cM ,

which should satisfy X ⊂ Xj+1 ⊂ Xj.

20 / 49

Iterative Refocussing Strategy for Reducing Input Space.

We use an iterative strategy to reduce the input parameter space. Denoting the

current non-implausible volume by Xj, at each stage or wave we:

1. Design a set of runs over the non-implausible input region Xj

2. Construct new emulators for f(x) only over this region Xj

3. Evaluate the new implausibility function IM(x) over Xj

4. Define a new (reduced) non-implausible region Xj+1, by IM(x) < cM ,

which should satisfy X ⊂ Xj+1 ⊂ Xj.

This algorithm is continued until a) we run out of computational resources, or b)

the emulators are found to be of sufficient accuracy compared to the other

uncertainties present (model discrepancy and observational errors).

20 / 49

History Matching to Observed Trends: some details

• Model is set of differential equations so is fast: generated 2000 runs using

maximin latin hypercube design, per wave.

21 / 49

History Matching to Observed Trends: some details

• Model is set of differential equations so is fast: generated 2000 runs using

maximin latin hypercube design, per wave.

• Emulate log of measurable outputs Auxin, ET, CK and PLSp (conc >0).

21 / 49

History Matching to Observed Trends: some details

• Model is set of differential equations so is fast: generated 2000 runs using

maximin latin hypercube design, per wave.

• Emulate log of measurable outputs Auxin, ET, CK and PLSp (conc >0).

• Use third order polynomials in inverse of inputs (rate parameters > 0).

21 / 49

History Matching to Observed Trends: some details

• Model is set of differential equations so is fast: generated 2000 runs using

maximin latin hypercube design, per wave.

• Emulate log of measurable outputs Auxin, ET, CK and PLSp (conc >0).

• Use third order polynomials in inverse of inputs (rate parameters > 0).

• Construct Implausibility Measures that capture the above modes of

behaviour, which include Junli’s subjective judgements in the model

discrepancy.

21 / 49

History Matching to Observed Trends: some details

• Model is set of differential equations so is fast: generated 2000 runs using

maximin latin hypercube design, per wave.

• Emulate log of measurable outputs Auxin, ET, CK and PLSp (conc >0).

• Use third order polynomials in inverse of inputs (rate parameters > 0).

• Construct Implausibility Measures that capture the above modes of

behaviour, which include Junli’s subjective judgements in the model

discrepancy.

• We performed 3 waves, and after the final wave, performed a large set of

runs to obtain a set of 200 acceptable runs.

21 / 49

History Matching Plots

22 / 49

History Matching Plots

23 / 49

History Matching Plots

24 / 49

History Matching Plots

25 / 49

History Matching Plots

26 / 49

History Matching Plots

27 / 49

History Matching Plots

28 / 49

History Matching Plots

29 / 49

30 / 49

31 / 49

32 / 49

Designing New Experiments

• We now have found several runs that belong to the set X , consistent with

observed trends.

33 / 49

Designing New Experiments

• We now have found several runs that belong to the set X , consistent with

observed trends.

• We have funding for 4 additional experiments: want to choose these to

maximise space reduction (to reduce the size of X ).

33 / 49

Designing New Experiments

• We now have found several runs that belong to the set X , consistent with

observed trends.

• We have funding for 4 additional experiments: want to choose these to

maximise space reduction (to reduce the size of X ).

• New experiments formed from a combination of plant type (wild type,

mutant or super mutant), chemical measured (Auxin, Ethylene, Cytokinin or

PLSp) and feeding regime (no feed, feed Auxin, feed Ethylene, feed

Cytokinin, or any feeding combination).

33 / 49

Designing New Experiments

• We now have found several runs that belong to the set X , consistent with

observed trends.

• We have funding for 4 additional experiments: want to choose these to

maximise space reduction (to reduce the size of X ).

• New experiments formed from a combination of plant type (wild type,

mutant or super mutant), chemical measured (Auxin, Ethylene, Cytokinin or

PLSp) and feeding regime (no feed, feed Auxin, feed Ethylene, feed

Cytokinin, or any feeding combination).

• This totals a list of (no. plant) x (no. chem) x (feeding regime) =

3 x 4 x 8 - 20 = 76 new experiments.

33 / 49

Designing New Experiments

• We now have found several runs that belong to the set X , consistent with

observed trends.

• We have funding for 4 additional experiments: want to choose these to

maximise space reduction (to reduce the size of X ).

• New experiments formed from a combination of plant type (wild type,

mutant or super mutant), chemical measured (Auxin, Ethylene, Cytokinin or

PLSp) and feeding regime (no feed, feed Auxin, feed Ethylene, feed

Cytokinin, or any feeding combination).

• This totals a list of (no. plant) x (no. chem) x (feeding regime) =

3 x 4 x 8 - 20 = 76 new experiments.

• We will select these based on an expected space reduction criteria, using

implausibility measures.

33 / 49

Space Cut Out Criteria

• Consider the implausibility measure for a future measurement zi :

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

34 / 49

Space Cut Out Criteria

• Consider the implausibility measure for a future measurement zi :

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

• We will cut out x from further analysis if I (i)(x) > cM as before, but now zi

is a random quantity.

34 / 49

Space Cut Out Criteria

• Consider the implausibility measure for a future measurement zi :

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

• We will cut out x from further analysis if I (i)(x) > cM as before, but now zi

is a random quantity.

• Given zi, define the indicator function Ii(x, zi) s.t.

Ii(x, zi) =

1 if I(i)(x) > cM, x cut out

0 if I (i)(x) < cM, x not cut out

(1)

34 / 49

Space Cut Out Criteria

• Consider the implausibility measure for a future measurement zi :

I 2 (i) (x) =

|E[fi(x)] − zi| 2

(Var[fi(x)] + Var[di] + Var[ei])

• We will cut out x from further analysis if I (i)(x) > cM as before, but now zi

is a random quantity.

• Given zi, define the indicator function Ii(x, zi) s.t.

Ii(x, zi) =

1 if I(i)(x) > cM, x cut out

0 if I (i)(x) < cM, x not cut out

• For given zi, the fraction of space cutout Si due to output i is:

Si(zi) = 1

VX

x∈X

Ii(x, zi)dx

(1)

34 / 49

Space Cut Out Criteria

• Given the best input x ∗ , and distributional assumptions for zi we have that:

zi|x ∗ ∼ N(µi(x ∗ ), σ 2 i (x ∗ ) + Var[di] + Var[ei])

with µi(x ∗ ) = E[fi(x=x ∗ )] and σi(x ∗ ) = Var[fi(x=x ∗ )].

35 / 49

Space Cut Out Criteria

• Given the best input x ∗ , and distributional assumptions for zi we have that:

zi|x ∗ ∼ N(µi(x ∗ ), σ 2 i (x ∗ ) + Var[di] + Var[ei])

with µi(x ∗ ) = E[fi(x=x ∗ )] and σi(x ∗ ) = Var[fi(x=x ∗ )].

• Therefore the expected space cut out Si given x ∗ is then

E[Si|x ∗ ] = 1

VX

zi

x∈X

Ii(x, zi)π(zi|x ∗ )dxdzi

35 / 49

Space Cut Out Criteria

• Given the best input x ∗ , and distributional assumptions for zi we have that:

zi|x ∗ ∼ N(µi(x ∗ ), σ 2 i (x ∗ ) + Var[di] + Var[ei])

with µi(x ∗ ) = E[fi(x=x ∗ )] and σi(x ∗ ) = Var[fi(x=x ∗ )].

• Therefore the expected space cut out Si given x ∗ is then

E[Si|x ∗ ] = 1

VX

zi

x∈X

Ii(x, zi)π(zi|x ∗ )dxdzi

• and the expected space cut out Si for new output i is

E[Si] = 1

V 2 X

x ∗ ∈X

zi

x∈X

Ii(x, zi)π(zi|x ∗ )dxdzidx ∗

35 / 49

Space Cut Out Criteria

• Given the best input x ∗ , and distributional assumptions for zi we have that:

zi|x ∗ ∼ N(µi(x ∗ ), σ 2 i (x ∗ ) + Var[di] + Var[ei])

with µi(x ∗ ) = E[fi(x=x ∗ )] and σi(x ∗ ) = Var[fi(x=x ∗ )].

• Therefore the expected space cut out Si given x ∗ is then

E[Si|x ∗ ] = 1

VX

zi

x∈X

Ii(x, zi)π(zi|x ∗ )dxdzi

• and the expected space cut out Si for new output i is

E[Si] = 1

V 2 X

x ∗ ∈X

zi

x∈X

• We choose output i to maximise E[Si].

Ii(x, zi)π(zi|x ∗ )dxdzidx ∗

35 / 49

Space Cut Out Criteria

• Given the best input x ∗ , and distributional assumptions for zi we have that:

zi|x ∗ ∼ N(µi(x ∗ ), σ 2 i (x ∗ ) + Var[di] + Var[ei])

with µi(x ∗ ) = E[fi(x=x ∗ )] and σi(x ∗ ) = Var[fi(x=x ∗ )].

• Therefore the expected space cut out Si given x ∗ is then

E[Si|x ∗ ] = 1

VX

zi

x∈X

Ii(x, zi)π(zi|x ∗ )dxdzi

• and the expected space cut out Si for new output i is

E[Si] = 1

V 2 X

x ∗ ∈X

zi

x∈X

• We choose output i to maximise E[Si].

Ii(x, zi)π(zi|x ∗ )dxdzidx ∗

• In fact we want to choose 4 outputs i, j, k, l such that the analogous

expected space cut out E[Si,j,k,l] is maximised.

35 / 49

Approximate Space Cut Out Criteria

• Integrals are expensive so we use the set of na acceptable runs xj,

j = 1, .., na where xj ∈ X to approximate the integrals.

36 / 49

Approximate Space Cut Out Criteria

• Integrals are expensive so we use the set of na acceptable runs xj,

j = 1, .., na where xj ∈ X to approximate the integrals.

• In which case E[Si] becomes

E[Si] ≈

1

n 2 ansim

na

k=1

nsim

a=1

na

j=1

Ii(xj, z a i )

• where we approximate the zi integral by simulating nsim draws of zi from

π(zi|x∗ k ) for each x∗ k . Can do analytically in some cases.

36 / 49

Approximate Space Cut Out Criteria

• Integrals are expensive so we use the set of na acceptable runs xj,

j = 1, .., na where xj ∈ X to approximate the integrals.

• In which case E[Si] becomes

E[Si] ≈

1

n 2 ansim

na

k=1

nsim

a=1

na

j=1

Ii(xj, z a i )

• where we approximate the zi integral by simulating nsim draws of zi from

π(zi|x∗ k ) for each x∗ k . Can do analytically in some cases.

• Should really do this using emulators, but for this calculation the runs may

be sufficient.

36 / 49

Approximate Space Cut Out Criteria

• Integrals are expensive so we use the set of na acceptable runs xj,

j = 1, .., na where xj ∈ X to approximate the integrals.

• In which case E[Si] becomes

E[Si] ≈

1

n 2 ansim

na

k=1

nsim

a=1

na

j=1

Ii(xj, z a i )

• where we approximate the zi integral by simulating nsim draws of zi from

π(zi|x∗ k ) for each x∗ k . Can do analytically in some cases.

• Should really do this using emulators, but for this calculation the runs may

be sufficient.

• This is because the runs would inform the most important parts of the

integrals.

36 / 49

Approximate Space Cut Out Criteria

• Integrals are expensive so we use the set of na acceptable runs xj,

j = 1, .., na where xj ∈ X to approximate the integrals.

• In which case E[Si] becomes

E[Si] ≈

1

n 2 ansim

na

k=1

nsim

a=1

na

j=1

Ii(xj, z a i )

• where we approximate the zi integral by simulating nsim draws of zi from

π(zi|x∗ k ) for each x∗ k . Can do analytically in some cases.

• Should really do this using emulators, but for this calculation the runs may

be sufficient.

• This is because the runs would inform the most important parts of the

integrals.

• Again, we are interested in the analogous multivariate quantity E[Si,j,k,l]

36 / 49

History Matching Plots Plus New Outputs

37 / 49

Predictions for New Outputs

38 / 49

Space Cut Out Criteria for New Outputs

39 / 49

Space Cut Out Criteria for New Outputs

40 / 49

Space Cut Out Criteria for New Outputs

41 / 49

Space Cut Out Criteria for New Outputs

42 / 49

Space Cut Out Criteria for New Outputs

43 / 49

Space Cut Out Criteria for New Outputs

44 / 49

Space Cut Out Criteria for New Outputs

45 / 49

Space Cut Out Criteria for New Outputs

46 / 49

Predictions for New Outputs

47 / 49

Results

• Selected outputs by stepping up to 8 outputs, then back down to 4.

48 / 49

Results

• Selected outputs by stepping up to 8 outputs, then back down to 4.

• Sensitivity analysis: performed two calculations with high/low model

discrepancy and observed errors: same choice of outputs in both cases.

48 / 49

Results

• Selected outputs by stepping up to 8 outputs, then back down to 4.

• Sensitivity analysis: performed two calculations with high/low model

discrepancy and observed errors: same choice of outputs in both cases.

• Experiments chosen with total space reduction:

1. Super mutant, feeding auxin + ethylene, measuring PLSp (56%)

2. Super mutant, feeding auxin + cytokinin, measuring PLSp (82%)

3. Super mutant, feeding ethylene, measuring Auxin (91%)

4. Super mutant, feeding auxin + ethylene, measuring Cytokinin (94%)

48 / 49

Results

• Selected outputs by stepping up to 8 outputs, then back down to 4.

• Sensitivity analysis: performed two calculations with high/low model

discrepancy and observed errors: same choice of outputs in both cases.

• Experiments chosen with total space reduction:

1. Super mutant, feeding auxin + ethylene, measuring PLSp (56%)

2. Super mutant, feeding auxin + cytokinin, measuring PLSp (82%)

3. Super mutant, feeding ethylene, measuring Auxin (91%)

4. Super mutant, feeding auxin + ethylene, measuring Cytokinin (94%)

• Seeds and equipment ordered, experiments are starting soon.

48 / 49

Results

• Selected outputs by stepping up to 8 outputs, then back down to 4.

• Sensitivity analysis: performed two calculations with high/low model

discrepancy and observed errors: same choice of outputs in both cases.

• Experiments chosen with total space reduction:

1. Super mutant, feeding auxin + ethylene, measuring PLSp (56%)

2. Super mutant, feeding auxin + cytokinin, measuring PLSp (82%)

3. Super mutant, feeding ethylene, measuring Auxin (91%)

4. Super mutant, feeding auxin + ethylene, measuring Cytokinin (94%)

• Seeds and equipment ordered, experiments are starting soon.

• Plan is to complete experiments by early August, in time to analyse and to

present at ICSB 2012 conference.

48 / 49

Concluding Comments

• All these calculations are designed to be efficient: approximations used are

very beneficial.

49 / 49

Concluding Comments

• All these calculations are designed to be efficient: approximations used are

very beneficial.

• We are seeking a good set of new experiments, not necessarily the

theoretical best (which we wouldn’t believe anyway).

49 / 49

Concluding Comments

• All these calculations are designed to be efficient: approximations used are

very beneficial.

• We are seeking a good set of new experiments, not necessarily the

theoretical best (which we wouldn’t believe anyway).

• We have chosen experiments to learn more about the rate parameters

given certain tolerances on the model discrepancy.

49 / 49

Concluding Comments

• All these calculations are designed to be efficient: approximations used are

very beneficial.

• We are seeking a good set of new experiments, not necessarily the

theoretical best (which we wouldn’t believe anyway).

• We have chosen experiments to learn more about the rate parameters

given certain tolerances on the model discrepancy.

• It is possible but unlikely that the model will be challenged by these

measurements. For this, which we hope to do, we would need a different

design criteria.

49 / 49

Concluding Comments

• All these calculations are designed to be efficient: approximations used are

very beneficial.

• We are seeking a good set of new experiments, not necessarily the

theoretical best (which we wouldn’t believe anyway).

• We have chosen experiments to learn more about the rate parameters

given certain tolerances on the model discrepancy.

• It is possible but unlikely that the model will be challenged by these

measurements. For this, which we hope to do, we would need a different

design criteria.

• All these techniques directly generalise to the case of stochastic models.

49 / 49