f(x) - MUCM

mucm.ac.uk

f(x) - MUCM

Metamodeling and

sensitivity analysis for

spatial outputs of

computer codes

Bertrand Iooss

EDF R&D

In collaboration with Amandine Marrel (IFP),

Béatrice Laurent (INSA Toulouse)

and Elena Volkova (Kurchatov Institute)

2010

13/07/2010


EDF = Électricité de France

Research & Development – Department of Industrial Risk Management

Research group:

Reliability components and uncertainty quantification of systems

B. Iooss – UCM Conference - 14/07/10 2


EDF = Électricité de France

Research & Development – Department of Industrial Risk Management

Research group:

Reliability components and uncertainty quantification of systems

• Large database of component failures

CLASSICAL STATISTICS

••No Pas observed de REX de failure défaillances

••Physical Un modèle model physique

SIMULATION UNCERTAINTY

Small Des avis database d’experts of system failures

Expert STATISTIQUE Judgment BAYESIENNE

BAYESIAN STATISTICS

Examples:

… Nuclear reactor accident scenario

… Dam failure

… Radioactive waste repository

B. Iooss – UCM Conference - 14/07/10 3


Initial motivation: uncertainty in hydrogeological modeling

Computation of the spatio-temporal

spatio temporal evolution of 90 90Sr Sr concentration in

an ancient radwaste disposal site between 2002 and 2010

Goal: Estimate the contamination impact on the environment

Identify the influent inputs on predicted outputs

Numerical modelisation:

modelisation

Hydrogeological transport

(Darcy’s law) scenario of 90Sr with the MARTHE software

pp = 20 uncertain inputs X:

Permeability, dispersivity,

Kd, infiltration intensity, …

Output of interest: interest

Concentration values Y

30 mn / run

B. Iooss – UCM Conference - 14/07/10 4

[ Volkova et al. 2008 ]

N = 300 Monte Carlo runs

are available from

a previous study


Four main problems in sensitivity analysis practice

X

The model :

Y = f (X 1 , …, X p )

DD XX

f

Sensitivity analysis studies how perturbations on a X i

generate perturbations on Y

Y

DD YY

B. Iooss – UCM Conference - 14/07/10 5


Four main problems in sensitivity analysis practice

X

The model :

Y = f (X 1 , …, X p )

DD XX

P1) f (.) is a complex function : irregular phenomena, interactions, non linear nor

monotonic variations, treshold effects, …

P2) f (.) is a costly simulator (several minutes - days to compute one evaluation)

P3) p is large : p > 10 ; 100 ; …

f

Sensitivity analysis studies how perturbations on a X i

generate perturbations on Y

Y

DD YY

P4) Y is not a single scalar variable but a high dimensional vector or a function

B. Iooss – UCM Conference - 14/07/10 6


Scalar outputs: sensitivity analysis using Sobol’ indices

Functional ANOVA [ Efron & Stein 1981 ] :

p

p

Y i L p

i = 1

i < j

Vi = VarX

[ E(

Y X i )]

i

Var ( ) = ∑ V ( Y ) + ∑ V ij ( Y ) + L + V12

Estimation of S i and S Ti by

– Monte Carlo [Sobol 1993, Saltelli 2002] (high CPU cost), quasi Monte Carlo,

FAST [Saltelli et al. 1999] : cost N > 100 p

– Approximation of the computer code by a metamodel (building cost N ~ 10 p)

Then: analytical formulations [using Gaussian process: Oakley & O’Hagan 2004]

or Monte Carlo applied on the metamodel

B. Iooss – UCM Conference - 14/07/10 7

( Y

)

(hyp of X i s independence)

Vi

S =

= + ∑ + ∑ i ; Sij

; ... ; STi

Si

Sij

Sijk

+ ...

( Y )

Var j j,

k

1st order 2nd order Total index

P1)

P2)


Gaussian process (Gp) metamodel

Kriging principles application to computer code [

Definition :

Y(x) = f(x) + Z(x)

[ Matheron, Matheron,

1963; 1963; Sacks Sacks et et al, al, 1989 1989 ] ]

Regression (pol pol d°1) stochastic part: stationary Gaussian process (σ² variance)

β F(x)

p ⎛

q ⎞ i

R( u

- v)

= exp⎜

−∑

θi

ui

− vi


⎝ i=

1 ⎠

Learning sample of N code simulations: X LS =(x (1) ,…,x (N) ), Y LS = f (X LS )

Gp metamodel:


( x)

=

βF

( x)

+ r ( x)

t

R

LS

−1

[ Y F ] r x [ R(

x x)

R(

x x)

]

( 1)

( N )

− β ; ( ) = , ,..., ,

LS

Hyperparameters (θ ι , q i ) i=1…p : estimation by likelihood maximization

B. Iooss – UCM Conference - 14/07/10 8

LS

t − 1

( Y ( u ) , Y ( v ) ) = σ ² ( R ( u , v ) + r ( u ) R r ( v ) ) ⇒ MSE [ Yˆ

( x )]

Cov , Y

X , Y

LS

X LS LS

LS LS


Algorithm for hyperparameters estimation for large p

[ Marrel et al., 2008 - Inspired from Welch et al., 1992 ]

Goal: sequential estimation and supression of unactive variables

Progressive introduction of input variables in covariance function

Progressive introduction of input variables in regression part

Selection of optimal regression model (min AICC)

Q 2 (predictivity coefficient) computation

Selection of optimal covariance model (max Q 2 )

Final validation of the metamodel on test basis (Q2 criterion)

Example:

Q 2

Number of inputs introduced in the covariance

( Y,


)

B. Iooss – UCM Conference - 14/07/10 9

Q

2

N


i=

1 = 1−

N

P3)

( Y −Yˆ

)

∑(

Y −Yi

)

i=

1

Drawback:

costly process

(made once for each output)

i

i

2

2


Sensitivity analysis results for one scalar output « p104 »

Gaussian process metamodel : Q2 = 93% - Linear regression : Q 2 = 68%

Sobol indices estimation + confidence intervals [ Marrel et al., 2009 ]

(en %)

per1

kd1

i3

SRC i ²

(linear regression)

52

13

This can be done for several outputs …

2

~

μ i = EΩ[ Si]

IC- 90% )

Gaussian Gaussian

process process

~

( Si

[ 10 ; 17 ]

B. Iooss – UCM Conference - 14/07/10 10

8

69

13

[ 5 ; 11 ]

[ 56 ; 83 ]


Initial study: sensitivity analysis for 20 scalar outputs

… but the results are difficult to synthesize, then to interpret

Main influent inputs

Group 1 : kd1

(distribution coef. of

layer 1)

Group 2 : kd2

(distribution coef. of

layer 2)

Group 3 : i3

(infiltration intensity)

- group 1

- group 3

- group 2

Spatial location map

B. Iooss – UCM Conference - 14/07/10 11


We have simulated N = 300 maps (Monte Carlo runs) with p = 20 random inputs

9 output

simulated

maps

4096 pixels

Considering functional (spatial) output

Concentration in 90 Sr predicted in 2010

Discretized spatial output can be considered as a functional 2D output

B. Iooss – UCM Conference - 14/07/10 12

P4)


Sensitivity analysis for spatial outputs: methodology

• Computer code f (.) :

Input: X = ( X 1 , … , X p ) random vector

Output for input x * : y = f (x * , z ) , z ∈ D z ⊂ R 2

In practice, D z is discretized in n z points (here: 64 x 64 = 4096 points)

Remark: untractable to fit several hundreds (here: 4096!) metamodels

• Decomposition in an orthogonal function basis

For example, a wavelets basis is well-suited if there are discontinuities

• Modeling of each coefficient of the basis by a Gp metamodel

[ Bayarri et al., 2007, Higdon et al., 2008 ]

• Prediction: x* => coefficients prediction => spatial output map reconstruction

Sensitivity analysis :

Spatial maps of sensitivity indices

B. Iooss – UCM Conference - 14/07/10 13

[ Marrel et al., submitted ]


A toy function: Campbell2D function

⎡ ( 0.


+ 0.2φ

−10X

2 )

+

6 different

realizations

g(

X , θ,

φ)

= X

X

5

( X

3


1

exp⎢−


60X

( 0.


+ 0. 6φ

20X

)

⎡ −

2)

exp⎢−

2

⎣ 40X

5


⎥ + ( X



⎥ + ( X


B. Iooss – UCM Conference - 14/07/10 14

2

1

2

6

2

2

6

+

+

X

X

4

8

⎡ ( 0.


+ 0.5φ

) X

) exp


⎣ 500

⎡ ( 0.


+ 0. 7φ

) X

) exp⎢

⎣ 250

p = 8

X i ~ U[-1,5]

for i =1,…8

z = (θ ,φ )

D z = [-90,90]²







n z = 64 x 64

= 4096 points

1

7


Sobol indices for Campbell2D function : 1 st order & total

B. Iooss – UCM Conference - 14/07/10 15

X i ~ U[-1,5]

i =1…8

First order indices

Exact values


Sobol indices for Campbell2D function : 1 st order & total

Total indices

Monte Carlo

estimations

B. Iooss – UCM Conference - 14/07/10 16

X i ~ U[-1,5]

i =1…8

First order indices

Exact values


Step 1: spatial decomposition & coefficients sorting

Spatial decomposition of NN simulated maps

Maps centering (empirical mean of N maps)

[ ( X,

) ]

( z) E X z Y = μ

Decomposition on Daubechies wavelet basis (K = 4096)

K

Y = + ∑ K X, z)

μ ( z)

α j ( X)

φ j

j = 1

( ( z)

Problem: cost of the metamodel building step (because p > 10)

P3)

Selection of a small number of coefficients to be Gp-modeled

Sorting of coefficients by decreasing variability (variance / X)

{ α 1 , … , α K } { α (1) , … , α (K) }

B. Iooss – UCM Conference - 14/07/10 17


Step 2: coefficients modeling

Proposed model :

Modeling the coefficients αj (X) in function of X

Gaussian process for k first coefficients

Linear model for k ’ following coefficients

Mean value for the others

Validation – Predictive quality measures :

MSE

[ ( X,

z)

−Yˆ

( X,

z)

]

2

MSE( X)

= ˆ

∫ Y K , k dz

with YK

, k ( X,

z)

approximation

of YK

( X,

z)

=

D

z

E

X

[ MSE( X)

]

;

= 1−

{ Var [ Y ( X,

z)

] }

B. Iooss – UCM Conference - 14/07/10 18

Q

2

E

z

MSE

X


Step 3: selection of k (number of Gp-modeled coefficients)

Campbell2D function : MSE in function of kk and NN

k first coef. = linear models

k first coef. = Gp models

We take k = 30 (stabilization of MSE), and k ’ = 500

QQ 2 = 93% for NN = 200 ; QQ 2 = 97% for NN = 500

B. Iooss – UCM Conference - 14/07/10 19

Test sample

(size=1000)

Proposed model (k Gp models and k ’ = 500 linear models)


Sensitivity analysis results for Campbell2D function

Estimation of first order and total Sobol’ indices by Monte Carlo applied on

the functional metamodel

Relative mean absolute errors on first order Sobol’ indices:

X 1

9%

X 2 X6

X 2

16%

X 3

16%

13%

13%

12%

10%

B. Iooss – UCM Conference - 14/07/10 20

X 4

X 6

X 7

X 8

Exacts

Functional

metamodel


Application on our test case (hydrogeological pollution)

N = 300 simulations

p = 20 random input variables

K = 4096 pixels

k = 100 coef. modeled by Gp

Mean predictivity: QQ 2 = 72%

Estimation of first order and total Sobol’ indices maps by Monte Carlo

(22000 runs with the fonctional metamodel)

20 maps of sensitivity indices

B. Iooss – UCM Conference - 14/07/10 21


Spatial output: results of sensitivity analysis

Spatial maps of Sobol sensitivity indices of first order, for 6 inputs

Permeability layer 1 Permeability layer 2 Permeability layer 3

Kd layer 1 Kd layer 2 Strong infiltration

intensity

B. Iooss – UCM Conference - 14/07/10 22


P3)

P4)

Conclusions

Technique to build a functional metamodel

Useful for: complex behaviour, consuming cpu code, large p, functional output

P1) P2) P3) P4)

Wavelets decomposition

Coefficients modeled by Gaussian process

Achievement of sensitivity maps

Global and local interpretation

Prospects

References :

Spatio-temporal output, spatio-temporal dynamic simulators

• Volkova, Iooss & Van Dorpe, Stoch. Env. Res. Risk Asses., 2008

• Marrel, Iooss, Van Dorpe & Volkova, Comput. Stat. Data Analysis, 2008

• Marrel, Iooss, Jullien, Laurent & Volkova, Global sensitivity analysis for spatially

dependent outputs, Environmetrics, Submitted

B. Iooss – UCM Conference - 14/07/10 23


MANY THANKS!!!

B. Iooss – UCM Conference - 14/07/10 24


MANY THANKS!!!

GdR MASCOT-NUM: a French research group on « computer experiments »

Conferences – Publications – Benchmarks – … - http://www.gdr-mascotnum.fr

Call for papers on « Computer experiments »

Before august, 31th for « Statistics & Computing » (Eds: Antoniadis & Pasanisi)

Next year (2011) for « Les Annales de la Faculté de Sciences de Toulouse »

(Eds: Azaïs, Gamboa & Iooss): call for mathematical & potentially long papers

Uncertainty softwares

OpenTurns (EDF-EADS-Phimeca) : http://trac.openturns.org/

(C++ library functions, Python interpreter, graphical interface)

B. Iooss – UCM Conference - 14/07/10 25


MANY THANKS!!!

GdR MASCOT-NUM: a French research group on « computer experiments »

Conferences – Publications – Benchmarks – … - http://www.gdr-mascotnum.fr

Call for papers on « Computer experiments »

Before august, 31th for « Statistics & Computing » (Eds: Antoniadis & Pasanisi)

Next year (2011) for « Les Annales de la Faculté de Sciences de Toulouse »

(Eds: Azaïs, Gamboa & Iooss): call for mathematical & potentially long papers

Uncertainty softwares

OpenTurns (EDF-EADS-Phimeca) : http://trac.openturns.org/

(C++ library functions, Python interpreter, graphical interface)

THANKS AGAIN!!!

B. Iooss – UCM Conference - 14/07/10 26

More magazines by this user
Similar magazines