Spatial Process Estimates - IMAGe

Spatial Process Estimates 

SAMSI Summer School August, 2009 

Douglas Nychka, 

National Center for Atmospheric Research

Outline 

• A spatial model and Kriging 

• Kriging = Penalized least squares = splines 

• Robust Kriging. 

• Identifying a covariance function 

• Inference 

Supported by the National Science Foundation 

2

The additive model 

Given n pairs of observations (x i , y i ), i = 1, . . . , n 

ɛ i ’s are random errors. 

y i = g(x i ) + ɛ i 

Assume that g is a realization of a Gaussian process. 

MN(0, σ 2 I) 

and ɛ are 

Formulating a statistical model for g 

makes a very big difference in how we solve the problem. 

3

A Normal World 

We assume that g(x) is a Gaussian process, 

ρk(x, x ′ ) = COV (g(x), g(x ′ )) 

For the moment assume that E(g(x)) = 0. 

(A Gaussian process ≡ any subset of the field locations has a multivariate 

normal distribution. ) 

We know what we need to do! 

If we know k we know how to make a prediction at x! 

ĝ(x) = E[g(x)|data] 

i.e. Just use the conditional multivariate normal distribution. 

4

A review of the conditional normal 

and 

u ∼ N(0, Σ) 

u = 

( 

u1 

u 2 

) 

( ) 

Σ11 , Σ 

Σ = 12 

Σ 21 , Σ 22 

[u 2 |u 1 ] = N(Σ 2,1 Σ −1 

1,1 u 1, Σ 2,2 − Σ 2,1 Σ −1 

1,1 Σ 1,2) 

(distribution of u 2 given u 1 ) 

5

Our application is 

u 1 = y 

and 

(the Data) 

u 2 = {g(x 1 ), ...g(x N )} 

a vector of function values where we would like to predict. 

6

The Kriging weights 

Conditional distribution of g given the data y is Gaussian. 

Conditional mean 

ĝ = COV (g, y) [COV (y)] −1 y = Ay 

rows of A are the Kriging weights. 

Conditional variance 

COV (g, g) − COV (g, y) [COV (y)] −1 COV (y, g) 

These two pieces characterize the entire conditional distribution 

7

Kriging as a smoother 

Suppose the errors are uncorrelated Normals with variance 

σ 2 . 

ρK = COV (g, y) = COV (g, g) 

and 

COV (y) = (ρK + σ 2 I) 

ĝ = ρK(ρK + σ 2 I) −1 y 

= K(K + λI) −1 y = A(λ)y 

8

My geostatistics/BLUE overhead 

For any covariance and any smoothing matrix (not just A above) we 

can easily derive the prediction variance. 

Question: Find the minimum of 

over all choices of A. 

E [ (g(x) − ĝ(x)) 2] 

The answer: The Kriging weights ... or what we would do if 

we used the Gaussian process and the conditional distribution. 

Folklore and intuition: The spatial estimates are not very sensitive 

if one uses suboptimal weights, especially if the observations 

contain some measurement error. 

It does matter for measures of uncertainty. 

9

Kriging with a fixed part 

Adding a fixed component 

g(x) = ∑ i 

φ i (x)d i + h(x) 

d is fixed 

h is a mean zero process with covariance, k. 

10

BLUE/Universal Kriging 

Find d by Generalized least squares 

ˆd = 

( 

T T M −1 T ) −1 

T T M −1 y 

M = K + λI 

”Krig” the residuals 

ĥ = K(K + λI) −1 (y − T ˆd) 

11

In general: 

ĝ(x) = ∑ i 

φ i (x) ˆd i + ∑ j 

k(x, x j )ĉ j 

with 

ĉ = (K + λI) −1 (y − T ˆd) 

12

The spline connection 

Basis functions: determined by the covariance function 

Penalty function: Ω = K is based on the covariance. 

The minimization criteria: 

min 

d,c 

n∑ 

i=1 

(y − (T d + Kc) i ) 2 + λc T Kc 

Kriging estimator is a spline with reproducing kernel k. 

λ is proportional to the measurement (nugget) variance 

13

Robust Kriging 

ρ(u) a robust function that grows slower than u 2 

The robust minimization criteria: 

min 

d,c 

n∑ 

i=1 

ρ(y − (T d + Kc) i ) + λc T Kc 

An example of ρ and its derviative 

0 5 10 15 

−5 0 5 

−4 −2 0 2 4 

−4 −2 0 2 4 

14

Identifying a covariance function 

The Matern class of covariances: 

φ(d) = ρψ ν (d/θ)) with ψ ν a Bessel function. 

correlation 

0.0 0.2 0.4 0.6 0.8 1.0 

6 

2 

1 

0.5 

0.0 0.5 1.0 1.5 2.0 

θ a range parameter, ν smoothness 

at 0. 

• ψ ν is an exponential for ν = 

1/2 as ν → ∞ Gaussian. 

• m t h order thin plate spline in 

R d ν = 2m − d.) 

distance 

Compactly support Wendland covariance 

15

What kind of processes are these? 

Matern (.5,1.0,2.0) and Wendland (2.0) 

16

Correlations among ozone 

In many cases spatial processes also have a temporal component. 

Here we take the 89 days over the ”ozone season” and just find 

sample correlations among stations. 

correlation 

−0.5 0.0 0.5 1.0 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

0 100 200 300 400 500 600 

miles 

● 

● 

● 

● 

17

Comparison to the variogram 

Sample variogram for 89 days during summer 1987 : 

Variance 

0 100 300 500 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

● 

0 50 100 150 200 250 300 

● 

● 

● 

● 

Distance (miles) 

”.” Sample variogram values 

”–” fitted linear function Variogram ∼ distance/θ 

● 

● 

● 

18

50 

Sensitivity to the covariance 

Exp (200) Matern (2, 200) 

150 

100 

100 

50 

50 

50 

150 

150 

150 

50 

100 

50 

50 

Thin plate spline Wendland (2, 180) 

100 

50 

100 

50 

0 

correlation 

0.0 0.4 0.8 

0 50 100 150 200 250 300 

distance 

19

10 

5 

5 

Sensitivity to the covariance (SE) 

Exp (200) Matern (2, 200) 

20 

20 

20 

10 

20 

20 

10 

5 

10 

10 

20 

20 

10 

20 

10 

10 

5 

10 

10 

20 

20 

10 

5 

10 

10 

10 

20 

10 

10 

20 

20 

20 

20 

10 

10 

5 

10 

10 

10 

10 

10 

10 

60 

50 

40 

30 

20 

5 

20 

10 

10 

5 

10 

10 

0 

Thin plate spline Wendland (2, 180) 

20

140 

140 

140 

140 

140 

140 

140 

140 

140 

140 

140 

140 

140 

140 

140 

Uncertainty in exceeding 140 PPB 

Mean field, four realizations of the conditional distribution 

Contours from 10 cases 

0 50 100 150 200 

21

Summary 

A spatial process model leads to a penalized 

least squares estimate 

A spline = Kriging estimate= Bayesian posterior 

mode 

For spatial estimators the basis functions are 

related to the covariance functions and can be 

identified from data 

22

Spatial Process Estimates - IMAGe

Create successful ePaper yourself

Delete template?

Save as template?