02.06.2015 Views

Spatial Process Estimates - IMAGe

Spatial Process Estimates - IMAGe

Spatial Process Estimates - IMAGe

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Spatial</strong> <strong>Process</strong> <strong>Estimates</strong><br />

SAMSI Summer School August, 2009<br />

Douglas Nychka,<br />

National Center for Atmospheric Research


Outline<br />

• A spatial model and Kriging<br />

• Kriging = Penalized least squares = splines<br />

• Robust Kriging.<br />

• Identifying a covariance function<br />

• Inference<br />

Supported by the National Science Foundation<br />

2


The additive model<br />

Given n pairs of observations (x i , y i ), i = 1, . . . , n<br />

ɛ i ’s are random errors.<br />

y i = g(x i ) + ɛ i<br />

Assume that g is a realization of a Gaussian process.<br />

MN(0, σ 2 I)<br />

and ɛ are<br />

Formulating a statistical model for g<br />

makes a very big difference in how we solve the problem.<br />

3


A Normal World<br />

We assume that g(x) is a Gaussian process,<br />

ρk(x, x ′ ) = COV (g(x), g(x ′ ))<br />

For the moment assume that E(g(x)) = 0.<br />

(A Gaussian process ≡ any subset of the field locations has a multivariate<br />

normal distribution. )<br />

We know what we need to do!<br />

If we know k we know how to make a prediction at x!<br />

ĝ(x) = E[g(x)|data]<br />

i.e. Just use the conditional multivariate normal distribution.<br />

4


A review of the conditional normal<br />

and<br />

u ∼ N(0, Σ)<br />

u =<br />

(<br />

u1<br />

u 2<br />

)<br />

( )<br />

Σ11 , Σ<br />

Σ = 12<br />

Σ 21 , Σ 22<br />

[u 2 |u 1 ] = N(Σ 2,1 Σ −1<br />

1,1 u 1, Σ 2,2 − Σ 2,1 Σ −1<br />

1,1 Σ 1,2)<br />

(distribution of u 2 given u 1 )<br />

5


Our application is<br />

u 1 = y<br />

and<br />

(the Data)<br />

u 2 = {g(x 1 ), ...g(x N )}<br />

a vector of function values where we would like to predict.<br />

6


The Kriging weights<br />

Conditional distribution of g given the data y is Gaussian.<br />

Conditional mean<br />

ĝ = COV (g, y) [COV (y)] −1 y = Ay<br />

rows of A are the Kriging weights.<br />

Conditional variance<br />

COV (g, g) − COV (g, y) [COV (y)] −1 COV (y, g)<br />

These two pieces characterize the entire conditional distribution<br />

7


Kriging as a smoother<br />

Suppose the errors are uncorrelated Normals with variance<br />

σ 2 .<br />

ρK = COV (g, y) = COV (g, g)<br />

and<br />

COV (y) = (ρK + σ 2 I)<br />

ĝ = ρK(ρK + σ 2 I) −1 y<br />

= K(K + λI) −1 y = A(λ)y<br />

8


My geostatistics/BLUE overhead<br />

For any covariance and any smoothing matrix (not just A above) we<br />

can easily derive the prediction variance.<br />

Question: Find the minimum of<br />

over all choices of A.<br />

E [ (g(x) − ĝ(x)) 2]<br />

The answer: The Kriging weights ... or what we would do if<br />

we used the Gaussian process and the conditional distribution.<br />

Folklore and intuition: The spatial estimates are not very sensitive<br />

if one uses suboptimal weights, especially if the observations<br />

contain some measurement error.<br />

It does matter for measures of uncertainty.<br />

9


Kriging with a fixed part<br />

Adding a fixed component<br />

g(x) = ∑ i<br />

φ i (x)d i + h(x)<br />

d is fixed<br />

h is a mean zero process with covariance, k.<br />

10


BLUE/Universal Kriging<br />

Find d by Generalized least squares<br />

ˆd =<br />

(<br />

T T M −1 T ) −1<br />

T T M −1 y<br />

M = K + λI<br />

”Krig” the residuals<br />

ĥ = K(K + λI) −1 (y − T ˆd)<br />

11


In general:<br />

ĝ(x) = ∑ i<br />

φ i (x) ˆd i + ∑ j<br />

k(x, x j )ĉ j<br />

with<br />

ĉ = (K + λI) −1 (y − T ˆd)<br />

12


The spline connection<br />

Basis functions: determined by the covariance function<br />

Penalty function: Ω = K is based on the covariance.<br />

The minimization criteria:<br />

min<br />

d,c<br />

n∑<br />

i=1<br />

(y − (T d + Kc) i ) 2 + λc T Kc<br />

Kriging estimator is a spline with reproducing kernel k.<br />

λ is proportional to the measurement (nugget) variance<br />

13


Robust Kriging<br />

ρ(u) a robust function that grows slower than u 2<br />

The robust minimization criteria:<br />

min<br />

d,c<br />

n∑<br />

i=1<br />

ρ(y − (T d + Kc) i ) + λc T Kc<br />

An example of ρ and its derviative<br />

0 5 10 15<br />

−5 0 5<br />

−4 −2 0 2 4<br />

−4 −2 0 2 4<br />

14


Identifying a covariance function<br />

The Matern class of covariances:<br />

φ(d) = ρψ ν (d/θ)) with ψ ν a Bessel function.<br />

correlation<br />

0.0 0.2 0.4 0.6 0.8 1.0<br />

6<br />

2<br />

1<br />

0.5<br />

0.0 0.5 1.0 1.5 2.0<br />

θ a range parameter, ν smoothness<br />

at 0.<br />

• ψ ν is an exponential for ν =<br />

1/2 as ν → ∞ Gaussian.<br />

• m t h order thin plate spline in<br />

R d ν = 2m − d.)<br />

distance<br />

Compactly support Wendland covariance<br />

15


What kind of processes are these?<br />

Matern (.5,1.0,2.0) and Wendland (2.0)<br />

16


Correlations among ozone<br />

In many cases spatial processes also have a temporal component.<br />

Here we take the 89 days over the ”ozone season” and just find<br />

sample correlations among stations.<br />

correlation<br />

−0.5 0.0 0.5 1.0<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

0 100 200 300 400 500 600<br />

miles<br />

●<br />

●<br />

●<br />

●<br />

17


Comparison to the variogram<br />

Sample variogram for 89 days during summer 1987 :<br />

Variance<br />

0 100 300 500<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

0 50 100 150 200 250 300<br />

●<br />

●<br />

●<br />

●<br />

Distance (miles)<br />

”.” Sample variogram values<br />

”–” fitted linear function Variogram ∼ distance/θ<br />

●<br />

●<br />

●<br />

18


50<br />

Sensitivity to the covariance<br />

Exp (200) Matern (2, 200)<br />

150<br />

100<br />

100<br />

50<br />

50<br />

50<br />

150<br />

150<br />

150<br />

50<br />

100<br />

50<br />

50<br />

Thin plate spline Wendland (2, 180)<br />

100<br />

50<br />

100<br />

50<br />

0<br />

correlation<br />

0.0 0.4 0.8<br />

0 50 100 150 200 250 300<br />

distance<br />

19


10<br />

5<br />

5<br />

Sensitivity to the covariance (SE)<br />

Exp (200) Matern (2, 200)<br />

20<br />

20<br />

20<br />

10<br />

20<br />

20<br />

10<br />

5<br />

10<br />

10<br />

20<br />

20<br />

10<br />

20<br />

10<br />

10<br />

5<br />

10<br />

10<br />

20<br />

20<br />

10<br />

5<br />

10<br />

10<br />

10<br />

20<br />

10<br />

10<br />

20<br />

20<br />

20<br />

20<br />

10<br />

10<br />

5<br />

10<br />

10<br />

10<br />

10<br />

10<br />

10<br />

60<br />

50<br />

40<br />

30<br />

20<br />

5<br />

20<br />

10<br />

10<br />

5<br />

10<br />

10<br />

0<br />

Thin plate spline Wendland (2, 180)<br />

20


140<br />

140<br />

140<br />

140<br />

140<br />

140<br />

140<br />

140<br />

140<br />

140<br />

140<br />

140<br />

140<br />

140<br />

140<br />

Uncertainty in exceeding 140 PPB<br />

Mean field, four realizations of the conditional distribution<br />

Contours from 10 cases<br />

0 50 100 150 200<br />

21


Summary<br />

A spatial process model leads to a penalized<br />

least squares estimate<br />

A spline = Kriging estimate= Bayesian posterior<br />

mode<br />

For spatial estimators the basis functions are<br />

related to the covariance functions and can be<br />

identified from data<br />

22

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!