03.11.2014 Views

Parameter Estimation Methods in Physiological Modeling: An ...

Parameter Estimation Methods in Physiological Modeling: An ...

Parameter Estimation Methods in Physiological Modeling: An ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Parameter</strong> <strong>Estimation</strong> <strong>Methods</strong> <strong>in</strong><br />

<strong>Physiological</strong> Model<strong>in</strong>g:<br />

<strong>An</strong> Introduction<br />

Hien Tran<br />

Center for Research <strong>in</strong> Scientific Computation<br />

Department of Mathematics<br />

NC STATE University


Overview<br />

<br />

<br />

<br />

<br />

<strong>Parameter</strong> <strong>Estimation</strong>: Concepts<br />

Sensitivity Identifiability<br />

M<strong>in</strong>imization (Nonl<strong>in</strong>ear Least-squares Problems)<br />

– Gradient-free methods<br />

– Gradient-based methods<br />

Kalman Filter-based Method<br />

GRAZ 2007


<strong>Parameter</strong> <strong>Estimation</strong>: Concepts<br />

<br />

Scientists and modelers frequently wish to relate<br />

physical/biological parameters characteriz<strong>in</strong>g a<br />

model, ! , to collected observations mak<strong>in</strong>g up<br />

some data sets , y . We will assume that the<br />

fundamental physics/biology are adequately<br />

understood, so a function<br />

G, may be specified<br />

G(!) = y<br />

Comput<strong>in</strong>g<br />

G(!) might <strong>in</strong>volve solv<strong>in</strong>g an ord<strong>in</strong>ary<br />

differential equation or partial differential equation.<br />

GRAZ 2007


<strong>Parameter</strong> <strong>Estimation</strong>: Concepts (cont’d)<br />

Example: : Drug concentration dynamics<br />

x(t)<br />

u(t)<br />

y(t)<br />

dx(t)<br />

= ! p 1<br />

x(t) + p 2<br />

u(t), x(0) = 0<br />

dt<br />

y(t) = p 3<br />

x(t)<br />

- concentration of a drug<br />

- test-<strong>in</strong>put <strong>in</strong>jection of a drug<br />

- temporal measurement of the drug concentration<br />

- model parameters<br />

! = (p 1<br />

, p 2<br />

, p 3<br />

)<br />

GRAZ 2007<br />

For any known<br />

u(t)<br />

, the explicit solution is given by<br />

t<br />

y(t) = p 3<br />

p 2<br />

e ! p 1 (t ! s)<br />

" u(s)ds<br />

0


<strong>Parameter</strong> <strong>Estimation</strong>: Concepts (cont’d)<br />

y(t) = p 3<br />

p 2<br />

e ! p 1 (t ! s)<br />

" u(s)ds<br />

If the drug is <strong>in</strong>troduced rapidly as a brief pulse of unit<br />

magnitude, that is,<br />

u(t) = !(t) , then<br />

t<br />

0<br />

y(t) = p 3<br />

p 2<br />

e ! p 1t<br />

!# " $#<br />

G(" )<br />

! y<br />

y !<br />

The forward problem is given f<strong>in</strong>d . Our focus is on the<br />

<strong>in</strong>verse problem of given f<strong>in</strong>d .<br />

GRAZ 2007<br />

Inverse problems are hard!


<strong>Parameter</strong> <strong>Estimation</strong>: Concepts (cont’d)<br />

<br />

Consider a model described by<br />

1<br />

"<br />

0<br />

g(t,s)!(s)ds = y(t)<br />

Fredholm <strong>in</strong>tegral equation of the first k<strong>in</strong>d. Even <strong>in</strong> the<br />

simplest case,<br />

g(t, s) ! 1 , the system<br />

1<br />

"<br />

0<br />

!(s)ds = y(t)<br />

y(t)<br />

has no solution unless<br />

is a constant! Moreover, when a<br />

solution does exist, the solution is not unique!<br />

GRAZ 2007<br />

Existence. There may be no model that exactly fits the data<br />

(mathematical model is only an approximation or because the<br />

data conta<strong>in</strong> noise).<br />

Uniqueness. If exact solutions do exist, they may not be unique,<br />

even for an <strong>in</strong>f<strong>in</strong>ite number of exact data po<strong>in</strong>ts.


<strong>Parameter</strong> <strong>Estimation</strong>: Concepts (cont’d)<br />

<br />

Even if we do not encounter existence or uniqueness issue,<br />

<strong>in</strong>stability is a fundamental feature of <strong>in</strong>verse problem.<br />

Consider the simpler system<br />

A ! ! = ! y<br />

A det A ! 0<br />

If<br />

is near s<strong>in</strong>gular ( or the condition number of the<br />

matrix is very large), small change <strong>in</strong> measurement lead to an<br />

a large change <strong>in</strong> the model parameters. This issue lies <strong>in</strong> the<br />

mathematical model itself, and not <strong>in</strong> the particular algorithm<br />

used to solve the problem. Inverse problems where this<br />

situation arises are referred to as ill-posed.<br />

GRAZ 2007<br />

Issues <strong>in</strong> <strong>in</strong>verse problems: solution existence,<br />

solution uniqueness, , and <strong>in</strong>stability of the solution<br />

process.


<strong>Parameter</strong> <strong>Estimation</strong>: Concepts (cont’d)<br />

<br />

Remarks:<br />

For a model describ<strong>in</strong>g the relationship between model parameters<br />

and data<br />

as<br />

y<br />

y<br />

G(!) = y<br />

In practice,<br />

may be a function of time and/or space, or may be a<br />

collection of discrete observations. <strong>An</strong> important issue is that actual<br />

observations always conta<strong>in</strong> some amount of noise (from <strong>in</strong>strument<br />

read<strong>in</strong>gs, human errors, numerical round-off, etc.). We can thus<br />

envision data as generally consist<strong>in</strong>g of noiseless observations from a<br />

“perfect” experiment,<br />

, plus a noise component<br />

y true<br />

!<br />

!<br />

y =<br />

!<br />

y + " true<br />

G(! true )<br />

GRAZ 2007<br />

Statistical methods for parameter estimation and <strong>in</strong>ference<br />

(Prof. Banks)


Sensitivity Identifiability<br />

References:<br />

J.G. Reid, Structural identifiability <strong>in</strong> l<strong>in</strong>ear time-<strong>in</strong>variant<br />

systems, , IEEE Trans. AC 22: 242-246, 1977.<br />

M.S. Grewal, , and K. Glover, Identifiability of l<strong>in</strong>ear and<br />

nonl<strong>in</strong>ear dynamical systems, , IEEE Trans. AC 21: 833-837,<br />

1976.<br />

J.J. DiStefano, , and C. Cobelli, On parameter and structural<br />

identifiability: Nonunique observability/reconstructibility for<br />

identifiable systems, other ambiguities and new def<strong>in</strong>itions,<br />

IEEE Trans AC 25: 830-833, 1980.<br />

GRAZ 2007


Sensitivity Identifiability (cont’d)<br />

Consider the simple example of drug concentration dynamics<br />

given by<br />

dx(t)<br />

= ! p 1<br />

x(t) + p 2<br />

u(t), x(0) = 0<br />

dt<br />

y(t) = p 3<br />

x(t)<br />

For u(t) = !(t) ,<br />

y(t) = p 3<br />

p 2<br />

e ! p 1t<br />

It is clear that only<br />

p 1 and the product<br />

p 3<br />

p 2 can be determ<strong>in</strong>ed<br />

(and not or ). We say that the model is unidentifiable!<br />

p 2<br />

p 3<br />

GRAZ 2007


Sensitivity Identifiability (cont’d)<br />

More general, consider the system - experiment model (or<br />

simply, structure)<br />

dx<br />

dt = f (x(t,!),u(t),t;!), x(t 0,!) = x 0<br />

,<br />

y(t,!) = g(x(t,!);!)<br />

x "R n , ! "R p , y "R m<br />

A standard approach to estimate the unknown parameters<br />

!<br />

<strong>in</strong> terms of least-square error criterion<br />

is<br />

GRAZ 2007<br />

J(!) =<br />

#[ y(t,!) " z(t) ] 2<br />

dt,<br />

t 0<br />

T<br />

where z(t) is the data fitted by the model output by<br />

optimum choice of .<br />

!<br />

y(t,!)


Sensitivity Identifiability (cont’d)<br />

<br />

Structural Identifiability<br />

The given structure is said to be locally identifiable at<br />

J(!) ! 0<br />

if<br />

has a local m<strong>in</strong>imum at<br />

. If the m<strong>in</strong>imum is global,<br />

the structure is said to globally identifiable.<br />

These concepts have also became known as (local and globally) least<br />

square identifiability.<br />

! 0<br />

GRAZ 2007<br />

<br />

Output Dist<strong>in</strong>guishability<br />

This notion is the question of whether system output obta<strong>in</strong>ed<br />

with different parameter values can be quantitatively<br />

dist<strong>in</strong>guishable from each other (local concept)<br />

It has been shown that local output dist<strong>in</strong>guishability is equivalent to<br />

local (structural) identifiability


Sensitivity Identifiability (cont’d)<br />

<br />

Sensitivity Identifiability<br />

This notion is def<strong>in</strong>ed <strong>in</strong> terms of the output sensitivity<br />

functions with respect to the parameters, that is,<br />

!y i<br />

!" j<br />

(t," 0 )<br />

local concept!<br />

Def<strong>in</strong>e the<br />

m ! p sensitivity function matrix<br />

GRAZ 2007<br />

S(t,!) =<br />

#<br />

%<br />

%<br />

%<br />

%<br />

%<br />

%<br />

$<br />

"y 1<br />

"! 1<br />

! "y 1<br />

"! p<br />

" # "<br />

"y m<br />

"! 1<br />

! "y m<br />

"! p<br />

&<br />

(<br />

(<br />

(<br />

(<br />

(<br />

(<br />

'


Sensitivity Identifiability (cont’d)<br />

<br />

Sensitivity Identifiability (cont’d)<br />

Now, let !" = " # " 0 denote a small perturbation from ! 0<br />

. This gives<br />

rise to a small perturbation <strong>in</strong> the output<br />

!y = y(t,") # y(t," 0 ). Then,<br />

by the cha<strong>in</strong> rule for differentiation, we obta<strong>in</strong> the follow<strong>in</strong>g<br />

(approximate)) relationship<br />

!y = S!"<br />

A structure is then said to be sensitivity identifiable if the above<br />

equation can be solved uniquely for<br />

!" . This is the case if and only<br />

the<br />

rank(S) = p, or equivalently, if and only if det(S T S) ! 0 .<br />

GRAZ 2007<br />

It is clear that (local) output dist<strong>in</strong>guishability and sensitivity<br />

identifiability are equivalent concepts.<br />

#<br />

Comput<strong>in</strong>g the sensitivity function matrix S(t,!) =<br />

"y i(t,!) &<br />

% ?<br />

$ "!<br />

(<br />

j '


Sensitivity Identifiability (cont’d)<br />

<br />

Remarks:<br />

The name sensitivity from the sensitivity function matrix<br />

S(t,!) =<br />

#<br />

%<br />

$<br />

"y (t,!) &<br />

i<br />

"!<br />

(<br />

j '<br />

is used because the elements of the matrix are precisely the<br />

sensitivity functions (to be <strong>in</strong>troduced by Prof. Kappel)<br />

<br />

Problem:<br />

How do we compute the elements of sensitivity function matrix<br />

GRAZ 2007<br />

S(t,!) =<br />

#<br />

%<br />

$<br />

"y (t,!) &<br />

i<br />

"!<br />

(<br />

j '<br />

?


Sensitivity Identifiability (cont’d)<br />

<br />

F<strong>in</strong>ite-differences:<br />

dy i<br />

(t,! 0 )<br />

= y i(t,! 0 + he j<br />

) " y i<br />

(t,! 0 )<br />

,<br />

d! j<br />

h<br />

where<br />

!<br />

$<br />

e j<br />

= # 0,0,…,0, 1! ,0,…,0&<br />

"<br />

%<br />

j th<br />

T<br />

h =<br />

! - ! = mach<strong>in</strong>e epsilon<br />

GRAZ 2007


Sensitivity Identifiability (cont’d)<br />

GRAZ 2007<br />

<br />

Direct:<br />

Us<strong>in</strong>g the cha<strong>in</strong> rule for differentiation,<br />

where<br />

y(t,!) = g(x(t,!);!) "<br />

with <strong>in</strong>itial conditions x(t 0<br />

,!) = x 0<br />

, dx .<br />

d! (t 0<br />

) = 0<br />

!g<br />

Remarks: The derivatives<br />

!x , !g<br />

!" , !f<br />

!x , !f<br />

!"<br />

(tedious and error-prone)<br />

d<br />

dt<br />

dx<br />

dt = f (x(t,!),t;!)<br />

dx<br />

d! = "f dx<br />

"x d! + "f<br />

"!<br />

dy<br />

d! = #g<br />

#x<br />

Automatic Differentiation (TOMLAB/MAD,<br />

dx<br />

d! + #g<br />

#!<br />

can be computed by hand<br />

http://tomopt<br />

//tomopt.com/tomlab/products/mad/<br />

com/tomlab/products/mad/)


Sensitivity Identifiability (cont’d)<br />

<br />

Remarks: Relationship to Fisher Information Matrix<br />

For noisy data,<br />

z(t l<br />

) = y(t l<br />

,!) + e(t<br />

! l<br />

)<br />

F<br />

Fisher <strong>in</strong>formation matrix<br />

(Prof. Kappel’s lecture) is a measure of<br />

the amount of <strong>in</strong>formation about the unknown parameters available<br />

<strong>in</strong> the noisy data. It is <strong>in</strong>timately related to the identifiability question<br />

<strong>in</strong> the broadest sense.<br />

z<br />

noise<br />

T<br />

) + # "log f (z |!) & # "log f (z |!) &-+<br />

F(!) = E *<br />

$<br />

% "! '<br />

(<br />

$<br />

% "! '<br />

(.<br />

,+<br />

/+<br />

- augmented vector of measurements<br />

f (z |!) z !<br />

- conditional probability density function of given<br />

GRAZ 2007<br />

Now, , if the noise <strong>in</strong> the data has zero mean, unit variance, and<br />

identical normal distribution at each t l , and if the error e(t l<br />

) are<br />

uncorrelated, .<br />

F = S T S


Sensitivity Identifiability (cont’d)<br />

<br />

Remarks:<br />

– Nons<strong>in</strong>gularity of the Fisher <strong>in</strong>formation matrix has been shown to be a<br />

necessary and sufficient condition for a large class of problems (<strong>in</strong>clud<strong>in</strong>g<br />

schochastic models, with noise also <strong>in</strong> the dynamical system)<br />

– Fisher <strong>in</strong>formation matrix is also used <strong>in</strong> the computation of the<br />

generalized sensitivity function (Prof. Kappel)<br />

– Fisher <strong>in</strong>formation matrix is a very useful tool for optimiz<strong>in</strong>g the design<br />

variables <strong>in</strong> a parameter estimation experiment. In particular, it has been<br />

used to optimize sampl<strong>in</strong>g schedules and test-<strong>in</strong>puts <strong>in</strong> physiological<br />

studies<br />

GRAZ 2007<br />

References<br />

<br />

<br />

J.J. DiStefano III, Match<strong>in</strong>g the model and the experiment to the goals: Data<br />

limitations, complexity and optimal experiment design for dynamic systems<br />

with biochemical signals, , J. Cybern. Inf. Sci. . 2: 2-4, 1979.<br />

F. Mori, and J.J. DiStefano III, Optimal nonuniform sampl<strong>in</strong>g <strong>in</strong>terval and test-<br />

<strong>in</strong>put design for identification of physiological systems from very limited data,<br />

IEEE Trans. AC 24: 893-900, 1979.


M<strong>in</strong>imization (Nonl<strong>in</strong>ear Least-squares Problem)<br />

Reference:<br />

C.T. Kelley, Iterative <strong>Methods</strong> for Optimization, SIAM, 1999.<br />

Consider the structure<br />

dx<br />

dt = f (x(t,!),u(t),t;!), x(t 0 ,!) = x 0 ,<br />

y(t,!) = g(x(t,!);!)<br />

x "R n , ! "R p , y "R m<br />

In the least-squares formulation, , the parameters ! are found by<br />

m<strong>in</strong>imiz<strong>in</strong>g<br />

J(!) =<br />

#[ y(t,!) " z(t) ] 2<br />

dt,<br />

t 0<br />

T<br />

GRAZ 2007<br />

Algorithms fall <strong>in</strong>to 2 classes:<br />

Gradient-free methods (sampl<strong>in</strong>g methods)<br />

Gradient-based methods (Quasi-Newton, subset selection)


M<strong>in</strong>imization (cont’d)<br />

<br />

Gradient-free methods (Nelder-Mead algorithm)<br />

It uses the concept of a simplex, , which is a polytope of<br />

vertices <strong>in</strong><br />

p dimension (a l<strong>in</strong>e segment <strong>in</strong> one-parameter space, a<br />

triangle <strong>in</strong> two-parameter, a tetrahedron <strong>in</strong> three-parameter space,<br />

etc.)<br />

Consider a 2-parameter estimation<br />

p + 1<br />

! = (p 1<br />

, p 2<br />

), simplex is a triangle<br />

! 1<br />

! 2<br />

! 3<br />

GRAZ 2007<br />

Evaluate<br />

J(! i<br />

) and sort them so that<br />

J(! 1<br />

) " J(! 2<br />

) " ! " J(! p+1<br />

)<br />

"#$<br />

worst po<strong>in</strong>t<br />

Idea: Replace the worst po<strong>in</strong>t with a po<strong>in</strong>t with a lower cost value!


M<strong>in</strong>imization (cont’d)<br />

<br />

Nelder-Mead algorithm (cont’d)<br />

– Compute the centroid of the simplex<br />

the worst po<strong>in</strong>t)<br />

– Replace<br />

with<br />

! = 1 p<br />

! p+1<br />

! new<br />

= "#! p+1<br />

+ (1+ #)!<br />

p<br />

"<br />

l =1<br />

! l<br />

(not <strong>in</strong>clud<strong>in</strong>g<br />

# = (1,2, 1 2 ," 1 2 ) = (# r ,# e ,# oc ,# ic ) ! i new = ! i<br />

+ ! 1<br />

If none of these values are better than the previous worst,<br />

algorithm shr<strong>in</strong>ks the simplex towards the best po<strong>in</strong>t<br />

2<br />

! 1<br />

! 2<br />

! 3<br />

GRAZ 2007<br />

e<br />

r<br />

oc<br />

ic


M<strong>in</strong>imization (cont’d)<br />

<br />

Nelder-Mead algorithm (cont<br />

(cont’d)<br />

Remarks:<br />

– In theory, the Nelder-Mead algorithm is not guaranteed to<br />

converge to a m<strong>in</strong>imum and can stagnate at a suboptimal po<strong>in</strong>t.<br />

However, <strong>in</strong> practice, the method performs well and often results<br />

<strong>in</strong> an <strong>in</strong>itial rapid decrease of the objective function value.<br />

– This suggests a hybrid approach: : Use the Nelder-Mead algorithm<br />

<strong>in</strong>itially then a gradient-based method (Gauss-Newton algorithm)<br />

to take advantage of fast local convergence of Gauss-Newton<br />

method<br />

GRAZ 2007<br />

– Other sampl<strong>in</strong>g methods: Implicit Filter<strong>in</strong>g, DIRECT, genetic<br />

algorithm


M<strong>in</strong>imization (cont’d)<br />

<br />

Gradient-based <strong>Methods</strong> (Gauss-Newton Algorithm)<br />

For discrete measurements,<br />

k<br />

( ) " z i<br />

J(!) = # y(t i<br />

,! ) 2<br />

i=1<br />

Def<strong>in</strong>e<br />

Y (!) =<br />

# y(t 1<br />

,!) " z 1 &<br />

%<br />

!<br />

(<br />

% (<br />

$ % y(t k<br />

,!) " z k '(<br />

J(!) = Y T (!)Y (!)<br />

Start with some current parameter estimate<br />

computed as<br />

! n<br />

= ! c<br />

+ ! "!<br />

correction<br />

! c<br />

. A new estimate is<br />

GRAZ 2007<br />

Idea: Compute the correction term from a quadratic expansion of the<br />

cost functional<br />

J(! new<br />

) = J(! c<br />

+ "!)


M<strong>in</strong>imization (cont’d)<br />

<br />

Gauss-Newton Algorithm (cont<br />

J(! new<br />

) = J(! c<br />

+ "!)<br />

(cont’d)<br />

= Y T (! c<br />

+ "!)Y (! c<br />

+ "!)<br />

$<br />

= Y (! c<br />

) + #Y<br />

#! (! '<br />

c)"!<br />

%<br />

&<br />

(<br />

)<br />

T<br />

$<br />

Y (! c<br />

) + #Y<br />

#! (! '<br />

c)"!<br />

%<br />

&<br />

(<br />

)<br />

T<br />

T<br />

#Y<br />

#Y<br />

= Y T (! c<br />

)Y (! c<br />

) + 2"! T (! c<br />

)Y (! c<br />

) + "! T (! c<br />

) #Y<br />

#!<br />

#! #! (! )"! c<br />

This is a quadratic function <strong>in</strong><br />

derivative with respect to<br />

!"<br />

!"<br />

. The m<strong>in</strong>imum is given by tak<strong>in</strong>g<br />

and set it equal to zero to obta<strong>in</strong><br />

GRAZ 2007<br />

T<br />

!Y<br />

(" c<br />

) !Y<br />

!" !" (" c) #" = $ !Y<br />

T<br />

(" c<br />

)Y (" c<br />

)<br />

!##"<br />

## $ !"<br />

S T S


M<strong>in</strong>imization (cont’d)<br />

<br />

Gauss-Newton Algorithm (cont’d)<br />

Remarks:<br />

– Clearly, the method fails if the matrix<br />

S T S is (almost) s<strong>in</strong>gular. A<br />

well-know remedy is to modify the correction as<br />

(S T S + !I)"# = $ %Y<br />

T<br />

(# c<br />

)Y (# c<br />

)<br />

%#<br />

The positive parameter<br />

! is adjusted based on how near s<strong>in</strong>gular<br />

the matrix<br />

S T S is (Levenberg-Marquardt algorithm). It is chosen<br />

to balance between the steepest-descent step (very slow but<br />

certa<strong>in</strong> convergence) and the Gauss-Newton method (fast but<br />

uncerta<strong>in</strong> convergence)<br />

– To have sufficient decrease <strong>in</strong> the cost functional,<br />

GRAZ 2007<br />

! n<br />

= ! c<br />

+ s"!<br />

Backtrack<strong>in</strong>g (Armijo’s s Rule)<br />

J(! n<br />

) # J(!) $ s ||! n<br />

$ ! c<br />

||


M<strong>in</strong>imization (cont’d)<br />

<br />

Subset Selection<br />

Reference:<br />

– M. Burth, G.C. Verghese, and M. Velez-Reyes, Subset selection for<br />

improved parameter estimation <strong>in</strong> on-l<strong>in</strong>e identification of a<br />

synchronous generator, , IEEE Trans on Power Systems 14:218-<br />

225, 1999.<br />

Idea:<br />

– Use sensitivity <strong>in</strong>formation to partition the parameter set <strong>in</strong>to two<br />

subsets: One subset associated with highly sensitive parameters<br />

and one subset associated with low sensitive parameters<br />

– When solv<strong>in</strong>g for parameters, only update those parameters <strong>in</strong> the<br />

first subset (highly sensitive parameter set). This way, the problem<br />

should be better conditioned!<br />

GRAZ 2007


Kalman-filter Based Method<br />

References:<br />

– R.E. Kalman, A new approach to l<strong>in</strong>ear filter<strong>in</strong>g and prediction<br />

problems, , Trans of the ASME - Journal of Basic Eng<strong>in</strong>eer<strong>in</strong>g 82<br />

(Series D): 35-45, 1960<br />

– F.L. Lewis, Optimal <strong>Estimation</strong> with an Introduction to Stochastic<br />

Control Theory, John Wiley & Sons, 1986<br />

Kalman Filter ( (A A Hypothetical Example)<br />

Suppose there are 3 contestants for the Price is Right show!<br />

f (x | z 1<br />

)<br />

- conditional probability<br />

GRAZ 2007<br />

z 1<br />

! z1<br />

First contestant’s<br />

estimate of the<br />

price is<br />

z 1 with<br />

standard deviation<br />

! z1


Kalman-filter Based Method (cont’d)<br />

f (x | z 2<br />

)<br />

z 1<br />

! z1<br />

z 2<br />

! z2<br />

GRAZ 2007<br />

At this po<strong>in</strong>t, there are two estimates available for predict<strong>in</strong>g the<br />

correct price. Question: : How do you (the third contestant) comb<strong>in</strong>e<br />

these data (so that you have a better estimate than either the first or<br />

the second contestant)?<br />

If<br />

! z1<br />

= ! z2 , the best<br />

estimate should be the<br />

2<br />

2<br />

! z2<br />

µ =<br />

! 2 2<br />

z1<br />

+ ! z ! z1<br />

1<br />

+<br />

z2<br />

! 2 2<br />

z1<br />

+ ! z average of the two<br />

2<br />

If ! z1<br />

> ! z2<br />

(i.e., z 2 is a<br />

z2<br />

better estimate), then<br />

1<br />

! = 1<br />

2 2<br />

! + 1<br />

the formula <strong>in</strong>dicates<br />

2<br />

that we should weight<br />

z1<br />

! z2<br />

our estimate more<br />

toward<br />

z 2


Kalman-filter Based Method (cont’d)<br />

f (x | z 2<br />

)<br />

z 1<br />

! z1<br />

z 2<br />

! z2<br />

GRAZ 2007<br />

Rewrite the updates as:<br />

µ = z 1<br />

+ K[z 2<br />

! z 1<br />

], K =<br />

" 2 = " 2 2<br />

z1<br />

! K" z1<br />

2<br />

" z1<br />

" 2 2<br />

z1<br />

+ " z2<br />

z 1<br />

z2<br />

Now, suppose that<br />

is the output<br />

from your model and<br />

is the<br />

measurement. Kalman filter is a<br />

technique that comb<strong>in</strong>es the model<br />

output with measurement to derive<br />

at a better estimate for the model<br />

output by consider<strong>in</strong>g both the<br />

error <strong>in</strong> the model and the error<br />

<strong>in</strong> the data


Kalman-filter Based Method (cont’d)<br />

<br />

Problem: : How do we extend this to dynamical systems?<br />

For l<strong>in</strong>ear system,<br />

dx<br />

dt<br />

= A(!)x + g(t)w(t)<br />

w(t)<br />

v k<br />

z k<br />

= C(!)x(t k<br />

) + v k<br />

where<br />

and<br />

are white noise processes with means zero and<br />

covariances Q and R , respectively. First, we def<strong>in</strong>e the conditional<br />

expected value and the conditional variance<br />

ˆx = E[ x(t) | (z k<br />

:t k<br />

< t) ]<br />

P(t) = E "# (x ! ˆx)(x ! ˆx) T | (z k<br />

:t k<br />

< t) $ %<br />

GRAZ 2007<br />

Kalman filter then attempts to make an estimate on the true state<br />

with a “predictor-corrector” sort of implementation. Note that to<br />

estimate also the parameters<br />

! we augment the state equations<br />

with<br />

d! / dt = 0


Kalman-filter Based Method (cont’d)<br />

ˆ!x = Aˆx<br />

!P = PA T + AP + gQg T !<br />

"<br />

#<br />

$# t k %1<br />

< t < t k<br />

, Predictor state (no measurement is used)<br />

GRAZ 2007<br />

<br />

ˆx k<br />

= ˆx ! k<br />

+ K k<br />

[z k<br />

! Cˆx ! k<br />

] "<br />

$<br />

K k<br />

= P ! (t k<br />

)C T [CP ! (t k<br />

)C T + R] !1<br />

# Corrector state (new measurement is used)<br />

P(t k<br />

) = [I ! K k<br />

C]P ! (t k<br />

)<br />

$<br />

%<br />

For nonl<strong>in</strong>ear system, , one idea is to l<strong>in</strong>earize the nonl<strong>in</strong>earities (Extended(<br />

Kalman Filter). Other ideas, such as Gaussian filters and Unscented<br />

Kalman filter, , attempt to approximate the underly<strong>in</strong>g conditional<br />

distribution by apply<strong>in</strong>g the nonl<strong>in</strong>ear function to a set of po<strong>in</strong>ts and<br />

recover<strong>in</strong>g <strong>in</strong>formation on this distribution based on the effect the<br />

function has on these po<strong>in</strong>ts. In ensemble Kalman filter, , Monte Carlo<br />

method is used to solve the time evolution equation of the probability<br />

density of the model state . F<strong>in</strong>ally, neural network is a viable approach<br />

for model design and parameter estimation ( (Dr. Spyros Courellis’ lecture)<br />

Reference: : G. Evensen, The ensemble Kalman filter: Theoretical<br />

formulation and practical implementation, , Ocean Dyn. . 53: 343-367, 2003

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!