18.01.2015 Views

Karjalainen, Pasi A. Regularization and Bayesian methods for ...

Karjalainen, Pasi A. Regularization and Bayesian methods for ...

Karjalainen, Pasi A. Regularization and Bayesian methods for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

40 2. Estimation theory<br />

If the measurement z is r<strong>and</strong>om, the coefficients c i are also r<strong>and</strong>om parameters.<br />

If they are required to be uncorrelated, we can write<br />

E { cc T } = E { H T zz T H } (2.211)<br />

= H T R z H = diag (σ 2 1,...,σ 2 p) (2.212)<br />

This is an eigenproblem. The basis vectors are then obtained as eigenvectors of<br />

the data correlation matrix R z . The sum (2.210) can then be called e.g. the<br />

discrete Karhunen-Loeve trans<strong>for</strong>m or principal component trans<strong>for</strong>m [157]. It<br />

can be shown that this selection of the basis gives the minimum mean square error<br />

in ẑ compared to any other set of same number of basis vectors.<br />

In wavelet trans<strong>for</strong>m the basis vectors are a set of orthogonal sampled functions<br />

with local support <strong>and</strong> (almost) non-overlapping spectra. The wavelet trans<strong>for</strong>m<br />

of the measurement can then be seen as filtering of the measurements with bank<br />

of noncausal b<strong>and</strong> pass filters [40, 134].<br />

2.15 Modeling of prior in<strong>for</strong>mation in <strong>Bayesian</strong> estimation<br />

Modeling of prior in<strong>for</strong>mation in <strong>Bayesian</strong> estimation is a fundamental problem.<br />

In some cases the parameters θ have some physical meaning <strong>and</strong> it is possible that<br />

we have some knowledge of their possible values. Typical interpretations <strong>for</strong> such<br />

kind of knowledge (or assumptions) is that the parameters are positive, small,<br />

almost equal or smoothly varying.<br />

A common way to implement the prior in<strong>for</strong>mation to <strong>Bayesian</strong> estimation is<br />

to use any prior density with realistic shape. For example if we know that the<br />

parameters should be in some specific p dimensional interval we select the density<br />

to be the multidimensional Gaussian density with appropriate mean <strong>and</strong> covariance<br />

to shrink the estimates towards the center of the desired area. The weaker our<br />

beliefs about the area are, the less in<strong>for</strong>mation the prior density should contain.<br />

Then it becomes more nonin<strong>for</strong>mative, more flat. In the limiting case we can use<br />

some improper prior density that still can be useful <strong>for</strong> the estimation, e.g. as the<br />

positivity constraint.<br />

Since the parameters θ are r<strong>and</strong>om we can assume that the <strong>for</strong>m of the density<br />

of θ depends on the hyperparameters φ. Inference concerning θ is then based on<br />

the posterior density<br />

p(θ|z,φ) = p(z,θ|φ)<br />

p(z|φ)<br />

=<br />

= ∫ p(z,θ|φ)<br />

(2.213)<br />

p(z,u|φ)du<br />

p(z|θ)p(θ|φ)<br />

∫<br />

p(z|u)p(u|φ)du<br />

(2.214)<br />

If φ is not known, the proper full <strong>Bayesian</strong> approach would be to interpret it as<br />

r<strong>and</strong>om with hyperprior p(φ). The desired posterior <strong>for</strong> θ is now obtained by<br />

marginalization<br />

p(θ|z) = p(z,θ)<br />

p(z)<br />

=<br />

∫<br />

p(z,θ,φ)dφ<br />

∫ ∫<br />

p(z,u,φ)dφdu<br />

(2.215)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!