10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.45.2: From parametric models to Gaussian processes 541<strong>and</strong> the (n, n ′ ) entry of C is∑C nn ′ = σw2 φ h (x (n) )φ h (x (n′) ) + δ nn ′σν 2 , (45.26)where δ nn ′ = 1 if n = n ′ <strong>and</strong> 0 otherwise.hExample 45.4. Let’s take as an example a one-dimensional case, with radialbasis functions. The expression for Q nn ′ becomes simplest if we assume wehave uniformly-spaced basis functions with the basis function labelled h centredon the point x = h, <strong>and</strong> take the limit H → ∞, so that the sum overh becomes an integral; to avoid having a covariance that diverges with H,we had better make σw 2 scale as S/(∆H), where ∆H is the number of basisfunctions per unit length of the x-axis, <strong>and</strong> S is a constant; then∫ hmaxQ nn ′ = S dh φ h (x (n) )φ h (x (n′) ) (45.27)h min∫ [] []hmax= S dh exp − (x(n) − h) 22r 2 exp − (x(n′) − h) 22r 2 . (45.28)h minIf we let the limits of integration be ±∞, we can solve this integral:Q nn ′ = √ πr 2 S exp[]− (x(n′) − x (n) ) 24r 2 . (45.29)We are arriving at a new perspective on the interpolation problem. Instead ofspecifying the prior distribution on functions in terms of basis functions <strong>and</strong>priors on parameters, the prior can be summarized simply by a covariancefunction,[]C(x (n) , x (n′) ) ≡ θ 1 exp − (x(n′) − x (n) ) 24r 2 , (45.30)where we have given a new name, θ 1 , to the constant out front.Generalizing from this particular case, a vista of interpolation methodsopens up. Given any valid covariance function C(x, x ′ ) – we’ll discuss ina moment what ‘valid’ means – we can define the covariance matrix for Nfunction values at locations X N to be the matrix Q given byQ nn ′ = C(x (n) , x (n′) ) (45.31)<strong>and</strong> the covariance matrix for N corresponding target values, assuming Gaussiannoise, to be the matrix C given byC nn ′ = C(x (n) , x (n′) ) + σ 2 νδ nn ′. (45.32)In conclusion, the prior probability of the N target values t in the data set is:P (t) = Normal(t; 0, C) = 1 Z e− 1 2 tT C −1t . (45.33)Samples from this Gaussian process <strong>and</strong> a few other simple Gaussian processesare displayed in figure 45.1.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!