28.07.2013 Views

Shared Gaussian Process Latent Variables Models - Oxford Brookes ...

Shared Gaussian Process Latent Variables Models - Oxford Brookes ...

Shared Gaussian Process Latent Variables Models - Oxford Brookes ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.5. NON-LINEAR 27<br />

resentation. This is the fundamental background to the Kernel Trick which<br />

is a way of non-linearizing algorithms that depend only on the inner product<br />

between data-points. Even though an accepted term, it is not clear where the<br />

term was initially suggested. The Kernel Trick is based on that rather than<br />

finding a specific mapping Φ that takes the data to the feature space F we<br />

specify a function k(yi,yj), called the kernel function that parameterizes the<br />

inner product between Φ(yi) and Φ(yj),<br />

k(yi,yj) = Φ(yi) T Φ(yj). (2.32)<br />

Evaluated between each pair of points in the data-set the kernel function k<br />

specifies the kernel matrix K(Y,Y) which specifies the Gram matrix in the<br />

feature space F . From Eq. 2.17 we know that the Gram matrix and a dis-<br />

tance matrix is interchangeable representation for centered data. Therefore<br />

as long as the kernel function k specifies a valid Gram matrix K there is an<br />

underlying geometrical representation of data in F . The class of kernel func-<br />

tions that specifies geometrically representable feature spaces are known as<br />

Mercer Kernels [41, 50]. Mercer Kernels are positive semidefinite, i.e in the<br />

spectral decomposition of the resulting kernel matrix K all eigenvalues are<br />

non-negative. Intuitively this can be understood through Eq. 2.17, if the<br />

eigenvalues where to be negative then by adding basis vectors the distance<br />

between two points would be reduced, which is not possible in a Euclidean<br />

space. Using a kernel function to represent the data the feature space F is<br />

know as a kernel induced feature space.<br />

One advantage using kernel induced feature space is that if we aim to

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!