25.01.2015 Views

Download Full Issue in PDF - Academy Publisher

Download Full Issue in PDF - Academy Publisher

Download Full Issue in PDF - Academy Publisher

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1604 JOURNAL OF COMPUTERS, VOL. 8, NO. 6, JUNE 2013<br />

belongs to is that of the component with the<br />

maximum value.<br />

Kernel function of variables and<br />

Kernel matrix<br />

. Its columns are the<br />

coefficients of the kernel function to represent<br />

the discrim<strong>in</strong>ative function .<br />

Norm <strong>in</strong> the Hilbert space<br />

Inner product <strong>in</strong> the Hilbert space<br />

tr(M)<br />

The trace of the matrix M , that is, the sum of<br />

the diagonal elements of the matrix M.<br />

span{ } subspace expanded by<br />

In the implementation of this kernel-based learn<strong>in</strong>g<br />

approach, we often use the Radial Basis Function or<br />

Gaussian (RBF) as kernel, and the kernel k def<strong>in</strong>es a<br />

unique RKHS. S<strong>in</strong>ce the Gaussian kernel is isotropic, it<br />

has a spherical symmetry. That is, it generally does not<br />

conform to the particular geometry of the underly<strong>in</strong>g<br />

classes. In other words, the underly<strong>in</strong>g data structure is<br />

obviated. F<strong>in</strong>ally, it is unable to provide an accurate<br />

decision surface. To address these limitations, it is crucial<br />

to def<strong>in</strong>e a new kernel that is adapted to the geometry of<br />

the data distribution well.<br />

Instead of solv<strong>in</strong>g (8) like a traditional kernel-based<br />

learn<strong>in</strong>g approach, we modify (or deform) the orig<strong>in</strong>al<br />

kernel <strong>in</strong> order to adapt it to the underly<strong>in</strong>g distribution<br />

geometry. Def<strong>in</strong><strong>in</strong>g a new deformed kernel , the new<br />

problem to be solved becomes<br />

, (9)<br />

(8) and (9) solved with different kernels, and thus <strong>in</strong><br />

different .<br />

The solution of (9) is<br />

, (10)<br />

that should be appropriate for real sett<strong>in</strong>g.<br />

To “deform” the orig<strong>in</strong>al kernel and adapt it to the<br />

geometry of the underly<strong>in</strong>g distribution, let be a l<strong>in</strong>ear<br />

space with positive semi-def<strong>in</strong>ite <strong>in</strong>ner product, and let<br />

be a bounded l<strong>in</strong>ear operator. Def<strong>in</strong><strong>in</strong>g<br />

to be the space with the same functions as and its<br />

<strong>in</strong>ner product def<strong>in</strong>es<br />

, (11)<br />

It is proved <strong>in</strong> [27] that is a valid . In this<br />

specific problem, it is required that and should<br />

depend on the data. Therefore, let be , and def<strong>in</strong>e<br />

as the evaluation map<br />

.Us<strong>in</strong>g a symmetric positive semidef<strong>in</strong>ite<br />

matrix , the semi-norm on can be written<br />

as<br />

. With such a norm, the regularization<br />

problem <strong>in</strong> (9) becomes<br />

(12)<br />

where <strong>in</strong>cludes both labeled and unlabeled data<br />

and the matrix encodes smoothness w.r.t. the graph or<br />

manifold.<br />

Let<br />

and substitute it <strong>in</strong>to (12), we have<br />

(13)<br />

where<br />

is a free parameter that controls<br />

the “deformation” of the orig<strong>in</strong>al kernel. Thus, Equation<br />

(13), <strong>in</strong> fact, is a graph-based semi-supervised learn<strong>in</strong>g<br />

problem based on the manifold assumption; it can be<br />

<strong>in</strong>directly set out us<strong>in</strong>g (12) and solved us<strong>in</strong>g (10).<br />

To utilize the geometry <strong>in</strong>formation of the data<br />

distribution, a graph can be constructed us<strong>in</strong>g labeled<br />

and unlabeled pixels. The graph Laplacian of is a<br />

matrix def<strong>in</strong>ed as , where is the<br />

adjacency matrix. The elements are measures of the<br />

similarity between pixels and , and the diagonal<br />

matrix D is given by<br />

. The graph<br />

Laplacian L measures the variation of the function<br />

along the graph built upon all labeled and unlabeled<br />

samples. By fix<strong>in</strong>g , the orig<strong>in</strong>al (undeformed)<br />

kernel is obta<strong>in</strong>ed.<br />

Next, we discuss the computation of the deformed<br />

kernel . In [27], the result<strong>in</strong>g new kernel was computed<br />

explicitly <strong>in</strong> terms of labeled and unlabeled data. It is<br />

proved that<br />

and<br />

(14)<br />

Thus, the two spans are same and we have<br />

where the coefficients depend on x, let<br />

.<br />

We can compute at x:<br />

(17)<br />

(15)<br />

(16)<br />

Where<br />

and g is the<br />

vector given by components<br />

. Therefore, it can be derived from (17)<br />

that<br />

(I+MK) (18)<br />

where K is the kernel matrix<br />

, = and is the<br />

identity matrix. F<strong>in</strong>ally, we obta<strong>in</strong> the follow<strong>in</strong>g explicit<br />

form for<br />

(19)<br />

where<br />

. It satisfies the Mercer’s<br />

conditions, be<strong>in</strong>g a valid kernel. If<br />

, the<br />

deformed kernel is degenerated <strong>in</strong>to the orig<strong>in</strong>al<br />

© 2013 ACADEMY PUBLISHER

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!