Download Full Issue in PDF - Academy Publisher
Download Full Issue in PDF - Academy Publisher
Download Full Issue in PDF - Academy Publisher
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
1604 JOURNAL OF COMPUTERS, VOL. 8, NO. 6, JUNE 2013<br />
belongs to is that of the component with the<br />
maximum value.<br />
Kernel function of variables and<br />
Kernel matrix<br />
. Its columns are the<br />
coefficients of the kernel function to represent<br />
the discrim<strong>in</strong>ative function .<br />
Norm <strong>in</strong> the Hilbert space<br />
Inner product <strong>in</strong> the Hilbert space<br />
tr(M)<br />
The trace of the matrix M , that is, the sum of<br />
the diagonal elements of the matrix M.<br />
span{ } subspace expanded by<br />
In the implementation of this kernel-based learn<strong>in</strong>g<br />
approach, we often use the Radial Basis Function or<br />
Gaussian (RBF) as kernel, and the kernel k def<strong>in</strong>es a<br />
unique RKHS. S<strong>in</strong>ce the Gaussian kernel is isotropic, it<br />
has a spherical symmetry. That is, it generally does not<br />
conform to the particular geometry of the underly<strong>in</strong>g<br />
classes. In other words, the underly<strong>in</strong>g data structure is<br />
obviated. F<strong>in</strong>ally, it is unable to provide an accurate<br />
decision surface. To address these limitations, it is crucial<br />
to def<strong>in</strong>e a new kernel that is adapted to the geometry of<br />
the data distribution well.<br />
Instead of solv<strong>in</strong>g (8) like a traditional kernel-based<br />
learn<strong>in</strong>g approach, we modify (or deform) the orig<strong>in</strong>al<br />
kernel <strong>in</strong> order to adapt it to the underly<strong>in</strong>g distribution<br />
geometry. Def<strong>in</strong><strong>in</strong>g a new deformed kernel , the new<br />
problem to be solved becomes<br />
, (9)<br />
(8) and (9) solved with different kernels, and thus <strong>in</strong><br />
different .<br />
The solution of (9) is<br />
, (10)<br />
that should be appropriate for real sett<strong>in</strong>g.<br />
To “deform” the orig<strong>in</strong>al kernel and adapt it to the<br />
geometry of the underly<strong>in</strong>g distribution, let be a l<strong>in</strong>ear<br />
space with positive semi-def<strong>in</strong>ite <strong>in</strong>ner product, and let<br />
be a bounded l<strong>in</strong>ear operator. Def<strong>in</strong><strong>in</strong>g<br />
to be the space with the same functions as and its<br />
<strong>in</strong>ner product def<strong>in</strong>es<br />
, (11)<br />
It is proved <strong>in</strong> [27] that is a valid . In this<br />
specific problem, it is required that and should<br />
depend on the data. Therefore, let be , and def<strong>in</strong>e<br />
as the evaluation map<br />
.Us<strong>in</strong>g a symmetric positive semidef<strong>in</strong>ite<br />
matrix , the semi-norm on can be written<br />
as<br />
. With such a norm, the regularization<br />
problem <strong>in</strong> (9) becomes<br />
(12)<br />
where <strong>in</strong>cludes both labeled and unlabeled data<br />
and the matrix encodes smoothness w.r.t. the graph or<br />
manifold.<br />
Let<br />
and substitute it <strong>in</strong>to (12), we have<br />
(13)<br />
where<br />
is a free parameter that controls<br />
the “deformation” of the orig<strong>in</strong>al kernel. Thus, Equation<br />
(13), <strong>in</strong> fact, is a graph-based semi-supervised learn<strong>in</strong>g<br />
problem based on the manifold assumption; it can be<br />
<strong>in</strong>directly set out us<strong>in</strong>g (12) and solved us<strong>in</strong>g (10).<br />
To utilize the geometry <strong>in</strong>formation of the data<br />
distribution, a graph can be constructed us<strong>in</strong>g labeled<br />
and unlabeled pixels. The graph Laplacian of is a<br />
matrix def<strong>in</strong>ed as , where is the<br />
adjacency matrix. The elements are measures of the<br />
similarity between pixels and , and the diagonal<br />
matrix D is given by<br />
. The graph<br />
Laplacian L measures the variation of the function<br />
along the graph built upon all labeled and unlabeled<br />
samples. By fix<strong>in</strong>g , the orig<strong>in</strong>al (undeformed)<br />
kernel is obta<strong>in</strong>ed.<br />
Next, we discuss the computation of the deformed<br />
kernel . In [27], the result<strong>in</strong>g new kernel was computed<br />
explicitly <strong>in</strong> terms of labeled and unlabeled data. It is<br />
proved that<br />
and<br />
(14)<br />
Thus, the two spans are same and we have<br />
where the coefficients depend on x, let<br />
.<br />
We can compute at x:<br />
(17)<br />
(15)<br />
(16)<br />
Where<br />
and g is the<br />
vector given by components<br />
. Therefore, it can be derived from (17)<br />
that<br />
(I+MK) (18)<br />
where K is the kernel matrix<br />
, = and is the<br />
identity matrix. F<strong>in</strong>ally, we obta<strong>in</strong> the follow<strong>in</strong>g explicit<br />
form for<br />
(19)<br />
where<br />
. It satisfies the Mercer’s<br />
conditions, be<strong>in</strong>g a valid kernel. If<br />
, the<br />
deformed kernel is degenerated <strong>in</strong>to the orig<strong>in</strong>al<br />
© 2013 ACADEMY PUBLISHER