25.01.2015 Views

Download Full Issue in PDF - Academy Publisher

Download Full Issue in PDF - Academy Publisher

Download Full Issue in PDF - Academy Publisher

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

JOURNAL OF COMPUTERS, VOL. 8, NO. 6, JUNE 2013 1603<br />

expensive computation when tra<strong>in</strong><strong>in</strong>g large-scale data<br />

sets.<br />

To improve the performance of the ELM, it is essential<br />

to use the <strong>in</strong>formation provided by both labeled and<br />

unlabeled samples. We construct the deformed kernel for<br />

the ELM, which is adapted to the geometry of the<br />

underly<strong>in</strong>g distribution. Based on the deformed kernel,<br />

we propose a deformed kernel-based extreme learn<strong>in</strong>g<br />

mach<strong>in</strong>e (DKELM) to provide a unified solution for<br />

regression, b<strong>in</strong>ary, and multiclass classifications (like<br />

ELM). To address large-scale data tra<strong>in</strong><strong>in</strong>g, we<br />

approximate the deformed kernel by random feature<br />

mapp<strong>in</strong>g, so that the proposed DKELM does not need<br />

parameter tun<strong>in</strong>g and has less computational complexity,<br />

as well as a natural out-of-sample extension for novel<br />

examples. We demonstrate the relationship between the<br />

traditional kernel-based learn<strong>in</strong>g approach and ELM, and<br />

our approach can be used by other kernel-based methods<br />

and a sequence of fast learn<strong>in</strong>g algorithms can be derived.<br />

The rest of this paper is organized as follows. Some<br />

previous works are <strong>in</strong>troduced <strong>in</strong> Section II. The method<br />

of construct<strong>in</strong>g and approximat<strong>in</strong>g the deformed kernel is<br />

discussed <strong>in</strong> Section III. In Section IV, we first<br />

demonstrate the relationship between the traditional<br />

kernel-based learn<strong>in</strong>g approach and ELM, and propose<br />

the deformed kernel based extreme learn<strong>in</strong>g mach<strong>in</strong>e.<br />

The experiments us<strong>in</strong>g benchmark real-world data sets<br />

are reported <strong>in</strong> Section V. F<strong>in</strong>ally, we conclude this paper<br />

<strong>in</strong> Section VI.<br />

II. BRIEF OF THE EXTREME LEARNING MACHINE<br />

The output function of ELM for generalized SLFNs <strong>in</strong><br />

the case of one output node case is<br />

(1)<br />

where<br />

is the vector of the weights<br />

between a hidden layer of L nodes and the output node.<br />

Note that<br />

is the output (row)<br />

vector of the hidden layer with respect to the <strong>in</strong>put x. In<br />

fact, maps the data from the d-dimensional <strong>in</strong>put<br />

space to the L-dimensional hidden-layer feature space<br />

(ELM feature space) H. Different from traditional<br />

learn<strong>in</strong>g algorithms [11], ELM is meant to m<strong>in</strong>imize the<br />

tra<strong>in</strong><strong>in</strong>g error as well as the norm of the output weights [7]<br />

M<strong>in</strong>imize: and (2)<br />

where H is the hidden-layer output matrix, which is<br />

denoted by<br />

where , and is a<br />

kernel function.<br />

If a feature mapp<strong>in</strong>g h(x) is known, we have<br />

(5)<br />

where is a nonl<strong>in</strong>ear piecewise cont<strong>in</strong>uous<br />

function satisfy<strong>in</strong>g ELM universal approximation<br />

capability theorems [7],[30] and are<br />

randomly generated accord<strong>in</strong>g to any cont<strong>in</strong>uous<br />

probability distribution. The output function of ELM<br />

classifier is<br />

or<br />

where<br />

classes.<br />

(4)<br />

, (6)<br />

, (7)<br />

and m is the number of<br />

III. DEFORMING THE KERNEL BY WARPING AN RKHS<br />

For a Mercer kernel K: X X , there is an<br />

associated RKHS of functions X with the<br />

correspond<strong>in</strong>g norm . Given a set of l labeled<br />

examples and a set of u unlabeled examples<br />

, where and , the classical<br />

kernel-based learn<strong>in</strong>g approach is based on solv<strong>in</strong>g the<br />

regularization problem given by<br />

, (8)<br />

where V is some loss function, such as the squared loss<br />

for RLS and the h<strong>in</strong>ge loss function<br />

for SVM; is the norm of the<br />

classification function <strong>in</strong> the reproduc<strong>in</strong>g kernel Hilbert<br />

space , and controls the complexity of function .<br />

The Representer Theorem [27] states that a solution can<br />

be found <strong>in</strong> the form<br />

. In order to<br />

avoid confusion, we list ma<strong>in</strong> notations of this paper <strong>in</strong><br />

Table I.<br />

TABLE I.<br />

. (3)<br />

As with SVM for the b<strong>in</strong>ary classification, to m<strong>in</strong>imize<br />

the norm of the output weights is actually used to<br />

maximize the distance of the separat<strong>in</strong>g marg<strong>in</strong>s of the<br />

two different classes <strong>in</strong> the ELM feature space: .<br />

The norm controls the complexity of the function <strong>in</strong> the<br />

ambient space, which will be elaborated later.<br />

If a feature mapp<strong>in</strong>g h(x) is unknown to users, the<br />

output function of ELM classifier is<br />

Notation<br />

m<br />

NOTATIONS<br />

Explanation<br />

The <strong>in</strong>put d-dimensional Euclidean space<br />

is the<br />

tra<strong>in</strong><strong>in</strong>g data matrix. are labeled po<strong>in</strong>ts,<br />

and are unlabeled po<strong>in</strong>ts.<br />

The number of classes that the samples belong<br />

to<br />

is the 0-1 label matrix.<br />

is the label vector of , and all<br />

components of are s except one be<strong>in</strong>g .<br />

is the discrim<strong>in</strong>ative<br />

vector function. The <strong>in</strong>dex of the class which x<br />

© 2013 ACADEMY PUBLISHER

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!