07.01.2015 Views

An efficient algorithm for maximal margin clustering - ResearchGate

An efficient algorithm for maximal margin clustering - ResearchGate

An efficient algorithm for maximal margin clustering - ResearchGate

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

J Glob Optim<br />

role in the optimal separating hyperplane in the subspace span(f 1 ,f 2 ) = span( f¯<br />

1 , f¯<br />

2 ).For<br />

simplicity of discussion, let us denote the updated feature set F ={f 1 ,f 2 ,...,f k }← ¯F.<br />

Next, we pick the feature pair (f 1 ,f 3 ), find the optimal separating hyperplane based on the<br />

projected data set onto span(f 1 ,f 3 ) and update (f 1 ,f 3 ) in a similar manner. We can repeat<br />

the above process until all the features are scanned. The <strong>algorithm</strong> stops when no significant<br />

improvement can be obtained in the scan process or a prescribed number of scans have been<br />

per<strong>for</strong>med. The <strong>algorithm</strong> can be summarized as shown in Algorithm 2.<br />

Algorithm 2 (top) The successive approach <strong>for</strong> <strong>maximal</strong> <strong>margin</strong> <strong>clustering</strong> and (bottom)<br />

subroutine <strong>for</strong> scanning the ranked feature set<br />

SuccessiveApproach(V, F,ɛ)<br />

Input: V ={v i ∈R d : i = 1,...,n}, F ={f i ∈R n : i = 1,...,d} and a tolerance parameter ɛ;<br />

Output: Cluster label T ;<br />

begin<br />

Rank the features F ={f i : i = 1,...,d},<br />

Set h ′ = 0<br />

{F,h,T}←Scan (V, F, 2)<br />

j ← 3<br />

while h − h ′ >ɛ do<br />

h ′ = h,<br />

{F,h,T}←Scan (V, F,j)<br />

j ← j + 1<br />

end while<br />

Output the cluster label T found in the last iteration<br />

end<br />

Scan (V, F,j)<br />

Input: V ={v i ∈R d : i = 1,...,n} and F ={f i ∈R n : i = 1,...,d} and index j of the current feature;<br />

Output: F ={f i : i = 1,...,d},hand the cluster label set T due to h;<br />

begin<br />

Compute the projected data set ¯V of V onto the subspace span(f 1 ,f j ),<br />

For ¯V, find the normalized optimal separating hyperplane a j f 1 + b j f j + c j = 0<br />

Set f ′ 1 = f 1,f ′<br />

i = f i<br />

Update f 1 ← a j f ′ 1 + b j f ′ j<br />

Update f j ← b j ¯ f ′ 1 − a j f ′ j<br />

Compute the projected data set ¯V based on the updated subspace span(f 1 ,f j );<br />

Find the optimal separating hyperplane based on ¯V<br />

Calculate the corresponding <strong>maximal</strong> <strong>margin</strong> h and the cluster label set T<br />

end<br />

The complexity of the <strong>algorithm</strong> depends on the number of scans in the <strong>algorithm</strong> multiplied<br />

by the cost of every scan O(dn 3 ). If we allow the <strong>algorithm</strong> to run only a few scans, then<br />

the total complexity of the <strong>algorithm</strong> remains O(dn 3 ). We call such a process with a fixed<br />

number of scans as the truncated successive approach. Empirically, we found the truncated<br />

successive approach useful <strong>for</strong> extensions to the semi-supervised case in Sect. 4.2.<br />

4.1 Convergence of the successive approach<br />

Let us analyze the convergence of the proposed successive approach. First, we introduce the<br />

following definition.<br />

123

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!