14.02.2013 Views

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1.4. Sparseness 31<br />

Sparse projection<br />

Algorithmically, we followed Hoyer’s approach and solved the<br />

sparse NMF problem by alternatively updat<strong>in</strong>g A and S us<strong>in</strong>g<br />

gradient descent on the residual error �X − AS� 2 . After each<br />

update, the columns of A and the rows of S are projected onto<br />

M := {s|�s�1 = σ} ∩ {s|�s�2 = 1} ∩ {s ≥ 0} (1.14)<br />

<strong>in</strong> order to satisfy the sparseness conditions of (1.13). For this,<br />

po<strong>in</strong>ts x ∈ R n have to be projected onto adjacent po<strong>in</strong>ts <strong>in</strong> M,<br />

which was def<strong>in</strong>ed as �x − p�2 ≤ �x − q�2 for all q ∈ M and<br />

denoted as p ⊳ x.<br />

A priori it is not clear when such an x exists and, more so,<br />

is unique, see figure 1.14. We answered this question by prov<strong>in</strong>g<br />

the follow<strong>in</strong>g theorem:<br />

Theorem 1.4.7 (Existence and uniqueness of the Euclidean projection).<br />

M X (M)<br />

(i) If M is closed and nonempty, then for every x ∈ R n there<br />

exists a p ∈ M with p ⊳ x.<br />

(ii) If X (M) := {x ∈ R n |#{p ∈ M|p ⊳ x} > 1} denotes the<br />

exception or non-uniqueness set of M, then vol(X (M)) = 0.<br />

M X (M)<br />

(a) exception set of two po<strong>in</strong>ts<br />

(a) exception set of two po<strong>in</strong>ts<br />

M<br />

(b) excepti<br />

Figure 1: Two examples of exception s<br />

In other words, the exception set conta<strong>in</strong>s the set of<br />

can’t uniquely project. Our goal is to show that this set<br />

very small. Figure 1 shows the exception set of two diffe<br />

Note that if x ∈ M then x ⊳ x, and x is the only poi<br />

So M ∩ X (M) = ∅. Obviously the exception set of an a<br />

is empty. Indeed, we can prove more generally:<br />

Lemma 2.4. Let M ⊂ Rn X (M)<br />

M<br />

be convex. Then X (M) = ∅.<br />

For the proof we need the follow<strong>in</strong>g simple lemma, wh<br />

2-norm as it uses the scalar product.<br />

Lemma 2.5. Let a, b ∈ Rn such that �a + b�2 = �a�2 +<br />

are coll<strong>in</strong>ear.<br />

Proof. By tak<strong>in</strong>g squares we get �a + b�2 = �a�2 + 2�a�<br />

�a� 2 + 2〈a, b〉 + �b� 2 = �a� 2 + 2�a��b� +<br />

if 〈., .〉 denotes the (symmetric) scalar product. Hence 〈<br />

and b are coll<strong>in</strong>ear accord<strong>in</strong>g to the Schwarz <strong>in</strong>equality.<br />

Proof of lemma 2.4. Assume X (M) �= ∅. Then let x ∈<br />

M such that pi ⊳ x. By assumption q := 1<br />

2 (p1 + p2) ∈ M<br />

�x − p1� ≤ �x − q� ≤ 1<br />

2 �x − p1� + 1<br />

�x − p2�<br />

2<br />

because both pi are adjacent to x. Therefore �x−q� = � 1<br />

(a) exception set of two po<strong>in</strong>ts<br />

(b) exception set of a sector<br />

Figure 1: Two examples of exception sets.<br />

In other words, the exception set conta<strong>in</strong>s the set of po<strong>in</strong>ts from which we<br />

can’t uniquely project. Our goal is to show that this set vanishes or is at least<br />

very small. Figure 1 shows the exception set of two different sets.<br />

Note that if x ∈ M then x ⊳ x, and x is the only po<strong>in</strong>t with that property.<br />

So M ∩ X (M) = ∅. Obviously the exception set of an aff<strong>in</strong>e l<strong>in</strong>ear hyperspace<br />

is empty. Indeed, we can prove more generally:<br />

Lemma 2.4. Let M ⊂ R<br />

2<br />

and application of lemma 2.5 shows that x − p1 = α(x<br />

3<br />

n be convex. Then X (M) = ∅.<br />

For the proof we need the follow<strong>in</strong>g simple lemma, which only works for the<br />

2-norm as it uses the scalar product.<br />

Lemma 2.5. Let a, b ∈ Rn such that �a + b�2 = �a�2 + �b�2. Then a and b<br />

are coll<strong>in</strong>ear.<br />

Proof. By tak<strong>in</strong>g squares we get �a + b�2 = �a�2 + 2�a��b� + �b�2 , so<br />

�a� 2 + 2〈a, b〉 + �b� 2 = �a� 2 + 2�a��b� + �b� 2<br />

The above is obvious if M is convex. However here, with M<br />

from equation (1.14), this is not the case, and the above theorem<br />

is needed. We then denote the (almost everywhere unique)<br />

projection πM(x) := p. In addition, <strong>in</strong> (Theis et al., 2005c), we (b) exception set of a sector<br />

proved convergence of Hoyer’s projection algorithm.<br />

Figure 1.14: Two exception<br />

(non-uniqueness) sets.<br />

Iterative projection onto spheres<br />

In Theis and Tanaka (2006), see chapter 14, our goal was to generalize the notion of sparseness.<br />

After all, we naturally <strong>in</strong>terpret sparseness of some signal x(t) as x(t) hav<strong>in</strong>g many zero entries.<br />

This can be measured by the 0-pseudo-norm, and it is common to approximate the it by p-norms<br />

for p → 0. Hence replac<strong>in</strong>g the 1-norm <strong>in</strong> (1.14) by some p-norm is desirable.<br />

A p-sparse NMF algorithm can then be readily derived. However, we observed that the<br />

sparse projection cannot be solved <strong>in</strong> closed form anymore, and little attention has been paid<br />

to f<strong>in</strong>d<strong>in</strong>g projections <strong>in</strong> the case of p �= 1, which is particularly important for p → 0 as better<br />

approximation of �.�0. Hence, our goal <strong>in</strong> (Theis and Tanaka, 2006) was to explore this more<br />

general notion of sparseness and to construct an algorithm to project a vector to its closest vector<br />

of a given sparseness. The result<strong>in</strong>g algorithm is a non-convex extension of the ‘projection onto<br />

convex sets’ (POCS) algorithm (Combettes, 1993, Youla and Webb, 1982).<br />

Let S<br />

if 〈., .〉 denotes the (symmetric) scalar product. Hence 〈a, b〉 = �a��b� and a<br />

and b are coll<strong>in</strong>ear accord<strong>in</strong>g to the Schwarz <strong>in</strong>equality.<br />

n−1<br />

p := {x ∈ Rn | �x�p = 1} denote the (n − 1)-dimensional sphere with respect to the<br />

p-norm (p > 0). A scaled version of this unit sphere is given by cSn−1 p := {x ∈ Rn | �x�p = c}.<br />

Proof of lemma 2.4. Assume X (M) �= ∅. Then let x ∈ X (M) and p1 �= p2 ∈<br />

M such that pi ⊳ x. By assumption q := 1<br />

2 (p1 + p2) ∈ M. But<br />

�x − p1� ≤ �x − q� ≤ 1<br />

2 �x − p1� + 1<br />

2 �x − p2� = �x − p1�<br />

(x−p1)�+� 1<br />

2 (x−p2)�<br />

because both pi are adjacent to x. Therefore �x−q� = � 1<br />

2<br />

and application of lemma 2.5 shows that x − p1 = α(x − p2). Tak<strong>in</strong>g norms<br />

3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!