14.02.2013 Views

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1.5. Mach<strong>in</strong>e learn<strong>in</strong>g for data preprocess<strong>in</strong>g 37<br />

for NGCA essentially us<strong>in</strong>g the idea of separated characteristic functions from the proof was<br />

proposed <strong>in</strong> (Kawanabe and Theis, 2007).<br />

F<strong>in</strong>ally, <strong>in</strong> (Theis and Kawanabe, 2007), we presented an modification of NGCA that evaluates<br />

the time structure of the multivariate observations <strong>in</strong>stead of their higher-order statistics.<br />

We differentiated the signal subspace from noise by search<strong>in</strong>g for a subspace of non-trivially<br />

autocorrelated data. In contrast to bl<strong>in</strong>d source separation approaches however, we did not<br />

require the existence of sources, so the model is applicable to any wide-sense stationary time<br />

series without restrictions. Moreover, s<strong>in</strong>ce the method is based on second-order time structure,<br />

it could be efficiently implemented even for large dimensions, which we illustrated with an<br />

application to dimension reduction of functional MRI record<strong>in</strong>gs.<br />

1.5.3 Cluster<strong>in</strong>g<br />

Cluster<strong>in</strong>g methods are an important tool <strong>in</strong> high-dimensional explorative data m<strong>in</strong><strong>in</strong>g. They<br />

aim at identify<strong>in</strong>g samples or regions of similar characteristics, and often code them by a s<strong>in</strong>gle<br />

codebook vector or centroid. In this section, we review cluster<strong>in</strong>g algorithms, and employ these<br />

methods to solve the bl<strong>in</strong>d matrix factorization problem (1.12) from above under various source<br />

assumptions.<br />

Cluster<strong>in</strong>g for solv<strong>in</strong>g overcomplete BSS problems<br />

In Theis et al. (2006), see chapter 17, we discussed the bl<strong>in</strong>d source separation problem (1.1)<br />

<strong>in</strong> the difficult case of overcomplete BSS, where less mixtures than sources are observed (m <<br />

n). We focused on the usually more elaborate matrix recovery part. Assum<strong>in</strong>g statistically<br />

<strong>in</strong>dependent sources with exist<strong>in</strong>g variance and at most one Gaussian component, it is wellknown<br />

that A is determ<strong>in</strong>ed uniquely by the mixtures x(t) (Eriksson and Koivunen, 2003).<br />

However, how to do this algorithmically is far from obvious, and although some algorithms have<br />

been proposed recently (Bofill and Zibulevsky, 2001, Lee et al., 1999, O’Grady and Pearlmutter,<br />

2004), performance is yet limited.<br />

The most commonly used overcomplete algorithms rely on sparse sources (after possible<br />

sparsification by preprocess<strong>in</strong>g), which can be identified by cluster<strong>in</strong>g, usually by k-means or<br />

some extension (Bofill and Zibulevsky, 2001, O’Grady and Pearlmutter, 2004). However apart<br />

from the fact that theoretical justifications have not been found, mean-based cluster<strong>in</strong>g only<br />

identifies the correct A if the data density approaches a delta distribution. In figure 1.18,<br />

we illustrate the deficiency of mean-based cluster<strong>in</strong>g; we get an error of up to 5 ◦ per mix<strong>in</strong>g<br />

angle, which is rather substantial consider<strong>in</strong>g the sparse density and the simple, complete case<br />

of m = n = 2. Moreover the figure <strong>in</strong>dicates that median-based cluster<strong>in</strong>g performs much<br />

better. Indeed, mean-based cluster<strong>in</strong>g does not possess any equivariance property, which implies<br />

performance <strong>in</strong>dependent of the choice of A.<br />

We proposed a novel overcomplete, median-based cluster<strong>in</strong>g method <strong>in</strong> (Theis et al., 2006),<br />

and proved its equivariance and convergence. Simply put, we first pick 2n normalized start<strong>in</strong>g<br />

vectors w1, w ′ 1 , . . . , wn, w ′ n, and iterate the follow<strong>in</strong>g steps until an appropriate abort condition<br />

has been met: Choose a sample x(t) ∈ R m and normalize it y(t) := π(x(t)) = x(t)/|x(t)|. Let

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!