14.02.2013 Views

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

270 Chapter 20. Signal Process<strong>in</strong>g 86(3):603-623, 2006<br />

2.2.1 <strong>Independent</strong> component analysis<br />

<strong>Independent</strong> component analysis (ICA) describes the task of (here: l<strong>in</strong>early)<br />

transform<strong>in</strong>g a given multivariate random vector such that its transform is<br />

stochastically <strong>in</strong>dependent. In our sett<strong>in</strong>g the random vector is given by N<br />

realizations, and ICA is applied to solve the BSS problem (1), where S is<br />

assumed to be <strong>in</strong>dependent. As <strong>in</strong> all BSS problems, a key issue lies <strong>in</strong> the<br />

question of identifiability of the model, and it can be shown that A (and<br />

hence S) is already uniquely — except for column permutation and scal<strong>in</strong>g —<br />

determ<strong>in</strong>ed by X if S conta<strong>in</strong>s at most one Gaussian and is square <strong>in</strong>tegrable<br />

[26,27]. This enables us to apply ICA to the BSS problem and to recover the<br />

orig<strong>in</strong>al sources.<br />

The idea of ICA was first expressed by Jutten and Hérault [28], while the<br />

term ICA was later co<strong>in</strong>ed by Comon [26]. In contrast to pr<strong>in</strong>cipal component<br />

analysis (PCA), ICA uses higher-order statistics to fully separate the data.<br />

Typical algorithms are based on contrasts such as m<strong>in</strong>imum mutual <strong>in</strong>formation,<br />

maximum entropy, diagonal cumulants or non-Gaussianity. For more<br />

details on ICA we refer to the two available excellent text-books [1,2].<br />

In the follow<strong>in</strong>g we will use the so-called JADE (jo<strong>in</strong>t approximate diagonalization<br />

of eigenmatrices) algorithm, which identifies the sources us<strong>in</strong>g the fact<br />

that due to <strong>in</strong>dependence, the diagonal cumulants of the sources vanish. Furthermore,<br />

fix<strong>in</strong>g two <strong>in</strong>dices of the 4th order cumulants, it is easy to see that<br />

such cumulant matrices of the mixtures are diagonalized by A.<br />

After pre-process<strong>in</strong>g by PCA, we can assume that A is orthogonal. Then<br />

diagonalization of one mixture cumulant matrix already yields A, given that<br />

its eigenvalues are pairwise different. However, <strong>in</strong> practice this is not always<br />

the case; furthermore, estimation errors could result <strong>in</strong> a bad estimate of the<br />

cumulant matrix and hence of A. Therefore, jo<strong>in</strong>t diagonalization of a whole<br />

set of cumulant matrices yields an improved estimate of A. Algorithms for<br />

actually perform<strong>in</strong>g jo<strong>in</strong>t diagonalization <strong>in</strong>clude gradient descent on the sum<br />

of off-diagonal terms, iterative construction of A by Givens rotation <strong>in</strong> two<br />

coord<strong>in</strong>ates [29], an iterative two-step recovery of A [30] or — more recently<br />

— a l<strong>in</strong>ear least-squares algorithm for diagonalization [31], where the latter<br />

two algorithms can also search for non-orthogonal matrices A.<br />

One method we use for analyz<strong>in</strong>g the s-EMG signals is JADE-based ICA. Confirm<strong>in</strong>g<br />

results from [32] we show that ICA can <strong>in</strong>deed extract the underly<strong>in</strong>g<br />

sources. In the case of s-EMG signals, all sources are strongly super-Gaussian<br />

and can therefore safely be assumed to be non-Gaussian, so identifiability<br />

holds. However, due to the nonnegativity of A, the scal<strong>in</strong>g <strong>in</strong>determ<strong>in</strong>acy is<br />

reduced to multiplication with a positive scalar <strong>in</strong> each column. If we additionally<br />

use the common assumption of unit variance of the sources, this<br />

9

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!