14.02.2013 Views

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 3. Signal Process<strong>in</strong>g 84(5):951-956, 2004 83<br />

952 F.J. Theis / Signal Process<strong>in</strong>g 84 (2004) 951 – 956<br />

Section 5 nally deals with separability of multidimensional<br />

ICA (group ICA).<br />

2. Notation<br />

Let K ∈{R; C} be either the real or the complex<br />

numbers. For m; n ∈ N let Mat(m × n; K)<br />

be the K-vectorspace of real, respectively, complex<br />

m × n matrices, Gl(n; K) := {W ∈ Mat(n ×<br />

n; K) | det(W) �= 0} be the general l<strong>in</strong>ear group of<br />

K n . I ∈ Gl(n; K) denotes the unit matrix. For ∈ C<br />

we write Re( ) for its real and Im( ) for its imag<strong>in</strong>ary<br />

part.<br />

An <strong>in</strong>vertible matrix L ∈ Gl(n; K) is said to be a<br />

scal<strong>in</strong>g matrix, if it is diagonal. We say two matrices<br />

B, C ∈ Mat(m × n; K) are (K−)equivalent, B ∼ C, if<br />

C can be written as C = BPL with a scal<strong>in</strong>g matrix<br />

L ∈ Gl(n; K) and an <strong>in</strong>vertible matrix with unit vectors<br />

<strong>in</strong> each row (permutation matrix) P ∈ Gl(n; K). Note<br />

that PL = L ′ P for some scal<strong>in</strong>g matrix L ′ ∈ Gl(n; K),<br />

so the order of the permutation and the scal<strong>in</strong>g matrix<br />

does not play a role for equivalence. Furthermore,<br />

if B ∈ Gl(n; K) with B ∼ I, then also B −1 ∼ I,<br />

and more general if BC ∼ A, then C ∼ B −1 A.So<br />

two matrices are equivalent if and only if they di er<br />

by right-multiplication by a matrix with exactly one<br />

non-zero entry <strong>in</strong> each row and each column. If K=R,<br />

the two matrices are the same except for permutation,<br />

sign and scal<strong>in</strong>g, if K = C, they are the same except<br />

for permutation, sign, scal<strong>in</strong>g and phase-shift.<br />

3. A multivariate version of the Skitovitch–<br />

Darmois theorem<br />

The orig<strong>in</strong>al Skitovitch–Darmois theorem shows a<br />

non-trivial connection between Gaussian distributions<br />

and stochastic <strong>in</strong>dependence. More precisely, it states<br />

that if two l<strong>in</strong>ear comb<strong>in</strong>ations of non-Gaussian <strong>in</strong>dependent<br />

random variables are aga<strong>in</strong> <strong>in</strong>dependent, then<br />

each orig<strong>in</strong>al random variable can appear only <strong>in</strong> one<br />

of the two l<strong>in</strong>ear comb<strong>in</strong>ations. It has been proved <strong>in</strong>dependently<br />

by Darmois [6] and Skitovitch [14]; <strong>in</strong> a<br />

more accessible form, the proof can be found <strong>in</strong> [12].<br />

Separability of l<strong>in</strong>ear BSS as shown by Comon [5]<br />

is a corollary of this theorem, although recently separability<br />

has also been shown without it [17].<br />

Theorem 3.1 (Skitovitch–Darmois theorem). Let<br />

L1 = �n i=1 iXi and L2 = �n i=1 iXi with X1;:::;Xn<br />

<strong>in</strong>dependent real random variables and j, j ∈ R for<br />

j =1;:::;n. If L1 and L2 are <strong>in</strong>dependent, then all Xj<br />

with j j �= 0are Gaussian.<br />

The converse is true if we assume that � n<br />

j=1 j j =<br />

0: If all Xj with j j �= 0 are Gaussian and<br />

� n<br />

j=1 j j = 0, then L1 and L2 are <strong>in</strong>dependent. This<br />

follows because then L1 and L2 are uncorrelated, and<br />

with all common variables be<strong>in</strong>g normal then also<br />

<strong>in</strong>dependent.<br />

Theorem 3.2 (Multivariate S–D theorem). Let L1 �<br />

=<br />

n<br />

i=1 AiXi and L2 = �n i=1 BiXi with mutually <strong>in</strong>dependent<br />

k-dimensional random vectors Xj and <strong>in</strong>vertible<br />

matrices Aj; Bj ∈ Gl(k; R) for j =1;:::;n. If<br />

L1 and L2 are mutually <strong>in</strong>dependent, then all Xj are<br />

Gaussian.<br />

Here Gaussian (or jo<strong>in</strong>tly Gaussian) means that<br />

each component of the random vector is a Gaussian.<br />

Obviously, those Gaussians can have non-trivial correlations.<br />

This extension of Theorem 3.1 to random vectors<br />

has rst been noted by Skitovitch [15] and shown by<br />

Ghurye and Olk<strong>in</strong> [8]. Z<strong>in</strong>ger gave a di erent proof<br />

for it <strong>in</strong> his Ph.D. thesis [18].<br />

We need the follow<strong>in</strong>g corollary:<br />

Corollary 3.3. Let L1 = � n<br />

i=1 AiX i and L2 =<br />

� n<br />

i=1 BiX i with mutually <strong>in</strong>dependent k-dimensional<br />

random vectors X j and matrices Aj; Bj either zero or<br />

<strong>in</strong> Gl(k; R) for j =1;:::;n. If L1 and L2 are mutually<br />

<strong>in</strong>dependent, then all X j with AjBj �= 0are Gaussian.<br />

Proof. We want to throw out all X j with AjBj =0.<br />

Then Theorem 3.2 can be applied. Let j be given with<br />

AjBj=0. Without loss of generality assume that Bj=0.<br />

If also Aj = 0, then we can simply leave out X j s<strong>in</strong>ce<br />

it does not appear <strong>in</strong> both L1 and L2. Assume Aj �=<br />

0. By assumption X j and X 1 ;:::;X j−1 ; X j+1 ;:::;X n<br />

are mutually <strong>in</strong>dependent, then so are X j and L2 because<br />

Bj = 0. Hence both −AjX j , L2 and L1, L2<br />

are mutually <strong>in</strong>dependent, so also the two l<strong>in</strong>ear comb<strong>in</strong>ations<br />

L1 − AjX j and L2 of the n − 1 variables<br />

X 1 ;:::;X j−1 ; X j+1 ;:::;X n are mutually <strong>in</strong>dependent.<br />

After successive application of this recursion we can

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!