14.02.2013 Views

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 8. Proc. NIPS 2006 133<br />

Theorem 1.7 (Uniqueness of NGSA). The mix<strong>in</strong>g matrix A of a m<strong>in</strong>imal decomposition is unique<br />

except for transformations <strong>in</strong> each of the two subspaces.<br />

Moreover, explicit algorithms can be constructed for identify<strong>in</strong>g the subspaces [3]. This result enables<br />

us to generalize theorem 1.2and to get a general decomposition theorem, which characterizes<br />

solutions of ISA.<br />

Theorem 1.8 (Existence and Uniqueness of ISA). Given a random vector X with exist<strong>in</strong>g covariance,<br />

an ISA of X exists and is unique except for permutation of components of the same dimension<br />

and <strong>in</strong>vertible transformations with<strong>in</strong> each <strong>in</strong>dependent component and with<strong>in</strong> the Gaussian part.<br />

Proof. Existence is obvious. Uniqueness follows after first apply<strong>in</strong>g theorem 1.7 to X and then<br />

theorem 1.2 to the non-Gaussian part.<br />

2 Jo<strong>in</strong>t block diagonalization with unknown block-sizes<br />

Jo<strong>in</strong>t diagonalization has become an important tool <strong>in</strong> ICA-based BSS (used for example <strong>in</strong> JADE)<br />

or <strong>in</strong> BSS rely<strong>in</strong>g on second-order temporal decorrelation. The task of (real) jo<strong>in</strong>t diagonalization<br />

(JD) of a set of symmetric real n×n matrices M := {M1, . . . ,MK} is to f<strong>in</strong>d an orthogonal matrix<br />

E such that E⊤MkE is diagonal for all k = 1, . . .,K i.e. to m<strong>in</strong>imizef( Ê) := �K k=1 �Ê⊤MkÊ −<br />

diagM( Ê⊤MkÊ)�2F with respect to the orthogonal matrix Ê, where diagM(M) produces a matrix<br />

where all off-diagonal elements of M have been set to zero, and �M�2 F := tr(MM⊤ ) denotes the<br />

squared Frobenius norm. The Frobenius norm is <strong>in</strong>variant under conjugation by an orthogonal matrix,<br />

so m<strong>in</strong>imiz<strong>in</strong>g f is equivalent to maximiz<strong>in</strong>g g( Ê) := �K k=1 � diag(Ê⊤MkÊ)�2 , where now<br />

diag(M) := (mii)i denotes the diagonal of M. For the actual m<strong>in</strong>imization of f respectively maximization<br />

of g, we will use the common approach of Jacobi-like optimization by iterative applications<br />

of Givens rotation <strong>in</strong> two coord<strong>in</strong>ates [5].<br />

2.1 Generalization to blocks<br />

In the follow<strong>in</strong>g we will use a generalization of JD <strong>in</strong> order to solve ISA problems. Instead of fully<br />

diagonaliz<strong>in</strong>g all n × n matrices Mk ∈ M, <strong>in</strong> jo<strong>in</strong>t block diagonalization (JBD) of M we want to<br />

determ<strong>in</strong>e E such that E ⊤ MkE is block-diagonal. Depend<strong>in</strong>g on the application, we fix the blockstructure<br />

<strong>in</strong> advance or try to determ<strong>in</strong>e it from M. We are not <strong>in</strong>terested <strong>in</strong> the order of the blocks,<br />

so the block-structure is uniquely specified by fix<strong>in</strong>g a partition of n i.e. a way of writ<strong>in</strong>g n as a sum<br />

of positive <strong>in</strong>tegers, where the order of the addends is not significant. So let 2 n = m1 + . . . + mr<br />

with m1 ≤ m2 ≤ . . . ≤ mr and set m := (m1, . . .,mr) ∈ N r . An n × n matrix is said to be<br />

m-block diagonal if it is of the form<br />

⎛<br />

D1<br />

⎜ .<br />

⎝ .<br />

· · ·<br />

. ..<br />

0<br />

.<br />

.<br />

⎞<br />

⎟<br />

⎠<br />

0 · · · Dr<br />

with arbitrary mi × mi matrices Di.<br />

As generalization of JD <strong>in</strong> the case of known the block structure, we can formulate the jo<strong>in</strong>t mblock<br />

diagonalization (m-JBD) problem as the m<strong>in</strong>imization of fm ( Ê) := �K k=1 �Ê⊤MkÊ −<br />

diagM m ( Ê⊤ Mk Ê)�2 F with respect to the orthogonal matrix Ê, where diagMm (M) produces a<br />

m-block diagonal matrix by sett<strong>in</strong>g all other elements of M to zero. In practice due to estimation<br />

errors, such E will not exist, so we speak of approximate JBD and imply m<strong>in</strong>imiz<strong>in</strong>g some errormeasure<br />

on non-block-diagonality. Indeterm<strong>in</strong>acies of any m-JBD are m-scal<strong>in</strong>g i.e. multiplication<br />

by an m-block diagonal matrix from the right, and m-permutation def<strong>in</strong>ed by a permutation matrix<br />

that only swaps blocks of the same size.<br />

F<strong>in</strong>ally, we speak of general JBD if we search for a JBD but no block structure is given; <strong>in</strong>stead<br />

it is to be determ<strong>in</strong>ed from the matrix set. For this it is necessary to require a block<br />

2 We do not use the convention from Ferrers graphs of specify<strong>in</strong>g partitions <strong>in</strong> decreas<strong>in</strong>g order, as a visualization<br />

of <strong>in</strong>creas<strong>in</strong>g block-sizes seems to be preferable <strong>in</strong> our sett<strong>in</strong>g.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!