Mathematics in Independent Component Analysis
Mathematics in Independent Component Analysis
Mathematics in Independent Component Analysis
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Chapter 8. Proc. NIPS 2006 133<br />
Theorem 1.7 (Uniqueness of NGSA). The mix<strong>in</strong>g matrix A of a m<strong>in</strong>imal decomposition is unique<br />
except for transformations <strong>in</strong> each of the two subspaces.<br />
Moreover, explicit algorithms can be constructed for identify<strong>in</strong>g the subspaces [3]. This result enables<br />
us to generalize theorem 1.2and to get a general decomposition theorem, which characterizes<br />
solutions of ISA.<br />
Theorem 1.8 (Existence and Uniqueness of ISA). Given a random vector X with exist<strong>in</strong>g covariance,<br />
an ISA of X exists and is unique except for permutation of components of the same dimension<br />
and <strong>in</strong>vertible transformations with<strong>in</strong> each <strong>in</strong>dependent component and with<strong>in</strong> the Gaussian part.<br />
Proof. Existence is obvious. Uniqueness follows after first apply<strong>in</strong>g theorem 1.7 to X and then<br />
theorem 1.2 to the non-Gaussian part.<br />
2 Jo<strong>in</strong>t block diagonalization with unknown block-sizes<br />
Jo<strong>in</strong>t diagonalization has become an important tool <strong>in</strong> ICA-based BSS (used for example <strong>in</strong> JADE)<br />
or <strong>in</strong> BSS rely<strong>in</strong>g on second-order temporal decorrelation. The task of (real) jo<strong>in</strong>t diagonalization<br />
(JD) of a set of symmetric real n×n matrices M := {M1, . . . ,MK} is to f<strong>in</strong>d an orthogonal matrix<br />
E such that E⊤MkE is diagonal for all k = 1, . . .,K i.e. to m<strong>in</strong>imizef( Ê) := �K k=1 �Ê⊤MkÊ −<br />
diagM( Ê⊤MkÊ)�2F with respect to the orthogonal matrix Ê, where diagM(M) produces a matrix<br />
where all off-diagonal elements of M have been set to zero, and �M�2 F := tr(MM⊤ ) denotes the<br />
squared Frobenius norm. The Frobenius norm is <strong>in</strong>variant under conjugation by an orthogonal matrix,<br />
so m<strong>in</strong>imiz<strong>in</strong>g f is equivalent to maximiz<strong>in</strong>g g( Ê) := �K k=1 � diag(Ê⊤MkÊ)�2 , where now<br />
diag(M) := (mii)i denotes the diagonal of M. For the actual m<strong>in</strong>imization of f respectively maximization<br />
of g, we will use the common approach of Jacobi-like optimization by iterative applications<br />
of Givens rotation <strong>in</strong> two coord<strong>in</strong>ates [5].<br />
2.1 Generalization to blocks<br />
In the follow<strong>in</strong>g we will use a generalization of JD <strong>in</strong> order to solve ISA problems. Instead of fully<br />
diagonaliz<strong>in</strong>g all n × n matrices Mk ∈ M, <strong>in</strong> jo<strong>in</strong>t block diagonalization (JBD) of M we want to<br />
determ<strong>in</strong>e E such that E ⊤ MkE is block-diagonal. Depend<strong>in</strong>g on the application, we fix the blockstructure<br />
<strong>in</strong> advance or try to determ<strong>in</strong>e it from M. We are not <strong>in</strong>terested <strong>in</strong> the order of the blocks,<br />
so the block-structure is uniquely specified by fix<strong>in</strong>g a partition of n i.e. a way of writ<strong>in</strong>g n as a sum<br />
of positive <strong>in</strong>tegers, where the order of the addends is not significant. So let 2 n = m1 + . . . + mr<br />
with m1 ≤ m2 ≤ . . . ≤ mr and set m := (m1, . . .,mr) ∈ N r . An n × n matrix is said to be<br />
m-block diagonal if it is of the form<br />
⎛<br />
D1<br />
⎜ .<br />
⎝ .<br />
· · ·<br />
. ..<br />
0<br />
.<br />
.<br />
⎞<br />
⎟<br />
⎠<br />
0 · · · Dr<br />
with arbitrary mi × mi matrices Di.<br />
As generalization of JD <strong>in</strong> the case of known the block structure, we can formulate the jo<strong>in</strong>t mblock<br />
diagonalization (m-JBD) problem as the m<strong>in</strong>imization of fm ( Ê) := �K k=1 �Ê⊤MkÊ −<br />
diagM m ( Ê⊤ Mk Ê)�2 F with respect to the orthogonal matrix Ê, where diagMm (M) produces a<br />
m-block diagonal matrix by sett<strong>in</strong>g all other elements of M to zero. In practice due to estimation<br />
errors, such E will not exist, so we speak of approximate JBD and imply m<strong>in</strong>imiz<strong>in</strong>g some errormeasure<br />
on non-block-diagonality. Indeterm<strong>in</strong>acies of any m-JBD are m-scal<strong>in</strong>g i.e. multiplication<br />
by an m-block diagonal matrix from the right, and m-permutation def<strong>in</strong>ed by a permutation matrix<br />
that only swaps blocks of the same size.<br />
F<strong>in</strong>ally, we speak of general JBD if we search for a JBD but no block structure is given; <strong>in</strong>stead<br />
it is to be determ<strong>in</strong>ed from the matrix set. For this it is necessary to require a block<br />
2 We do not use the convention from Ferrers graphs of specify<strong>in</strong>g partitions <strong>in</strong> decreas<strong>in</strong>g order, as a visualization<br />
of <strong>in</strong>creas<strong>in</strong>g block-sizes seems to be preferable <strong>in</strong> our sett<strong>in</strong>g.