Mathematics in Independent Component Analysis
Mathematics in Independent Component Analysis
Mathematics in Independent Component Analysis
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
74 Chapter 2. Neural Computation 16:1827-1850, 2004<br />
A New Concept for Separability Problems <strong>in</strong> BSS 1845<br />
Consider now the postnonl<strong>in</strong>ear BSS model,<br />
X = f(AS), (5.1)<br />
where aga<strong>in</strong> S is an <strong>in</strong>dependent random vector, A ∈ Gl(n), and f is a diagonal<br />
nonl<strong>in</strong>earity. We assume the components of f to be <strong>in</strong>jective analytical<br />
functions with <strong>in</strong>vertible Jacobian at every po<strong>in</strong>t (locally diffeomorphic).<br />
Def<strong>in</strong>ition 4. An <strong>in</strong>vertible matrix A ∈ Gl(n) is said to be mix<strong>in</strong>g if A has at<br />
least two nonzero entries <strong>in</strong> each row.<br />
Note that if A is mix<strong>in</strong>g, then A ′ , A −1 , and ALP for scal<strong>in</strong>g matrix L and<br />
permutation matrix P are also mix<strong>in</strong>g.<br />
Postnonl<strong>in</strong>ear BSS is a generalization of l<strong>in</strong>ear BSS, so the <strong>in</strong>determ<strong>in</strong>acies<br />
of postnonl<strong>in</strong>ear ICA conta<strong>in</strong> at least the <strong>in</strong>determ<strong>in</strong>acies of l<strong>in</strong>ear BSS:<br />
A can be reconstructed only up to scal<strong>in</strong>g and permutation. In the l<strong>in</strong>ear<br />
case, aff<strong>in</strong>e l<strong>in</strong>ear transformation is ignored. Here, of course, additional<br />
<strong>in</strong>determ<strong>in</strong>acies come <strong>in</strong>to play because of translation: fi can be recovered<br />
only up to a constant. Also, if L ∈ Gl(n) is a scal<strong>in</strong>g matrix, then<br />
f(AS) = (f ◦ L)((L −1 A)S),<br />
so f and A can <strong>in</strong>terchange scal<strong>in</strong>g factors <strong>in</strong> each component. Another obvious<br />
<strong>in</strong>determ<strong>in</strong>acy could occur if A is not general enough. If, for example,<br />
A = I, then f(S) is already <strong>in</strong>dependent, because <strong>in</strong>dependence is <strong>in</strong>variant<br />
under diagonal nonl<strong>in</strong>ear transformation; so f cannot be found <strong>in</strong> this case.<br />
If we assume, however, that A is mix<strong>in</strong>g, then we will show that except for<br />
scal<strong>in</strong>g <strong>in</strong>terchange between f and A, no more <strong>in</strong>determ<strong>in</strong>acies than <strong>in</strong> the<br />
aff<strong>in</strong>e l<strong>in</strong>ear case exist.<br />
Theorem 4 (separability of postnonl<strong>in</strong>ear BSS). Let A, W ∈ Gl(n) be mix<strong>in</strong>g,<br />
h : R n → R n be a diagonal bijective function with analytical locally diffeomorphic<br />
components, and S be an <strong>in</strong>dependent random vector with at most one<br />
gaussian component and exist<strong>in</strong>g covariance. If W(h(AS)) is <strong>in</strong>dependent, then<br />
there exists a scal<strong>in</strong>g matrix L ∈ Gl(n) and p ∈ R n with LA ∼ W −1 and h ≡ L+p.<br />
If analyticity of the components of h is not assumed, then h ≡ L + p can<br />
only hold on {As|pS(s) �= 0}.<br />
If f ◦ A is the mix<strong>in</strong>g model, W ◦ g is the separat<strong>in</strong>g model. Putt<strong>in</strong>g the<br />
two together, we get the above mix<strong>in</strong>g-separat<strong>in</strong>g model. S<strong>in</strong>ce A has to be<br />
assumed to be mix<strong>in</strong>g, we can assume W to be mix<strong>in</strong>g as well because the<br />
<strong>in</strong>verse of a matrix that is mix<strong>in</strong>g is aga<strong>in</strong> mix<strong>in</strong>g. Furthermore, the mix<strong>in</strong>gseparat<strong>in</strong>g<br />
model is assumed to be bijective—hence, A and W <strong>in</strong>vertible and<br />
h bijective—because otherwise trivial solutions as, for example, h ≡ c for a<br />
constant c ∈ R, would also be solutions.