Mathematics in Independent Component Analysis

More documents

Recommendations

Info

168 Chapter 11. EURASIP JASP, 2007 Fabian J. Theis et al. 7 hyperplanes will still be found if the grid resolution is high enough. Increase of the grid resolution (in polynomial time) results in increased accuracy also for higher source dimensions n. The memory requirement of the algorithm is dominated by the accumulator size, which is β m−1 . This can limit the grid resolution. 4.3. Resolution error The choice of the grid resolution β in the algorithm induces a systematic resolution error in the estimation of A (as tradeoff for robustness and speed). This error is calculated in this section. Let A be the unknown mixing matrix and �A its estimate, constructed by the Hough SCA algorithm (Algorithm 4) with grid resolution β. Let n (1) , . . . , n (d) be the normal vectors of hyperplanes generated by (m − 1)-tuples of columns of A and let �n (1) , . . . , �n (d) be their corresponding estimates. Ignoring permutations, it is sufficient to only describe how �n (i) differs from n (i) . Assume that the maxima of the accumulator are correctly estimated, but due to the discrete grid resolution, an average error of π/2β is made when estimating the precise maximum position, because the size of one bin is π/β. How is this error propagated into �n (i) ? By assumption each estimate �ϕ, �θ1, . . . , �θm−2 differs from ϕ, θ1, . . . , θm−2 maximally by π/2β. As we are only interested in an upper boundary, we simply calculate the deviation of each component of �n (i) from n (i) . Using the fact that sine and cosine are bounded by one, (8) then gives us estimates |�n (i) j − n (i) j | ≤ (m − 1)π/(2β) for coordinate j, so altogether � � (i) (i) �n − n � � (m − 1) ≤ √ mπ . (12) 2β This estimate may be improved by using the Jacobian of the spherical coordinate transformation and its determinant, but for our purpose this boundary is sufficient. In summary, we have shown that the grid resolution contributes to a β −1 - perturbation in the estimation of A. 4.4. Robustness Robustness with regard to additive noise as well as outliers is important for any algorithm to be used in the real world. Here an outlier is roughly defined to be a sample far away from other observations, and indeed some researchers define outliers to be sample further away from the mean than say 5 standard deviations. However, such definitions do necessarily depend on the underlying random variable to be estimated, so most books only give examples of outliers, and indeed no consistent, context-free, precise definition of outliers exists [25]. In the following, given samples of a fixed random variable of interest, we denote a sample as outlier if it is drawn from another sufficiently different distribution. Data fitting of only one hyperspace to the data set can be achieved by linear regression namely by minimizing the squared distance to such a possible hyperplane. These least squares fitting algorithms are well known to be sensitive to outliers, and various extensions of the LS method such as least median of squares and reweighted least squares [26] have been developed to overcome this problem. The breakdown point of the latter is 0.5, which means that the fit parameters are only stably estimated for data sets with less than 50% outliers. The other techniques typically have much lower breakdown points, usually below 0.3. The classical Hough transform, albeit no regression method, is comparable in terms of breakdown with robust fitting algorithms such as the reweighted least squares algorithm [27]. In the experiments we will observe similar results for the generalized method presented above. Namely, we achieve breakdown levels of up to 0.8 in the low-noise case, which considerably decrease with increasing noise. From a mathematical point of view, the “classical” Hough transform as an estimator (and extension of linear regression) as well as regarding algorithmic and implementational aspects has been studied quite extensively; see, for example, [28] and references therein. Most of the presented theoretical results in the two-dimensional case could be extended to the more general objective presented here, but this is not within the scope of this manuscript. Simulations giving experimental evidence that the robustness also holds in our case are shown in Section 5. 4.5. Extensions The following possible extensions to the Hough SCA algorithm can be employed to increase its performance. If the noise level is known, smoothing of the accumulator (antialiasing) will help to give more robust results in terms of noise. For smoothing (usually with a Gaussian), the smoothing radius must be set according to the noise level. If the noise level is not known, smoothing can still be applied by gradually increasing the radius until the number of clearly detectable maxima equals d. Furthermore, an additional fine-tuning step is possible: the estimated plane norms are slightly deteriorated by the systematic resolution error as shown previously. However, after application of Hough SCA, the data space can now be clustered into data points lying close to corresponding hyperplanes. Within each cluster linear regression (or some more robust version of it; see Section 4.4) can now be applied to improve the hyperplane estimate—this is actually the idea used locally in the k-hyperplane clustering algorithm (Algorithm 3). Such a method requires additional computational power, but makes the algorithm less dependent on the grid resolution, which is only needed for the hyperplane clustering step. However, it is expected that this additional finetuning step may decrease robustness especially against biased noise and outliers. 5. SIMULATIONS We give a simulation example as well as batch runs to analyze the performance of the proposed algorithm.
Chapter 11. EURASIP JASP, 2007 169 8 EURASIP Journal on Advances in Signal Processing 5 0 5 10 0 5 100 200 300 400 500 600 700 800 900 1000 0 5 0 100 200 300 400 500 600 700 800 900 1000 5 0 5 0 10 5 0 100 200 300 400 500 600 700 800 900 1000 5 0 100 200 300 400 500 600 700 800 900 1000 1 0.5 0 0.5 1 1 0.5 0 (a) Source signals 0.5 1 1 0.5 (c) Normalized mixture scatter plot 0 0.5 1 5 0 5 10 0 4 2 0 100 200 300 400 500 600 700 800 900 1000 2 4 0 5 100 200 300 400 500 600 700 800 900 1000 0 5 0 100 200 300 400 500 600 700 800 900 1000 50 100 150 200 250 300 350 (b) Mixture signals 50 100 150 200 250 300 350 (d) Hough accumulator with labeled maxima Figure 4: Example: (a) shows the 2-sparse, sufficiently rich represented, 4-dimensional source signals, and (b) the randomly mixed, 3dimensional mixtures. The normalized mixture scatter plot {x(t)/|x(t)| | t = 1, . . . , T} is given in (c), and the generated Hough accumulator in (d); note that the color scale in (d) was chosen to be nonlinear (γnew := (1 − γ/ max) 10 ) in order to visualize structure in addition to the strong maxima. 5.1. Explicit example In the first experiment, we consider the case of source dimensions n = 4 and mixing dimension m = 3. The 4-dimensional sources have been generated from i.i.d. samples (two Laplacian and two Gaussian sequences) followed by setting some entries to zero in order to fulfill the sparsity constraints; see Figure 4(a). They are 2-sparse and consist of 1000 samples. Obviously all combinations (i, j), i < j, of active sources are present in the data set; this condition is needed by the matrix recovery step. The sources were mixed using a mixing matrix with randomly (uniform in [−1, 1]) chosen coefficients to give mixtures as shown in Figure 4(b). The mixture density clearly lies in 6 disjoint hyperplanes, spanned by pairs (ai, aj), i < j, of mixture matrix columns, as indicated by the normalized scatter plot in Figure 4(c), similar to the illustration from Figure 1(c). In order to detect the planes in the data space, we apply the generalized Hough transform as explained in Section 3.3. Figure 4(d) shows the Hough image with β = 360. Each sample results in a curve, and clearly 6 intersection points are visible, which correspond to the 6 hyperplanes in question. Maxima analysis retrieves these points (in Hough space) as shown in the same figure. After transforming these points back into R 3 with the inverse Hough transform, we get 6 normalized vectors corresponding to the 6 planes. Considering intersections of the hyperplanes, we notice that only 4 of them intersect in precisely 3 planes, and these 4 intersection lines are spanned by the matrix columns ai. For practical reasons, we recover these combinatorially from the plane norm vectors; see Algorithm 4. The deviation of the recovered mixing matrix �A from the original mixing matrix A in the overcomplete case can be measured by the generalized crosstalking error [8] defined as E(A, �A) := minM∈Π �A− �AM�, where the minimum is taken over the group Π of all invertible real 180 160 140 120 100 80 60 40 20
Page 1 and 2:
Statistical machine learning of bio
Page 3:
to Jakob
Page 6 and 7:
vi Preface
Page 8 and 9:
viii CONTENTS 3 Signal Processing 8
Page 11 and 12:
Chapter 1 Statistical machine learn
Page 13 and 14:
1.1. Introduction 5 auditory cortex
Page 15 and 16:
1.2. Uniqueness issues in independe
Page 17 and 18:
1.2. Uniqueness issues in independe
Page 19 and 20:
S U M M A R Y T his �le contains
Page 21 and 22:
1.3. Dependent component analysis 1
Page 23 and 24:
Table 1.1: BSS algorithms based on
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
1.4. Sparseness 25 1.4 Sparseness O
Page 35 and 36:
1.4. Sparseness 27 Theorem 1.4.3 (S
Page 37 and 38:
1.4. Sparseness 29 R 3 R 3 A BSRA R
Page 39 and 40:
1.4. Sparseness 31 Sparse projectio
Page 41 and 42:
1.5. Machine learning for data prep
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
1.6. Applications to biomedical dat
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
1.7. Outlook 51 noise data signal i
Page 61:
Part II Papers 53
Page 64 and 65:
56 Chapter 2. Neural Computation 16
Page 66 and 67:
Page 68 and 69:
Page 70 and 71:
Page 72 and 73:
Page 74 and 75:
Page 76 and 77:
Page 78 and 79:
Page 80 and 81:
Page 82 and 83:
Page 84 and 85:
Page 86 and 87:
Page 88 and 89:
Page 90 and 91:
82 Chapter 3. Signal Processing 84(
Page 92 and 93:
Page 94 and 95:
Page 96 and 97:
Page 98 and 99:
90 Chapter 4. Neurocomputing 64:223
Page 100 and 101:
Page 102 and 103:
Page 104 and 105:
Page 106 and 107:
Page 108 and 109:
Page 110 and 111:
Page 112 and 113:
104 Chapter 5. IEICE TF E87-A(9):23
Page 114 and 115:
Page 116 and 117:
Page 118 and 119:
Page 120 and 121:
Page 122 and 123:
114 Chapter 6. LNCS 3195:726-733, 2
Page 124 and 125:
116 Chapter 6. LNCS 3195:726-733, 2
Page 126 and 127: 118 Chapter 6. LNCS 3195:726-733, 2
Page 132 and 133: 124 Chapter 7. Proc. ISCAS 2005, pa
Page 138 and 139: 130 Chapter 8. Proc. NIPS 2006 Towa
Page 140 and 141: 132 Chapter 8. Proc. NIPS 2006 1.3
Page 142 and 143: 134 Chapter 8. Proc. NIPS 2006 stru
Page 144 and 145: 136 Chapter 8. Proc. NIPS 2006 5.5
Page 146 and 147: 138 Chapter 8. Proc. NIPS 2006
Page 148 and 149: 140 Chapter 9. Neurocomputing (in p
Page 164 and 165: 156 Chapter 10. IEEE TNN 16(4):992-
Page 170 and 171: 162 Chapter 11. EURASIP JASP, 2007
Page 184 and 185: 176 Chapter 12. LNCS 3195:718-725,
Page 194 and 195: 186 Chapter 13. Proc. EUSIPCO 2005
Page 200 and 201: 192 Chapter 14. Proc. ICASSP 2006 S
Page 202 and 203: 194 Chapter 14. Proc. ICASSP 2006 A
Page 204 and 205: 196 Chapter 14. Proc. ICASSP 2006
Page 206 and 207: 198 Chapter 15. Neurocomputing, 69:
Page 226 and 227:
218 Chapter 15. Neurocomputing, 69:
Page 228 and 229:
Page 230 and 231:
Page 232 and 233:
Page 234 and 235:
Page 236 and 237:
Page 238 and 239:
230 Chapter 16. Proc. ICA 2006, pag
Page 240 and 241:
Page 242 and 243:
Page 244 and 245:
Page 246 and 247:
Page 248 and 249:
240 Chapter 17. IEEE SPL 13(2):96-9
Page 250 and 251:
Page 252 and 253:
Page 254 and 255:
246 Chapter 18. Proc. EUSIPCO 2006
Page 256 and 257:
Page 258 and 259:
Page 260 and 261:
252 Chapter 19. LNCS 3195:977-984,
Page 262 and 263:
254 Chapter 19. LNCS 3195:977-984,
Page 264 and 265:
256 Chapter 19. LNCS 3195:977-984,
Page 266 and 267:
258 Chapter 19. LNCS 3195:977-984,
Page 268 and 269:
260 Chapter 19. LNCS 3195:977-984,
Page 270 and 271:
262 Chapter 20. Signal Processing 8
Page 272 and 273:
Page 274 and 275:
Page 276 and 277:
Page 278 and 279:
Page 280 and 281:
Page 282 and 283:
Page 284 and 285:
Page 286 and 287:
Page 288 and 289:
Page 290 and 291:
Page 292 and 293:
Page 294 and 295:
Page 296 and 297:
Page 298 and 299:
Page 300 and 301:
292 Chapter 21. Proc. BIOMED 2005,
Page 302 and 303:
Page 304 and 305:
Page 306 and 307:
298 BIBLIOGRAPHY Barber, C., Dobkin
Page 308 and 309:
300 BIBLIOGRAPHY Georgiev, P. (2001
Page 310 and 311:
302 BIBLIOGRAPHY Keck, I., Theis, F
Page 312 and 313:
304 BIBLIOGRAPHY Schießl, I., Sch
Page 314 and 315:
306 BIBLIOGRAPHY Theis, F. and Inou
show all

Mathematics in Independent Component Analysis

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?