Multivariate Gaussianization for Data Processing

More documents

Recommendations

Info

crosses represent outliers of different nature. The figures show the classification boundaries found by SVDD (left) andIntro G-PCA (right) whenIterative trained using Gaussianization a restricted set of outliers (crosses). Experiments ConclusionsExperiment method for3: small One-class size training Classification sets. This is because more target samples are needed by the G-PCA for anaccurate PDF estimation. However, for moderate and large training sets the proposed method substantiallyoutperforms SVDD. Note that training size requirements of G-PCA are not too demanding: 750 samples on a10-dimensional problem are enough for G-PCA to outperform SVDD when very little is known of the non-targetclass. Classification accuracyNaples, 1995 Naples, 1999 Rome 19990.50.50.5κ statistic0.40.30.2κ statistic0.40.30.2κ statistic0.40.30.20.10.10.100 500 1000 1500 2000 2500Training samples00 500 1000 1500 2000 2500Training samplesThe estimated κ statistic jointly measures precision and recall00 500 1000 1500 2000 2500Training samplesFigure 5. Classification performance (κ statistics) as a function of the number of training samples for the three consideredimages by the SVDD (dashed) and the G-PCA (solid)..Results for test, 10 5 pixelsPoor results, very challenging problem:Figure 6 shows Training the classification with few samples maps using and a restricted from antraining independent strategy. area In this case, the experiment wascarried out over High a small variance region (200 of the × 200) spectral of thesignaturesNaples 1995 image. We used 2000 samples of the target classand only 10 samples of the non-target class. Here the classification performance (κ statistic) is better thanSVDD outperforms RBIG for small size training setsthe results reported in Fig. 5 because small regions have more homogeneous features, and then the varianceof spectral RBIG signatures outperforms is smaller. SVDD As a consequence, for moderate the training and large data describes trainingmore sets accurately the particularbehavior of the smaller spatial region thus achieving a better performance in the test set.Note that, although the SVDD classification map is more homogeneous, G-PCA better rejects the ‘nonurban’areas (in black). This may be because SVDD training with few non-target data gives rise to a too broad
Intro Iterative Gaussianization Experiments ConclusionsExperiment 3: One-class ClassificationClassification accuracy (II)Ground truth SVDD κ = 0.62 G-PCA κ = 0.65A small region (200 × 200) of the Naples 1995 image.Figure 6. Classification performance over a small region of the Naples image (1995). White points represent urban areaswhile black 2000 points represent samples non-urban of theareas.target class and only 10 samples of the non-targetclass for tuning parameters.Much better results (lower 6. spectral CONCLUSIONS variance)We proposed SVDD a fast alternative classification to iterative map Gaussianization is more homogeneous methods that but makes fails it suitable in outlier in high-dimensionalproblems such identificationas those in remote sensing applications. The proposed G-PCA consists of iteratively applyingmarginal Gaussianization and PCA to any original dataset. The result is a multivariate Gaussian. TheoreticalRBIG better rejects the ‘non-urban’ areas (in black)convergence of the proposed method was proved.The methodNoisyexhibits resultsfast canandbe stablesolvedconvergenceby includingrates throughspatialainformationsuitable early-stopping criterion. Thecomputational cost is dramatically reduced compared to ICA-based Gaussianization methods. The proposed
Page 1 and 2:
Intro Iterative Gaussianization Exp
Page 4 and 5:
Page 6 and 7:
Page 8 and 9:
Page 10 and 11: Intro Iterative Gaussianization Exp
Page 70: Intro Iterative Gaussianization Exp
show all

Multivariate Gaussianization for Data Processing

Create successful ePaper yourself

Delete template?

Save as template?