11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

J. Fan 515 65432100.05 0 0.05 0.1 0.15FIGURE 43.4Left panel: Distribution of the sample correlation ĉorr(X j , ˆε), in which ˆε representsthe residuals after the FGMM fit. Right panel: Distribution of thesample correlation ĉorr(X j , ˆε) for only selected 5 genes by FGMM.selects five genes. Therefore, we need only validate 10 empirical correlationsspecified by conditions (43.4). The empirical correlations between the residualsafter the FGMM fit and the five selected covariates are zero, and theircorrelations with squared covariates are small. The results are displayed inthe right panel of Figure 43.4. Therefore, our model assumptions and modeldiagnostics are consistent.43.6 Noise accumulationWhen a method depends on the estimation of many parameters, the estimationerrors can accumulate. For high-dimensional statistics, noise accumulationis more severe and can even dominate the underlying signals. Consider, forexample, a linear classification which assigns the class label 1(x ⊤ β>0) foreach new data point x. Thisrulecanhavehighdiscriminationpowerwhenβ is known. However, when an estimator ˆβ is used instead, the classificationrule can be as bad as a random guess due to the accumulation of errors inestimating the high-dimensional vector ˆβ.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!