11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

472 Targeted learningliterature provides many possible approaches to construct estimators of therequired objects Q 0 , and thereby ψ 0 =Ψ(Q 0 ). One strategy would be to definea large class of submodels that contains a sequence of submodels thatapproximates the complete statistical model (a so called sieve), and constructfor each submodel an estimator that achieves the minimax rate under theassumption that Q 0 is an element of this submodel. One can now use a dataadaptive selector to select among all these submodel-specific candidate estimators.This general strategy, which is often referred to as sieve-based MLE,results in a minimax adaptive estimator of Q 0 ,i.e.,theestimatorconvergesat the minimax rate of the smallest submodel (measured by entropy) thatstill contains the true Q 0 . Such an estimator is called minimax adaptive. Werefer to van der Laan and Dudoit (2003) and van der Laan et al. (2006) forsuch general minimum loss-based estimators relying on cross-validation to selectthe subspace. This same strategy can be employed with kernel regressionestimators that are indexed by the degree of orthogonality of the kernel anda bandwidth, and one can use a data-adaptive selector to select this kerneland bandwidth. In this manner the resulting data adaptive kernel regressionestimator will achieve the minimax rate of convergence corresponding withthe unknown underlying smoothness of the true regression function.40.3.6 Cross-validationCross-validation is a particularly powerful tool to select among candidate estimators.In this case, one defines a criterion that measures the performanceof a given fit of Q 0 on a particular sub-sample: typically, this is defined as anempirical mean of a loss function that maps the fit Q and observation O i intoa real number and is such that the minimizer of the expectation of the lossover all Q equals the desired true Q 0 . For each candidate estimator, one trainsthe estimator on a training sample and one evaluates the resulting fit on thecomplement of the training sample, which is called the validation sample. Thisis carried out for a number of sample splits in training and validation samples,and one selects the estimator that has the best average performance across thesample splits. Statistical theory teaches us that this procedure is asymptoticallyoptimal in great generality in the sense that it performs asymptoticallyas well as an oracle selector that selects the estimator based on the criterionapplied to an infinite validation sample; see, e.g., Györfi et al. (2002), van derLaan and Dudoit (2003), van der Laan et al. (2006), and van der Vaart et al.(2006). The key conditions are that the loss-function needs to be uniformlybounded, and the size of the validation sample needs to converge to infinity.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!