11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

M. van der Laan 471the doubly censored observations. For continuously distributed failure times,all the half-lines contain zero uncensored bivariate failure times, and as aconsequence the likelihood does essentially provide no information about howthese masses 1/n should be distributed over the half-lines. That is the MLEis highly non-unique, and thereby also inconsistent.Similarly, consider the estimation of Ψ(P 0 )=E 0 E 0 (Y |A =1,W)basedonn i.i.d. observations on (W, A, Y ) ∼ P 0 , and suppose the statistical model isthe nonparametric model. The MLE of E 0 (Y |A =1,W = w) istheempiricalmean of the outcome among the observations with A i =1,W i = w, whilethe MLE of the distribution of W is the empirical distribution of W 1 ,...,W n .This estimator is ill-defined since most-strata will have no observations.40.3.5 Regularizing MLE through smoothingIn order to salvage the MLE the literature suggests to regularize the MLEin some manner. This often involves either smoothing or sieve-based MLEwhere the fine-tuning parameters need to be selected based on some empiricalcriterion. For example, in our bivariate survival function example, we couldput strips around the half-lines of the single censored observations, and computethe MLE as if the half-lines implied by the single censored observationsare now these strips. Under this additional level of coarsening, the MLE isnow uniquely defined as long as the strips contain at least one uncensoredobservation. In addition, if one makes sure that the number of observations inthe strips converge to infinity as sample size increases, and the width of thestrips converges to zero, then the MLE will also be consistent. Unfortunately,there is still a bias/variance trade-off that needs to be resolved in order to arrangethat the MLE of the bivariate survival function is asymptotically linear.Specifically, we need to make sure that the width of the strips converges fastenough to zero so that the bias of the MLE with respect to the conditionaldensities over the half-lines is o(1/ √ n). This would mean that the width ofthe strips is o(1/ √ n). For an extensive discussion of this estimation problem,and alternative smoothing approach to repair the NPMLE we refer to van derLaan (1996).Similarly, we could estimate the regression function E 0 (Y |A =1,W)witha histogram regression method. If the dimension of W is k-dimensional, thenfor the sake of arranging that the bin contains at least one observation, oneneeds to select a very large width so that the k-dimensional cube with widthh contains one observation with high probability. That is, we will need toselect h so that nh k →∞.ThisbinningcausesbiasO(h) fortheMLEofEE(Y |A =1,W). As a consequence, we will need that n −1/k converges tozero faster than n −1/2 ,whichonlyholdswhenk = 1. In other words, there isno value of the smoothing parameter that results in a regularized MLE thatis asymptotically linear.Even though there is no histogram-regularization possible, there might existother ways of regularizing the MLE. The statistics and machine learning

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!