11.07.2015 Views

[U] User's Guide

[U] User's Guide

[U] User's Guide

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

308 [ U ] 20 Estimation and postestimation commandsand taking the mean(y 1 + y 2 + y 3 )/3 = {(x 1 + x 2 + x 3 )/3}β + (ɛ 1 + ɛ 2 + ɛ 3 )/3For another observation in the data—which may be the result of summing of a different number ofobservations—the variance will be different. Hence, the model for the data isy j = x j β + ɛ j , ɛ j ∼ N(0, σ 2 /N j )This makes intuitive sense. Consider 2 observations, one recording means over two subjects and theother means over 100,000 subjects. You would expect the variance of the residual to be less in the100,000-subject observation; i.e., there is more information in the 100,000-subject observation thanin the two-subject observation.Now instead say that you are fitting the same model, y i = x i β+ɛ i , ɛ i ∼ N(0, σ 2 ), on probabilityweighteddata. Each observation in your data is one subject, but the different subjects have differentchances of being included in your sample. Therefore, for each subject in your datay i = x i β + ɛ i , ɛ i ∼ N(0, σ 2 )That is, there is no heteroskedasticity problem. The use of the aweighted estimator cannot be justifiedon these grounds.As a matter of fact, from the argument just given, you do not need to adjust for the weights atall, although the argument does not justify not making an adjustment. If you do not adjust, you areholding tightly to the assumed truth of your model. Two issues arise when considering adjustmentfor sampling weights:1. the efficiency of the point estimate ̂β of β and2. the reported standard errors (and, more generally, the variance matrix of ̂β).Efficiency argues in favor of adjustment, and that, by the way, is why many researchers have usedaweights with pweighted data. The adjustment implied by pweights to the point estimates is thesame as the adjustment implied by aweights.With regard to the second issue, the use of aweights produces incorrect results because it interpretslarger weights as designating more accurately measured points. For pweights, however, the pointis no more accurately measured—it is still just one observation with one residual ɛ j and varianceσ 2 . In [U] 20.16 Obtaining robust variance estimates above, we introduced another estimator ofvariance that measures the variation that would be observed if the data collection followed by theestimation were repeated. Those same formulas provide the solution to pweights, and they havethe added advantage that they are not conditioned on the model’s being true. If we have any hopesof measuring the variation that would be observed were the data collection followed by estimationrepeated, we must include the probability of the observations being sampled in the calculation.In Stata, when you type. regress y x1 x2 [pw=pop]the results are the same as if you had typed. regress y x1 x2 [pw=pop], vce(robust)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!