11.07.2015 Views

[U] User's Guide

[U] User's Guide

[U] User's Guide

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

[ U ] 20.18 Weighted estimation 309That is, specifying pweights implies the vce(robust) option and, hence, the robust variancecalculation (but weighted). In this example, we use regress simply for illustration. The same istrue of probit and all of Stata’s estimation commands. Estimation commands that do not have avce(robust) option (there are a few) do not allow pweights.pweights are adequate for handling random samples where the probability of being sampled varies.pweights may be all you need. If, however, the observations are not sampled independently but aresampled in groups—called clusters in the jargon—you should specify the estimator’s vce(clusterclustvar) option as well:. regress y x1 x2 [pw=pop], vce(cluster block)There are two ways of thinking about this:1. The robust estimator answers the question of which variation would be observed were the datacollection followed by the estimation repeated; if that question is to be answered, the estimatormust account for the clustered nature of how observations are selected. If observations 1 and2 are in the same cluster, then you cannot select observation 1 without selecting observation 2(and, by extension, you cannot select observations like 1 without selecting observations like 2).2. If you prefer, you can think about potential correlations. Observations in the same clustermay not really be independent—that is an empirical question to be answered by the data.For instance, if the clusters are neighborhoods, it would not be surprising that the individualneighbors are similar in their income, their tastes, and their attitudes, and even more similarthan two randomly drawn persons from the area at large with similar characteristics, such asage and sex.Either way of thinking leads to the same (robust) estimator of variance.Sampling weights usually arise from complex sampling designs, which often involve not onlyunequal probability sampling and cluster sampling but also stratified sampling. There is a family ofcommands in Stata designed to work with the features of complex survey data, and those are thecommands that begin with svy. To fit a linear regression model with stratification, for example, youwould use the svy:regress command.Non-svy commands that allow pweights and clustering give essentially identical results to thesvy commands. If the sampling design is simple enough that it can be accommodated by the non-svycommand, that is a fine way to perform the analysis. The svy commands differ in that they havemore features, and they do all the little details correctly for real survey data. See [SVY] survey fora brief discussion of some of the issues involved in the analysis of survey data and a list of all thedifferences between the svy and non-svy commands.Not all model estimation commands in Stata allow pweights. This is often because they arecomputationally or statistically difficult to implement.20.18.4 Importance weightsStata’s iweights—importance weights—are the emergency exit. These weights are for those whowant to take control and create special effects. For example, programmers have used regress withiweights to compute iteratively reweighted least-squares solutions for various problems.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!