21.06.2014 Views

Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...

Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...

Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

7. Weight Calibration at <strong>Questionnaire</strong><br />

<strong>Dwelling</strong> <strong>Unit</strong> <strong>and</strong> <strong>Pair</strong> <strong>Level</strong>s<br />

The 2006 National Survey on Drug Use <strong>and</strong> Health (NSDUH) was based on probability<br />

sampling so that valid inferences can be made from survey findings about the target population.<br />

Probability sampling refers to sampling in which every unit on the frame is given a known,<br />

nonzero probability for inclusion in the survey. This is required for unbiased estimation of the<br />

population total. The assumption of nonzero inclusion probability for every pair of units in the<br />

frame also is required for unbiased variance estimation. The basic sampling plan involved four<br />

stages of selection across two phases of design: within Phase I, (1) the selection of census tracts<br />

within each State sampling (SS) region, (2) the selection of subareas or segments (comprised of<br />

U.S. Bureau of the Census blocks) within SS regions; (3) the selection of dwelling units (DUs)<br />

within these subareas; <strong>and</strong>, finally, within Phase II, (4) the selection of eligible individuals within<br />

DUs. Specific details of the sample design <strong>and</strong> selection procedures for the sample can be found<br />

in the 2006 NSDUH sample design report (Morton et al., 2007).<br />

As part of the postsurvey data-processing activities, analysis weights that reflected the<br />

selection probabilities from various stages of the sample design were calculated for respondents.<br />

These sample weights were adjusted at the DU (screening sample), questionnaire dwelling unit<br />

(QDU), person, <strong>and</strong> paired respondent levels (the latter three all based around the drug<br />

questionnaire sample) to account for bias due to extreme values (ev), nonresponse (nr), <strong>and</strong><br />

coverage.<br />

The final sample weights for Phase I screener dwelling units (SDU) <strong>and</strong> Phase II QDU,<br />

person, <strong>and</strong> pair levels for the 2006 samples consisted of products of several factors, each<br />

representing either a probability of selection at some particular stage or some form of ev, nr, or<br />

ps calibration adjustment. In the following sections, we describe the QDU <strong>and</strong> pair weight<br />

components in greater detail. In summary, the first 10 factors were defined for all SDUs <strong>and</strong><br />

reflected the fully adjusted SDU sample weight. The remaining components branched to reflect<br />

QDU <strong>and</strong> pair selection probabilities, as well as additional adjustments for ev, nr, <strong>and</strong> ps. Note<br />

that the final QDU <strong>and</strong> pair weights for the 2006 survey sample are the product of all weight<br />

components for each type of sample, illustrated in Exhibits 7.1 <strong>and</strong> 7.2.<br />

For QDU data, generalized exponential modeling (GEM) calibration modeling was<br />

applied by partitioning the data into four groups of States: Northeast, South, Midwest, <strong>and</strong> West,<br />

based on census regions in the interest of computational feasibility. Previous experience showed<br />

that with current computing power, the large number of variables <strong>and</strong> records prevented any<br />

further reduction of modeling groups.<br />

For pair data, GEM modeling was initially applied by partitioning the pair data into four<br />

groups based on census regions. However, there were not enough observations in each group to<br />

fit a comprehensive model to reduce bias. Alternatively, a single model was attempted for the<br />

whole pair data, but it was rejected as not practical due to computational limitations. A<br />

compromise approach was adopted by combining census regions into two groups: Northeast with<br />

South <strong>and</strong> Midwest with West. This grouping proved both manageable <strong>and</strong> desirable as it<br />

65

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!