GAO-12-208G, Designing Evaluations: 2012 Revision
GAO-12-208G, Designing Evaluations: 2012 Revision
GAO-12-208G, Designing Evaluations: 2012 Revision
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Comparison Group Quasiexperiments<br />
Chapter 4: Designs for Assessing Program<br />
Implementation and Effectiveness<br />
over both program implementation and external factors that may influence<br />
program results. In fact, enforcing strict adherence to program protocols<br />
in order to strengthen conclusions about program effects may actually<br />
limit the ability to generalize those conclusions to less perfect, but more<br />
typical program operations.<br />
Ideally, randomized experiments in medicine are conducted as doubleblind<br />
studies, in which neither the subjects nor the researchers know who<br />
is receiving the experimental treatment. However, double-blind studies in<br />
social science are uncommon, making it hard sometimes to distinguish<br />
the effects of a new program from the effects of introducing any novelty<br />
into the classroom or workplace. Moreover, program staff may jeopardize<br />
the random assignment process by exercising their own judgment in<br />
recruiting and enrolling participants. Because of the critical importance of<br />
the comparison groups’ equivalence for drawing conclusions about<br />
program effects, it is important to check the effectiveness of random<br />
assignment by comparing the groups’ equivalence on key characteristics<br />
before program exposure.<br />
Because of the difficulties in establishing a random process for assigning<br />
units of study to a program, as well as the opportunity provided when only<br />
a portion of the targeted population is exposed to the program, many<br />
impact evaluations employ a quasi-experimental comparison group<br />
design instead. This design also uses a treatment group and one or more<br />
comparison groups; however, unlike the groups in the true experiment,<br />
membership in these groups is not randomly assigned. Because the<br />
groups were not formed through a random process, they may differ with<br />
regard to other factors that affect their outcomes. Thus, it is usually not<br />
possible to infer that the “raw” difference in outcomes between the groups<br />
has been caused by the treatment. Instead, statistical adjustments such<br />
as analysis of covariance should be applied to the raw difference to<br />
compensate for any initial lack of equivalence between the groups.<br />
Comparison groups may be formed from the pool of applicants who<br />
exceed the number of program slots in a given locale or from similar<br />
populations in other places, such as neighborhoods or cities, not served<br />
by the program. Drawing on the research literature to identify the key<br />
factors known to influence the desired outcomes will aid in forming<br />
treatment and comparison groups that are as similar as possible, thus<br />
strengthening the analyses’ conclusions. When the treatment group is<br />
made up of volunteers, it is particularly important to address the potential<br />
for “selection bias”—that is, that volunteers or those chosen to participate<br />
Page 42 <strong>GAO</strong>-<strong>12</strong>-<strong>208G</strong>