Practical Considerations in Raking Survey Data

Practical Considerations in Raking Survey Data 

Michael P. Battaglia 1 , David Izrael 1 , David C. Hoaglin 1 , and Martin R. Frankel 1,2 

(1) Abt Associates Inc., (2) Baruch College, CUNY 

Contact Author: Michael P. Battaglia 

Abt Associates Inc., 55 Wheeler Street, Cambridge, MA 02138 

(v) 617-349-2425, (f) 617-349-2605, mike_battaglia@abtassoc.com 

1

Abstract 

A survey sample may cover segments of the target population in proportions that do not match the 

proportions of those segments in the population itself. The differences may arise from sampling 

fluctuations, nonresponse, or because the sample design was not able to cover the entire population. 

In such situations one can use raking to improve the relation between the sample and the 

population by adjusting the sampling weights of the cases in the sample so that the marginal totals 

of the adjusted weights on specified characteristics agree with the corresponding totals for the 

population. The raking procedure is described, and convergence issues and problems are 

discussed. The details of several practical aspects of raking are then given. The topics covered 

have not received much attention in the literature on raking. Specific aspects of raking are 

illustrated with graphical displays of output from a SAS Macro that can be obtained for free from 

the authors. 

Key Words 

Control totals, convergence, raking margins, weights, nonresponse 

2

1. Introduction 

A survey sample may cover segments of the target population in proportions that do not match the 

proportions of those segments in the population itself. The differences may arise, for example, 

from sampling fluctuations, from nonresponse, or because the sample design was not able to cover 

the entire population. In such situations one can often improve the relation between the sample and 

the population by adjusting the sampling weights of the cases in the sample so that the marginal 

totals of the adjusted weights on specified characteristics agree with the corresponding totals for 

the population. This operation is known as raking ratio estimation (Kalton 1983), raking, or 

sample-balancing, and the population totals are usually referred to as control totals. Raking may 

reduce nonresponse and noncoverage biases, as well as sampling variability. The initial sampling 

weights in the raking process are often equal to the reciprocal of the probability of selection and 

may have undergone some adjustments for unit nonresponse and noncoverage. The weights from 

the raking process are used in estimation and analysis. 

The adjustment to control totals is sometimes achieved by creating a cross-classification of the 

categorical control variables (e.g., age categories x gender x race x family-income categories) and 

then matching the total of the weights in each cell to the control total. This approach, however, can 

spread the sample thinly over a large number of cells. It also requires control totals for all cells of 

the cross-classification. Often this is not feasible (e.g., control totals may be available for age x 

gender x race but not when those cells are subdivided by family income). The use of marginal 

control totals for single variables (i.e., each margin involves only one control variable) often avoids 

many of these difficulties. In return, of course, the two-variable (and higher-order) weighted 

distributions of the sample are not required to mimic those of the population. 

3

A somewhat different problem motivated the original development of sample-balancing (Deming 

1943). The Census Bureau needed to produce tabulations for the joint distribution of two (or more) 

variables in the U.S. population, in situations where information on the joint distribution was 

available only from a sample. The marginal totals, however, were available for the full population, 

and so the sample counts in the cells of the cross-classification were adjusted to provide an 

estimated tabulation that had the correct marginal totals. 

Raking (or sample-balancing) usually proceeds one variable at a time, applying a proportional 

adjustment to the weights of the cases that belong to the same category of the control variable. 

Software for sample-balancing has been available for many years, but not as part of SAS (except 

for the CLAMAR macro from France) or most other major software systems (WESVAR includes a 

raking algorithm). Older readers may be familiar with a FORTRAN program developed in the 

1960s by MarketMath, Inc. Although that program executed rapidly, it had a variety of 

disadvantages. The user had to create an ASCII input data set, painstakingly prepare control 

statements (the original program was designed to read input from cards), and then process its 

ASCII output data set. It could rake on at most 12 variables. Also, it handled rounding in a way 

that could lose precision. Izrael et al. (2000) introduced a SAS macro for raking (sometimes 

referred to as the IHB raking macro) that combines simplicity and versatility. More recently, the 

IHB raking macro was enhanced to increase its utility and diagnostics (Izrael et al. 2004). 

The raking algorithm and issues related to convergence are discussed next. Several practical raking 

applications are then covered. 

4

2. Basic Algorithm 

The procedure known as raking adjusts a set of data so that its marginal totals match specified 

control totals on a specified set of variables. The term “raking” suggests an analogy with the 

process of smoothing the soil in a garden plot by alternately working it back and forth with a rake 

in two perpendicular directions. 

In a simple 2-variable example the marginal totals in various categories for the two variables are 

known from the entire population, but the joint distribution of the two variables is known only from 

a sample. In the cross-classification of the sample, arranged in rows and columns, one might begin 

with the rows, taking each row in turn and multiplying each entry in the row by the ratio of the 

population total to the weighted sample total for that category, so that the row totals of the adjusted 

data agree with the population totals for that variable. The weighted column totals of the adjusted 

data, however, may not yet agree with the population totals for the column variable. Thus the next 

step, taking each column in turn, multiplies each entry in the column by the ratio of the population 

total to the current total for that category. Now the weighted column totals of the adjusted data 

agree with the population totals for that variable, but the new weighted row totals may no longer 

match the corresponding population totals. The process continues, alternating between the rows 

and the columns, and agreement on both rows and columns is usually achieved after a few 

iterations. The result is a tabulation for the population that reflects the relation of the two variables 

in the sample. 

The above sketch of the raking procedure focuses on the counts in the cells and on the margins of a 

two-variable cross-classification of the sample. In the applications that survey statisticians often 

encounter, involving data from complex surveys, it is more common to work with the survey 

5

weights of the n individual respondents. Thus, the basic raking algorithm is described in terms of 

those individual weights, w, i = 1,2,..., n. 

For an unweighted (i.e., equally weighted) sample, one 

i 

can simply take the initial weights to be w = 1 for each i. 

i 

In a cross-classification that has J rows and K columns, denote the sum of the 

wi 

in cell ( j, k) 

by 

w . jk 

To indicate further summation, replace a subscript by a + sign. Thus, the initial row totals 

and column totals of the sample weights are 

w 

j + 

and 

corresponding population control totals by T + 

and . 

j 

T + k 

w + k 

, respectively. Analogously, denote the 

The iterative raking algorithm produces modified weights, whose sums are denoted by a suitably 

subscripted m with a parenthesized superscript for the number of the step. Thus, in the twovariable 

cross-classification 

m (1) jk 

denotes the sum of the modified weights in cell (j,k) at the end 

of Step 1. If one begins by matching the control totals for the rows, 

algorithm are 

T 

j + 

, the initial steps of the 

m 

(0) 

jk 

= w 

(j = 1,...,J; k=1,...,K) 

jk 

m = m ( T / m ) 

(for each k within each j) 

(1) (0) (0) 

jk jk j+ j+ 

m = m ( T / m ) 

(for each j within each k) 

(2) (1) (1) 

jk jk + k + k 

6

The adjustment factors, 

(0) 

Tj+ / mj+ and 

(1) 

T+ k 

/ m+ k 

, are actually applied to the individual weights, 

which could be denoted by 

(2) 

m 

i 

for example. In the iterative process an iteration rakes both rows 

and columns. Thus, for iteration s ( s = 0, 1, ...) one may write 

m = m ( T / m ) 

(2s+ 

1) (2 s) (2 s) 

jk jk j+ j+ 

m = m ( T / m ) 

(2s + 2) (2s + 1) (2s 

+ 1) 

jk jk + k + k 

Bishop et al. (1975) discuss the relationship between iterative proportional fitting and raking. They 

point out that raking was originally developed not for fitting an unsaturated model to a data set, but 

rather for combining information from two or more data sets. In the two-way table discussed 

above, one is in effect fitting a fully saturated log-linear model: the two-factor interaction present 

in the sample persists after raking, and the one-factor terms (reflected in the population control 

totals) are also fitted. Thus, in some ways raking can thus be thought of as fitting a “main effects” 

model, where the main effects correspond to the given margins. 

Raking can also adjust a set of data to control totals on three or more variables. In such situations 

the control totals often involve single variables, but they may involve two or more variables. In 

one example, in raking on three variables one might have control totals T a++ , T +b+ , and T ++c . In 

another example, the control totals might be T ab+ and T ++c --- a two-variable margin and a onevariable 

margin. In actually carrying out the raking for this second example, it suffices to treat the 

two-variable margin as the one-variable margin for a composite variable, whose values simply 

index the cells of the underlying two-variable margin. 

7

Ideally, one should rake on variables that exhibit strong associations with the key survey outcome 

variables or that are strongly related to nonresponse or noncoverage. This strategy will reduce the 

mean squared error of the key outcome variables. In practice, other considerations may enter. A 

variable such as gender may not be related to key outcome variables or to nonresponse or 

noncoverage, but raking on it may be desirable to preserve the “face validity” of the sample. 

3. Convergence 

Convergence of the raking algorithm has received considerable attention in the statistical literature, 

especially in the context of iterative proportional fitting for log-linear models, where the number of 

variables is at least three and the process begins with a different set of initial values in the fitted 

table (often 1 in each cell). For raking survey data it is enough that the iterative raking algorithm 

(ordinarily) converges, as one would expect from the fact that (in a suitable scale) the fitted cell 

counts produced by the raking are the weighted-least-squares fit to the observed cell counts in the 

full cross-classification of the sample by all the raking variables (Deming 1943). As an extreme 

example, for the 2 x 2 table shown in Table 1, convergence is impossible. 

Convergence may require a large number of iterations. Oh and Scheuren (1978) note that the 

available convergence proofs make strong assumptions about the cell counts in the crossclassification 

of the raking variables – that no cells are empty or that some particular combination 

of nonempty cells is present. They recommend setting up the raking problem in a “sensible” 

manner to avoid: 1) imposing too many marginal constraints on the sample, 2) defining marginal 

categories that contain a small percentage of the sample, and 3) imposing contradictory constraints 

on the sample. 

8

The authors’ experience indicates that, in general, raking on a large number of variables slows the 

convergence process. However, other factors also affect convergence. One is the number of 

categories of the raking variables. Convergence will typically be slower for raking on 10 variables 

each with 5 categories than for 10 variables each with only 2 categories. A second factor is the 

number of sample cases in each category of the raking variables. Convergence may be slow if any 

categories contain fewer than 5% of the sample cases. A third factor is the size of the difference 

between each control total and the corresponding weighted sample total prior to raking. If some 

differences are large, the number of iterations will typically be higher. One can guard against the 

possibility of nonconvergence or slow convergence by setting an upper limit on the number of 

iterations (e.g., 50). 

Brick et al. (2003) also discuss problems with convergence. They point out that a large number of 

iterations indicate a raking application that is not “well-behaved” and that problems may exist with 

the resulting weights – highly variable weights inflate sampling variances and produce unstable 

domain estimates. One example of a problem is the use of raking variables that have a strong 

association (correlation). In this situation the number of iterations may be large, and convergence 

will not occur if there are inconsistencies between the associations in the sample and the control 

totals (Table 1 shows such an example). The log-linear models literature on structural zeros in 

contingency tables is directly related to this issue. For example, if one rakes on Food Stamps 

eligibility and a poverty status variable, the cross-tabulation of these two variables in the sample 

will likely result in one or more cells that must be empty by definition. 

One simple definition of convergence requires that each marginal total of the raked weights be 

within a specified tolerance of the corresponding control total. As noted above, in practice, when a 

9

number of raking variables are involved, one must check for the possibility that the iterations do 

not converge (e.g., because of sparseness or some other feature in the full cross-classification of the 

sample). As already noted, one can guard against this possibility by setting an upper limit on the 

number of iterations. As elsewhere in data analysis, it is sensible to examine the sample (including 

its joint distribution with respect to all the raking variables) before doing any raking. For example, 

if the sample contains no cases in a category of one of the raking variables, it will be necessary to 

revise the set of categories and their control totals (say, by combining categories). The authors 

recommend, at a minimum, checking the unweighted percentage of sample cases and the 

percentage of control cases in each category of each raking variable. Small categories in the 

sample or in the control totals (say under 5%) are potential candidates for collapsing. This step 

will reduce the chance of creating very unequal weights in raking. Category collapsing always 

needs to be done carefully, and in some instances it may be important to retain a small category in 

the raking. 

4. The IHB Raking Macro 

The IHB SAS macro produces diagnostic output that contains the following information: number 

of iterations, name of variable currently being raked on, name of BY-variable if there is one, and 

marginal control total and calculated total weight for each level of the current raking variable, 

along with their difference and percentage difference. At termination, the macro gives the iteration 

number at which termination occurred and the reason, which is either that the tolerance has been 

met or that the process did not converge. The macro also writes diagnostics into the SAS LOG, 

from several of the checks that it makes. 

10

Table 2 illustrates the use of the macro with an example involving two raking variables, Table 2 

calls them VARIABLE1 and VARIABLE2, and a BY-variable, AREA, which has two levels. The 

marginal percentage and general control total for each level of the BY-variable are obtained outside 

the example, from PROC FREQ. Preliminary analyses of the data set showed that all categories of 

the raking variables represented in the marginal control data sets exist in the sample as well. Table 

2 shows the unweighted distribution of each variable. The actual raking uses the weights of the 

individual cases. With the convergence tolerance set to 1, the raking converged after 3 iterations 

for Area 1, and also after 3 iterations for Area 2. 

5. Sources of Control Totals 

The discussion of control totals refers to actual totals as opposed to percents. Surveys that use 

demographic and socioeconomic variables for raking must locate a source for the population 

control totals. An example of a source of true population control totals is the 2000 U.S. Census 

short-form data. The U.S. Census long-form variables, the 2000 U.S. Census 5-Percent Public Use 

Microdata Sample (PUMS) files, the Current Population Survey (CPS), U.S. Census Bureau 

population projections, the National Health Interview Survey, and private-sector sources such as 

Claritas are better viewed as control totals, because they are based either on large samples or on 

projection methodologies. 

Control totals obtained from a sample such as the CPS estimates are subject to much smaller 

sampling variability and nonresponse bias, and may be subject to much lower noncoverage bias, 

than a survey sample. For state-specific control totals, say for persons aged 0-17 years, the CPS 

estimates will be subject to considerably larger sampling variability; thus they are useful for 

national control totals, but potentially less useful for stable state control totals. Combining two 

years of CPS data can reduce the sampling variability of the state control totals. For projection 

11

methods (e.g., age by sex by race mid-year population projections from the U.S. Census Bureau), 

the basic approach is to project information forward from 2000 for the non-censal years. Clearly, 

the farther one gets from 2000, the greater the likelihood that the projections will be off. This 

happened, for example, with the projection of the size of the Hispanic population for the years 

before the 2000 Census results came out. Eventually, the American Community Survey should 

provide a new source of information for non-censal years. 

It is important to make sure that control totals from different sources all add to the same population 

total. If not, the raking will not converge. For example, for a survey in the middle of 2003, one 

would use Census Bureau age, sex, and race projections of the civilian noninstitutionalized 

population for July 2003, and obtain control totals by household income from the March 2003 CPS. 

In this situation one would most likely need to ratio-adjust the CPS income control totals so that 

they summed to the Census projection control totals for July 2003. 

One must also consider how the variables are measured. A telephone survey may ask a single 

question to obtain household income. The source for the control totals, however, may have an 

income variable that is constructed from a series of questions about income from several sources 

(wages, cash-assistance programs, interest, dividends, etc.). One needs to consider carefully 

whether using income as a raking variable makes sense. If the sample is thought to substantially 

under-represent low-income persons, then raking on income may be preferred. If, on the other 

hand, there is concern that the survey is measuring income very differently from the source of the 

control totals, then consideration should be given to raking on a proxy variable such as educational 

attainment or even a dichotomous poverty-status variable. 

12

Control totals usually do not come with a “missing” category. The same variable in the survey 

may have a nontrivial percentage of cases that fall in a DK or Refused category. In this situation it 

may be possible to impute for item nonresponse in the survey before the raking takes place. When 

imputation is not feasible, the following procedure can be used to adjust the control totals. Run a 

weighted frequency distribution on the raking variable in order to determine the percentage of 

sample cases that have a missing value (e.g., 4.3%). Allocate 4.3% of the control total to a newly 

created missing category (e.g., 4.3% of 1,500,000 = 64,500). Reapportion the control totals in the 

other categories so that they add to the reduced control total (1,500,000 – 64,500 = 1,435,500). 

After raking, the weighted distribution of the sample will agree with the revised control totals and 

will reflect a 4.3% missing- data rate in weighted frequencies and tabulations. 

6. Trade-offs Related to Number of Margins and Numbers of Categories 

Some raking applications use margins for age, sex, and race, because it is relatively easy to obtain 

control totals for these variables. In other situations (especially in surveys with lower response or 

important noncoverage issues) one may need to rake on a considerably larger number of variables. 

This is feasible if control totals can be assembled. The authors have seen rakings that used well 

over ten variables. Raking on many variables will almost always require a large number of 

iterations. The authors have also seen rakings that used a smaller number of variables, but with 

fairly detailed categories. Again, a large number of iterations may be required. In both situations 

the cross-classification of the raking variables often yields an extremely large number of cells. For 

example, raking on 12 dichotomous variables yields 4,096 cells. Raking on five variables each 

containing six categories yields 7,776 cells. Many of these cells will contain no cases in the 

sample. Such cells, by definition, remain empty after raking. However, the two-variable, threevariable, 

and higher-order interactions in the sample are maintained in the raking to the marginal 

13

control totals. The small cell sizes increase the chance that the raked weights will exhibit 

considerable variability, because those weights are maintaining sample interactions that are quite 

unstable. 

On top of the challenges of the numbers of variables and categories and the resulting number of 

underlying cells, large differences, before raking, between the weighted sample totals be and the 

marginal control totals will generally increase the number of iterations. These issues point to the 

need to closely examine: 1) the variables selected for raking, 2) the number and size of the 

categories of those raking variables, and 3) the magnitude of differences between the weighted 

sample totals and the control totals. Ideal variables for raking are those related to the key survey 

outcome variables and related to nonresponse and/or noncoverage. Variables that do not meet 

these conditions are candidates for exclusion from raking when a large number of variables are 

being considered. The categories of each candidate raking variable should be examined to see 

whether they contain a small proportion of the sample cases (say, under 5%) or whether the control 

total percentage is small (also, say, under 5%). Such small categories should be considered for 

collapsing. Sometimes the small categories of a nominal categorical variables can be collapsed 

into a larger residual category. For ordinal variables, collapsing with an adjacent category is often 

the best approach. If one or more weighted sample totals differ by a large amount from the 

corresponding control totals, one should first try to determine the source of the difference. Is it 

extreme differential nonresponse, or has the variable in the sample been measured in a very 

different manner than the corresponding variable used to form the control total? One should 

consider whether it is appropriate to use such a variable in raking. 

14

7. Examining and Diagnosing Slow Convergence 

Sometimes the raking process does not converge in a specified number of iterations. As an aid to 

diagnosing such situations and taking appropriate action, the enhanced IHB raking macro 

incorporates a module that, in case of non-convergence, uses the data to predict the number of 

iterations needed for convergence. 

The prediction is based on an empirical observation that the logarithm of the magnitude of the 

difference between an adjusted weighted total and its control total declines linearly with the 

number of iterations. In the authors’ experience, this relation holds reasonably well when a slowly 

converging raking process approaches the specified number of iterations (50 in most applications). 

The enhanced macro extrapolates the last iteration slope and estimates the iteration at which the 

slowest converging variable will cross a given tolerance threshold. 

One usually considers a raking process to be “converging slowly” if either it does not converge in a 

specified number of iterations or convergence takes substantially more iterations than usual. In the 

authors’ work, convergence usually takes place in 5 to 20 iterations. However, when the number of 

raking variables is large (say, more than 8) and some of the raking variables have numerous levels 

(the variable State with 51 categories, for instance), the process may take much longer to converge 

or may even not converge in an initially set number of iterations. The statistician has options to 

proceed with raking. The first one is by using the predicted number of iterations from the 

diagnostics to rerake the sample, trying to achieve complete convergence. This option is illustrated 

later. However, the predicted number of iterations may be impractically large. Then, as a second 

option, one may attempt to preprocess the sample data. 

15

A common strategy collapses categories of slowly converging variables. If, for instance, State is 

such a variable (with a value for each U.S. state and D.C.), it could be collapsed into, say, Census 

Division (9 levels) or even Census Region (4 levels). Of course, the statistician may not always 

have flexibility in collapsing. He/she may be required to rake by the original variables, or the 

“slow” variables may already be dichotomous. But if there is some flexibility in the statistical 

weighting methods, the authors recommend trying collapsing to accelerate convergence. 

How does one determine which raking variables are “slow”? The most effective way to examine a 

convergence process is to draw graphs. Figure 1 displays a plot of a slow raking process involving 

12 variables; the x-axis is the iteration number, and the y-axis is log 10 of the maximum (taken over 

all categories of a given raking variable) of the absolute value of the difference between the 

adjusted weighted total and the control total. The reference line indicates the tolerance level, in this 

example log 10 (1) = 0. One can easily construct this kind of graph using standard SAS/GRAPH 

facilities. 

From the graph, one can easily single out the four slowest converging variables (their traces cluster 

distinctly higher): EEE, JJJ, GGG, and AAA. The variables GGG and AAA are dichotomous, so it 

is not possible to collapse them. To explore how categories of the variables EEE and JJJ (which 

are ordinal) converge and which of them might be collapsed, similar graphs show the individual 

categories of those two variables (Figure 2). 

Besides visual exploration of convergence of slow categories, one should apply common sense 

when combining them. For ordinal variables, for instance, it would be logical to combine adjacent 

16

categories. Taking the meaning of values of EEE and JJJ into account, in addition to the graphs in 

Figure 2, collapsing combined Categories 1 and 2, and Categories 4 and 5 for both variables 

(keeping Category 3 separate). Correspondingly, the respective marginal totals were combined, 

after which the raking was rerun and new convergence graphs were constructed for those two 

collapsed variables (Figure 3). Because convergence of EEE and JJJ looked promising, a new 

overall convergence graph was constructed for all 12 raking variables (Figure 4). Comparing this 

graph with Figure 1, one can see that collapsing did play a dramatic role in speeding convergence. 

The raking process now converges in 17 iterations. 

As already noted, the statistician may not always have the flexibility to collapse categories, or 

he/she may still want to achieve convergence without altering the raking variables, i.e., using as 

many iterations as required. But how many are required? The enhanced macro calculates a 

predicted number of iterations needed for full convergence. The graph in Figure 5 demonstrates a 

two-variable raking process that initially did not converge in the default 50 iterations (vertical 

reference line) and predicted 65 as the needed number. When rerun, the raking did converge at 

exactly the 65th iteration. In a fairly rare situation, rerunning the raking with the predicted number 

of iterations could give non-convergence again, with a new and much larger number of predicted 

iterations. If this occurs, it makes sense to thoroughly examine sample and population data and 

make appropriate changes. 

8. Inclusion of Two-Variable Raking Margins 

Raking can be viewed as analogous to fitting a main-effects-only model. Because of sample size 

limitations and/or availability of only one-variable (factor or dimension) control totals, many 

raking applications follow this approach. In some situations it may be important to fit a two- 

17

variable interaction to the data. For example, one is planning to rake on Variables A, B, C, and D. 

However, control totals for Variable C crossed with Variable D are available and exhibit a strong 

interaction (e.g., persons aged 0-17 years are more likely to be Hispanic than persons aged 65+ 

years). If the cell counts in the C x D margin of the sample are large enough to support fitting a C 

x D interaction, one would rake on three margins: A, B, and C x D. It is not necessary also to rake 

on separate margins for Variables C and D. If, however, the C x D raking margin involved 

collapsing one could consider adding one-variable margins to the raking for Variables C and D 

without any collapsing of their categories. 

9. Forming Control Totals for Quantity Variables 

In a specialized raking situation one is planning on raking a sample of persons on some categorical 

variables (e.g., age, sex, and race), but the source of the control totals also has a quantity variable 

related, to say, the total number of glasses of milk consumed in a week. The survey has also 

measured this same quantity variable; but the survey response rate is, let us assume, only 50%. 

One may want to ensure that the weighted total number of glasses of milk consumed per week from 

the sample agrees closely with the control total. This can be accomplished by dividing the sample 

into groups; each group will have a mean number of glasses of milk consumed in a week and a sum 

of weights. In the raking process one can modify the sum of the weights in each group so that the 

sum of the weights times the mean, summed over all the groups, adds to the control value of total 

glasses of milk consumed in a week. In the simplest application one can divide the sample into two 

groups: below versus above the median number of glasses of milk consumed in a week based on 

the control total data. For each group one can use the control data to obtain the total number of 

glasses of milk consumed in a week. This two-category margin is then added to the raking. 

Convergence may not occur making it necessary to shift the group boundary point away from the 

18

median in order to achieve convergence. Once convergence is achieved the weighted total number 

of glasses of milk consumed in a week will be in close agreement with the control total value. This 

procedure may be extended to modify not only the total over the entire sample, but for various 

subpopulations as well. 

10. Raking at the State Level in a Large National Survey 

Some large surveys stratify by state and are designed to yield state estimates. The resulting total 

national sample is usually very large. The survey statisticians seek to provide national estimates as 

well as state estimates. Often one sets up raking control totals at the state level and carries out 51 

individual rakings. Assume those rakings use Variables A, B, and C; but the number of categories 

of each variable is limited because of the state sample sizes. For example, one might collapse 

Variables A, B, and C differently by state. If Variable A were race/ethnicity, one might be able to 

use Hispanic as a separate race/ethnicity category in California, but not in Vermont because of the 

small sample size. After the 51 rakings one might compare weighted distribution of Variables A, 

B, and C with national control totals and observe some differences that are caused by the state-level 

collapsing of categories. If having precise weighted distributions at the national level is important 

for analytic or “face validity” reasons, one can use the IHB raking macro in the following manner. 

Set up a single raking that includes margins for State x A, State x B, and State x C (i.e., combine 

the 51 individual state rakings into a single raking). Then add detailed national margins for 

Variables A, B, and C. Another, similar example would involve adding Variable D as a national 

raking margin because its control total is available only at the national level (e.g., household 

income). This strategy needs to be implemented carefully. Checks should be made for raking 

variables that contain small sample sizes. The coefficient of variation of the weights prior to raking 

19

and after raking should be examined in each state to check for large increases in the variability of 

the weights. Finally, the raking diagnostics discussed above should be used if convergence 

problems arise. 

11. Maintaining Prior Nonresponse and Noncoverage Adjustments in the Final weights 

Frankel et al. (2003) have discussed methods based on data on interruptions in telephone service 

(of a week or longer in the past 12 months) to compensate for the exclusion of persons in 

nontelephone households in random-digit-dialing surveys. One typically adjusts the base sampling 

weights of persons with versus without an interruption in telephone service. The resulting 

interruption-based weight adjusts for the noncoverage of nontelephone households. If one then 

rakes the sample on age, sex, and race, the impact of the nontelephone adjustment may be diluted 

somewhat, even though the raking starts with interruption-based weight. In that case it generally 

makes sense to create weighted control totals (using the interruption-based weight) from the sample 

for persons residing in households with versus without an interruption in telephone service. These 

weighted control totals should be ratio-adjusted so that they have the same sum as the age, sex, and 

race control totals. For example, if the age, sex, and race margins sum to 180,000,000 persons, 

then the interruption margin needs to be adjusted so that it also sums to 180,000,000. The raking 

would use the four variables instead of just three and would ensure that the nontelephone 

adjustment is fully reflected in the final weights. This would be appropriate where the interruptionin-telephone-service 

category could be small (e.g., in states where telephone coverage is very 

high), but one still wants to maintain that small category in the raking. 

20

12. Raking Surveys that Screen for a Specific Target Population 

A common survey model for obtaining interviews with a specific target population is to screen a 

sample of households for the presence of members of the target population. An example would be 

children with special health care needs. The screening interview collects a roster of children with, 

say, their age, sex, and race, and determines whether each child has special health care needs. If 

the household contains one child with special health care needs, a detailed interview is conducted 

for that child. If the household has two or more such children, one is selected at random for the 

detailed interview. Of course, the interview response rate will be less than 100%, because some 

parents will not agree to do the detailed interview. 

Assume that the survey statisticians need to look at the prevalence of children with special health 

care needs, and they will also be analyzing the detailed interview data. In this situation one would 

calculate the usual base sampling weights, make adjustments for unit nonresponse and possibly 

make a noncoverage adjustment if warranted. One first obtains control totals for age, sex, and race 

in the U.S. population aged 0-17 years. One then rakes the entire sample of children in the 

screened households to those control totals, because that sample is a sample of children aged 0-17 

in the U.S. The resulting screener weights can then be used to estimate the prevalence of children 

with special health care needs in the U.S. 

That screener weight would typically serve as the input weight in the calculation of weights for the 

children with completed detailed interviews. As part of that calculation process one also seeks to 

weight the detailed-interview sample by age, sex, and race. Of course, control totals are unlikely to 

be available for children with special health care needs. One can, however, use the screener weight 

21

and the sample of children with special health care needs identified in the screened households to 

form weighted control totals for age, sex, and race and then use those in raking the detailedinterview 

weights. This method ensures that the survey analysts do not ask why the age 

distribution of children with special health care needs from the screener sample does not agree 

exactly with the distribution in the detailed interview data. Some caution needs to be exercised in 

using this approach when the screener shows survey evidence of false positives. 

13. Raking to Control Totals Expressed as Percentages and Raking with No “Input” Weight 

Frequently, the user working with a weighted or an unweighted sample needs to weight it to fit 

marginal population proportions. As an example (Table 3), the authors created an 11-case sample 

data set that contains two variables: VAR1, which takes values 1, 2, and 3 with frequencies 

27.27%, 45.45% and 27.27%, respectively; and VAR2, which takes values 1 and 2 with 

frequencies 45.45% and 54.55%, respectively. The objective was to weight this sample so that the 

distributions of VAR1 and VAR2 met the population distributions --- (20%, 35%, 45%) and (60%, 

40%), respectively --- within a tolerance of 0.001%. 

14. Weight Trimming and Raking 

Weight trimming refers to truncation of high or extreme weight values in order to reduce their 

impact on the variance of the estimates, especially for subgroup estimates. One consequence of the 

truncation of high weight values is that the weights of the entire sample will not add to the 

population size. Although weight trimming is a separate topic from raking; they are certainly 

related in the sense that weight trimming typically takes place at the last step in the calculations, 

which is often raking. Many large surveys use weight trimming (Srinath 2003, Abt Associates 

memorandum). Its objective is to reduce the mean squared error of the key outcome estimates. By 

22

trimming high weight values one generally lowers sampling variability but may incur some bias. 

The MSE will be lower if the reduction in variance is large relative to the increase in bias arising 

from weight trimming. There are no established rules for weight trimming; rather most people use 

a general set of guidelines. Some common truncation points are: 1) the median weight plus five or 

six times the interquartile range (IQR) of the weights, 2) five times the mean weight, 3) the 95 th 

percentile of the weights. 

How can weight trimming be incorporated in raking? The IHB SAS macro can be used for weight 

trimming in the following steps (using as an example the median weight plus six times the IQR as 

the truncation point) 1 : 

1. Prior to raking i, where i references the number of times the raking is run, examine the 

distribution of the raking “input” weight and calculate the median weight plus six times the 

interquartile (IQR) range of the weights. 

2. Truncate values of the input weight that are above the median weight plus six times the 

IQR plus one to the median weight plus six times the IQR (values at or below the median 

weight plus six times the IQR plus one are not altered). 

3. Using the truncated input weight, run the raking to obtain raking weight i. 

4. Repeat Steps 1 to 3 (i.e., run the raking a second time, third time, etc.) until there are no 

weights that are above the median weight plus six times the IQR plus one. 

Although the cutoff value equals the median weight plus six times the IQR, weights that exceed the 

median weight plus six times the IQR plus one are truncated to the median weight plus six times 

1 A somewhat more sophisticated, but computer intensive, procedure is to apply bounds to the weights as the 

raking is taking place. 

23

the IQR, because the raking may increase the weight values of the cases that have been truncated, 

and thus cause the raking steps to repeat endlessly. The approach described above does not 

guarantee convergence (i.e., after running the raking several times there could still be weights 

above the median weight plus six times the IQR plus one), and one could consider adding a larger 

constant to increase the chances of convergence, but the authors have found in their applications 

that convergence is often achieved by adding a constant of one. Table 4 shows an example of the 

use of weight trimming with raking. Before raking there are four cases with “input” weights that 

exceed the median weight plus six times the IQR plus one of 439.847 (condition). The weights of 

those cases are truncated to 438.847 (cutoff) and the raking is run for the first time. After the first 

raking the condition equals 444.490. Only one case has a weight that exceeds this value and that 

weight is truncated to the cutoff of 443.490. After the second raking no cases have a weight that 

exceeds the condition and the process is stopped. The weights from the second raking add to the 

population size and meet the raking control totals. 

15. Summary 

The authors have sought to give some background on how raking works and to discuss the 

convergence process. They have also sought to give some warnings of conditions that need to be 

checked before and after raking. Brick et al. (2003) discuss other examples of issues that one 

should be aware of when using raking. The IHB SAS macro discussed in this paper is available for 

free from the first author. 

24

References 

Bishop YMM, Fienberg SE, and Holland PW. (1975). Discrete Multivariate Analysis: Theory and 

Practice. Cambridge, MA: MIT Press. 

Brick JM, Montaquila J, and Roth S. (2003). Identifying Problems with Raking Estimators. 2003 

Proceedings of the Annual Meeting of the American Statistical Association [CD-ROM], 

Alexandria, VA: American Statistical Association, pp. 710-717. 

Deming WE. (1943). Statistical Adjustment of Data. New York: Wiley. 

Frankel MR, Srinath KP, Hoaglin DC, Battaglia MP, Smith PJ, Wright RA, and Khare M. (2003). 

Adjustments for non-telephone bias in random-digit-dialling surveys. Statistics in Medicine, 

Volume 22, pp. 1611-1626. 

Izrael D, Hoaglin, DC, and Battaglia MP. (2000). A SAS Macro for Balancing a Weighted Sample. 

Proceedings of the Twenty-Fifth Annual SAS Users Group International Conference, Cary, NC: 

SAS Institute Inc., pp. 1350-1355. 

Izrael D, Hoaglin DC, and Battaglia MP. (2004). To Rake or Not To Rake Is Not the Question 

Anymore with the Enhanced Raking Macro. May 2004 SUGI Conference, Montreal, Canada. 

Kalton G. (1983). Compensating for Missing Survey Data. Survey Research Center, Institute for 

Social Research, University of Michigan. 

25

Oh HL, and Scheuren F. (1978). Some Unresolved Application Issues in Raking Ratio Estimation. 

1978 Proceedings of the Section on Survey Research Methods, Washington, DC: American 

Statistical Association, pp. 723-728. 

26

Table 1. A 2 x 2 Table for Which Raking Cannot Produce Agreement with the Control Totals 

Variable 1 

Marginal 

Control 

Total 

Variable 2 


Control Total 

1 2 

1 20 0 70 

2 0 10 30 

50 50 100 

27

Table 2. Example of Raking Using the IHB SAS Macro 

Raking AREA - 1 VARIABLE1, iteration - 1 



Calculated Control Calculated Control Difference 

VARIABLE1 margin Total Difference % % in % 

1 15915.87 22154.39 6238.52 35.486 35.278 0.209 

2 10912.05 16533.88 5621.83 24.330 26.328 -1.998 

3 18022.90 24112.03 6089.13 40.184 38.395 1.789 

========== ======== ========== ======== 

44850.82 62800.30 100.00 100.00 






1 32684.74 30697.33 -1987.40 52.046 48.881 3.165 

2 30115.56 32102.97 1987.40 47.954 51.119 -3.165 

========== ======== ========== ======== 

62800.30 62800.30 100.00 100.00 






1 22102.81 22154.39 51.586 35.195 35.278 -0.082 

2 16442.32 16533.88 91.553 26.182 26.328 -0.146 

3 24255.17 24112.03 -143.139 38.623 38.395 0.228 

========== ======== ========== ======== 

62800.30 62800.30 100.00 100.00 






1 30708.98 30697.33 -11.6455 48.899 48.881 0.019 

2 32091.32 32102.97 11.6455 51.101 51.119 -0.019 

========== ======== ========== ======== 

62800.30 62800.30 100.00 100.00 






1 22154.09 22154.39 0.29992 35.277 35.278 -0.000 

2 16533.34 16533.88 0.53717 26.327 26.328 -0.001 

3 24112.87 24112.03 -0.83710 38.396 38.395 0.001 

========== ======== ========== ======== 

62800.30 62800.30 100.00 100.00 

28






1 30697.40 30697.33 -0.068148 48.881 48.881 0.000 

2 32102.90 32102.97 0.068148 51.119 51.119 -0.000 

========== ======== ========== ======== 

62800.30 62800.30 100.00 100.00 

**** Program for AREA 1 terminated at iteration 3 because all calculated margins 

differ from Marginal Control Totals by less than 1 






1 31377.80 38598.04 7220.24 41.734 37.292 4.441 

2 17512.57 29596.11 12083.54 23.292 28.595 -5.303 

3 26295.48 35307.30 9011.82 34.974 34.113 0.861 

========== ========= ========== ======== 

75185.84 103501.44 100.00 100.00 






1 51930.05 51902.14 -27.9123 50.173 50.146 0.027 

2 51571.39 51599.30 27.9123 49.827 49.854 -0.027 

========== ========= ========== ======== 

103501.44 103501.44 100.00 100.00 






1 38596.66 38598.04 1.37510 37.291 37.292 -0.001 

2 29599.80 29596.11 -3.69114 28.598 28.595 0.004 

3 35304.98 35307.30 2.31605 34.111 34.113 -0.002 

========== ========= ========== ======== 

103501.44 103501.44 100.00 100.00 






1 51902.75 51902.14 -0.61296 50.147 50.146 0.001 

2 51598.69 51599.30 0.61296 49.853 49.854 -0.001 

========== ========= ========== ======== 

103501.44 103501.44 100.00 100.00 

29






1 38598.01 38598.04 0.030193 37.292 37.292 -0.000 

2 29596.19 29596.11 -0.081052 28.595 28.595 0.000 

3 35307.25 35307.30 0.050859 34.113 34.113 -0.000 

========== ========= ========== ======== 

103501.44 103501.44 100.00 100.00 






1 51902.15 51902.14 -0.013460 50.146 50.146 0.000 

2 51599.29 51599.30 0.013460 49.854 49.854 -0.000 

========== ========= ========== ======== 

103501.44 103501.44 100.00 100.00 

**** Program for AREA 2 terminated at iteration 3 because all calculated margins 

differ from Marginal Control Totals by less than 1 

30

Figure 1. Convergence of a Raking Process Involving 12 Variables 

6 

5 

4 

3 

2 

1 

0 

-1 

-2 

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 

variable AAA BBB CCC DDD EEE FFF 

GGG HHH III JJJ KKK LLL 

31

Figure 2. Convergence of Variables EEE and JJJ before Collapsing 

6 

Variable EEE 

5 

Variable JJJ 

5 

4 

4 

3 

3 

2 

2 

1 

1 

0 

-1 

0 

-2 

-1 

-3 

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 

category 1 2 3 4 5 

-2 

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 

category 1 2 3 4 5 

32

Figure 3. Convergence of Variables EEE and JJJ after Collapsing 

6 

Variable EEE 

5 

Variable JJJ 

5 

4 

4 

3 

3 

2 

1 

2 

1 

0 

0 

-1 

-1 

-2 

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 

category 1 3 5 

Category 2 and 4 collapsed into 1 and 5 respectively 

-2 

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 

category 1 3 5 

Category 2 and 4 collapsed into 1 and 5 respectively 

33

Figure 4. Convergence of All 12 Variables in the Raking Process after collapsing Variables EEE 

and JJJ. 

6 

5 

4 

3 

2 

1 

0 

-1 

-2 

-3 

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 

variable AAA BBB CCC DDD EEE FFF 

GGG HHH III JJJ KKK LLL 

34

Figure 5. Prediction of the Number of Iterations Needed for Convergence 

6 

5 

4 

3 

2 

1 

0 

-1 

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 

variable AAA BBB 

predicted number of iterations for convergence - 65 

35

Table 3. Raking Using Marginal Percentage Controls and No “Input” Weight (first and last 

iteration shown). 

The FREQ Procedure 

Cumulative Cumulative 

VAR1 Frequency Percent Frequency Percent 

--------------------------------------------------------- 

1 3 27.27 3 27.27 

2 5 45.45 8 72.73 

3 3 27.27 11 100.00 

Cumulative Cumulative 

VAR2 Frequency Percent Frequency Percent 

--------------------------------------------------------- 

1 5 45.45 5 45.45 

2 6 54.55 11 100.00 

Raking VAR1, iteration - 1 




VAR1 margin Total Difference % % in % 

1 3 20 17 27.273 20.000 7.273 

2 5 35 30 45.455 35.000 10.455 

3 3 45 42 27.273 45.000 -17.727 

========== ======== ========== ======== 

11 100 100.00 100.00 






1 42.667 60 17.3333 42.667 60.000 -17.333 

2 57.333 40 -17.3333 57.333 40.000 17.333 

========== ======== ========== ======== 

100.000 100 100.00 100.00 






1 20.000 20 0.000256716 20.000 20.000 -0.000 

2 35.001 35 -.000834329 35.001 35.000 0.001 

3 44.999 45 0.000577612 44.999 45.000 -0.001 

========== ======== ========== ======== 

100.000 100 100.00 100.00 






1 60.000 60 0.000205597 60.000 60.000 -0.000 

2 40.000 40 -.000205597 40.000 40.000 0.000 

========== ======== ========== ======== 

100.000 100 100.00 100.00 

**** Program terminated at iteration 5 because all Calculated Percents differ from Marginal 

Percents by less than 0.001 

36

Table 4: Example of Weight Trimming During Raking 

OBSERVATIONS IN ORIGINAL DATASET TO BE TRUNCATED 

CUTOFF: MEDIAN+6*IQR 

CONDITION: MEDIAN+6*IQR +1 

weight_to_ 

id truncate mean median IQR cutoff condition 

715 477.576 144.250 132.491 51.0592 438.847 439.847 

651 509.018 144.250 132.491 51.0592 438.847 439.847 

1085 690.762 144.250 132.491 51.0592 438.847 439.847 

770 515.720 144.250 132.491 51.0592 438.847 439.847 

OBSERVATIONS TO BE TRUNCATED AFTER ITERATION = 1 


CONDITION: MEDIAN+6*IQR + 1 

truncated_ 

id weight mean median IQR cutoff condition 

1085 451.059 144.250 133.108 51.7302 443.490 444.490 

OBSERVATIONS TO BE TRUNCATED AFTER ITERATION = 2 


CONDITION: MEDIAN+6*IQR + 1 

THERE ARE NO WEIGHTS TO TRUNCATE 

RERAKING-TRUNCATION PROCESS CONVERGED IN 2 ITERATIONS WITH CONDITION MEDIAN+6*IQR+1. 

37

Practical Considerations in Raking Survey Data

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?