REPORT - Search CIMMYT repository
REPORT - Search CIMMYT repository
REPORT - Search CIMMYT repository
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
performing a conventional analysis of variance<br />
of the drought indices and testing the significance<br />
of treatment and block effects. A high<br />
association between drought indices and treatments<br />
was taken to mean that the definition<br />
of a given treament should include not only<br />
the direct effect of the applied treatment on<br />
yield, but also the indirect effect of the moisture<br />
stress that is associated with the treatment.<br />
In such a case, the drought indices<br />
should be adjusted for treatments and only<br />
the remaining variation among the drought<br />
indices used as a covariate. Drought indices<br />
should also be adjusted for blocks, in case of<br />
a significant association bewteen indices and<br />
blocks.<br />
In summary, it is seen that the evaluation<br />
of the drought index, for use as a covariate,<br />
requires three operations: (1) an estimate of<br />
me.asurement error, (2) comparison of the total<br />
variability With measurement error by means<br />
of an F test, and (3) analysis of variance of<br />
the drought indices and tests of significance<br />
of treatment and block effects. When either<br />
or both of these effects is significant, it is<br />
necessary to adjust the drought indices for<br />
the indicated effects. This procedure is applicable<br />
in the evaluation of other production<br />
factors for use as a covariate in the analysis<br />
of yield data.<br />
Evaluation of the Predictive Ability<br />
of Regression Models<br />
Fertilizer recommendations for farmers are<br />
commonly developed using information generated<br />
in a large number of well-conducted<br />
field trials. These trials are distributed over<br />
time and geographical area to sample a wide<br />
range of values among important factors affecting<br />
yields. Variables are described carefully<br />
for 'each experiment at the plot or site<br />
level, and the results are combined into a<br />
prediction function, using mUltiple linear regression<br />
procedures.<br />
The number of potential variables to be<br />
included in the prediction equation is large<br />
as it includes applied fertilizer variables, site<br />
variables, and interaction variables. Currently<br />
most investigators select the variables for the<br />
prediction equation by means of a stepwise<br />
procedure. One such approach introduces<br />
potential variables into the equation in an<br />
order determined by their correlation with the<br />
yield or response variable. A second stepwise<br />
procedure begins with all the potential<br />
predictor variables and eliminates some of<br />
them, one at a time, until a satisfactory prediction<br />
equation is obtained,<br />
These procedures are known by the terms<br />
forward selection and backward elimination,<br />
respectively. The criterion used in the selection<br />
or elimination of a variable is the change<br />
in the residual sum of squares, calculated by<br />
using the differences between the observed<br />
and predicted observations.<br />
Recent <strong>CIMMYT</strong> studies have shown that<br />
the full model (the equation containing all<br />
potential predictor variables) performs well<br />
when used in predicting yields for combinations<br />
of the site variables measured at the<br />
experimental sites. However, the full model<br />
performs poorly when used to predict yields<br />
for combinations which are not the specific<br />
combinations measured in the study. This<br />
holds true even though the combinations are<br />
within the range of values studied. The backward<br />
elimination model and, to a lesser extent<br />
the forward selection model, provide about<br />
the same general results.<br />
These irregularities led to <strong>CIMMYT</strong>'s development<br />
of a new criterion for the selection<br />
of predictor variables in the general yield<br />
function. Called the predictive sum of squares,<br />
it is based on the performance of the estimated<br />
equation for predicting observations<br />
not included in the estimation of the predictor<br />
variable coefficients. Calculation is similar to<br />
the commonly used residual sum of squares,<br />
as both are the sum of squares of the deviations<br />
between the observed and predicted<br />
observations.<br />
A predictor variable selection procedure,<br />
based on the predictive sum of squares criterion,<br />
involves the following steps:<br />
The total number of fertilizer trials is randomly<br />
divided into a small number of groups.<br />
Setting aside one of the groups, the prediction<br />
equations is estimated for each of the<br />
potential predictor variables; the yields predicted;<br />
and the predictive sum of squares<br />
calculated for the omitted group. The procedure<br />
is then repeated, with each of the groups<br />
set aside in turn. At the end of the cycle, the<br />
total predictive sum of squares is calculated<br />
by summing over the omitted groups, and<br />
the predictor with the smallest value is 'selected.<br />
The entire procedure is then repeated<br />
in a stepwise manner for the one-by-one selection<br />
of several predictor variables.<br />
As is commonly known, the residual sum<br />
of squares decreases as more predictors are<br />
added to the general yield function and asymptotically<br />
approaches zero as the number of<br />
variables approaches the number of observations.<br />
With the usual significance levels of<br />
127