15.03.2015 Views

REPORT - Search CIMMYT repository

REPORT - Search CIMMYT repository

REPORT - Search CIMMYT repository

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

performing a conventional analysis of variance<br />

of the drought indices and testing the significance<br />

of treatment and block effects. A high<br />

association between drought indices and treatments<br />

was taken to mean that the definition<br />

of a given treament should include not only<br />

the direct effect of the applied treatment on<br />

yield, but also the indirect effect of the moisture<br />

stress that is associated with the treatment.<br />

In such a case, the drought indices<br />

should be adjusted for treatments and only<br />

the remaining variation among the drought<br />

indices used as a covariate. Drought indices<br />

should also be adjusted for blocks, in case of<br />

a significant association bewteen indices and<br />

blocks.<br />

In summary, it is seen that the evaluation<br />

of the drought index, for use as a covariate,<br />

requires three operations: (1) an estimate of<br />

me.asurement error, (2) comparison of the total<br />

variability With measurement error by means<br />

of an F test, and (3) analysis of variance of<br />

the drought indices and tests of significance<br />

of treatment and block effects. When either<br />

or both of these effects is significant, it is<br />

necessary to adjust the drought indices for<br />

the indicated effects. This procedure is applicable<br />

in the evaluation of other production<br />

factors for use as a covariate in the analysis<br />

of yield data.<br />

Evaluation of the Predictive Ability<br />

of Regression Models<br />

Fertilizer recommendations for farmers are<br />

commonly developed using information generated<br />

in a large number of well-conducted<br />

field trials. These trials are distributed over<br />

time and geographical area to sample a wide<br />

range of values among important factors affecting<br />

yields. Variables are described carefully<br />

for 'each experiment at the plot or site<br />

level, and the results are combined into a<br />

prediction function, using mUltiple linear regression<br />

procedures.<br />

The number of potential variables to be<br />

included in the prediction equation is large<br />

as it includes applied fertilizer variables, site<br />

variables, and interaction variables. Currently<br />

most investigators select the variables for the<br />

prediction equation by means of a stepwise<br />

procedure. One such approach introduces<br />

potential variables into the equation in an<br />

order determined by their correlation with the<br />

yield or response variable. A second stepwise<br />

procedure begins with all the potential<br />

predictor variables and eliminates some of<br />

them, one at a time, until a satisfactory prediction<br />

equation is obtained,<br />

These procedures are known by the terms<br />

forward selection and backward elimination,<br />

respectively. The criterion used in the selection<br />

or elimination of a variable is the change<br />

in the residual sum of squares, calculated by<br />

using the differences between the observed<br />

and predicted observations.<br />

Recent <strong>CIMMYT</strong> studies have shown that<br />

the full model (the equation containing all<br />

potential predictor variables) performs well<br />

when used in predicting yields for combinations<br />

of the site variables measured at the<br />

experimental sites. However, the full model<br />

performs poorly when used to predict yields<br />

for combinations which are not the specific<br />

combinations measured in the study. This<br />

holds true even though the combinations are<br />

within the range of values studied. The backward<br />

elimination model and, to a lesser extent<br />

the forward selection model, provide about<br />

the same general results.<br />

These irregularities led to <strong>CIMMYT</strong>'s development<br />

of a new criterion for the selection<br />

of predictor variables in the general yield<br />

function. Called the predictive sum of squares,<br />

it is based on the performance of the estimated<br />

equation for predicting observations<br />

not included in the estimation of the predictor<br />

variable coefficients. Calculation is similar to<br />

the commonly used residual sum of squares,<br />

as both are the sum of squares of the deviations<br />

between the observed and predicted<br />

observations.<br />

A predictor variable selection procedure,<br />

based on the predictive sum of squares criterion,<br />

involves the following steps:<br />

The total number of fertilizer trials is randomly<br />

divided into a small number of groups.<br />

Setting aside one of the groups, the prediction<br />

equations is estimated for each of the<br />

potential predictor variables; the yields predicted;<br />

and the predictive sum of squares<br />

calculated for the omitted group. The procedure<br />

is then repeated, with each of the groups<br />

set aside in turn. At the end of the cycle, the<br />

total predictive sum of squares is calculated<br />

by summing over the omitted groups, and<br />

the predictor with the smallest value is 'selected.<br />

The entire procedure is then repeated<br />

in a stepwise manner for the one-by-one selection<br />

of several predictor variables.<br />

As is commonly known, the residual sum<br />

of squares decreases as more predictors are<br />

added to the general yield function and asymptotically<br />

approaches zero as the number of<br />

variables approaches the number of observations.<br />

With the usual significance levels of<br />

127

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!