12.07.2015 Views

NCC Report No. 1 - (IMD), Pune

NCC Report No. 1 - (IMD), Pune

NCC Report No. 1 - (IMD), Pune

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

The most sensible approach for selecting a best subset out of availablevariables in a complex linear model is to compare all possible subsets. Thisprocedure simply fits all the possible regression models (i.e. all possiblecombinations of predictors) and chooses the best one (or more than one). Standardselection procedure like stepwise regression (forward or backward selection) hassome serious problems of selecting the best model. The problems in stepwiseregression method are described in detail by Quinn and Keough (2002). With theavailability of fast computing facilities, trying all possible combination of thepredictors is not a difficult task.In this study, we have fitted all the possible regression models (63 (= 2 6 -1) for6 predictors) and used jacknife method for the selection of 5 best models amongthem. For ‘m’ standardised predictors for ‘n’ years, the multiple linear regression(MR) model equation can be written as Y = BZ + ε. Where Y is the (nX1) predictand(ISMR) matrix, B is the (1Xm) matrix of regression coefficients, Z is the (nXm)matrix of predictors of model size m and ε is the (nX1) error matrix.Jacknife method (Crask and Perreault 1977; Tukey 1958) is the most suitablefor comparing the performance of different models when the data period is smalland testing is to be made using relatively longer period. In this method,development and testing of the models is possible in the same period. For SET-I,data for the period 1951-2000 and for SET-II, data for 1958-2000 were used for thispurpose. In the Jacknife method, for each of the possible models, predictions foreach of the years (say i th year) within the given data period of k years were doneusing the remaining k-1 years. In the case of SET-I, k is 50 (1951 to 2000) and thatfor SET-II, k is 43 (1958-2000).After preparing the hindcasts by a number of different regression models forthe given period, the best performing models can be selected using a suitablecriterion. This criterion is based on the error measure statistics computed for the testperiod. While comparing the performance, it is possible that the error measures ofvarious models particularly with different model size are approximately equal. Insuch cases, the principle of parsimony (Box et al. 1994) suggests to give preference11

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!