How to use FSBforecast Excel add-in for regression analysis

More documents

Recommendations

Info

fitting a regression model, only rows of data in which all the chosen dependent and independent variables have numeric values can be used to estimate the model. Correlation and scatterplot matrices: The Data Analysis procedure always shows you the correlation matrix of the selected variables (i.e., all correlations between one variable and another), because correlations are the key statistics that are used to measure linear relationships among variables. If you check the Show Scatter Plots box when running the Data Analysis procedure you will also get a matrix of all 2‐way scatterplots, which is the visual counterpart of the correlation matrix. The scatter plots may take some time to draw if you choose to analyze a large number of variables at once (e.g., 15 or more) and there are also many rows of data (e.g., 1000 or more). If you run the procedure and select n variables, you will get n 2 plots, and they are drawn at the rate of several per second (faster or slower depending on the number of rows of data). If you try this with 50 variables, you will get 2500 scatterplots on a single worksheet. The result is impressive to look at, but you may wait a while for it! Here is a picture of what the output looks like when only 3 variables are chosen: The correlation matrix is displayed farther down on the Data Analysis worksheet, and there is an option to generate a full matrix of all 2‐way scatterplots. Any of the individual scatterplots can be enlarged by pulling on its corners, and it can be copied and pasted to another worksheet or to a Word or Powerpoint document and re‐formatted there as well. The same is true of all chart output in FSBforecast. Note that in these plots, the relationship between MPG_City and the two other variables appears to be somewhat nonlinear, i.e., the points appear to be distributed around a curved line rather than a straight line. Other patterns you might (or might not) observe in a scatterplot are extreme values of some variables (“outliers”), which may or may not line up with extreme values of other variables, or clusters of points along the edges or in the corners of some plots. These sorts of patterns can present challenges for fitting models that assume linear relationships and normally distributed errors. Sometimes transformations of variables are needed to “straighten things out.” 4
Regression: The Regression procedure fits multiple regression models and allows them to be easily compared side‐by‐side. Just hit the Regression button and select the dependent variable you want to use and check the boxes for the independent variables from which you wish to predict it, then hit the “Run” button. Consecutive models are named “Model 1”, “Model 2”, etc., by default, but you can also enter a name of your choice in the Model Name box before hitting “Run”, and the custom name will be used to label all of the output. To run a regression, select the dependent variable and then check the boxes for the independent variables you wish to include, and hit the “Run” button. A model can have up to 50 independent variables and over 18,000 rows of data. If you also check the Brief Output box, then some of the usual regression output‐‐‐the normal probability plot, the descriptive statistics and plots of the individual variables, the residuals‐vs‐independent‐variable plots, and the residual table—will not be included on the model worksheet. These take a large amount of time and space to produce compared to the rest of the standard output. If you have relatively large numbers of independent variables (say, a dozen or more) and/or relatively large numbers of rows (say, 500 or more), you may wish to ask for brief output when first running a model. Brief output will give you more compact model sheets, and it will also cut down on the time needed to re‐draw plots with large numbers of points when you scroll up and down the sheet. Once you have identified a promising‐looking model for a large data set, you can re‐run it with full output for a more complete picture. Brief‐output mode will also keep the file size more manageable if you fit a large number of models in one workbook. It is easy to end up with file sizes of 10M or 20M or more if you run a lot of full‐output regressions with many variables and many rows of data. If all your variables consist of time series (i.e., variables whose values are ordered in time, such as daily or weekly or monthly or annual observations of some quantities), then you should also check the Time Series Data box. This will provide additional model statistics that are relevant only for time series, such as autocorrelations of the residuals, which reveal whether there are unexplained time patterns. 5
Page 1 and 2: How to use FSBforecast Excel add‐
Page 3: If you check the Show Series Plots
Page 7 and 8: Charts appear farther down on the m
Page 9 and 10: additional rows with out‐of‐sam
Page 11 and 12: In the data set shown here, the rel
Page 13: Displaying gridlines and column hea

How to use FSBforecast Excel add-in for regression analysis

Create successful ePaper yourself

Delete template?

Save as template?