14.03.2014 Views

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Appendix A Statistical Details 675<br />

Key Statistical Concepts<br />

In business, you want to maximize revenues <strong>and</strong> minimize costs. In science you want to minimize<br />

uncertainty. Uncertainty in science plays the same role as cost plays in business. All statistical methods fit<br />

models such that uncertainty is minimized.<br />

It is not difficult to visualize uncertainty. Just think of flipping a series of coins where each toss is<br />

independent. The probability of tossing a head is 0.5, <strong>and</strong> –log( 0.5)<br />

is 1 for base 2 logarithms. The<br />

probability of tossing h heads in a row is simply<br />

p<br />

=<br />

1-- h<br />

2<br />

Solving for h produces<br />

You can think of the uncertainty of some event as the number of consecutive “head” tosses you have to flip<br />

to get an equally rare event.<br />

Almost everything we do statistically has uncertainty, –logp , at the core. Statistical literature refers to<br />

uncertainty as negative log-likelihood.<br />

The Two Basic Fitting Machines<br />

Springs<br />

h<br />

= –log 2 p<br />

An amazing fact about statistical fitting is that most of the classical methods reduce to using two simple<br />

machines, the spring <strong>and</strong> the pressure cylinder.<br />

First, springs are the machine of fit for a continuous response model (Farebrother, 1987). Suppose that you<br />

have n points <strong>and</strong> that you want to know the expected value (mean) of the points. Envision what happens<br />

when you lay the points out on a scale <strong>and</strong> connect them to a common junction with springs (see<br />

Figure A.6). When you let go, the springs wiggle the junction point up <strong>and</strong> down <strong>and</strong> then bring it to rest at<br />

the mean. This is what must happen according to physics.<br />

If the data are normally distributed with a mean at the junction point where springs are attached, then the<br />

physical energy in each point’s spring is proportional to the uncertainty of the data point. All you have to do<br />

to calculate the energy in the springs (the uncertainty) is to compute the sum of squared distances of each<br />

point to the mean.<br />

To choose an estimate that attributes the least uncertainty to the observed data, the spring settling point is<br />

chosen as the estimate of the mean. That is the point that requires the least energy to stretch the springs <strong>and</strong><br />

is equivalent to the least squares fit.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!