11.07.2015 Views

statisticalrethinkin..

statisticalrethinkin..

statisticalrethinkin..

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4 Linear ModelsHistory has been unkind to Ptolemy. Claudius Ptolemy (born 90 CE, died 168 CE) wasan Egyptian mathematician and astronomer, most famous for his geocentric model of thesolar system. ese days, when scientists wish to mock someone, they might compare himto a supporter of the geocentric model. But Ptolemy was a genius. His mathematical modelof the motions of the planets (FIGURE 4.1) was extremely accurate. To achieve it’s accuracy,it employed a device known as an epicycle, a circle on a circle. It is even possible to have epiepicycles,circles on circles on circles. With enough epicycles in the right places, Ptolemy’smodel could predict with accuracy greater than anyone had achieved before him. And sothe model was utilized for over a thousand years. And Ptolemy and people like him, toilingover centuries, worked it all out without the aid of a computer. Anyone should be flatteredto be compared to Ptolemy.e trouble of course is that the geocentric model is wrong, in many respects. If youused it to plot the path of your Mars probe, you’d miss the red planet by quite a distance.But for spotting Mars in the night sky, it remains an excellent model. It would have to bere-calibrated every century or so, depending upon which heavenly body you wish to locate.But the geocentric model continues to make useful predictions, provided those predictionsremain within a narrow domain of questioning.e strategy of using epicycles might seem crazy, once you know the correct physicalstructure of the solar system. But it turns out that the ancients had hit upon a generalizedsystem of approximation. Given enough circles embedded in enough places, the Ptolemaicstrategy is the same as a Fourier series, a way of decomposing a periodic function (like anorbit) into a series of sine and cosine functions. So no matter the actual arrangement ofplanets and moons, a geocentric model can be built to describe their paths against the nightsky.LINEAR REGRESSION is the geocentric model of applied statistics. By “linear regression,”we will mean a family of simple statistical golems that attempt to learn about the mean andvariance of some measurement, using an additive combination of other measurements. Likegeocentrism, linear regression can usefully describe a very large variety of natural phenomena.Like geocentrism, linear regression is a descriptive model that corresponds to manydifferent process models. If we read its structure too literally, we’re likely to make mistakes.But used wisely, these little linear golems continue to be useful.is chapter introduces linear regression as a Bayesian procedure. Under a probabilityinterpretation, which is necessary for Bayesian work, linear regression uses a Gaussian (normal)distribution to describe our golem’s uncertainty about some measurement of interest.is type of model is simple, flexible, and commonplace. Like all statistical models, it isnot universally useful. But linear regression has a strong claim to being foundational, in the83

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!