11.07.2015 Views

statisticalrethinkin..

statisticalrethinkin..

statisticalrethinkin..

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

88 4. LINEAR MODELS4.1.4.2. Epistemological justification. But the natural occurrence of the Gaussian distributionis only one reason to build models around it. Another route to justifying the Gaussianas our choice of skeleton, and a route that will help us appreciate later why it is oen a poorchoice, is that it represents a particular state of ignorance. When all we know or are willingto say about a distribution of measures (measures are continuous values on the real numberline) is their mean and variance, then the Gaussian distribution arises as the most consistentwith our assumptions.at is to say that the Gaussian distribution is the most natural expression of our stateof ignorance, because if all we are willing assume is that a measure has finite variance, theGaussian distribution is the shape that can be realized in the largest number of ways anddoes not introduce any new assumptions. It is the least surprising and least informativeassumption to make. In this way, the Gaussian is the distribution most consistent with ourassumptions. Or rather, it is the most consistent with our golem’s assumptions. If you don’tthink the distribution should be Gaussian, then that implies that you know something elsethat you should tell your golem about, something that would improve inference.is epistemological justification is premised on INFORMATION THEORY and MAXIMUMENTROPY. We’ll dwell on information theory in Chapter 6, when it’ll be more useful. enin later chapters, other common and useful distributions will be used to build generalizedlinear models (GLMs). When these other distributions are introduced, you’ll learn the constraintsthat make them the uniquely most appropriate (consistent with our assumptions)distributions.For now, let’s take the ontological and epistemological justifications of just the normaldistribution as reasons to start building models of measures around it. roughout all ofthis modeling, keep in mind that using a model is not equivalent to swearing an oath to it.e golem is your servant, not the other way around. And so the golem’s beliefs are not yourown. ere is no contract that forces us to believe the assumptions of our models. ey justneed to be useful robots.Overthinking: Gaussian distribution. You don’t have to memorize the Gaussian probability distributionformula to make good use of it. You’re computer already knows it. But a little knowledge of itsform can help demystify it. e probability of some value y, given a Gaussian (normal) distributionwith mean µ and standard deviation σ, is:( )1Pr(y|µ, σ) = √ exp (y − µ)2−2πσ2 2σ 2is looks monstrous. But the important bit is just the (y − µ) 2 bit. is is the part that gives thenormal distribution it’s fundamental shape, a quadratic shape. Once you exponentiate the quadraticshape, you get the classic bell curve. e rest of it just scales and standardizes the distribution so thatit sums to one, as all probability distributions must. But an expression as simple as exp(−y 2 ) yieldsthe Gaussian prototype.e Gaussian is also a continuous distribution, unlike the binomial probabilities of earlier chapters.is means that the value y in the Gaussian distribution can be any continuous value. ebinomial, in contrast, requires integers. Probability distributions with only discrete outcomes, likethe binomial, are oen called probability mass functions, while continuous ones like the Gaussian arecalled probability density functions. For mathematical reasons, probability densities, but not masses,can be greater than 1. Try dnorm(0,0,0.1), for example, which is the way to make R calculatePr(0|0, 0.1). e answer, about 4, is no mistake. Probability density is the rate of change in cumulativeprobability. So where cumulative probability is increasing rapidly, density can easily exceed1. But if we calculate the area under the density function, it will never exceed 1. Such areas are also

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!