12.07.2015 Views

Counterexamples to a Likelihood Theory of Evidence

Counterexamples to a Likelihood Theory of Evidence

Counterexamples to a Likelihood Theory of Evidence

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

For each hypothesis, the likelihood is obtained by multiplying the probabilities <strong>of</strong>the data points <strong>to</strong>gether, where each probability has the form:PF( x< X < x+ dx, y< Y < y+ dy) = k pF( x, y),PB( x< X < x+ dx, y< Y < y+ dy) = k pB( x, y),where k = dxdy. Furthermore,pF( xy , ) = pF( xp )F( y| x), and pB( xy , ) = pB( yp )B( x| y).Since B2has a lower likelihood than F 2, it must be because pB( y ) is smaller than pF( x ) ,on average.This is odd because thespecification <strong>of</strong> marginalx − ydistributions, pB( y ) and pF( x ) is notwhat we think <strong>of</strong> as the essential2content <strong>of</strong> a ‘causal’ hypothesis. Thefalsity <strong>of</strong> B2is already apparent fromthe pattern that forms when the xyvalues are generated from y values,even we look at a narrow range <strong>of</strong> yvalues. The x values do not varyrandomly <strong>to</strong> the left and <strong>to</strong> the right <strong>of</strong>the line Y = X, as B2claims. Instead,they vary randomly <strong>to</strong> the left and theright <strong>of</strong> the line Y = 2X with half thevariance, just as we would expect if B1were true. This is easily seen byplotting residuals (x − y) against y (see Fig. 4). The residual variance is equal <strong>to</strong> 1because it is sum <strong>of</strong> two terms— one due <strong>to</strong> the deviation <strong>of</strong> the line Y = 2X from the lineY = X (the ‘explainable’ variation) and the other due <strong>to</strong> the smaller random variationabout the line Y = 2X. In contrast, if we were <strong>to</strong> plot the y residuals against x, then therewould be no discernible correlation between the y residuals and x.Example 3: We will have a counterexample <strong>to</strong> LTE if the competing hypothesescan be constructed so that the marginal components <strong>of</strong> their likelihoods are the same.Suppose that we are <strong>to</strong>ld an alternative s<strong>to</strong>ry about how the marginal values aregenerated. According <strong>to</strong> this version <strong>of</strong> the Forward hypothesis, the x values are readfrom a predetermined list <strong>of</strong> values, and then a slight s<strong>to</strong>chastic element imposed on thefinalvalue by randomly generating a small error from a uniform distribution <strong>of</strong> (small) width δaround the listed value. That is, X has a uniform distribution between the listed valueminus δ 2 and the listed value plus δ 2 . If we define ε = 1 δ , then for all observed x,P ( ) Fx = εdx. Similarly, B asserts that y values are first drawn from a list and thenrandomized in the same way, so that PB( y)= εdy. The s<strong>to</strong>ry about how the othervariable is generated is the same as in Example 2, in each case. Now, we are <strong>to</strong>ld tha<strong>to</strong>ne <strong>of</strong> these hypotheses is true. Can we tell which one? Yes, by looking at the behavior<strong>of</strong> the residuals (as explained in Example 2). Can we tell if we are just given the-2Figure 4: The x residuals (x − y) plotted as afunction <strong>of</strong> y. The residual has an averagevariation <strong>of</strong> 1, but the variation varies in asystematic way for different values <strong>of</strong> y.14

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!