10.12.2012 Views

Online Papers - Brian Weatherson

Online Papers - Brian Weatherson

Online Papers - Brian Weatherson

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Online</strong> <strong>Papers</strong><br />

<strong>Brian</strong> <strong>Weatherson</strong><br />

January 17, 2011


Contents<br />

I General Philosophy 1<br />

1 What Good are Counterexamples? 2<br />

2 Morality, Fiction and Possibility 23<br />

3 David Lewis 51<br />

II Epistemology 90<br />

4 Can We Do Without Pragmatic Encroachment? 91<br />

5 Knowledge, Bets and Interests 116<br />

6 Defending Interest-Relative Invariantism 141<br />

7 Deontology and Descartes’ Demon 157<br />

8 Luminous Margins 181<br />

9 Scepticism, Rationalism and Externalism 190<br />

10 Disagreements, Philosophical and Otherwise 205<br />

11 Do Judgments Screen Evidence? 220<br />

12 Easy Knowledge and Other Epistemic Virtues 243<br />

13 Induction and Supposition 257<br />

III Language 264<br />

14 Epistemic Modals in Context 265<br />

Co-authored with Andy Egan and John Hawthorne<br />

15 Attitudes and Relativism 300<br />

16 Conditionals and Indexical Relativism 316<br />

17 Indicatives and Subjunctives 340<br />

18 Assertion, Knowledge and Action 354<br />

Co-authored with Ishani Maitra<br />

19 No Royal Road to Relativism 373<br />

20 Epistemic Modals and Epistemic Modality 381<br />

21 Questioning Contextualism 397


CONTENTS ii<br />

IV Vagueness 410<br />

22 Many Many Problems 411<br />

23 Vagueness as Indeterminacy 428<br />

24 True, Truer, Truest 442<br />

25 Vagueness and Pragmatics 458<br />

V Probability 485<br />

26 From Classical to Intuitionistic Probability 486<br />

27 Should We Respond to Evil With Indifference? 498<br />

28 The Bayesian and the Dogmatist 516<br />

29 Keynes, Uncertainty and Interest Rates 528<br />

30 Stalnaker on Sleeping Beauty 546<br />

31 Dogmatism and Intuitionistic Probability 557<br />

Co-authored with David Jehle<br />

VI Metaphysics 571<br />

32 Intrinsic Properties and Combinatorial Principles 572<br />

33 The Asymmetric Magnets Problem 586<br />

34 Chopping Up Gunk 598<br />

Co-authored with John Hawthorne<br />

35 Intrinsic and Extrinsic Properties 606<br />

VII Comments and Criticism 619<br />

36 In Defense of a Kripkean Dogma 620<br />

Co-authored with Jonathan Ichikawa and Ishani Maitra<br />

37 Defending Causal Decision Theory 630<br />

38 Keynes and Wittgenstein 639<br />

39 Blome-Tillmann on Interest-Relative Invariantism 657<br />

40 Doing Philosophy With Words 665<br />

41 Epistemicism, Parasites and Vague Names 674<br />

42 Begging the Question and Bayesians 678<br />

43 Prankster’s Ethics 686<br />

Co-authored with Andy Egan<br />

44 Are You a Sim? 693<br />

45 Humeans Aren’t Out of Their Minds 700<br />

46 Nine Objections to Steiner and Wolff on Land Disputes 705<br />

47 Misleading Indexicals 711<br />

Bibliography 713


Part I<br />

General Philosophy


What Good are Counterexamples?<br />

The following kind of scenario is familiar throughout analytic philosophy. A bold<br />

philosopher proposes that all Fs are Gs. Another philosopher proposes a particular<br />

case that is, intuitively, an F but not a G. If intuition is right, then the bold philosopher<br />

is mistaken. Alternatively, if the bold philosopher is right, then intuition is<br />

mistaken, and we have learned something from philosophy. Can this alternative ever<br />

be realised, and if so, is there a way to tell when it is? In this paper, I will argue that<br />

the answer to the first question is yes, and that recognising the right answer to the<br />

second question should lead to a change in some of our philosophical practices.<br />

The problem is pressing because there is no agreement across the sub-disciplines<br />

of philosophy about what to do when theory and intuition clash. In epistemology,<br />

particularly in the theory of knowledge, and in parts of metaphysics, particularly<br />

in the theory of causation, it is almost universally assumed that intuition trumps<br />

theory. Shope’s The Analysis of Knowledge contains literally dozens of cases where<br />

an interesting account of knowledge was jettisoned because it clashed with intuition<br />

about a particular case. In the literature on knowledge and lotteries it is not as widely<br />

assumed that intuitions about cases are inevitably correct, but this still seems to be<br />

the working hypothesis. 1 And recent work of causation by a variety of authors, with<br />

a wide variety of opinions, generally takes the same line: if a theory disagrees with<br />

intuition about a case, the theory is wrong. 2 In this area exceptions to the rule are<br />

a little more frequent, particularly on the issues of whether causation is transitive<br />

and whether omissions can be causes, but in most cases the intuitions are taken to<br />

override the theories. Matters are quite different in ethics. It is certainly not a good<br />

thing for utilitarian theories that we very often feel that the action that maximises<br />

utility is not the right thing to do. But the existence of such cases is rarely taken to<br />

be obviously and immediately fatal for utilitarian theories in the way that, say, Gettier<br />

cases are taken to be obviously and immediately fatal for theories of knowledge<br />

that proclaim those cases to be cases of knowledge. Either there is some important<br />

difference here between the anti-utilitarian cases and the Gettier cases, a difference<br />

that justifies our differing reactions, or someone is making a mistake. I claim that it<br />

is (usually) the epistemologists and the metaphysicians who are wrong. In more cases<br />

than we usually imagine, a good philosophical theory can teach us that our intuitions<br />

are mistaken. Indeed, I think it is possible (although perhaps not likely) that the justified<br />

true belief (hereafter, JTB) theory of knowledge is so plausible that we should<br />

hold onto it in preference to keeping our intuition that Gettier cases are not cases of<br />

knowledge.<br />

My main interests here are methodological, not epistemological. Until the last<br />

section I will be arguing for the JTB theory of knowledge, but my main interest is<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philosophical<br />

Studies 115 (2003): 1-31.<br />

1See, for example, DeRose (1996) and Nelkin (2000)<br />

2See, for example, Menzies (1996), or any of the papers in the special Journal of Philosophy issue on<br />

causation, April 2000.


What Good are Counterexamples? 3<br />

in showing that one particular argument against the JTB theory, the one that turns<br />

on the fact that it issues in some rather unintuitive pronouncements about Gettier<br />

cases, is not in itself decisive. Still, the epistemological issues are important, which is<br />

one reason I chose to focus on the JTB theory, and at the end I will discuss how the<br />

methodological conclusions drawn here may impact on them in an unexpected way.<br />

1 Intuitions<br />

Let us say that a counterexample to the theory that all Fs are Gs is a possible situation<br />

such that most people have an intuition that some particular thing in the story is an<br />

F but not a G. The kinds of intuition I have in mind are what George Bealer (1998)<br />

calls intellectual “seemings”. Bealer distinguishes intellectual seemings, such as the<br />

intuition that Hume’s Principle is true, or that punishing a person for a crime they<br />

did not commit is unjust, from physical seemings, such as the ‘intuition’ that objects<br />

fall if released, or perhaps that the sun rotates around the earth. We shall be primarily<br />

concerned here with intellectual seemings, and indeed I shall only call these intuitions<br />

in what follows.<br />

As Bealer notes, whether something seems to be true can be independent of<br />

whether we believe it to be true. Bealer himself notes that Frege’s Axiom V seems to<br />

be true, though we know it is false. It does not seem to be the case, in the relevant<br />

sense, that 643 x 721 = 463603. Unless one is rather good at mental arithmetic, there<br />

is nothing that 643 x 721 seems to be; it is out of the reach of intuition. These are not<br />

the only ways that seemings and belief can come apart. One can judge that something<br />

seems to be the case while neither believing nor disbelieving it. This is a sensible attitude<br />

to take towards the view that one cannot know that a particular ticket will lose<br />

in a fair lottery. This is despite the fact that it certainly seems one cannot know this.<br />

If one’s intuitions are running rampant, one may even have an intuition about something<br />

that one believes to be strictly indeterminate. For example, some people may<br />

have the intuition that the continuum hypothesis is true, even though they believe<br />

on reflection that it is indeterminate whether it is true.<br />

The distinction between intuitions and belief is important because it helps reduce<br />

the violence that revisionary philosophical views do to our pre-existing positions.<br />

When I say that Gettier cases may be cases of knowledge, I am not denying that there<br />

is a strong intuition that they are not cases of knowledge. I am not denying that a<br />

Gettier case does not seem to be a case of knowledge. The same thing occurs in ethics.<br />

Utilitarians rarely deny that it seems that punishing innocents is the wrong thing to<br />

do. They urge that in certain, rare, cases this might be one of those things that seems<br />

to be true despite being false. The case that knowledge is justified true belief is meant<br />

to be made in full awareness of the fact that certain cases of justified true beliefs seem<br />

to not be cases of knowledge.<br />

Actually, although we will not make much of it here, this last claim is not true as<br />

a general statement about all people. Jonathan Weinberg, Stephen Stich and Shaun<br />

Nichols have reported Weinberg et al. (2001) that the intuition that Gettier cases<br />

are not cases of knowledge is not universally shared. It is not entirely clear what<br />

the philosophical relevance of these discoveries is. It might show that we who have


What Good are Counterexamples? 4<br />

Gettier intuitions speak a different language from those who do not. It might show<br />

(though as Stich and Nichols point out it is rather hard to see how) that philosophers<br />

know a lot more about knowledge than other folk. I think it is rather unlikely<br />

that this is true, but we shall bracket such concerns for now, and continue on the<br />

assumption that all parties have the Gettier intuitions. Since I shall want to argue<br />

that knowledge may still be justified belief in any case, I am hardly tilting the playing<br />

field in my direction by making this assumption.<br />

Given that intuitions are what Bealer calls intellectual seemings, and given that<br />

the example of Axiom V shows that seemings can be mistaken, what evidence have we<br />

that they are not mistaken in the cases we consider here? Arguably, we have very little<br />

indeed. Robert Cummins (1998) argues that in general intuition should not be trusted<br />

as an evidential source because it cannot be calibrated. We wouldn’t have trusted the<br />

evidence Galileo’s telescope gave us about the moon without an independent reason<br />

for thinking his telescope reliable. Fortunately, this can be done; we can point the<br />

telescope at far away terrestrial mountains, and compare its findings with the findings<br />

of examining the mountains up close and personal. There is no comparable way of<br />

calibrating intuitions. Clearly we should suspicious of any method that has been<br />

tested and found unreliable, but there are tricky questions about the appropriate level<br />

of trust in methods that have not been tested. Ernest Sosa (1998) argues in response<br />

to Cummins that this kind of reasoning leads to an untenable kind of scepticism.<br />

Sosa notes that one can make the same point about perception as Cummins makes<br />

about intuition: we have no independent way of calibrating perception as a whole.<br />

There is a distinction to be drawn here, since perception divides into natural kinds,<br />

visual perception, tactile perception, etc, and we can use each of these to calibrate<br />

the others. It is hard to see how intuitions can be so divided in ways that permit<br />

us to check some kinds of intuitions against the others. In any case, the situation is<br />

probably worse than Cummins suggests, since we know that several intuitions are<br />

just false. It is interesting to note the many ways in which intuition does, by broad<br />

agreement, go wrong.<br />

Many people are prone to many kinds of systematic logical mistakes. Most famously,<br />

the error rates on the Wason Selection Task are disturbingly large. Although<br />

this test directly measures beliefs rather than intuitions, it seems very likely that<br />

many of the false beliefs are generated by mistaken intuitions. As has been shown<br />

in a variety of experiments, the most famous of which were conducted by Kahneman<br />

and Tversky, most people are quite incompetent at probabilistic reasoning. In the<br />

worst cases, subjects held that a conjunction was more probable than one of its conjuncts.<br />

Again, this only directly implicates subjects’ beliefs, but it is very likely that<br />

the false beliefs are grounded in false intuitions. (The examples in this paragraph are<br />

discussed in detail in Stich (1988, 1992).)<br />

As noted above, most philosophers would agree that many, if not most, people<br />

have mistaken moral intuitions. We need not agree with those consequentialists who<br />

think that vast swathes of our moral views are in error to think that (a) people make<br />

systematic moral mistakes and (b) some of these mistakes can be traced to mistaken<br />

intuitions. To take the most dramatic example, for thousands of years it seemed to<br />

many people that slavery was morally acceptable. On a more mundane level, many of


What Good are Counterexamples? 5<br />

us find that our intuitive judgements about a variety of cases cannot be all acceptable,<br />

for it is impossible to find a plausible theory that covers them all. 3 Whenever we<br />

make a judgement inconsistent with such an intuition, we are agreeing that some of<br />

our original intuitions were mistaken.<br />

From a rather different direction, there are many mistaken conceptual intuitions,<br />

with the error traceable to the way Gricean considerations are internalised in the<br />

process of learning a language. Having learned that it would be improper to use t<br />

to describe a particular case, we can develop the intuition that this case is not an F,<br />

where F is the property denoted by t. For example, if one is careless, one can find<br />

oneself sharing the intuition expressed by Ryle in The Concept of Mind that morally<br />

neutral actions, like scratching one’s head, are neither voluntary nor involuntary<br />

(Ryle, 1949). The source of this intuition is the simple fact that it would be odd to<br />

describe an action as voluntary or involuntary unless there was some reason to do<br />

so, with the most likely such reason being that the action was in some way morally<br />

suspect. The fact that the intuition has a natural explanation does not stop it being<br />

plainly false. We can get errors in conceptual intuitions from another source. At<br />

one stage it was thought that whales are fish, that the Mars is a star, the sun isn’t.<br />

These are beliefs, not intuitions, but there are clearly related intuitions. Anyone<br />

who had these beliefs would have had the intuition that in a situation like this (here<br />

demonstrating the world) the object in the Mars position was a star, and the objects in<br />

the whale position were fish. The empirical errors in the person’s belief will correlate<br />

to conceptual errors in their intuition. To note further that the kind of error being<br />

made here is conceptual not empirical, and hence the kind of error that occurs in<br />

intuition, note that we need not have learned anything new about whales, the sun or<br />

Mars to come to our modern beliefs. (In fact we did, but that’s a different matter.)<br />

Rather, we need only have learned something about the vast bulk of the objects that<br />

are fish, or stars, to realise that these objects had been wrongly categorised. The<br />

factor we had thought to be the most salient similarity to the cases grouped under<br />

the term, being a heavenly body visible in the night sky for ‘star’, living in water<br />

for ‘fish’, turned out not to be the most important similarity between most things<br />

grouped under that term. So there is an important sense in which saying whales are<br />

fish, or that the sun is not a star, may reveal a conceptual (rather than an empirical)<br />

error.<br />

There seems to be a link between these two kinds of conceptual error. The reason<br />

we say that the Rylean intuitions, or more generally the intuitions of what (Grice,<br />

1989, Ch. 1) called the Type-A philosophers, are mistaken is that the rival, Gricean,<br />

theory attaches to each word a relatively natural property. There is no natural property<br />

that actions satisfy when, and only when, we ordinarily describe them as voluntary.<br />

There is a natural property that covers all these cases, and other more mundane<br />

actions like scratching one’s head, and that is the property we now think is denoted<br />

by ‘voluntary’. This notion of naturalness, and the associated drive for systematic-<br />

3 The myriad examples in Unger (1996) are rather useful for reminding us just how unreliable our moral<br />

intuitions are, and how necessary it is to employ reflection and considered judgement in regimenting such<br />

intuitions.


What Good are Counterexamples? 6<br />

ity in our philosophical and semantic theories, will play an important role in what<br />

follows.<br />

2 Correcting Mistakes<br />

The following would be a bad defence of the JTB theory against counterexamples. We<br />

can tell that all counterexamples to the JTB theory are based on mistaken intuitions,<br />

because the JTB theory is true, so all counterexamples to it are false. Unless we have<br />

some support for the crucial premise that the JTB theory is true, this argument is<br />

rather weak. And that support should be enough to not only make the theory prima<br />

facie plausible, but so convincing that we are prepared to trust it rather than our<br />

judgements about Gettier cases.<br />

In short, the true theory of knowledge is the one that does best at (a) accounting<br />

for as many as possible of our intuitions about knowledge while (b) remaining systematic.<br />

A ‘theory’ that simply lists our intuitions is no theory at all, so condition (b)<br />

is vital. And it is condition (b), when fully expressed, that will do most of the work<br />

in justifying the preservation of the JTB theory in the face of the counterexamples.<br />

The idea that our theory should be systematic is accepted across a wide range of<br />

philosophical disciplines. This idea seems to be behind the following plausible claims<br />

by Michael Smith: “Not only is it a platitude that rightness is a property that we<br />

can discover to be instantiated by engaging in rational argument, it is also a platitude<br />

that such arguments have a characteristic coherentist form.” (1994: 40) The second<br />

so-called platitude just points out that it is a standard way of arguing in ethics to say,<br />

you think we should do X in circumstances C 1 , circumstances C 2 are just like C 1 , so<br />

we should do X in C 1 . The first points out that not only is this standard, it can yield<br />

surprising ethical knowledge. But this is only plausible if it is more important that<br />

final ethics is systematic than that first ethics, the ethical view delivered by intuition,<br />

is correct. In other words, it is only plausible if ethical intuitions are classified as<br />

mistaken to the extent that they conflict with the most systematic plausible theory.<br />

So, for example, it would be good news for utilitarianism if there was no plausible<br />

rival with any reasonable degree of systematicity.<br />

This idea also seems to do important work in logic. If we just listed intuitions<br />

about entailment, we would have a theory on which disjunctive syllogism (A and ˜A<br />

∨ B entail B) is valid, while ex falso quadlibet (A and ˜A entail B) is not. Such a theory<br />

is unsystematic because no concept of entailment that satisfies these two intuitions<br />

will satisfy a generalised transitivity requirement: that if C and D entail E, and F<br />

entails D then C and F entail E. (This last step assumes that ˜A entails ˜A ∨ B, but<br />

that is rarely denied.) Now one can claim that a theory of entailment that gives up<br />

this kind of transitivity can still be systematic enough, and Neil Tennant (1992) does<br />

exactly this, but it is clear that we have a serious cost of the theory here, and many<br />

people think avoiding this cost is more important than preserving all intuitions.<br />

In more detail, there are four criteria by which we can judge a philosophical theory.<br />

First, counterexamples to a theory count against it. While a theory can be<br />

reformist, it cannot be revolutionary. A theory that disagreed with virtually all intuitions<br />

about possible cases is, for that reason, false. The theory: X knows that p iff X


What Good are Counterexamples? 7<br />

exists and p is true is systematic, but hardly plausible. As a corollary, while intuitions<br />

about any particular possible case can be mistaken, not too many of them could be.<br />

Counterexamples are problematic for a theory, the fewer reforms needed the better,<br />

it’s just not that they are not fatal. Importantly, not all counterexamples are as damaging<br />

to a theory as others. Intuitions come in various degrees of strength, and theories<br />

that violate weaker intuitions are not as badly off as those that violate stronger intuitions.<br />

Many people accept that the more obscure or fantastic a counterexample<br />

is, the less damaging it is to a theory. This seems to be behind the occasional claim<br />

that certain cases are “spoils to the victor” – the idea is that the case is so obscure or<br />

fantastic that we should let theory rather than intuition be our guide. Finally, if we<br />

can explain why we have the mistaken intuition, that counts for a lot in reducing the<br />

damage the counterexample does. Grice did not just assert that the theory on which<br />

an ordinary head scratch was voluntary was more systematic than the theory of voluntariness<br />

Ryle proposed, he provided an explanation of why it might seem that his<br />

theory was wrong in certain cases.<br />

Secondly, the analyses must not have too many theoretical consequences which<br />

are unacceptable. Consider Kahneman and Tversky’s account of how agents actually<br />

make decisions, prospect theory, as an analysis of ‘good decision’. (Disclaimer: This<br />

is not how Kahneman and Tversky intend it.) So the analysis of ‘good decision’ is<br />

‘decision authorised by prospect theory’. It is a consequence of prospect theory that<br />

which decision is “best” depends on which outcome is considered to be the neutral<br />

point. In practice this is determined by contextual factors. Redescribing a story to<br />

make different points neutral, which can be done by changing the context, licences<br />

different decisions. I take it this would be unacceptable in an analysis of ‘good decision’,<br />

even though it means the theory gives intuitively correct results in more possible<br />

cases than its Bayesian rivals 4 . In general, we want our normative theories to<br />

eliminate arbitrariness as much as possible, and this is usually taken to be more important<br />

than agreeing with our pre-theoretic intuitions about particular cases. Unger<br />

uses a similar argument in Living High and Letting Die to argue against the reliance<br />

on intuitions about particular cases in ethics. We have differing ethical intuitions towards<br />

particular cases that differ only in the conspicuousness of the suffering caused<br />

(or not prevented), we know that conspicuousness is not a morally salient difference,<br />

so we should stop trusting the particular intuitions. (Presumably this is part of the<br />

reason that we find Tennant’s theory of entailment so incredible, prima facie. It is<br />

not just that violating transitivity seems unsystematic, it is that we have a theoretical<br />

intuition that transitivity should be maintained.)<br />

Thirdly, the concept so analysed should be theoretically significant, and should<br />

be analysed in other theoretically significant terms. This is why we now analyse ‘fish’<br />

in such a way that whales aren’t fish, and ‘star’ in such a way that the sun is a star.<br />

This is not just an empirical fact about our language. Adopting such a constraint<br />

on categories is a precondition of building a serious classificatory scheme, so it is a<br />

constraint on languages, which are classificatory schemes par excellance. Even if I’m<br />

wrong about this, the fact that we do reform our language with the advance of science<br />

4 A point very similar to this is made in Horowitz (1998).


What Good are Counterexamples? 8<br />

to make our predicates refer to theoretically more significant properties shows that<br />

we have a commitment to this restriction.<br />

Finally, the analysis must be simple. This is an important part of why we don’t<br />

accept Ryle’s analysis of ‘voluntary’. His analysis can explain all the intuitive data,<br />

even without recourse to Gricean implicature, and arguably it doesn’t do much worse<br />

than the Gricean explanation on the second and third tests. But Grice’s theory can<br />

explain away the intuitions that it violates, and importantly it does so merely with<br />

the aid of theories of pragmatics that should be accepted for independent reasons, and<br />

it is simpler, so it trumps Ryle’s theory.<br />

My main claim is that even once we have accepted that the JTB theory seems to<br />

say the wrong thing about Gettier cases, we should still keep an open mind to the<br />

question of whether it is true. The right theory of knowledge, the one that attributes<br />

the correct meaning to the word ‘knows’, will do best on balance at these four tests.<br />

Granted that the JTB theory does badly on test one, it seems to do better than its<br />

rivals on tests two, three and four, and this may be enough to make it correct.<br />

3 Naturalness in a theory of meaning<br />

Let’s say I have convinced you that it would be better to use ‘knows’ in such a way<br />

that we all now assent to “She knows” whenever the subject of that pronoun truly,<br />

justifiably, believes. You may have been convinced that only by doing this will our<br />

term pick out a natural relation, and there is evident utility in having our words pick<br />

out relations that carve nature at something like its joints. Only in that way, you may<br />

concede, will our language be a decent classificatory scheme of the kind described<br />

above, and it is a very good thing to have one’s language be a decent classificatory<br />

scheme. I have implicitly claimed above that if you concede this you should agree that<br />

I will have thereby corrected a mistake in your usage. But, an objector may argue, it is<br />

much more plausible to say that in doing so I simply changed the meaning of ‘knows’<br />

and its cognates in your idiolect. The meaning of your words is constituted by your<br />

responses to cases like Gettier cases, so when I convince you to change your response,<br />

I change the meaning of your words.<br />

This objection relies on a faulty theory of meaning, one that equates meaning<br />

with use in a way which is quite implausible. If this objection were right, it would<br />

imply infallibilism about knowledge ascriptions. Still, the objection does point to a<br />

rather important point. There is an implicit folk theory of the meaning of ‘knows’,<br />

one according to which it does not denote justified true belief. I claim this folk theory<br />

is mistaken. It is odd to say that we can all be mistaken about the meanings of our<br />

words; it is odd to say that we can’t make errors in word usage. I think the latter<br />

is the greater oddity, largely because I have a theory which explains how we can all<br />

make mistakes about meanings in our own language.<br />

How can we make such mistakes? The short answer is that meanings ain’t in<br />

the head. The long answer turns on the kind of tests on analyses I discussed in section<br />

two. The meaning of a predicate is a property in the sense described by Lewis<br />

(1983c) 5 : a set, or class, or plurality of possibilia. (That is, in general the meaning of<br />

5 The theory of meaning outlined here is deeply indebted to Lewis (1983c, 1984b, 1992).


What Good are Counterexamples? 9<br />

a predicate is its intension. 6 ) The interesting question is determining which property<br />

it is. In assigning a property to a predicate, there are two criteria we would like to<br />

follow. The first is that it validates as many as possible of our pre-theoretic beliefs.<br />

The second is that it is, in some sense, simple and theoretically important. How to<br />

make sense of this notion of simplicity is a rather complex matter. Lewis canvasses<br />

the idea that there is a primitive ‘naturalness’ of properties which measures simplicity<br />

and theoretical significance 7 , and I will adopt this idea. Space restrictions prevent me<br />

going into greater detail concerning ‘naturalness’, but if something more definite is<br />

wanted, for the record I mean by it here just what Lewis means by it in the works<br />

previously cited. 8<br />

So, recapitulating what I said in section two, for any predicate t and property F,<br />

we want F meet two requirements before we say it is the meaning of t. We want this<br />

meaning assignment to validate many of our pre-theoretic intuitions (this is what<br />

we test for in tests one and two) and we want F to be reasonably natural (this is<br />

what we test for in tests three and four). In hard cases, these requirements pull in<br />

opposite directions; the meaning of t is the property which on balance does best.<br />

Saying ‘knows’ means ‘justifiably truly believes’ does not do particularly well on the<br />

first requirement. Gettier isolated a large class of cases where it goes wrong. But<br />

it does very well on the second, as it analyses knowledge in terms of a short list of<br />

simple and significant features. I claim that all its rivals don’t do considerably better<br />

on the first, and arguably do much worse on the second. (There are considerations<br />

pulling either way here, as I note in section seven, but it is prima facie plausible that<br />

it does very well on the second, which is all that we consider for now.) That the JTB<br />

theory is the best trade-off is still a live possibility, even considering Gettier cases.<br />

This little argument will be perfectly useless this theory of meaning (owing in all<br />

its essential features to Lewis) is roughly right. There are several reasons for believing<br />

it. First, it can account for the possibility of mistaken intuitions, while still denying<br />

the possibility that intuitions about meaning can be systematically and radically<br />

mistaken. This alone is a nice consequence, and not one which is shared by every<br />

theory of meaning on the market. Secondly, as was shown in sections one and two, it<br />

seems to make the right kinds of predictions about when meaning will diverge from<br />

intuitions about meaning.<br />

Thirdly, it can account for the fact that some, but not all, disagreements about the<br />

acceptability of assertions are disputes about matters of fact, not matters of meaning.<br />

This example is from Cummins: “If a child, asked to use ‘fair’ in a sentence, says, “It<br />

isn’t fair for girls to get as much as boys,” we should suspect the child’s politics, not his<br />

language” 1998, 120. This seems right; but if the child had said “It is fair that dreams<br />

are purple”, we would suspect his language. Perhaps by ‘fair’ he means ‘nonsensical’<br />

or something similar. A theory of meaning needs to account for this divergence, and<br />

6 There are tricky questions concerning cointensional predicates, but these have fairly familiar solutions,<br />

which I accept. For ease of expression here I will ignore the distinction between properties and<br />

relations – presumably ‘knows’ denotes a relation, that is a set of ordered pairs.<br />

7 ‘Measures’ may be inappropriate here. Plausibly a property is simple because it is natural.<br />

8 For more recent applications of naturalness in Lewis’s work, see Langton and Lewis (1998, 2001)and<br />

Lewis (2001a).


What Good are Counterexamples? 10<br />

for the fact that it is a vague matter when we say the problem is with the child’s language,<br />

and when with his politics. In short, saying which disputes are disputes about<br />

facts (or values or whatever), and which about meanings, is a compulsory question<br />

for a theory of meaning.<br />

The balance theory of meaning I am promoting can do this, as the following<br />

demonstration shows. This theory of meaning is determinedly individualistic. Every<br />

person has an idiolect determined by her dispositions to apply terms; a shared<br />

language is a collection of closely-enough overlapping idiolects. So the child’s idiolect<br />

might differ from ours, especially if he uses ‘fair’ to mean ‘nonsensical’. But if<br />

the idiolect differs in just how a few sentences are used, it is likely that the meaning<br />

postulate which does best at capturing his dispositions to use according to our two<br />

criteria, is the same as the meaning postulate which does best at capturing our dispositions<br />

to use. The reason is that highly natural properties are pretty thin on the<br />

ground; one’s dispositions to use a term have to change quite a lot before they get into<br />

the orbit of a distinct natural property. So despite the fact that I allow for nothing<br />

more than overlapping idiolects, in practice the overlap is much closer to being exact<br />

than on most ‘overlapping idiolect’ theories.<br />

With this, I can now distinguish which disputes are disputes about facts, and<br />

which are disputes about meaning. Given that there is a dispute, the parties must<br />

have different dispositions to use some important term. In some disputes, the same<br />

meaning postulate does best on balance at capturing the dispositions of each party.<br />

I say that here the parties mean the same thing by their words, and the dispute is a<br />

dispute about facts. In others, the difference will be so great that different meaning<br />

postulates do best at capturing the dispositions of the competing parties. In these<br />

cases, I say the dispute is a dispute about meaning.<br />

Now, I can explain the intuition that the JTB theorist means something different<br />

to the rest of us by ‘knows. That is, I can explain this intuition away. It seems<br />

a fair assumption that the reasonably natural properties will be evenly distributed<br />

throughout the space of possible linguistic dispositions. If this is right, then any<br />

change of usage beyond a certain magnitude will, on my theory, count as a change of<br />

meaning. And it is plausible to suppose the change I am urging to our usage, affirming<br />

rather than denying sentences like, “Smith knows Jones owns a Ford” is beyond that<br />

certain magnitude. But the assumption of even distribution of the reasonably natural<br />

properties is false. That, I claim, is what the failure of the ‘analysis of knowledge’<br />

merry-go-round to stop shows us. There are just no reasonably natural properties in<br />

the neighbourhood of our disposition to use ‘knows’. If this is right, then even some<br />

quite significant changes to usage will not be changes in meaning, because they will<br />

not change which is the closest reasonably natural property to our usage pattern. The<br />

assumption that the reasonably natural properties are reasonably evenly distributed<br />

is plausible, but false. Hence the hunch that I am trying to change the meaning of<br />

‘knows’ is plausible, but false.<br />

The hypothesis that when we alter intuitions because of a theory we always<br />

change meanings, on the other hand, is not even plausible. When the ancients said<br />

“Whales are fish”, or “The sun is not a star”, they simply said false sentences. That<br />

is, they said that whales are fish, and believed that the sun is not a star. This seems


What Good are Counterexamples? 11<br />

platitudinous, but the ‘use-change implies meaning-change’ hypothesis would deny<br />

it.<br />

It has sometimes been suggested to me that conceptual intuitions should be given<br />

greater privilege than other intuitions; that I am wrong to generalise from the massive<br />

fallibility of logical, ethical or semantic intuitions to the massive fallibility of<br />

conceptual intuitions. Since I am on much firmer ground when talking about these<br />

non-conceptual cases, if such an attack were justified it would severely weaken my<br />

argument. Given what has been said so far we should be able to see what is wrong<br />

with this suggestion. Consider a group of people who systematically assent to “If<br />

A then B implies if B then A.” On this view these people are expressing a mistaken<br />

logical intuition, but a correct conceptual intuition. So their concept of ‘implication’<br />

doesn’t pick out implication, or at the very least doesn’t pick out our concept of ‘implication’.<br />

Now if we are in that group, this summary becomes incoherent, so this<br />

position immediately implies that we can’t be mistaken about our logical intuitions.<br />

Further, we are no longer able to say that when these people say “If A then B implies<br />

if B then A,” they are saying something false, because given the reference of ‘implies’<br />

in their idiolect, this sentence expresses a true proposition. This is odd, but odder is<br />

to come. Assuming again we are in this group, it turns out to be vitally important in<br />

debates concerning philosophical logic to decide whether we are engaging in logical<br />

analysis or conceptual analysis. It might turn out a correct piece of conceptual analysis<br />

of ‘implication’ picks out a different relation to the correct implication relation we<br />

derive from purely logical considerations. If logical intuitions are less reliable than<br />

conceptual intuitions, as proposed, and assent to sentences like “If A then B implies if<br />

B then A” reveals simultaneously a logical and a conceptual intuition, this untenable<br />

conclusion seems forced. I conclude that conceptual intuitions are continuous with<br />

other intuitions, and should be treated in a similar way.<br />

4 Keeping Conceptual Analysis<br />

The following would be a bad way to respond to the worry that the JTB theory<br />

amounts to a change in the meaning of the word ‘knows’. For the worry to have any<br />

bite, facts about the meaning of ‘knows’ will have to be explicable in terms of facts<br />

about the use of ‘knows. But facts about use can only tell us about the beliefs of this<br />

community about knowledge, not what knowledge really is. Since different communities<br />

adopt different standards for knowledge, we should only trust ours over theirs<br />

if (a) we have special evidence that our is correct or (b) we are so xenophobic that we<br />

trust ours simply because it is ours. “Many of us care very much whether or cognitive<br />

processes lead to beliefs that are true, or give us power over nature, or lead to happiness.<br />

But only those with a deep and free-floating conservatism in matters epistemic<br />

will care whether their cognitive processes are sanctioned by the evaluative standards<br />

that happen to be woven into our language” (Stich, 1988, 109). “The intuitions and<br />

tacit knowledge of the man or woman in the street are quite irrelevant. The theory<br />

seeks to say what [knowledge] really is, not what folk [epistemology] takes it to be”


What Good are Counterexamples? 12<br />

(Stich, 1992, 252) 9 . Facts about use can only give us the latter, so they are not what<br />

are relevant to my inquiry.<br />

Stich takes this to be a general reason for abandoning conceptual analysis. Now<br />

while I think, and have argued above, that conceptual analysis need not slavishly<br />

follow intuition, I do not think that we should abandon it altogether. Stich’s worry<br />

seems to be conceptual analysis can only tell us about our words, not about our<br />

world. But is this kind of worry coherent? Can we say what will be found when we<br />

get to this real knowledge about the world? Will we be saying, “This belief of Smith’s<br />

shouldn’t be called knowledge, but really it is”? We need to attend to facts about the<br />

meaning of ‘knows’ in order to define the target of our search. If not, we have no way<br />

to avoid incoherencies like this one.<br />

To put the same point another way, when someone claims to find this deep truth<br />

about knowledge, why should anyone else care? She will say, “Smith really knows<br />

that Jones owns a Ford, but I don’t mean what everyone else means by ‘knows’.”<br />

Why is this any more interesting than saying, “Smith really is a grapefruit, but I<br />

don’t mean what everyone else means by ‘grapefruit”’? If she doesn’t use words in<br />

the way that we do, we can ignore what she says about our common word usage.<br />

Or at least we can ignore it until she (or one of her colleagues) provides us with a<br />

translation manual. But to produce a translation manual, or to use words the way we<br />

do, she needs to attend to facts about our meanings. Again, incoherence threatens if<br />

she doesn’t attend to these facts but claims nevertheless to be participating in a debate<br />

with us. These points are all to be found in Chapter 2 of Jackson (1998).<br />

An underlying assumption of the first reply is that there is a hard division between<br />

facts about meaning and facts about the world at large; that a principle like: No ‘is’<br />

from a ‘means’ holds. This principle is, however, mistaken. All instances of the<br />

following argument pattern, where t ranges over tokenings of referring terms, are<br />

valid.<br />

P1: t refers unequivocally to α.<br />

P2: t refers unequivocally to β.<br />

C: α = β<br />

For example, from the premise that ‘POTUS’ refers unequivocally to the President<br />

of the United States, and the premise that ‘POTUS’ refers unequivocally to Bush,<br />

we can validly infer that Bush is President of the United States. Since P1 and P2<br />

are facts about meaning, and C is a fact about the world, any principle like No ‘is’<br />

from a ‘means’ must be mistaken. So this worry about how much we can learn from<br />

conceptual analysis, from considerations of meaning, is mistaken.<br />

I call this inference pattern the R-inference. That the R-inference is valid doesn’t<br />

just show Stich’s critique rests on the false assumption No ‘is’ from a ‘means’. It can<br />

9 The paper from which this quote is drawn is about the content of mental states, so originally it had<br />

‘mental representation’ for ‘knowledge’ and ‘psychology’ for ‘epistemology’. But I take it that (a) this isn’t<br />

an unfair representation of Stich’s views and (b) even if it is, it is an admirably clear statement of the way<br />

many people feel about the use of intuitions about possible cases, and worth considering for that reason<br />

alone.


What Good are Counterexamples? 13<br />

be used to provide a direct response to his critique. The problem is meant to be<br />

that conceptual analysis, the method of counterexamples, can at best provide us with<br />

claims like: ‘knows’ refers to the relation justifiably truly believes. We want to know<br />

facts about knowledge, not about the term ‘knows’, so the conceptual analyst seems<br />

to have been looking in the wrong place. But it is a platitude that ‘knows’ refers to the<br />

relation knows. I call such platitudes, that ‘t’ refers to t, instances of the R-schema 10 .<br />

We can use the R-schema together with the R-inference to get the kind of conclusion<br />

our opponents are looking for.<br />

P1: ‘Knowledge’ refers unequivocally to the relation justifiably truly believes.<br />

P2: ‘Knowledge’ refers unequivocally to the relation knows.<br />

C: The relation knows is the relation justifiably truly believes.<br />

More colloquially, the conclusion says that knowledge is justified true belief. Everyone<br />

agrees (I take it) that conceptual analysis could, in principle, give us knowledge<br />

of facts of the form of P1. So the opponents of conceptual analysis must either deny<br />

P2, or deny that C follows from P1 and P2. In other words, for any such argument<br />

they must deny that the R-schema is true, or that the R-inference is valid. I hope the<br />

reader will agree that neither option looks promising.<br />

5 Against the Psychologists<br />

Someone excessively impressed by various results in the psychological study of concepts<br />

may make the following objection to the theory of meaning here proffered.<br />

“Why think that we should prefer short lists of necessary and sufficient conditions?<br />

This seems like another one of those cases where philosophers take their aesthetic<br />

preferences to be truth-indicative, much like the ‘taste for desert landscapes’ argument.<br />

Besides, haven’t psychologists like Eleanor Rosch shown that our concepts<br />

don’t have simple necessary and sufficient conditions? If that’s right, your argument<br />

falls down in several different places.”<br />

Strictly speaking, my preference is not just for short lists of necessary and sufficient<br />

conditions. But it is, for reasons set out more fully in the next section, for short<br />

theories that fit the meaning of some term into a network of other properties. And<br />

my argument would fall down if there was no reason to prefer such short theories.<br />

And, of course, short lists of necessary and sufficient conditions are paradigmatically<br />

short theories. One reason I prefer the JTB analysis to its modern rivals is its brevity.<br />

Some of the reasons for preferring short lists are brought out by considering the objections<br />

to this approach developed by psychologists. I’ll just focus on one of the<br />

experiments performed by Rosch and Mervis, the points I make can be generalised.<br />

Rosch and Mervis (1975) claim that “subjects rate superordinate semantic categories<br />

as having few, if any, attributes common to all members.” (p. 20) (A superordinate<br />

semantic category is one, like ‘fruit’, which has other categories, like ‘apple’,<br />

10 (Horwich, 1999, 115-130) discusses similar schema, noting that instances involving words in foreign<br />

languages, or indexical expressions, will not be platitudinous. He also notes a way to remove the presumption<br />

that there is such a thing as knowledge, by stating the schema as ∀x (‘knowledge’ refers to x iff<br />

knowledge = x). For ease of expression I will stick with the simpler formulation in the text.


What Good are Counterexamples? 14<br />

‘pear’ and ‘banana’, as sub-categories.) Here’s the experiment they ran to show this.<br />

For each of six superordinate categories (‘furniture’, ‘fruit’, ‘weapon’, ‘vegetable’, ‘vehicle’<br />

and ‘clothing’) they selected twenty category members. So for ‘fruit’ the members<br />

ranged from ‘orange’ and ‘apple’ to ‘tomato’ and ‘olive’. They then asked a range<br />

of subjects to list the attributes they associated with some of these 120 category members.<br />

Each subject was presented with six members, one from each category, and for<br />

each member had a minute and a half to write down its salient attributes.<br />

[F]ew attributes were given that were true of all twenty members of the<br />

category – for four of the categories there was only one such item; for<br />

two of the categories, none. Furthermore, the single attribute that did<br />

apply to all members, in three cases was true of many items besides those<br />

within that superordinate (for example, “you eat it” for fruit). Rosch and<br />

Mervis (1975)<br />

They go on to conclude that the superordinate is not defined by necessary and sufficient<br />

conditions, but by a ‘family resemblance’ between members. This particular<br />

experiment was taken to confirm that the number of attributes a member has with<br />

other members of the category is correlated with a previously defined measure of prototypicality.<br />

11 They claim that the intuition, commonly held amongst philosophers,<br />

that there must be some attribute in common to all the members, is explicable by the<br />

fact that the highly prototypical members of the category all do share quite a few attributes<br />

in common, ranging from 3 attributes in common to the highly prototypical<br />

vegetables, to 36 for the highly prototypical vehicles.<br />

One occasionally hears people deride the assumption that there are necessary and<br />

sufficient conditions for the application of a term, as if this was the most preposterous<br />

piece of philosophy possible. Really, this assumption is no more than the assumption<br />

that dictionaries can be written, and without any reason to think otherwise, seems<br />

perfectly harmless. Perhaps, though, the Rosch and Mervis experiments provide a<br />

reason to think otherwise, a reason for thinking that the conditions of applicability<br />

for terms like ‘fruit’, ‘weapon’, and perhaps ‘knowledge’ are Wittgensteinian family<br />

resemblance conditions, rather than short lists of necessary and sufficient conditions,<br />

the kinds of conditions that fill traditional dictionaries.<br />

When we look closely, we see that the experiments do not show this at all. One<br />

could try and knock any such argument away by claiming the proposal is incoherent.<br />

The psychologists claim that there are no necessary and sufficient conditions<br />

for being a weapon, but something is a weapon iff it bears a suitable resemblance to<br />

paradigmatic weapons. In one sense, bearing a suitable resemblance to a paradigmatic<br />

weapon is a condition, so it looks like we just have a very short list of necessary and<br />

sufficient conditions, a list of length one. (Jackson, 1998, 61) makes a similar point in<br />

response to Stich’s invocation of Rosch’s experiments. This feels like it’s cheating, so<br />

I’ll move onto other objections. I’ll explain below just why it feels like cheating.<br />

11 In previous work they had done some nice experiments aimed at getting a grip on our intuition that<br />

apples are more prototypical exemplars of fruit than olives are.


What Good are Counterexamples? 15<br />

Philosophers aren’t particularly interested in terms like ‘weapon’, so these experiments<br />

only have philosophical interest if the results can be shown to generalise<br />

to terms philosophers care about. In other words, if can be shown that terms like<br />

‘property’, ‘justice’, ‘cause’ and particularly ‘knows’ are cluster concepts, or family<br />

resemblance terms. But there is a good reason to think this is false. As William<br />

Ramsey (1998) notes, if F refers to a cluster concept, then for any proposed list of<br />

necessary and sufficient properties for F-hood, it should be easy to find an individual<br />

which is an F but which lacks some of these properties. To generate such an example,<br />

just find an individual which lacks one of the proposed properties, but which has<br />

several other properties from the cluster. It should be harder to find an individual<br />

which has the properties without being an F. If the proposed analysis is even close to<br />

being right, then having these conditions will entail having enough of the cluster of<br />

properties that are constitutive of F-hood to be an F. Note, for example, that all of the<br />

counterexamples Wittgenstein (1953) lists to purported analyses of ‘game’ are cases<br />

where something is, intuitively, a game but which does not satisfy the analysis. If<br />

game is really a cluster concept, this is how things should be. But it is not how things<br />

are with knowledge; virtually all counterexamples, from Gettier on, are cases which<br />

are intuitively not cases of knowledge, but which satisfy the proposed analysis. This<br />

is good evidence that even if some terms in English refer to cluster concepts, ‘knows’<br />

is not one of them.<br />

Secondly, Rosch and Mervis’s conclusions about the nature of the superordinate<br />

categories makes some rather mundane facts quite inexplicable. In this experiment<br />

the subjects weren’t told which category each member was in, but for other categories<br />

they were. Imagine, as seems plausible, one of the subjects objected to putting the<br />

member in that category. Many people, even undergraduates, don’t regard olives and<br />

tomatoes as fruit. (“Fruit on pasta? How absurd!”) When the student asks why is<br />

this thing called a fruit, other speakers can provide a response. It is not a brute fact of<br />

language that tomatoes are fruit. It is not just by magic that we happened to come to<br />

a shared meaning for fruit that includes tomatoes, and that if faced with a new kind<br />

of object, we would generally agree about whether it is a fruit. It is because we know<br />

how to answer such questions. This answer to the Why is it called ‘fruit’? question<br />

had better be a sufficient condition for fruitness. If not, the subject is entitled to ask<br />

why having that property makes it a fruit. And unless there are very many possible<br />

distinct answers to this question, which seems very improbable, there will be a short<br />

list of necessary and sufficient conditions for being a fruit. But for this example,<br />

at least, ‘fruit’ was relatively arbitrary, so there will be a short list of necessary and<br />

sufficient conditions for being an F, for pretty much any F.<br />

Thirdly, returning to ‘fruit’, we can see that Rosch and Mervis’s experiments<br />

could not possibly show that many superordinate predicates in English are cluster<br />

concepts. For they would, if successful, show that ‘fruit’ is a cluster concept, and<br />

it quite plainly is not. So by modus tollens, there is something wrong with their<br />

methodology. Some of the other categories they investigate, particularly ‘weapon’<br />

and ‘furniture’ might be relatively cluster-ish, in a sense to be explained soon, but not<br />

‘fruit’. As the OED says, a fruit is “the edible product of a tree, shrub or other plant,


What Good are Counterexamples? 16<br />

consisting of the seed and its envelope.” If nothing like this is right, then we couldn’t<br />

explain to the sceptical why we call tomatoes, olives and so on fruit.<br />

So the conclusion that philosophically significant terms are likely to be cluster<br />

concepts is mistaken. To close, I note one way the cluster concept view could at least<br />

be coherent. Many predicates do have necessary and sufficient conditions for their<br />

applicability, just as traditional conceptual analysis assumed. In other words, they<br />

have analyses. However, any analysis must be in words, and sometimes the words<br />

needed will refer to quite recherche properties. The properties in the analysans may,<br />

that is, be significantly less natural than the analysandum.<br />

In some contexts, we only consider properties that are above a certain level of<br />

naturalness. If I claim two things say my carpet and the Battle of Agincourt, have<br />

nothing in common, I will not feel threatened by an objector who points out that<br />

they share some gruesome, gerrymandered property, like being elements of {my carpet,<br />

the Battle of Agincourt}. Say that the best analysis of F-hood requires us to use<br />

predicates denoting properties which are below the contextually defined border between<br />

the ‘natural enough’ and ‘too gruesome to use’. Then there will be a sense in<br />

which there is no analysis of F into necessary and sufficient conditions; just the sense<br />

in which my carpet and the Battle of Avignon have nothing in common. Jackson’s<br />

argument feels like a cheat because he just shows that there will be necessary and sufficient<br />

conditions for any concept provided we are allowed to use gruesome properties,<br />

but he makes it sound like this proviso is unnecessary. If Rosch and Mervis’s experiments<br />

show anything at all, it is that this is true of some common terms in some<br />

everyday-ish contexts. In particular, if we restrict our attention to the predicates that<br />

might occur to us within ninety seconds (which plausibly correlates well with some<br />

level of naturalness), very few terms have analyses. Thus far, Rosch and Mervis are<br />

correct. They go wrong by projecting truths of a particular context to all contexts.<br />

6 In defence of analysis<br />

In the previous section I argued that various empirical arguments gave us no reason<br />

to doubt that ‘knows’ will have a short analysis. In this section we look at various<br />

philosophical arguments to this conclusion. One might easily imagine the following<br />

objection to what has been claimed so far. At best, the above reasoning shows that<br />

if ‘knows’ has a short analysis, then the JTB analysis is correct, notwithstanding the<br />

intuitions provoked by Gettier cases. But there is little reason to think English terms<br />

have analyses, as evidenced by the failure of philosophers to analyse even one interesting<br />

term, and particular reasons to think that ‘knows’ does not have an analysis.<br />

These reasons are set out by (Williamson, 2000a, Ch. 3), who argues, by appeal to<br />

intuitions about a particular kind of case, that there can be no analysis of ‘knows’<br />

into independent clauses, one of which describes an internal state of the agent and<br />

the other of which describes an external state of the agent. This does not necessarily<br />

refute the JTB analysis, since the concepts of justification and belief in use may be<br />

neither internal nor external in Williamson’s sense. And if we are going to revise intuitions<br />

about the Gettier cases, we may wish to revise intuitions about Williamson’s<br />

cases as well, though here it is probably safest to not do this, because it is unclear just


What Good are Counterexamples? 17<br />

what philosophical benefit is derived from this revision. In response to these arguments<br />

I will make two moves: one defensive and one offensive. The defensive move<br />

is to distinguish the assumptions made here about the structure of the meaning of<br />

‘knows’, and show how these assumptions do not have some of the dreadful consequences<br />

suggested by various authors. The offensive move, with which we begin, is<br />

to point out the rather unattractive consequences of not making these assumptions<br />

about the structure of the meaning of ‘knows’.<br />

In terms of the concept of naturalness used above, the relation denoted by ‘knows’<br />

might fall into one of three broad camps:<br />

(a) It might be rather unnatural;<br />

(b) It might be fairly natural in virtue of its relation to other, more natural, properties;<br />

or<br />

(c) It might be a primitive natural property, one that does not derive its naturalness<br />

from anything else.<br />

My preferred position is (b). I think that the word ‘knows’, like every other denoting<br />

term in English, denotes something fairly natural. And I don’t think there are any<br />

primitively natural properties or relations in the vicinity of the denotation of this<br />

word, so it must derive its naturalness from its relation to other properties or relations.<br />

If this is so, we can recover some of the structure of its meaning by elucidating<br />

those relationships. If it is correct, that is exactly what I think the JTB theory does.<br />

This is not to say that justification, truth or belief are themselves primitively natural<br />

properties, but rather that we can make some progress towards recovering the source<br />

of the naturalness of knowledge via its decomposition into justification, truth and<br />

belief. But before investigating the costs of (b), let us look at the costs of (a) and (c).<br />

I think we can dispense with (c) rather quickly. It would be surprising, to say<br />

the least, if knowledge was a primitive relation. That X knows that p can hardly be<br />

one of the foundational facts that make up the universe. If X knows that p, this fact<br />

obtains in virtue of the obtaining of other facts. We may not be able to tell exactly<br />

what these facts are in general, but we have fairly strong opinions about whether<br />

they obtain or not in a particular case. This is why we are prepared to say whether or<br />

not a character knows something in a story, perhaps a philosophical story, without<br />

being told exactly that. We see the facts in virtue of which the character does, or does<br />

not, know this. This does not conclusively show that knowledge is not a primitively<br />

natural property. Electrical charge presumably is a primitively natural property, yet<br />

sometimes we can figure out the charge of an object by the behaviour of other objects.<br />

For example, if we know it is repulsed by several different negatively charged things,<br />

it is probably negatively charged. But in these cases it is clear our inference is from<br />

some facts to other facts that are inductively implied, not to facts that are constituted<br />

by the facts we know. (Only a rather unreformed positivist would say that charge is<br />

constituted by repulsive behaviour.) And it does not at all feel that in philosophical<br />

examples we are inductively (or abductively) inferring whether the character knows<br />

that p.


What Good are Counterexamples? 18<br />

The more interesting question is whether (a) might be correct. This is, perhaps<br />

surprisingly, consistent with the theory of meaning advanced above. I held, following<br />

Lewis, that the meaning of a denoting term is the most natural object, property or<br />

relation that satisfies most of our usage dispositions. It is possible that the winner of<br />

this contest will itself be quite unnatural. This is what happens all the time with vague<br />

terms, and indeed it is what causes, or perhaps constitutes, their vagueness. None<br />

of the properties (or relations) that we may pick out by ‘blue’ is much more natural<br />

than several other properties (or relations) that would do roughly as well at capturing<br />

our usage dispositions, were they the denotation of ‘blue’. 12 And indeed none of<br />

these properties (or relations) are particularly natural; they are all rather arbitrary<br />

divisions of the spectrum. The situation is possibly worse when we consider what<br />

Theodore Sider (2001a) calls maximal properties. A property F is maximal iff things<br />

that massively overlap an F are not themselves an F. So being a coin is maximal, since<br />

large parts of a coin, or large parts of a coin fused with some nearby atoms outside the<br />

coin, are not themselves coins. Sider adopts the following useful notation: something<br />

is an F* iff it is suitable to be an F in every respect save that it may massively overlap<br />

an F. So a coin* is a piece of metal (or suitable substance) that is (roughly) coinshaped<br />

and is (more or less) the deliberate outcome of a process designed to produce<br />

legal tender. Assuming that any collection of atoms has a fusion, in the vicinity of<br />

any coin there will be literally trillions of coin*s. At most one of these will be a coin,<br />

since coins do not, in general, overlap. That is, the property being a coin must pick<br />

out exactly one of these coin*s. Since the selection will be ultimately arbitrary, this<br />

property is not very natural. There are just no natural properties in the area, so the<br />

denotation of ‘coin’ is just not natural.<br />

These kind of considerations show that option (a) is a live possibility. But they do<br />

not show that it actually obtains. And there are several contrasts between ‘knows’,<br />

on the one hand, and ‘blue’ and ‘coin’ on the other, which suggest that it does not<br />

obtain. First, we do not take our word ‘knows’ to be as indeterminate as ‘blue’ or<br />

‘coin’, despite the existence of some rather strong grounds for indeterminacy in it.<br />

Secondly, we take apparent disputes between different users of the word ‘knows’ to<br />

be genuine disputes, ones in which at most one side is correct, which we do not<br />

necessarily do with ‘blue’ and ‘coin’. Finally, we are prepared to use the relation<br />

denoted by ‘knows’ in inductive arguments in ways that seem a little suspect with<br />

genuinely unnatural relations, as arguably evidenced by our attitudes towards ‘coin’<br />

and ‘blue’. Let’s look at these in more detail.<br />

If we insisted that the meaning of ‘knows’ must validate all of our dispositions<br />

to use the term, we would find that the word has no meaning. If we just look at<br />

intuitions, we will find that our intuitions about ‘knows’ are inconsistent with some<br />

simple known facts. (Beliefs, being regimented by reflection, might not be inconsistent,<br />

depending on how systematic the regimentation has been.) For example, the<br />

following all seem true to many people.<br />

12 I include the parenthetical comments here so as not to prejudge the question of whether colours are<br />

properties or relations. It seems unlikely to me that colours are relations, either the viewers or environments,<br />

but it is not worth quibbling over this here.


What Good are Counterexamples? 19<br />

(1) Knowledge supervenes on evidence: if two people (not necessarily in the same<br />

possible world) have the same evidence, they know the same things.<br />

(2) We know many things about the external world.<br />

(3) We have the same evidence as some people who are the victims of massive<br />

deception, and who have few true beliefs about their external world.<br />

(4) Whatever is known is true.<br />

These are inconsistent, so they cannot all be true. We could take any three of these as<br />

an argument for the negation of the fourth, though probably the argument from (1)<br />

(2) and (3) to the negation of (4) is less persuasive than the other three such arguments.<br />

I don’t want to adjudicate here which such argument is sound. All I want to claim<br />

here is that there is a fact of the matter about which of these arguments is sound,<br />

and hence about which of these four claims is false. If two people are disagreeing<br />

about which of these is false, at most one of them is right, and the other is wrong. If<br />

‘knows’ denoted a rather unnatural relation, there would be little reason to believe<br />

these things to be true. Perhaps by more carefully consulting intuitions we could<br />

determine that one of them is false by seeing that it had the weakest intuitive pull. If<br />

we couldn’t do this, it would follow that in general there was no fact of the matter<br />

about which is false, and if someone wanted to use ‘know’ in their idiolect so that one<br />

particular one of these is false, there would be no way we could argue that they were<br />

wrong. It is quite implausible that this is what should happen in such a situation. It<br />

is more plausible that the dispute should be decided by figuring out which group of<br />

three can be satisfied by a fairly natural relation. This, recall, is just how we resolve<br />

disputes in many other areas of philosophy, from logic to ethics. If there is no natural<br />

relation eligible to be the meaning of ‘knows’, then probably this dispute has no<br />

resolution, just like the dispute about what ‘mass’ means in Newtonian mechanics. 13<br />

The above case generalises quite widely. If one speaker says that a Gettier case is a<br />

case of knowledge and another denies this (as Stich assures us actually happens if we<br />

cast our linguistic net wide enough) we normally assume that one of them is making<br />

a mistake. But if ‘knows’ denotes something quite unnatural, then probably each is<br />

saying something true in her own idiolect. Each party may make other mistaken<br />

claims, that for example what they say is also true in the language of all their compatriots,<br />

but in just making these claims about knowledge they would not be making<br />

a mistake. Perhaps there really is no fact of the matter here about who is right, but<br />

thinking so would be a major change to our common way of viewing matters, and<br />

hence would be a rather costly consequence of accepting option (a). Note here the<br />

contrast with ‘blue’ and ‘coin’. If one person adopts an idiosyncratic usage of ‘blue’<br />

and ‘coin’, one on which there are determinate facts about matters where, we say,<br />

there are none, the most natural thing to say is that they are using the terms differently<br />

to us. If they insist that it is part of their intention in using the terms to speak<br />

the same way as their fellows we may (but only may) revise this judgement. But in<br />

13 Note that in that dispute the rivals are quite natural properties, but seem to be matched in their<br />

naturalness. In the dispute envisaged here, the rivals are quite unnatural, but still seem to be matched. For<br />

more on ‘mass’, see Field (1973).


What Good are Counterexamples? 20<br />

general there is much more inclination to say that a dispute over whether, say, a patch<br />

is blue is merely verbal than to say this about a dispute over whether X knows that p.<br />

Finally, if knowledge was a completely unnatural relation, we would no more<br />

expect it to play a role in inductive or analogical arguments than does grue, but it<br />

seems it can play such a role. One might worry here that blueness also plays a role<br />

in inductive arguments, as in: The sky has been blue the last n days, so probably it<br />

will be blue tomorrow. If blueness is not natural, this might show that unnatural<br />

properties can play a role in inductive arguments. But what is really happening here<br />

is that there is, implicitly, an inductive argument based on a much narrow colour<br />

spectrum, and hence a much more natural property. To see this, note that we would<br />

be just as surprised tomorrow if the sky was navy blue, or perhaps of the dominant<br />

blue in Picasso’s blue period paintings, as if it were not blue at all.<br />

So there are substantial costs to (a) and (c). Are there similar costs to (b)? If we<br />

take (b) to mean that there is a decomposition of the meaning of ‘knows’ into conditions,<br />

expressible in English, which we can tell a priori are individually necessary and<br />

jointly sufficient for knowledge, and such that it is also a priori that they represent<br />

natural properties, then (b) would be wildly implausible. To take just one part of<br />

this, Williamson (2000a) notes it is clear that there are some languages in which such<br />

conditions cannot be expressed, so perhaps English is such a language too. And if this<br />

argument for ‘knows’ works it presumably works for other terms, like ‘pain’, but it<br />

is hard to find such an a priori decomposition of ‘pain’ into more natural properties.<br />

Really, all (b) requires is that there be some connection, perhaps only discoverable a<br />

posteriori, perhaps not even humanly comprehensible, between knowledge and other<br />

more primitively natural properties. These properties need not be denoted by any<br />

terms of English, or any other known language.<br />

Most importantly, this connection need not be a decomposition. If knowledge<br />

is the most general factive mental state, as Williamson proposes, and being factive<br />

and being a mental state are natural properties, then condition (b) will be thereby<br />

satisfied. If knowledge is the norm of assertion, as Williamson also proposes, then<br />

that could do as the means by which knowledge is linked into the network of natural<br />

properties. This last assumes that being an assertion is a natural property, and more<br />

dangerously that norms as natural, but these are relatively plausible assumptions in<br />

general. In neither case do we have a factorisation, in any sense, of knowledge into<br />

constituent properties, but we do have, as (b) requires, a means by which knowledge<br />

is linked into the network of natural properties. It is quite plausible that for every<br />

term which, unlike ‘blue’ and ‘coin’ are not excessively vague and do not denote<br />

maximal properties, something like (b) is correct. Given the clarifications made here<br />

to (b), this is consistent with most positions normally taken to be anti-reductionist<br />

about those terms, or their denotata.<br />

7 Naturalness and the JTB theory<br />

I have argued here that the following argument against the JTB theory is unsound.<br />

P1. The JTB theory says that Gettier cases are cases of knowledge.


What Good are Counterexamples? 21<br />

P2. Intuition says that Gettier cases are not cases of knowledge.<br />

P3. Intuition is trustworthy in these cases.<br />

C. The JTB theory is false.<br />

The objection has been that P3 is false in those cases where following intuition slavishly<br />

would mean concluding that some common term denoted a rather unnatural<br />

property while accepting deviations from intuition would allow us to hold that it denoted<br />

a rather natural property. Peter Klein (in conversation) has suggested that there<br />

is a more sophisticated argument against the JTB theory that we can draw out of the<br />

Gettier cases. Since this argument is a good illustration of the way counterexamples<br />

should be used in philosophy, I’ll close with it.<br />

Klein’s idea, in effect, is that we can use Gettier cases to argue that being a justified<br />

true belief is not a natural property, and hence that P3 is after all true. Remember<br />

that P3 only fails when following intuition too closely would lead too far away from<br />

naturalness. If being a justified true belief is not a natural property to start with,<br />

there is no great danger of this happening. What the Gettier cases show us, goes the<br />

argument, is that there are two ways to be a justified true belief. The first way is<br />

where the belief is justified in some sense because it is true. The second way is where<br />

it is quite coincidental that the belief is both justified and true. These two ways of<br />

being a justified true belief may be natural enough, but the property being a justified<br />

true belief is just the disjunction of these two not especially related properties.<br />

I think this is, at least, a prima facie compelling argument. There are, at least, three<br />

important points to note about it. First, this kind of reasoning does not obviously<br />

generalise. Few of the examples described in Shope (1983) could be used to show that<br />

some target theory in fact made knowledge into a disjunctive kind. The second point<br />

is that accepting this argument is perfectly consistent with accepting everything I<br />

said above against the (widespread) uncritical use of appeal to intuition. Indeed, if<br />

what I said above is broadly correct then this is just the kind of reasoning we should<br />

be attempting to use when looking at fascinating counterexamples. Thirdly, if the<br />

argument works it shows something much more interesting than just that the JTB<br />

theory is false. It shows that naturalness is not always transferred to a conjunctive<br />

property by its conjuncts.<br />

I assume here that being a justified belief and being a true belief are themselves natural<br />

properties, and being a justified true belief is the conjunction of these. The only<br />

point here that seems possibly contentious is that being a true belief is not natural.<br />

On some forms of minimalism about truth this may be false, but those forms seem<br />

quite implausibly strong. Remember that saying being a true belief is natural does not<br />

imply that has an analysis – truth might be a primitively natural component of this<br />

property. And remember also that naturalness is intensional rather than hyperintensional.<br />

If all true beliefs correspond with reality in a suitable way, and corresponding<br />

with reality in that way is a natural property, then so is being a true belief, even if truth<br />

of belief cannot be explained in terms of correspondence.<br />

This is a surprising result, because the way naturalness was originally set up by<br />

Lewis suggested that it would be transferred to a conjunctive property by its conjuncts.<br />

Lewis gave three accounts of naturalness. The first is that properties are


What Good are Counterexamples? 22<br />

perfectly natural in virtue of being co-intensive with a genuine universal. The third<br />

is that properties are natural in virtue of the mutual resemblance of their members,<br />

where resemblance is taken to be a primitive. On either account, it seems that whenever<br />

being F is natural, and so is being G, then being F and G will be natural. 14 The<br />

second account, if it can be called that, is that naturalness is just primitive. If the<br />

Gettier cases really do show that being a justified true belief is not natural, then they<br />

will have shown that we have to fall back on just this account of naturalness.<br />

14 I follow Armstrong (1978) here in assuming that there are conjunctive universals.


Morality, Fiction and Possibility<br />

1 Four Puzzles<br />

Several things go wrong in the following story.<br />

Death on a Freeway<br />

Jack and Jill were arguing again. This was not in itself unusual, but this<br />

time they were standing in the fast lane of I-95 having their argument.<br />

This was causing traffic to bank up a bit. It wasn’t significantly worse<br />

than normally happened around Providence, not that you could have<br />

told that from the reactions of passing motorists. They were convinced<br />

that Jack and Jill, and not the volume of traffic, were the primary causes<br />

of the slowdown. They all forgot how bad traffic normally is along there.<br />

When Craig saw that the cause of the bankup had been Jack and Jill, he<br />

took his gun out of the glovebox and shot them. People then started<br />

driving over their bodies, and while the new speed hump caused some<br />

people to slow down a bit, mostly traffic returned to its normal speed.<br />

So Craig did the right thing, because Jack and Jill should have taken their<br />

argument somewhere else where they wouldn’t get in anyone’s way.<br />

The last sentence raises a few related puzzles. Intuitively, it is not true, even in the<br />

story, that Craig’s murder was morally justified. What the narrator tells us here is<br />

just false. That should be a little surprising. We’re being told a story, after all, so<br />

the storyteller should be an authority on what’s true in it. Here we hearers get to<br />

rule on which moral claims are true and false, not the author. But usually the author<br />

gets to say what’s what. The action takes place in Providence, on Highway 95, just<br />

because the author says so. And we don’t reject those claims in the story just because<br />

no such murder has ever taken place on Highway 95. False claims can generally be<br />

true in stories. Normally, the author’s say so is enough to make it so, at least in the<br />

story, even if what is said is really false. The first puzzle, the alethic puzzle, is why<br />

authorial authority breaks down in cases like Death on the Freeway. Why can’t the<br />

author just make sentences like the last sentence in Death true in the story by saying<br />

they are true? At this stage I won’t try and give a more precise characterisation of<br />

which features of Death lead to the break down of authorial authority, for that will<br />

be at issue below.<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philosophers’<br />

Imprint vol. 4, number 3. I’ve spoken to practically everyone I know about the issues here, and a full<br />

list of thanks for useful advice, suggestions, recommendations, criticisms, counterexamples and encouragement<br />

would double the size of the paper. If I thank philosophy departments rather than all the individuals<br />

in them it might cut the size a little, so thanks to the departments at Brown, UC Davis, Melbourne, MIT<br />

and Monash. Thanks also to Kendall Walton, Tamar Gendler and two referees for Philosophers’ Imprint.<br />

The most useful assistance came from Wolfgang Schwarz and especially Tyler Doggett, without whose<br />

advice this could never have been written, and to George Wilson, who prevented me from (keeping on)<br />

making a serious error of over-generalisation.


Morality, Fiction and Possibility 24<br />

The second puzzle concerns the relation between fiction and imagination. Following<br />

Kendall Walton (1990), it is common to construe fictional works as invitations<br />

to imagine. The author requests, or suggests, that we imagine a certain world.<br />

In Death we can follow along with the author for most of the story. We can imagine<br />

an argument taking place in peak hour on Highway 95. We can imagine this frustrating<br />

the other drivers. And we can imagine one of those drivers retaliating with<br />

a loaded gun. What we cannot, or at least do not, imagine is that this retaliation is<br />

morally justified. There is a limit to our imaginative ability here. We refuse, fairly<br />

systematically, to play along with the author here. Call this the imaginative puzzle.<br />

Why don’t we play along in cases like Death? Again, I won’t say for now which cases<br />

are like Death.<br />

The third puzzle concerns the phenomenology of Death and stories like it. The<br />

final sentence is striking, jarring in a way that the earlier sentences are not. Presumably<br />

this is closely related to the earlier puzzles, though I’ll argue below that the cases<br />

that generate this peculiar reaction are not identical with cases that generate alethic<br />

or imaginative puzzles. So call this the phenomenological puzzle.<br />

Finally, there is a puzzle that David Hume (1757) first noticed. Hume suggested<br />

that artistic works that include morally deviant claims, moral claims that wouldn’t<br />

be true were the descriptive aspects of the story true, are thereby aesthetically compromised.<br />

Why is this so? Call that the aesthetic puzzle. I will have nothing to say<br />

about that puzzle here, though hopefully what I have to say about the other puzzles<br />

will assist in solving it.<br />

I’m going to call sentences that raise the first three puzzles puzzling sentences.<br />

Eventually I’ll look at the small differences between those three puzzles, but for now<br />

we’ll focus on what they have in common. The puzzles, especially the imaginative<br />

puzzle, have become quite a focus of debate in recent years. The aesthetic puzzle is<br />

raised by David Hume (1757), and is discussed by Kendall Walton (1994) and Richard<br />

Moran (1995). Walton and Moran also discuss the imaginative and alethic puzzles,<br />

and they are the focus of attention in recent work by Tamar Szabó Gendler (2000),<br />

Gregory Currie (2002) and Stephen Yablo (2002). My solution to the puzzles is best<br />

thought of as a development of some of Walton’s ‘sketchy story’ (to use his description).<br />

Gendler suggests one way to develop Walton’s views, and shows it leads to<br />

an unacceptable solution, because it leads to mistaken predictions. I will argue that<br />

there are more modest developments of Walton’s views that don’t lead to so many<br />

predictions, and in particular don’t lead to mistaken predictions, but which still say<br />

enough to solve the puzzles.<br />

2 The Range of the Puzzles<br />

As Walton and Yablo note, the puzzle does not only arise in connection with thin<br />

moral concepts. But it has not been appreciated how widespread the puzzle is, and<br />

getting a sense of this helps us narrow the range of possible solutions.<br />

Sentences in stories attributing thick moral concepts can be puzzling. If my<br />

prose retelling of Macbeth included the line “Then the cowardly Macduff called on<br />

the brave Macbeth to fight him face to face,” the reader would not accept that in


Morality, Fiction and Possibility 25<br />

the story Macduff was a coward. If my retelling of Hamlet frequently described the<br />

young prince as decisive, the reader would struggle to go along with me imaginatively.<br />

Try imagining Hamlet doing exactly what he does, and saying exactly what he says,<br />

and thinking what he thinks, but always decisively. For an actual example, it’s easy<br />

to find the first line in Bob Dylan’s Ballad of Frankie Lee and Judas Priest, that the<br />

titular characters ‘were the best of friends’ puzzling in the context of how Frankie<br />

Lee treats Judas Priest later in the song. It isn’t too surprising that the puzzle extends<br />

to the thick moral concepts, and Walton at least doesn’t even regard these as a separate<br />

category.<br />

More interestingly, any kind of evaluative sentence can be puzzling. Walton and<br />

Yablo both discuss sentences attributing aesthetic properties. (Yablo, 2002, 485) suggests<br />

that a story in which the author talks about the sublime beauty of a monster<br />

truck rally, while complaining about the lack of aesthetic value in sunsets, is in most<br />

respects like our morally deviant story. The salient aesthetic claims will be puzzling.<br />

Note that we are able to imagine a community that prefers the sight of a ‘blood bath<br />

death match of doom’ (to use Yablo’s evocative description) to sunsets over Sydney<br />

Harbour and it could certainly be true in a fiction that such attitudes were commonplace.<br />

But that does not imply that those people could be right in thinking the trucks<br />

are more beautiful. (Walton, 1994, 43-44) notes that sentences describing jokes that<br />

are actually unfunny as being funny will be puzzling. We get to decide what is funny,<br />

not the author.<br />

Walton and Yablo’s point here can be extended to epistemic evaluations. Again<br />

it isn’t too hard to find puzzling examples when we look at attributions of rationality<br />

or irrationality.<br />

Alien Robbery<br />

Sam saw his friend Lee Remnick rushing out of a bank carrying in one<br />

hand a large bag with money falling out of the top and in the other hand<br />

a sawn-off shotgun. Lee Remnick recognised Sam across the street and<br />

waved with her gun hand, which frightened Sam a little. Sam was a little<br />

shocked to see Lee do this, because despite a few childish pranks involving<br />

stolen cars, she’d been fairly law abiding. So Sam decided that<br />

it wasn’t Lee, but really a shape-shifting alien that looked like Lee, that<br />

robbed the bank. Although shape-shifting aliens didn’t exist, and until<br />

that moment Sam had no evidence that they did, this was a rational<br />

belief. False, but rational.<br />

The last two sentences of Alien Robbery are fairly clearly puzzling.<br />

So far all of our examples have involved normative concepts, so one might think<br />

the solution to the puzzle will have something to do with the distinctive nature of<br />

normative concepts, or with their distinctive role in fiction. Indeed, Gendler’s and<br />

Currie’s solutions have just this feature. But sentences that seem somewhat removed<br />

from the realm of the normative can still be puzzling. (It is of course contentious just<br />

where the normative/non-normative barrier lies. Most of the following cases will be


Morality, Fiction and Possibility 26<br />

regarded as involving normative concepts by at least some philosophers. But I think<br />

few people will hold that all of the following cases involve normative concepts.)<br />

Attributions of mental states can, in principle, be puzzling. If I retell Romeo and<br />

Juliet, and in this say ‘Although he believed he loved Juliet, and acted as if he did,<br />

Romeo did not really love Juliet, and actually wanted to humiliate her by getting her<br />

to betray her family’, that would I think be puzzling. This example is odd, because<br />

it is not obviously impossible that Romeo could fail to love Juliet even though he<br />

thought he loved her (people are mistaken about this kind of thing all the time) and<br />

acted as if he did (especially if he was trying to trick her). But given the full detail<br />

of the story, it is impossible to imagine that Romeo thought he had the attitudes<br />

towards Juliet he is traditionally thought to have, and he is mistaken about this.<br />

Attributions of content, either mental content or linguistic content, can be just<br />

as puzzling. The second and third sentences in this story are impossible to imagine,<br />

and false even in the story.<br />

Cats and Dogs<br />

Rhodisland is much like a part of the actual world, but with a surprising<br />

difference. Although they use the word ‘cat’ in all the circumstances<br />

when we would (i.e. when they want to say something about cats), and<br />

the word ‘dog’ in all the circumstances we would, in their language ‘cat’<br />

means dog and ‘dog’ means cat. None of the Rhodislanders are aware of<br />

this, so they frequently say false things when asked about cats and dogs.<br />

Indeed, no one has ever known that their words had this meaning, and<br />

they would probably investigate just how this came to be in some detail,<br />

if they knew it were true.<br />

A similar story can be told to demonstrate how claims about mental content can be<br />

puzzling. Perhaps these cases still involve the normative. Loving might be thought<br />

to entail special obligations and Kripke (1982) has argued that content is normative.<br />

But we are clearly moving away from the moral, narrowly construed.<br />

Stephen Yablo recently suggested that certain shape predicates generate imaginative<br />

resistance. These predicates are meant to be special categories of a broader<br />

category that we’ll discuss further below. Here’s Yablo’s example.<br />

Game Over<br />

They flopped down beneath the giant maple. One more item to find,<br />

and yet the game seemed lost. Hang on, Sally said. It’s staring us in the<br />

face. This is a maple tree we’re under. She grabbed a five-fingered leaf.<br />

Here was the oval they needed! They ran off to claim their prize. (Yablo,<br />

2002, 485, title added)<br />

There’s a potential complication in this story in that one might think that it’s metaphysically<br />

impossible that maple trees have ovular leaves. That’s not what is meant<br />

to be resisted, and I don’t think is resisted. What is resisted is that maple leaves have<br />

their distinctive five-fingered look, that the shape of the leaf Sally collects is like that<br />

(imagine I demonstrate a maple leaf here) and that its shape be an oval.


Morality, Fiction and Possibility 27<br />

Fewer people may care about the next class of cases, or have clear intuitions about<br />

them, but if one has firm ontological beliefs, then deviant ontological claims can<br />

be puzzling. I’m a universalist about mereology, at least with respect to ordinary<br />

concrete things, so I find many of the claims in this story puzzling.<br />

Wiggins’ World<br />

The Hogwarts Express was a very special train. It had no parts at all.<br />

Although you’d be tempted to say that it had carriages, an engine, seats,<br />

wheels, windows and so on, it really was a mereological atom. And it<br />

certainly had no temporal parts - it wholly was wherever and whenever<br />

it was. Even more surprisingly, it did not enter into fusions, so when<br />

the Hogwarts Local was linked to it for the first few miles out of Kings<br />

Cross, there was no one object that carried all the students through north<br />

London.<br />

I think that even in fictions any two concrete objects have a fusion. So the Hogwarts<br />

Express and the Hogwarts Local have a fusion, and when it is a connected object it<br />

is commonly called a train. I know how to describe a situation where they have no<br />

fusion (I did so just above) but I have no idea how to imagine it, or make it true in a<br />

story.<br />

More generally, there are all sorts of puzzling sentences involving claims about<br />

constitution. These I think are the best guide to a solution to the puzzle.<br />

A Quixotic Victory<br />

–What think you of my redecorating Sancho?<br />

–It’s rather sparse, said Sancho.<br />

–Sparse. Indeed it is sparse. Just a television and an armchair.<br />

–Where are they, Señor Quixote? asked Sancho. All I see are a knife and<br />

fork on the floor, about six feet from each other. A sparse apartment<br />

for a sparse mind. He said the last sentence under his breath so Quixote<br />

would not hear him.<br />

–They might look like a knife and fork, but they are a television and an<br />

armchair, replied Quixote.<br />

–They look just like the knife and fork I have in my pocket, said Sancho,<br />

and he moved as to put his knife and fork besides the objects on<br />

Quixote’s floor.<br />

–Please don’t do that, said Quixote, for I may be unable to tell your knife<br />

and fork from my television and armchair.<br />

–But if you can’t tell them apart from a knife and fork, how could they<br />

be a television and an armchair?<br />

–Do you really think being a television is an observational property?<br />

asked Quixote with a grin.<br />

–Maybe not. OK then, how do you change the channels? asked Sancho.<br />

–There’s a remote.<br />

–Where? Is it that floorboard?


Morality, Fiction and Possibility 28<br />

–No, it’s at the repair shop, admitted Quixote.<br />

–I give up, said Sancho.<br />

Sancho was right to give up. Despite their odd appearance, Quixote’s<br />

items of furniture really were a television and an armchair. This was the<br />

first time in months Quixote had won an argument with Sancho.<br />

Quixote is quite right that whether something is a television is not determined entirely<br />

by how it looks. A television could be indistinguishable from a non-television.<br />

Nonetheless, something indistinguishable from a knife is not a television. Not in this<br />

world, and not in the world of Victory either, whatever the author says. For whether<br />

something is a television is determined at least in part by how it looks, and while it is<br />

impossible to provide a non-circular constraint on how a television may look, it may<br />

not look like a common knife.<br />

In general, if whether or not something is an F is determined in part by ‘lowerlevel’<br />

features, such as the shape and organisation of its parts, and the story specifies<br />

that the lower-level features are incompatible with the object being an F, it is not an<br />

F in the fiction. Suitably generalised and qualified, I think this is the explanation<br />

of all of the above categories. To understand better what the generalisations and<br />

qualifications must be, we need to look at some cases that aren’t like Death, and some<br />

alternative explanations of what is going on in Death.<br />

Sentences that are intentional errors on the part of storytellers are not puzzling<br />

in our sense. We will use real examples for the next few pages, starting with the<br />

opening line of Joyce’s most famous short story.<br />

The Dead<br />

Lily, the caretaker’s daughter, was literally run off her feet.<br />

(Joyce, 1914/2000, 138)<br />

It isn’t true that Lily is literally run off her feet. She is run off her feet by the incoming<br />

guests, and if you asked her she may well say she was literally run off her feet, but this<br />

would reveal as much about her lack of linguistic care as about her demanding routine.<br />

Is this a case where the author loses authority over what’s true in the story? No,<br />

we are not meant to read the sentence as being true in the story, but being a faithful<br />

report of what Lily (in the story) might say to herself. In practice it’s incredibly difficult<br />

to tell just when the author intends a sentence to be true in the story, as opposed<br />

to being a report of some character’s view of what is true. (See Holton (1997) for an<br />

illustration of the complications this can cause.) But since we are operating in theory<br />

here, we will assume that problem solved. The alethic puzzle only arises when it is<br />

clear that the author intends that p is true in her story, but we think p is not true.<br />

The imaginative puzzle only arises when the author invites us to imagine p, but we<br />

can not, or at least do not. Since Joyce does not intend this sentence to be true in The<br />

Dead, nor invites us to imagine it being true, neither puzzle arises. What happens to<br />

the phenomenological puzzle in cases like these is a little more interesting, and I’ll<br />

return to that in §7.


Morality, Fiction and Possibility 29<br />

Just as intentional errors are not puzzling, careless errors are not puzzling. Writing<br />

a full length novel is a perilous business. Things can go wrong. Words can be<br />

miswritten, mistyped or misprinted at several different stages. Sometimes the errors<br />

are easily detectable, sometimes they are not, especially when they concern names. In<br />

one of the drafts of Ulysses, Joyce managed to write “Connolly Norman” in place of<br />

“Conolly Norman”. Had that draft being used for the canonical printing of the work,<br />

it would be tempting to say that we had another alethic puzzle. For the character<br />

named here is clearly the Superintendent of the Richmond District Lunatic Asylum,<br />

and his name had no double-‘n’, so in the story there is no double-‘n’ either. 1<br />

Here we do have an instance where what is true in the story differs from the what<br />

is written in the text. But this is not a particularly interesting deviation. To avoid<br />

arcane discussions of typographical errors, we will that in every case we possess an<br />

ideal version of the text, and are comparing it with the author’s intentions. Slip-ups<br />

that would be detected by a careful proof-reader, whether they reveal an unintended<br />

divergence between word and world, as here, or between various parts of the text, as<br />

would happen if Dr Norman were not named after a real person but had his name<br />

spelled differently in parts of the text, will be ignored. 2<br />

Note two ways in which the puzzles as I have stated them are narrower than they<br />

first appear. First, I am only considering puzzles that arise from a particular sentence<br />

in the story, intentionally presented in the voice of an authoritative narrator. We<br />

could try and generalise, asking why it is that we sometimes (but not always) question<br />

the moral claims that are intended to be tacit in a work of fiction. For instance,<br />

we might hold that for some Shakespearean plays there are moral propositions that<br />

Shakespeare intended to be true in the play, but which are not in fact true. Such cases<br />

are interesting, but to keep the problem of manageable proportions I won’t explicitly<br />

discuss them here. (I believe the solution I offer here generalises to those cases, but<br />

I won’t defend that claim here.) Second, all the stories I have discussed are either<br />

paragraph-long examples, or relatively detachable parts of longer stories. For all I’ve<br />

said so far, the puzzle may be restricted to such cases. In particular, it might be the<br />

case that a suitably talented author could make it true in a story that killing people<br />

for holding up traffic is morally praiseworthy, or that a television is phenomenally<br />

and functionally indistinguishable from a knife. What we’ve seen so far is just that<br />

an author cannot make these things true in a story simply by saying they are true. 3 I<br />

leave open the question of whether a more subtle approach could make those things<br />

true in a fiction. Similarly, I leave it open whether a more detailed invitation to<br />

imagine that these things are true would be accepted. All we have seen so far is that<br />

1 For details on the spelling of Dr Norman’s name, and the story behind it, see Kidd (1988). The good<br />

doctor appears on page 6 of Joyce (1922/1993).<br />

2 At least, they will be ignored if it is clear they are errors. If there seems to be a method behind<br />

the misspellings, as in Ulysses there frequently is, the matter is somewhat different, and somewhat more<br />

difficult.<br />

Tyler Doggett has argued that these cases are more similar to paradigm cases of imaginative resistance<br />

than I take them to be. Indeed, I would not have noticed the problems they raise without reading his paper.<br />

It may be a shortcoming of my theory here that I have to set questions about whether these sentences are<br />

puzzling to one side and assume an ideal proof-reader.<br />

3 Thanks here to George Wilson for reminding me that we haven’t shown anything stronger than that.


Morality, Fiction and Possibility 30<br />

simple direct invitations to imagine these things are rejected, and it feels like we could<br />

not accept them.<br />

3 An Impossible Solution<br />

Here’s a natural solution to the puzzles, one that you may have been waiting for me to<br />

discuss. The alethic puzzle arises because only propositions that are possibly true can<br />

be true in a story, or can be imagined. The latter claim rests on the hypothesis that we<br />

can imagine only what is possible, and that we resist imagining what is impossible.<br />

This solution assumes that it is impossible that killing people for holding up freeway<br />

traffic is the right thing to do. Given enough background assumptions, that is<br />

plausible. It is plausible, that is, that the moral facts supervene on the non-moral<br />

facts. And the supervenience principle here is quite a strong one - in every possible<br />

world where the descriptive facts are thus and so, the moral facts are the same way. 4<br />

If we assume the relevant concept of impossibility is truth in no possible worlds, we<br />

get the nice result that the moral claims at the core of the problem could not possibly<br />

be true.<br />

Several authors have discussed solutions around this area. Kendall Walton (1994)<br />

can easily be read as endorsing this solution, though Walton’s discussion is rather<br />

tentative. Tamar Szabó Gendler rejects the theory, but thinks it is the most natural<br />

idea, and spends much of her paper arguing against this solution. As those authors,<br />

and Gregory Currie (2002), note, the solution needs to be tidied up a little before it<br />

will work for the phenomenological and imaginative puzzles. (It is less clear whether<br />

the tidying matters to the alethic puzzle.) For one thing, there is no felt asymmetry<br />

between a story containing, “Alex proved the twin primes theorem,” and one<br />

containing, “Alex found the largest pair of twin primes,” even though one of them is<br />

impossible. Since we don’t know which it is, the impossibility of the false one cannot<br />

help us here. So the theory must be that it is believed impossibilities that matter,<br />

for determining what we can imagine, not just any old impossibilities. Presumably<br />

impossibilities that are not salient will also not prevent imagination.<br />

Even thus qualified, the solution still overgenerates, as Gendler noted. There are<br />

stories that are not puzzling in any way that contain known salient impossibilities.<br />

Gendler suggests three kinds of cases of this kind, of which I think only the third<br />

clearly works. The first kind of case is where we have direct contradictions true in<br />

the story. Gendler suggests that her Tower of Goldbach story, where seven plus five<br />

both does and does not equal twelve, is not puzzling. Graham Priest (1999) makes a<br />

similar point with a story, Sylvan’s Box, involving an empty box with a small statue<br />

in one corner. These are clear cases of known, salient impossibility, but arguably are<br />

not puzzling in any respect. (There is a distinction between the puzzles though. It<br />

is very plausible to say that it’s true in Priest’s story that there’s an empty box with<br />

a small statue in one corner. It is less plausible to say we really can imagine such a<br />

4 Arguably the relevant supervenience principle is even stronger than that. To use some terminology of<br />

Stephen Yablo’s, there’s no difference in moral facts without a difference in non-moral facts between any<br />

two counteractual worlds, as well as between any two counterfactual worlds. This might be connected to<br />

some claims I will make below about the relationship between the normative and the descriptive.


Morality, Fiction and Possibility 31<br />

situation.) Opinion about such cases tends to be fairly sharply divided, and it is not<br />

good I suspect to rest too much weight on them one way or the other.<br />

The second kind of case Gendler suggests is where we have a distinctively metaphysical<br />

impossibility, such as a singing snowman or a talking playing card. Similar<br />

cases as discussed by Alex Byrne (1993) who takes them to raise problems for David<br />

Lewis’s (1978b) subjunctive conditionals account of truth in fiction. If we believe a<br />

strong enough kind of essentialism, then these will be impossible, but they clearly<br />

do not generate puzzling stories. For a quick proof of this, note that Alice in Wonderland<br />

is not puzzling, but several essentialist theses are violated there. It is true in<br />

Alice in Wonderland, for example, that playing cards plant rose trees.<br />

But these examples don’t strike me as particularly convincing either. For one<br />

thing, the essentialism assumed here may be wrong. For another, the essentialism<br />

might not be both salient and believed to be right, which is what is needed. And<br />

most importantly, we can easily reinterpret what the authors are saying in order to<br />

be make the story possibly true. We can assume, for example, that the rosebush<br />

planting playing cards are not playing cards as we know them, but roughly humanshaped<br />

beings with playing cards for torsos. Gendler and Byrne each say that this is<br />

to misinterpret the author, but I’m not sure this is true. As some evidence, note that<br />

the authorised illustrations in Alice tend to support the reinterpretations. 5<br />

Gendler’s third case is better. There are science fiction stories, especially time<br />

travel stories, that are clearly impossible but which do not generate resistance. Here’s<br />

two such stories, the first lightly modified from a surprisingly popular movie, and<br />

the second lifted straight from a very popular source.<br />

Back to the Future ′<br />

Marty McFly unintentionally travelled back in time to escape some marauding<br />

Libyan terrorists. In doing so he prevented the chance meeting<br />

which had, in the timeline that had been, caused his father and mother<br />

to start dating. Without that event, his mother saw no reason to date the<br />

unattractive, boring nerdy kid who had been, in a history that no longer<br />

is, Marty’s father. So Marty never came into existence. This was really a<br />

neat trick on Marty’s part, though he was of course no longer around to<br />

appreciate it. Some people manage to remove themselves from the future<br />

of the world by foolish actions involving cars. Marty managed to remove<br />

himself from the past as well.<br />

The Restaurant at the End of the Universe<br />

The Restaurant at the End of the Universe is one of the most extraordinary<br />

ventures in the entire history of catering.<br />

5 Determining whether this is true in all such stories would be an enormous task, I fear, and somewhat<br />

pointless given the next objection. If anyone wants to say all clearly impossible statements in fiction are<br />

puzzling, I suspect the best strategy is to divide and conquer. The most blatantly impossible claims are<br />

most naturally fit for reinterpretation, and the other claims rest on an essentialism that is arguably not<br />

proven. I won’t try such a massive defence of a false theory here.


Morality, Fiction and Possibility 32<br />

It is built on the fragmented remains of an eventually ruined planet<br />

which is enclosed in a vast time bubble and projected forward in time<br />

to the precise moment of the End of the Universe.<br />

This is, many would say, impossible.<br />

. . .<br />

You can visit it as many times as you like . . . and be sure of never meeting<br />

yourself, because of the embarrassment this usually causes.<br />

This, even if the rest were true, which it isn't, is patently impossible, say<br />

the doubters.<br />

All you have to do is deposit one penny in a savings account in your own<br />

era, and when you arrive at the End of Time the operation of compound<br />

interest means that the fabulous cost of your meal has been paid for.<br />

This, many claim, is not merely impossible but clearly insane. (Adams,<br />

1980, 213-214)<br />

Neither of these are puzzling. Perhaps it’s hard to imagine the last couple of sentences<br />

of the McFly story, but everything the respective authors say is true in their stories.<br />

So the impossibility theory cannot be right, because it overgenerates, just as Gendler<br />

said.<br />

Recently Kathleen Stock (2003) has argued that one of the assumptions that Gendler<br />

makes, specifically that it isn’t true that “a judgement of conceptual impossibility<br />

renders a scenario unimaginable” (Gendler, 2000, 66) is false. Even if Stock is right,<br />

this doesn’t threaten the kind of response that I have (following Gendler) offered to<br />

the puzzles. But actually there are a few reasons to doubt Stock’s reply. I’ll discuss<br />

these points in order.<br />

It isn’t entirely clear from Stock’s discussion what she is taking a conceptual impossibility<br />

to be. I think it is a proposition of the form Some F is a G (or That F is a<br />

G, or something of this sort) where it is constitutive of being an F that the F is not a<br />

G. There is no positive characterisation of conceptual impossibility in Stock’s paper,<br />

but it is clearly meant to be something stronger than mere impossibility, or a priori<br />

falsehood. In any case, most of the core arguments turn on worries about allegedly<br />

deploying a concept while refusing to draw inferences that are constitutive of that<br />

concept, so the kind of definition I’ve offered above seems to be on the right track.<br />

Now if this is the case then Stock has no objection to the imaginability of the<br />

two stories I offered that involve known and salient impossibilities. For neither of<br />

these stories includes a conceptual impossibility in this sense. So even if conceptual<br />

impossibilities cannot be imagined, some impossibilities can be imagined. (And at<br />

this point what holds for imagination also holds for truth in fiction.)<br />

While this suffices as a response to the particular claims Stock makes, it might<br />

be thought it undercuts the objection I have made to the impossible solution. For<br />

it might be thought that what is wrong with the puzzling sentences just is that they<br />

represent conceptual impossibilities in this sense, and we have no argument that these<br />

can be imagined, or true in fiction. This is not too far removed from the actual


Morality, Fiction and Possibility 33<br />

solution I will offer, so it is a serious worry. The problem with this line is that<br />

not all of our puzzles are conceptual impossibilities. It isn’t constitutive of being a<br />

television that a thing is phenomenally or functionally distinguishable from a knife,<br />

but the claim in Victory that some television is not phenomenally or functionally<br />

distinguishable from a knife is puzzling. Even in our core cases, of morally deviant<br />

claims in fiction, there need not be any conceptual impossibilities. As R. M. Hare<br />

(1951) pointed out long ago, people with very different moral beliefs could have in<br />

common the concept GOOD. Arguably, someone who thinks that what Craig does<br />

in Death is good is morally confused, not conceptually confused. So whether Gendler<br />

or Stock is right about the imaginability of conceptual impossibility is neither here<br />

nor there with respect to these puzzles.<br />

Having said that, there are some reasons to doubt Stock’s argument. One of her<br />

moves is to argue that we couldn’t imagine conceptual impossibilities because we<br />

can’t believe conceptual impossibilities. But as Sorensen (2001) persuasively argues,<br />

we can believe conceptual impossibilities. One of Sorensen’s arguments, lightly modified,<br />

helps us respond to another of Stock’s arguments. Stock notes, rightly, that we<br />

shouldn’t take the fact that it seems we can imagine impossibilities to be conclusive<br />

evidence we can do so. After all, we are wrong about whether things are as they<br />

seem all the time. But this might be a special case. I think that if it seems to be the<br />

case that p then we can imagine that p. And Stock agrees it seems to be the case that<br />

we can imagine conceptual impossibilities. So we can imagine that we can imagine<br />

conceptual impossibilities. Hence it can’t be a conceptual impossibility that we can<br />

imagine at least one conceptual impossibility. This doesn’t tell against the claim that<br />

it is some other kind of impossibility, though as we’ll see Stock’s main argument rests<br />

on considerations about the conceptual structure of imagination, so it isn’t clear how<br />

she could argue for this.<br />

The main argument Stock offers is that no account of how concepts work are<br />

compatible with our imagining conceptual impossibilities. Her argument that atomist<br />

theories of concepts (as in Fodor (1998)) are incompatible with imagining conceptual<br />

impossibilities isn’t that persuasive. She writes that “clearly it is not the case that<br />

imagining “the cow jumped over the moon” stands in a lawful relation to the property<br />

of being a cow (let alone the property of [being] a cow jumping over the moon.<br />

Imagining by its very nature is resistant to any attempt to incorporate it into an externalist<br />

theory of content” (2003, 114). But this isn’t clear at all. When I imagine going<br />

out drinking with Bill Clinton there is, indeed there must be, some kind of causal<br />

chain running back from my imagining to Bill Clinton himself. If there was not, I’d<br />

at most be imagining going out drinking with a guy who looks a lot like Bill Clinton.<br />

Perhaps it isn’t as clear, but when I imagine that a cow (and not just a zebra disguised<br />

to look like a cow) is jumping over the moon it’s nomologically necessary that there’s<br />

a causal chain of the right kind stretching back to actual cows. And it’s arguable that<br />

the concept I deploy in imagining that a cow (a real cow) is jumping over the moon<br />

just is the concept whose content is fixed by the lawful connections between various<br />

cows and my (initial) deployment of it. So I don’t see why a conceptual atomist<br />

should find this kind of argument convincing.


Morality, Fiction and Possibility 34<br />

Stock’s response to Gendler was presented at a conference on Imagination and the<br />

Arts at Leeds in 2001, and at the same conference Derek Matravers (2003) offered an<br />

alternative solution to the alethic puzzle. Although it does not rest on claims about<br />

impossibility, it also suffers from an overgeneration problem. Matravers suggests<br />

that in at least some fictions, we treat the text as a report by a (fictional) narrator<br />

concerning what is going on in a faraway land. Now in reality when we hear reports<br />

from generally trustworthy foreign correspondents, we are likely to believe their<br />

descriptive claims about the facts on the ground. Since they have travelled to the<br />

lands in question, and we have not, the correspondent is epistemologically privileged<br />

with respect to those facts on the ground. But when the correspondent makes moral<br />

evaluations of those facts, she is not in a privileged position, so we don’t just take her<br />

claims as the final word. Matravers suggests there are analogous limits to how far we<br />

trust a fictional narrator.<br />

The problem with this approach is that there are several salient disanalogies between<br />

the position of the correspondent and the fictional narrator. The following<br />

case, which I heard about from Mark Liberman, illustrates this nicely. On March 5,<br />

2004, the BBC reported that children in a nursery in England had found a frog with<br />

three heads and six legs. Many people, including Professor Liberman, were sceptical,<br />

notwithstanding the fact that the BBC was actually in England and Professor Liberman<br />

was not. The epistemological privilege generated by proximity doesn’t extend<br />

to implausible claims about three-headed frogs. The obvious disanalogy is that if a<br />

fictional narrator said that there was a three-headed six-legged frog in the children’s<br />

nursery then other things being equal we would infer it is true in the fiction that there<br />

was indeed a three-headed six-legged frog in the children’s nursery. 6 So there isn’t an<br />

easy analogy between when we trust foreign correspondents and fictional narrators.<br />

Now we need an explanation of why the analogy does hold when either party makes<br />

morally deviant claims, even though it doesn’t when they both make biologically deviant<br />

claims. But it doesn’t seem any easier to say why the analogy holds then than it<br />

is to solve the original puzzle.<br />

Two other quick points about Matravers’s solution. It’s going to be a little delicate<br />

to extend this solution to all the cases I have discussed above, for normally we<br />

do think fictional narrators are privileged with respect to where the televisions and<br />

windows are. What matters here is that how far narratorial privilege extends depends<br />

on what other claims the narrator makes. Perhaps the same is true of foreign correspondents,<br />

though we’d need to see an argument for that. Second, it isn’t clear how<br />

this solution could possibly generalise to cover cases, such as frequently occurs in<br />

plays, where the deviant moral claim is clearly intended by the author to be true in<br />

the fiction but the reader (or watcher) does not agree even though the author’s intention<br />

is recognised. As I mentioned at the start, these cases aren’t our concern here,<br />

though it would be nice to see how a generalisation to these cases is possible. But the<br />

primary problem with Matravers’s solution is that as it stands it (improperly) rules<br />

6 There is a complication here in that such a sentence might be evidence that the fictional work is not to<br />

be understood as this kind of report, and instead understood as something like a recording of the children’s<br />

thoughts. I’ll assume we’re in a story where it is clear that the sentences are not to be so interpreted.


Morality, Fiction and Possibility 35<br />

out three-headed frogs in fiction, and it is hard to see how to remedy this problem<br />

without solving the original puzzle.<br />

4 Some Ethical Solutions<br />

If one focuses on cases like Death, it is natural to think the puzzle probably has something<br />

to do with the special nature of ethical predicates, or perhaps of ethical concepts,<br />

or perhaps of the role of either of these in fiction. I don’t think any such<br />

solution can work because it can’t explain what goes wrong in Victory, and this will<br />

recur as an objection in what follows.<br />

The most detailed solution to the puzzles has been put forward by Tamar Szabó<br />

Gendler. She focuses on the imaginative puzzle, but she also makes valuable points<br />

about the other puzzles. My solution to the phenomenological puzzle is basically<br />

hers plus a little epicycle.<br />

She says that we do not imagine morally deviant fictional worlds because of our<br />

“general desire to not be manipulated into taking on points of view that we would<br />

not reflectively endorse as our own.” How could we take on a point of view by<br />

accepting something in a fiction? Because of the phenomena noted above that some<br />

things become true in a story because they are true in the world. If this is right,<br />

its converse must be true as well. If what is true in the story must match what is<br />

true in the world, then to accept that something is true in the story just is to accept<br />

that it is true in the world. Arguably, the same kind of ‘import/export’ principles<br />

hold for imagination as for truth in fiction. Some propositions become part of the<br />

content of an imagining because they are true. So, in the right circumstances, they<br />

will only be part of an imagining if they are true. Hence to imagine them (in the<br />

right circumstances) is to commit oneself to their truth. Gendler holds that we are<br />

sensitive to this phenomena, and that we refuse to accept stories that are morally<br />

deviant because that would involve accepting that morally deviant claims are true in<br />

the world.<br />

That’s a relatively rough description of Gendler’s theory, but it says enough to<br />

illustrate what she has in mind, and to show where two objections may slip in. First,<br />

it is not clear that it generalises to all the cases. Gendler is aware of some of these<br />

cases and just bites the relevant bullets. She holds, for instance, that we can imagine<br />

that actually lame jokes are funny, and it could be true in a story that such a joke is<br />

funny. It would be a serious cost to her theory if she had to say the same thing about<br />

all the examples discussed above.<br />

The second problem is more serious. The solution is only as good as the claim<br />

that moral claims are more easily exported than descriptive claims, and more generally<br />

that the types of claims we won’t imagine are more easily exported than those we<br />

don’t resist. Gendler has two arguments for why the first of these should be true, but<br />

neither of them sounds persuasive. First, she says that the moral claims are true in all<br />

possible worlds if true at all. But this won’t do on its own, because as she proved, we<br />

don’t resist some necessarily false claims. (This objection is also made by (Matravers,<br />

2003, 94).)


Morality, Fiction and Possibility 36<br />

Secondly, she claims that in other cases where there are necessary falsehoods true<br />

in a story, as in Alice in Wonderland, or the science fiction cases, the author makes it<br />

clear that unusual export restrictions are being imposed. But this is wrong for two<br />

reasons. First, I don’t think that any particularly clear signal to this effect occurs in<br />

my version of Back to the Future. Secondly, even if I had explicitly signalled that I had<br />

intended to make some of the facts in the story available for export, and you didn’t<br />

believe that, that isn’t enough reason to resist imagining the story. For my intent as<br />

to what can and cannot be exported is not part of the story.<br />

To see this, consider one relatively famous example. At one stage B. F. Skinner<br />

tried to promote behaviourism by weaving his theories into a novel (of sorts): Walden<br />

Two. Now I’m sure Skinner intended us to export some psychological and political<br />

claims from the story to the real world. But it is entirely possible to read the story<br />

with full export restrictions in force without rejecting that what Skinner says is true<br />

in that world. (It is dreadfully boring, since there’s nothing but propagandising going<br />

on, but possible.) If exporting was the only barrier here, we should be able to impose<br />

our own tariff walls and read the story along, whatever the intent of the author, as<br />

we can with Walden Two. One can accept it is true in Walden Two that behaviourism<br />

is the basis of a successful social policy, even though Skinner wants us to accept this<br />

as true in the story iff it is true in the world, and it isn’t true in the world. We cannot<br />

read Death or Victory with the same ironic detachment, and Gendler’s theory lacks<br />

the resources to explain this.<br />

Currie’s theory attacks the problem from a quite different direction. He relies<br />

on the motivational consequences of accepting moral claims. Assume internalism<br />

about moral motivation, so to accept that φ-ing is right is to be motivated to φ, at<br />

least ceteris paribus. So accepting that φ-ing is right involves acquiring a desire to<br />

φ, as well, perhaps, as beliefs about φ-ing. Currie suggests that there is a mental<br />

state that stands to desire the way that ordinary imagination stands to belief. It is,<br />

roughly, a state of having an off-line desire, in the way that imagining that p is like<br />

having an off-line belief that p, a state like a belief that p but without the motivational<br />

consequences. Currie suggests that imagining that φ-ing is right involves off-line<br />

acceptance that φ-ing is right, and that in part involves having an off-line desire (a<br />

desire-like imagination) to φ. Finally, Currie says, it is harder to alter our off-line<br />

desires at will than it is to alter our off-line beliefs, and this explains the asymmetry.<br />

The argument for this last claim seems very hasty, but we’ll let that pass. For even if it<br />

is true, Currie’s theory does little to explain the later cases of imaginative resistance,<br />

from Alien Robbery to Victory. It cannot explain, why we have resistance to claims<br />

about what is rational to believe, or what is beautiful, or what attitudes other people<br />

have. The idea that there is a state that stands to desire as imagination stands to belief<br />

is I suspect a very fruitful one, but I don’t think its fruits include a solution to these<br />

puzzles.<br />

5 Grok<br />

Stephen Yablo has suggested that the puzzles, or at least the imaginative puzzle, is<br />

closely linked to what he calls response-enabled concepts, or grokking concepts. (I’ll


Morality, Fiction and Possibility 37<br />

also use response-enabled (grokking) as a property of the predicates that pick out<br />

these concepts.) These are introduced by examples, particularly by the example ‘oval’.<br />

Here are meant to be some platitudes about OVAL. It is a shape concept - any two<br />

objects in any two worlds, counterfactual or counteractual, that have the same shape<br />

are alike in whether they are ovals. But which shape concept it is is picked out by<br />

our reactions. They are the shapes that strike us as being egg-like, or perhaps more<br />

formally, like the shape of all ellipses whose length/width ratio is the golden ratio.<br />

In this way the concept OVAL meant to be distinguished on the one hand from, say,<br />

PRIME NUMBER, which is entirely independent of us, and from WATER, which<br />

would have picked out a different chemical substance had our reactions to various<br />

chemicals been different. Note that what ‘prime number’ picks out is determined by<br />

us, like all semantic facts are. So the move space into which OVAL is meant to fit is<br />

quite tiny. We matter to its extension, but not the way we matter to ‘prime number’<br />

(or that we don’t matter to PRIME NUMBER), and not the way we matter to ‘water’.<br />

I’m not sure there’s any space here at all. To my ear, Yablo’s grokking predicates<br />

strike me as words that have associated egocentric descriptions that fix their reference<br />

without having egocentric reference fixing descriptions, and such words presumably<br />

don’t exist. But for present purposes I’ll bracket those general concerns and see how<br />

this idea can help solve the puzzles. For despite my disagreement about what these<br />

puzzles show about the theory of concepts, Yablo’s solution is not too dissimilar to<br />

mine.<br />

The important point for fiction about grokking concepts is that we matter, in a<br />

non-constitutive way, for their extension. Not we as we might have been, or we as<br />

we are in a story, but us. So an author can’t say that in the story squares looked eggshaped<br />

to the people, so in the story squares are ovals, because we get to say what’s<br />

an oval, not some fictional character. Here’s how Yablo puts it:<br />

Why should resistance [meaning, roughly, unimaginability] and grokkingness<br />

be connected in this way? It’s a feature of grokking concepts<br />

that their extension in a situation depends on how the situation does<br />

or would strike us. ‘Does or would strike us’ as we are: how we are<br />

represented as reacting, or invited to react, has nothing to do with it.<br />

Resistance is the natural consequence. If we insist on judging the extension<br />

ourselves, it stands to reason that any seeming intelligence coming<br />

from elsewhere is automatically suspect. This applies in particular to being<br />

‘told’ about the extension by an as-if knowledgeable narrator. (2002,<br />

485)<br />

It might look at first as if Victory will be a counterexample to Yablo’s solution, just as<br />

it is to the Ethical solutions. After all, the concept that seems to generate the puzzles<br />

there is TELEVISION, and that isn’t at all like his examples of grokking concepts.<br />

(The examples, apart from evaluative concepts, are all shape concepts.) On the other<br />

hand, if there are any grokking concepts, perhaps it is plausible that TELEVISION<br />

should be one of them. Indeed, the platitudes about TELEVISION provide some<br />

support for this. (The following two paragraphs rely heavily on Fodor (1998).)


Morality, Fiction and Possibility 38<br />

Three platitudes about TELEVISION stand out. One is that it’s very hard to<br />

define just what a television is. A second is that there’s a striking correlation between<br />

people who have the concept TELEVISION and people who have been acquainted<br />

with a television. Not a perfect correlation - some infants have acquaintance with<br />

televisions but not as such, and some people acquire TELEVISION by description<br />

- but still strikingly high. And a third is that conversations about televisions are<br />

rarely at cross purposes, even when they consist of people literally talking different<br />

languages. TELEVISION is a shared concept.<br />

Can we put these into a theory of the concept TELEVISION? Fodor suggests we<br />

can, as long as we are not looking for an analysis of TELEVISION. Televisions are<br />

those things that strike us, people in general, as being sufficiently like the televisions<br />

we’ve seen, in a televisual kind of way. This isn’t an account of the meaning of the<br />

word ‘television’ - there’s no reference to us in that word’s dictionary entry, and<br />

rightly so. Nor is it an analysis of what constitutes the concept television. There’s<br />

no reference to us there either. But it does latch on to the right concept, or at least<br />

the right extension, in perhaps the only way we could. And this proposal certainly<br />

explains the platitudes well. The epistemic necessity of having a paradigm television<br />

to use as a basis for similarity judgments explains the striking correlation between<br />

televisual acquaintance and concept possession. The fact that the only way of picking<br />

out the extension uses something that is not constitutive of the concept, namely our<br />

reactions to televisions, explains why we can’t reductively analyse the concept. And<br />

the use of people’s reactions in general rather than idiosyncratic reactions explains<br />

why its a common concept. These look like good reasons to think something like<br />

Fodor’s theory of the concept TELEVISION is right, and if it is then TELEVISION<br />

seems to be response-enabled in Yablo’s sense. So unlike the Ethical solutions, Yablo’s<br />

solution might yet predict that Victory will be puzzling.<br />

Still, I have three quibbles about his solution, and that’s enough to make me think<br />

a better solution may still to be found.<br />

First, there’s a missing antecedent in a key sentence in his account, and it’s hard<br />

to see how to fill it in. What does he mean when he says ‘how the situation does or<br />

would strike us’? Does or would strike us if what? If we were there? But we don’t<br />

know where there is. There, in Victory, is allegedly a place where televisions look<br />

like knifes and forks. What if the antecedent is If all the non-grokking descriptions<br />

were accurate? The problem now is that this will be too light. If TELEVISION is<br />

grokking, then there is a worry that many concepts, including perhaps all artefact<br />

concepts, will be grokking. Fodor didn’t illustrate his theory with TELEVISION,<br />

he always used DOORKNOB. But the theory was meant to be rather general. If we<br />

take out all the claims involving grokking concepts, there may not be much left.<br />

Second, despite the generality of Fodor’s account, it isn’t clear that mental concepts,<br />

and content concepts, are grokking. We would need another argument that<br />

LOVE is grokking, and that so is BELIEVING THAT THERE ARE SPACE AL-<br />

IENS. Perhaps such an argument can be given, but it will not be a trivial exercise.<br />

Finally, I think this Yablo’s solution, at least as most naturally interpreted, overgeneralises.<br />

Here’s a counterexample to it. The following story is not, I take it,<br />

puzzling.


Morality, Fiction and Possibility 39<br />

Fixing a Hole<br />

DQ and his buddy SP leave DQ’s apartment at midday Tuesday, leaving a<br />

well-arranged lounge suite and home theatre unit, featuring DQ’s prized<br />

oval television. They travel back in time to Monday, where DQ has<br />

some rather strange and unexpected adventures. He intended to correct<br />

something that happened yesterday, that had gone all wrong the first<br />

time around, and by the time the buddies reunite and leave for Tuesday<br />

(by sleeping and waking up in the future) he’s sure it’s all been sorted.<br />

When DQ and his buddy SP get back to his apartment midday Tuesday,<br />

it looks for all the world like there’s nothing there except an ordinary<br />

knife and fork.<br />

Now this situation would not strike us, were we to see it, as one where there is a<br />

lounge suite and home theatre unit in DQ’s apartment midday Tuesday, for it looks as<br />

if there’s an ordinary knife and fork there. But still, the author gets to say that what’s<br />

in DQ’s apartment as the story opens includes an oval television. And this despite<br />

the fact that the two concepts, TELEVISION and OVAL, are grokking. Perhaps<br />

some epicycles could be added to Yablo’s theory to solve this problem, but for now<br />

the solution is incomplete.<br />

6 Virtue<br />

The content cases may remind us of one of Fodor’s most famous lines about meaning.<br />

I suppose that sooner or later the physicists will complete the catalogue<br />

they’ve been compiling of the ultimate and irreducible properties of<br />

things. When they do, the likes of spin, charm, and charge will perhaps<br />

appear on the list. But aboutness surely won’t; intentionality doesn’t go<br />

that deep . . . If the semantic and the intentional are real properties of<br />

things, it must be in virtue of their identity with (or maybe their supervenience<br />

on?) properties that are themselves neither intentional nor<br />

semantic. If aboutness is real, it must really be something else. (Fodor,<br />

1987, 97)<br />

If meaning doesn’t go that deep, but there are meaning facts, then those facts must<br />

hold in virtue of more fundamental facts. “Molino de viento” means windmill in<br />

Spanish in virtue of a pattern of usage of those words by Spanish speakers, for instance.<br />

It seems that many of the stories above involve facts that hold, if they hold at all,<br />

in virtue of other facts. Had Fodor other interests than intentionality, he may have<br />

written instead that beauty doesn’t go that deep, and neither does television. If an<br />

event is to be beautiful, this is a fact that must obtain in virtue of other facts about<br />

it, perhaps its integrity, wholeness, symmetry and radiance as Aquinas says (Joyce,<br />

1944/1963, 212), and that event being a monster truck death match of doom probably


Morality, Fiction and Possibility 40<br />

precludes those facts from obtaining. 7 If Quixote’s favourite item of furniture is to<br />

be a television, this must be in virtue of it filling certain functional roles, and being<br />

indistinguishable from a common knife probably precludes that.<br />

What is it for a fact to obtain in virtue of other facts obtaining? A good question,<br />

but not one we will answer here. Still, the concept seems clear enough that we can<br />

still use it, as Fodor does. What we have in mind by ‘virtue’ is understandable from<br />

the examples. One thing to note from the top is that it is not just supervenience:<br />

whether x is good supervenes on whether it is good, but it is not good in virtue of<br />

being good. How much our concept differs from supervenience is a little delicate,<br />

but it certainly differs.<br />

Returning to our original example, moral properties are also less than perfectly<br />

fundamental. It is not a primitive fact that the butcher or the baker is generous, but a<br />

fact that obtains in virtue of the way they treat their neighbours. It is not a primitive<br />

fact that what Craig does is wrong, but a fact that obtains in virtue of the physical<br />

features of his actions.<br />

How are these virtuous relations relevant to the puzzles? To a first approximation,<br />

these relations are always imported into stories and into imagination. The puzzles<br />

arise when we try to tell stories or imagine scenes where they are violated. The<br />

rest of the paper will be concerned with making this claim more precise, motivating<br />

it, and arguing that it solves the puzzles. In making the claim precise, we will largely<br />

be qualifying it.<br />

The first qualification follows from something we noted at the end of section 2.<br />

We don’t know whether puzzles like the ones with which we started arise whenever<br />

there is a clash between real-world morality (or epistemology or mereology) and the<br />

morality (or epistemology or mereology) the author tries to put in the story. We<br />

do know they arise for simple stories and direct invitations to imagine. So if we<br />

aren’t to make claims that go beyond our evidence, we should say there is a default<br />

assumption that these relations are imported into stories or imaginations, and it is<br />

not easy to overcome this assumption. (I will say for short there is a strong default<br />

assumption, meaning just that an author cannot cancel the assumption by saying so,<br />

and that we cannot easily follow invitations to imagine that violate the relations.)<br />

The second qualification is that sometimes we simply ignore, either in fiction or<br />

imagination, what goes on at some levels of detail. This means that sometimes, in a<br />

sense, the relations are not imported into the story. For instance, for it to really be<br />

true that in a language that “glory” means a nice knockdown argument, this must be<br />

true in virtue of facts about how the speakers of that language use, or are disposed to<br />

use, “glory”. But we can simply say in a story that “glory” in a character’s language<br />

means a nice knockdown argument without thereby making any more general facts<br />

about usage or disposition to use true in the story. 8 More generally, we can simply<br />

pick a level of conceptual complexity at which to write our story or conduct our<br />

imaginings. Even if those concepts apply, when they do, in virtue of more basic<br />

7 Although it isn’t obvious just which of the Thomistic properties the death match lacks.<br />

8 Do we make facts about the actual speaker’s usage true in the story? No. The character might have idiosyncratic<br />

reasons for not using the word “glory”, and for ignoring all others who use it. That’s consistent<br />

with the word meaning a nice knockdown argument.


Morality, Fiction and Possibility 41<br />

facts, no more basic facts need be imported into the story. For a more vivid, if more<br />

controversial, example, one might think that cows are cows in virtue of their DNA<br />

having certain chemical characteristics. But when we imagine a cow jumping over the<br />

moon, we need not imagine anything about chemistry. Those facts are simply below<br />

the radar of our imagining. What do we mean then when we say that these relations<br />

are imported into the story? Just that if the story regards both the higher-level facts<br />

and the lower-level facts as being within its purview, then they must match up. This<br />

does not rule out the possibility of simply leaving out all lower-level facts from the<br />

story. In general the same thing is true for imagining, though we will look at some<br />

cases below where we it seems there is a stronger constraint on imagining.<br />

The third qualification is needed to handle an example pressed on me by a referee.<br />

Recall our example Fixing a Hole.<br />

Fixing a Hole<br />

DQ and his buddy SP leave DQ’s apartment at midday Tuesday, leaving a<br />

well-arranged lounge suite and home theatre unit, featuring DQ’s prized<br />

oval television. They travel back in time to Monday, where DQ has<br />

some rather strange and unexpected adventures. He intended to correct<br />

something that happened yesterday, that had gone all wrong the first<br />

time around, and by the time the buddies reunite and leave for Tuesday<br />

(by sleeping and waking up in the future) he’s sure it’s all been sorted.<br />

When DQ and his buddy SP get back to his apartment midday Tuesday,<br />

it looks for all the world like there’s nothing there except an ordinary<br />

knife and fork.<br />

In this story it seems that on Tuesday there is a television that looks exactly like a<br />

knife. If we interpret the claim about the relations between higher-level facts and the<br />

lower-level facts as a kind of impossibility claim, e.g. as the claim that a conjunction<br />

p ∧ q is never true in a story if the conditional If q, then p is false in virtue of q being<br />

true is true, then we have a problem. Let p be the claim that there is a television, and<br />

let q be the claim that the only things in the apartment looked life a knife and fork.<br />

If that’s how the more basic phenomenal and functional facts are, then there isn’t a<br />

television in virtue of those facts. (That is, this relation between phenomenal and<br />

functional facts and facts about where the televisions are really holds.) So this rule<br />

would say p ∧ q could not be true in the story. But in fact p ∧ q is true in the story.<br />

The difficulty here is that Fixing a Hole is a contradictory story, and contradictory<br />

stories need care. First, here’s how we should interpret the rule<br />

Virtue<br />

If p is the kind of claim that if true must be true in virtue of lower-level<br />

facts, and if the story is about those lower-level facts, then it must be true<br />

in the story that there is some true proposition r which is about those<br />

lower-level facts such that p is true in virtue of r.


Morality, Fiction and Possibility 42<br />

In Fixing a Hole there are some true lower-level claims that are inconsistent with<br />

there being a television. But there is also in the story a true proposition about how<br />

DQ’s television looked before his time-travel misadventure. And it is true (both in<br />

reality and in the story) that something is a television in virtue of looking that way.<br />

(Note that we don’t say there must be some proposition r that is true in the story in<br />

virtue of which p is true. For there is no fact of the matter in Fixing a Hole about<br />

how DQ’s television looked before he left. So in reality we could not find such a<br />

proposition. But it is true in the story that his television looks some way or other, so<br />

as long as we talk about what in the story is true, and don’t quantify over propositions<br />

that are (in reality) true in the story, we avoid this pitfall.)<br />

So my solution to the alethic puzzle is that Virtue is a strong default principle of<br />

fictional interpretation. I haven’t done much yet to motivate it, apart from noting<br />

that it seems to cover a lot of the cases that have been raised without overgenerating in<br />

the manner of the impossible solution. A more positive motivation must wait until<br />

I have presented my solutions to the phenomenological and imaginative puzzles. I’ll<br />

do that in the next section, then in §8 tell a story about why we should believe Virtue.<br />

7 More Solutions<br />

7.1 The Phenomenological Puzzle<br />

My solution here is essentially the same as Gendler’s. She think that when we strike<br />

a sentence that generated imaginative resistance we respond with something like,<br />

“That’s what you think!” What makes this notable is that it’s constitutive of playing<br />

the fiction game that we not normally respond that we way, that we give the<br />

author some flexibility in setting up a world. I think that’s basically right, but a little<br />

more is needed to put the puzzle to bed.<br />

Sometimes the “That’s what you think!” response does not constitute abandoning<br />

the fiction game. At times it is the only correct way to play the game. It’s the<br />

right thing to say to Lily when reading the first line of The Dead. (Maybe it would<br />

be rude to say it aloud to poor Lily, the poor girl is run off her feet after all, but it’s<br />

appropriate to think it.) This pattern recurs throughout Dubliners. When in Eveline<br />

the narrator says that Frank has sailed around the world, the right reaction is to say<br />

to Eveline (or whoever is narrating then), “That’s what you think!” There’s a cost to<br />

playing the game this way. We end up knowing next to nothing about Frank. But it<br />

is not as if making the move stops us playing, or even stops us playing correctly. It’s<br />

part of the point of Eveline that we know next to nothing about Frank.<br />

What makes cases like Death and Victory odd is that our reaction is directed at<br />

someone who isn’t in the story. One of Alex Byrne’s (1993) criticisms of Lewis was<br />

that on Lewis’s theory it is true in every story that the story is being told. Byrne<br />

argued that in many fictions it is not true that in the fictional world there is someone<br />

sufficiently knowledgeable to tell the story. In these fictions, we have a story without<br />

a storyteller. If there are such stories, then presumably Death and Victory are amongst<br />

them. It is not a character in the story who ends by saying that Craig’s action was<br />

right or that Quixote’s apartment contains a television. The author says that, and


Morality, Fiction and Possibility 43<br />

hence deserves our reproach, but the author isn’t in the story. Saying “That’s what<br />

you think!” directly to him or her breaks the fictional spell for suddenly we have to<br />

recognise a character not in the fictional world.<br />

This proposal for the phenomenological puzzle yields a number of predictions<br />

which seem to be true and interesting. First, a story that has a narrator should not<br />

generate a phenomenological puzzle, even when outlandish moral claims are made.<br />

The more prominent the narrator, the less striking the moral claim. Imagine, for example,<br />

a version of Death where the text purports to be Craig’s diary, and it includes<br />

naturally enough his own positive evaluation of what he did. We wouldn’t believe<br />

him, of course, but we wouldn’t be struck by the claim the same way we are in the<br />

actual version of Death.<br />

One might have thought that what is shocking is what we discover about the<br />

author. But this isn’t right, as can be seen if we reflect on stories that contain Craig’s<br />

diary. It is possible, difficult but possible, to embed the diary entry corresponding<br />

to Death in a longer story where it is clear that the author endorses Craig’s opinions.<br />

(Naturally I won’t do this. Examples have to come to an end somewhere.) Such a<br />

story would, in a way, be incredibly shocking. But it wouldn’t make the final line<br />

shocking in just the way that the final line of Death is shocking. Our reactions to<br />

these cases suggest that the strikingness of the last line of Death is not a function of<br />

what it reveals about the author, but of how it reveals it.<br />

The final prediction my theory makes is somewhat more contentious. Some novels<br />

announce themselves as works of fiction. They go out of their way to prevent you<br />

ignoring the novel’s role as mediation to a fictional world. (For an early example of<br />

this, consider the sudden appearance of newspaper headlines in the ‘Aeolus’ episode<br />

of Ulysses.) In such novels we already have to recognise the author as a player in the<br />

fictional game, if not a character in the story. I predict that sentences where we do<br />

not take what is written to really be true in the story, even though this is what the<br />

author intended, should be less striking in these cases because we are already used to<br />

reacting to the author as such rather than just to the characters. Such books go out<br />

of their way to break the fictional spell, so spell breaking should matter less in these<br />

cases. I think this prediction is correct, although the works in question tend to be so<br />

complicated that it is hard to generate clear intuitions about them.<br />

7.2 The Imaginative Puzzle<br />

Imagine, if you will, a chair. Have you done so? Good. Let me make some guesses<br />

about what you imagined. First, it was a specific kind of chair. There is a fact of<br />

the matter about whether the chair you imagined is, for example, an armchair or a<br />

dining chair or a classroom chair or an airport lounge chair or an outdoor chair or<br />

an electric chair or a throne. We can verbally represent something as being a chair<br />

without representing it as being a specific kind of chair, but imagination cannot be<br />

quite so coarse. 9<br />

9 This relates to another area in which my solution owes a debt to Gendler’s solution. Supposing can<br />

be coarse in a way that imagining cannot. We can suppose that Jack sold a chair without supposing that<br />

he sold an armchair or a dining chair or any particular kind of chair at all. Gendler concludes that what<br />

we do in fiction, where we try and imagine the fictional world, is very different to what we do, say, in


Morality, Fiction and Possibility 44<br />

Secondly, what you imagined was incomplete in some respects. You possibly<br />

imagined a chair that if realised would contain some stitching somewhere, but you<br />

did not imagine any details about the stitching. There is no fact of the matter about<br />

how the chair you imagined holds together, if indeed it does. If you imagined a chair<br />

by imagining bumping into something chair-like in the dead of night, you need not<br />

have imagined a chair of any colour, although in reality the chair would have some<br />

colour or other. 10<br />

Were my guesses correct? Good. The little I needed to know about imagination<br />

to get those guesses right goes a long way towards solving the puzzle.<br />

Chairs are not very distinctive. Whenever we try to imagine that a non-fundamental<br />

property is instantiated the content of our imagining will be to some extent<br />

more specific than just that the object imagined has the property, but not so much<br />

more specific as to amount to a complete description of a possibilia. It’s the latter fact<br />

that does the work in explaining how can imagine impossible situations. If we were,<br />

foolishly, to try to fill in all the details of the impossible science fiction cases it would<br />

be clear they contained not just impossibilities, but violations of Virtue, and then we<br />

would no longer be able to imagine them. But we can imagine the restaurant at the<br />

end of the universe without imagining it in all its glorious gory detail. And when we<br />

do so our imagining appears to contain no such violations.<br />

But why can’t we imagine these violations in fictions? It is primarily because<br />

we can only imagine the higher-level claim some way or another, just as we only<br />

imagine a chair as some chair or other, and the instructions that go along with the<br />

fiction forbid us from imagining any relevant lower-level facts that would constitute<br />

the truth of the higher-level claim. We have not stressed it much above, but it is<br />

relevant that fictions understood as invitations to imagine have a “That’s all” clause. 11<br />

We are not imagining Death if we imagine that Jack and Jill had just stopped arguing<br />

with each other and were about to shoot everyone in sight when Craig shot them in<br />

self-defence. The story does not explicitly say that wasn’t about to happen. It doesn’t<br />

include a “That’s all” clause. But such clauses have to be understood. So not only<br />

are we instructed to imagine something that seems incompatible with Craig’s action<br />

being morally acceptable, we are also instructed (tacitly) to not imagine anything<br />

that would make it the case that his action is morally acceptable. But we can’t simply<br />

imagine moral goodness in the abstract, to imagine it we have to imagine a particular<br />

kind of goodness.<br />

philosophical argumentation, where we often suppose that things are different to the way they actually<br />

are. We can suppose, for the sake of argument as it’s put, that Kantian or Aristotelian ethical theories are<br />

entirely correct, even if we have no idea how to imagine either being correct. Thanks to Tyler Doggett for<br />

pointing out the connection to Gendler here.<br />

10 Thanks to Kendall Walton for pointing out this possibility.<br />

11 “That’s all” clauses play a distinct, but related, role in (Jackson, 1998, Ch. 1). It’s also crucial to my<br />

solution to the alethic puzzle that there be a “That’s all” clause in the story. What’s problematic about<br />

these cases is that the story (implicitly) rules out there being the lower-level facts that would make the<br />

expressed higher-level claims true.


Morality, Fiction and Possibility 45<br />

7.3 Two Thoughts Too Many?<br />

I have presented three solutions to the three different puzzles with which we started.<br />

Might it not be better to have a uniform solution? No, because although the puzzles<br />

are related, they are not identical. Three puzzles demand three solutions.<br />

We saw already that the phenomenological puzzle is different to the other two. If<br />

we rewrite Death as Craig’s diary there would be nothing particularly striking about<br />

the last sentence, certainly in the context of the story as so told. But the last sentence<br />

generates alethic and imaginative puzzles. Or at least it could generate these puzzles if<br />

the author has made it clear elsewhere in the story that Craig’s voice is authoritative.<br />

So we shouldn’t expect the same solution to that puzzle as the other two.<br />

The alethic puzzle is different to the other two because ultimately it depends on<br />

what the moral and conceptual truths are not on what we take them to be. Consider<br />

the following story.<br />

The Benefactor<br />

Smith was a very generous, just and in every respect moral man. Every<br />

month he held a giant feast for the village where they were able to escape<br />

their usual diet of gains, fruits and vegetables to eat the many and varied<br />

meats that Smith provided for them.<br />

Consider in particular, what should be easy to some, how Benefactor reads to someone<br />

who believes that we are morally required to be vegetarian if this is feasible. In<br />

Benefactor it is clear in the story that most villagers can survive on a vegetarian diet.<br />

So it is morally wrong to serve them the many and varied meats that Smith does.<br />

Hence such a reader should disagree with the author’s assessment that Smith is moral<br />

‘in every respect’. Such a reader will think that in fact in the story Smith is quite<br />

immoral in one important respect.<br />

Now for our final assumption. Assume it is really true that we morally shouldn’t<br />

eat meat if it is avoidable. Since the ethical vegetarians have true ethical beliefs about<br />

the salient facts here, it seems plausible that their views on what is true in the story<br />

should carry more weight than ours. (I’m just relying on a general epistemological<br />

principle here: other things being equal trust the people who have true beliefs about<br />

the relevant background facts.) So it seems that it really is false in the story that<br />

Smith is in every respect moral. Benefactor raises an alethic puzzle even though for<br />

non-vegetarians it does not raise a phenomenological or imaginative puzzle.<br />

This point generalises, so we need not assume for the general point that vegetarianism<br />

is true or that our typical reader is not vegetarian. We can be very confident<br />

that some of our ethical views will be wrong, though for obvious reasons it is hard<br />

to say which ones. Let p be a false moral belief that we have. And let S be a story<br />

in which p is asserted by the (would-be omniscient) narrator. For reasons similar to<br />

what we said about Benefactor, p is not true in S. But S need not raise any imaginative<br />

or phenomenological puzzles. Hence the alethic puzzle is different to the other two<br />

puzzles.


Morality, Fiction and Possibility 46<br />

8 Why Virtue Matters<br />

I owe you an argument for why authors should be unable to easily generate violations<br />

of Virtue, though there is no general bar on making impossibilities true in a story.<br />

My general claims here are not too dissimilar to Yablo’s solution to the puzzles, but<br />

there are a couple of distinctive new points. Before we get to the argument, it’s time<br />

for another story.<br />

Three design students walk into an furniture showroom. The new season’s fashions<br />

are all on display. The students are all struck by the piece de resistance, though<br />

they are all differently struck by it. Over drinks later, it is revealed that while B and C<br />

thought it was a chair, A did not. But the differences did not end there. When asked<br />

to sketch this contentious object, A and B produced identical sketches, while C’s recollections<br />

were drawn somewhat differently. B clearly disagrees with both A and C,<br />

but her differences with each are quite different. With C she disagrees on some simple<br />

empirical facts, what the object in question looked like. With A she disagrees on a<br />

conceptual fact, or perhaps a semantic fact, whether the concept CHAIR, or perhaps<br />

just the term ‘chair’, applies to the object in question. As it turns out, A and B agree<br />

that ‘chair’ means CHAIR, and agree that CHAIR is a public concept so one of them<br />

is right and the other wrong about whether this object falls under the concept. In<br />

this case, their disagreement will have a quite different feel to B’s disagreement with<br />

C. It may well be that there is no analytic/synthetic distinction, and that questions<br />

about whether an object satisfies a concept are always empirical questions, but this is<br />

not how it feels to A and B. They feel that they agree on what the world is like, or<br />

at least what this significant portion of it is like, and disagree just on which concepts<br />

apply to it.<br />

The difference between these two kinds of disagreement is at the basis of our attitudes<br />

towards the alethic puzzle. It may look like we are severely cramping authorial<br />

freedom by not permitting violations of Virtue. 12 From A and B’s perspective, however,<br />

this is no restriction at all. Authors, they think, are free to stipulate which<br />

world will be the site of their fiction. But as their disagreement about whether the<br />

piece de resistance was a chair showed, we can agree about which world we are discussing<br />

and disagree about which concepts apply to it. The important point is that<br />

the metaphysics and epistemology of concepts comes apart here.<br />

There can be no difference in whether the concept CHAIR applies without a<br />

difference in the underlying facts. But there can be a difference of opinion about<br />

whether a thing is a chair without a difference of opinion about the underlying facts.<br />

The fact that it’s the author’s story, not the reader’s, means that the author gets to<br />

say what the underlying facts are. But that still leaves the possibility for differences<br />

of opinion about whether there are chairs, and on that question the author’s opinion<br />

is just another opinion.<br />

Authorial authority extends as far as saying which world is fictional in their story,<br />

it does not extend as far as saying which concepts are instantiated there. Since the<br />

12 Again, it is worth noting that I am not ruling out any violation of Virtue, just easy violations of it.<br />

The point being made in the text is that even a blanket ban on violations would not be a serious restriction<br />

on authorial freedom.


Morality, Fiction and Possibility 47<br />

main way that we specify which world is fictional is by specifying which concepts are<br />

instantiated at it, authorial authority will usually let authors get away with any kind<br />

of conceptual claim. But once we have locked onto the world being discussed, the<br />

author has no special authority to say which concepts, especially which higher-level<br />

concepts like RIGHT or FUNNY or CHAIR are instantiated there.<br />

(Does it matter much that the distinction between empirical disagreements and<br />

conceptual disagreements with which I started might turn out not to rest on very<br />

much? Not really. I am trying to explain why we have the attitudes towards fiction<br />

that we do, which in turn determines what is true in fiction generally. All<br />

that matters is that people generally think that there is something like a conceptual<br />

truth/empirical truth distinction, and I think enough people would agree that A and<br />

B’s disagreement is different in kind from B and C’s disagreement to show that is<br />

true. If folks are generally wrong about this, if there is no difference in kind between<br />

conceptual truths and empirical truths, then our communal theory of truth<br />

in fiction will rest on some fairly untenable supports. But it will still be our theory,<br />

although any coherent telling of it will have to be in terms of things that are taken to<br />

be conceptual truths and things that are taken to be empirical truths.)<br />

This explanation of why authorial authority collapses just when it does yields<br />

one fairly startling, and I think true, prediction. I argued above that authors could<br />

not easily generate violations of Virtue. That this is impossible is compatible with<br />

any number of hypotheses about how readers will resolve those impossibilities that<br />

authors attempt to slip in. The story here, that authors get to say which world is at<br />

issue but not which concepts apply to it, yields the prediction that readers will resolve<br />

the tension in favour of the lower-level claims. When given a physical description of a<br />

world and an incompatible moral description, we will take the physical description to<br />

fix which world is at issue and reduce the moral description to a series of questionable<br />

claims about the world. Compare what happens with A, B and C. We take A and B<br />

to agree about the world and disagree about concepts, rather than say taking B and<br />

C to agree about what the world is like (there’s a chair at the heart of the furniture<br />

show) and say that A and B disagree about the application of some recognitional<br />

concepts. This prediction is borne out in every case discussed in §2. We do not<br />

conclude that Craig did not really shoot Jack and Jill, because after all the world at<br />

issue is stipulated to be one where he did the right thing. Even more surprisingly, we<br />

do not conclude that Quixote’s furniture does not look like kitchen utensils, because<br />

it consists of a television and an armchair. This is surprising because in Victory I<br />

never said that the furniture looked like kitchen utensils. The tacit low-level claim<br />

about appearances is given precedence over the explicit high-level claims about which<br />

objects populate Quixote’s apartment. The theory sketched here predicts that, and<br />

supports the solution to the alethic puzzle sketched in §5, which is good news for<br />

both the theory and the solution.<br />

It’s been a running theme here that the puzzles do not have anything particularly<br />

to do with normativity. But some normative concepts raise the kind of issues<br />

about authority mentioned here in a particularly striking way. There is always some<br />

division of cognitive labour in fiction. The author’s role is, among other things, to<br />

say which world is being made fictional. The audience’s role is, among other things,


Morality, Fiction and Possibility 48<br />

to determine the artistic merit of the fictional work. On other points there may be<br />

some sharing of roles, but this division is fairly absolute. The division threatens to<br />

collapse when authors start commenting on the aesthetic quality of words produced<br />

by their characters. At the end of Ivy Day in the Committee Room Joyce has one character<br />

describe a poem just recited by another character as “A fine piece of writing”<br />

(Joyce, 1914/2000, 105). Most critics seem to be happy to accept the line, because<br />

Joyce’s poem here really is, apparently, a fine piece of writing. But to me it seems<br />

rather jarring, even if it happens to be true. It’s easy to feel a similar reaction when<br />

characters in a drama praise the words of another character. 13 This is a special, and<br />

especially vivid, illustration of the point I’ve been pushing towards here. The author<br />

gets to describe the world at whichever level of detail she chooses. But once it has<br />

been described, the reader has just as much say in which higher-level concepts apply<br />

to parts of that world. When the concepts are evaluative concepts that directly reflect<br />

on the author, the reader’s role rises from being an equal to having more say than the<br />

author, just as we normally have less say than others about which evaluative concepts<br />

apply to us.<br />

This idea is obviously similar to Yablo’s point that we get to decide when grokking<br />

concepts apply, not the author. But it isn’t quite the same. I think that if any<br />

concepts are grokking, most concepts are, so it can’t be the case that authors never<br />

get to say when grokking concepts apply in their stories. Most of the time authors<br />

will get to say which grokking concepts apply, because they have to use them to tell<br />

us about the world. What’s special about the kind of concepts that cause puzzles is<br />

that we get to decide when they apply full stop, but that we get to decide how they<br />

apply given how more fundamental concepts apply. So the conciliatory version of the<br />

relation between my picture here and Yablo’s is that I’ve been filling in, in rather<br />

laborious detail, his missing antecedent.<br />

9 Two Hard Cases<br />

The first hard case is suggested by Kendall Walton (1994). Try to imagine a world<br />

where the over-riding moral duty is to maximise the amount of nutmeg in the world.<br />

If you are like me, you will find this something of a challenge. Now consider a story<br />

Nutmeg that reads (in its entirety!): “Nobody ever discovered this, but it turned out<br />

all along their over-riding moral duty was to maximise the amount of nutmeg in the<br />

world.” What is true in Nutmeg? It seems that there are no violations of Virtue here,<br />

but it is hard to imagine what is being described.<br />

The second hard case is suggested by Tamar Szabó Gendler (2000). (I’m simplifying<br />

this case a little, but it’s still hard.) In her Tower of Goldbach, God decrees that 12<br />

shall no longer be the sum of two primes, and from this it follows (even in the story)<br />

13 For a while this would happen frequently on the TV series The West Wing. President Bartlett would<br />

deliver a speech, and afterwards his staffers would congratulate themselves on what a good speech it was.<br />

The style of the congratulations was clearly intended to convey the author’s belief that the speech they<br />

themselves had written was a good speech, not just the characters’ beliefs to this effect. When in fact it<br />

was a very bad speech, this became very jarring. In later series they would often not show the speeches in<br />

question and hence avoid this problem.


Morality, Fiction and Possibility 49<br />

that it is not the sum of 7 and 5. (It is not clear why He didn’t just make 5 no longer<br />

prime - say the product of 68 and 57. That may have been simpler.) Interestingly, this<br />

has practical consequences. When a group of seven mathematicians from one city<br />

attempts to join a group of five from another city, they no longer form a group of<br />

twelve. Again, two questions. Can we imagine a Goldbachian situation, where 7 and<br />

5 equal not 12? Is it true in Gendler’s story that 7 and 5 equal not 12? If we cannot<br />

imagine Goldbach’s tower, where is the violation of Virtue?<br />

First a quick statement of my responses to the two cases then I’ll end with my<br />

detailed responses. To respond properly we need to tease apart the alethic and imaginative<br />

puzzles. I claim that the alethic puzzle only arises when there’s a violation of<br />

Virtue. There’s no violation in either story, so there is no alethic puzzle. I think there<br />

are independent arguments for this conclusion in both cases. We can’t imagine either<br />

(if we can’t) because any way of filling in the more basic facts leads to violations.<br />

It follows from my solution to the alethic puzzle that Nutmegism (Tyler Doggett’s<br />

name for the principle that we must maximise quantities of nutmeg) could be<br />

true in a story. There is no violation in Nutmeg, since there are no lower level claims<br />

made. Still, the story is very hard to imagine. The reason for this is quite simple. As<br />

noted, we cannot just imagine a chair, we have to imagine something more detailed<br />

that is a chair in virtue of its more basic properties. (There is no particular more basic<br />

property we need imagine, as is shown by the fact that we can imagine a chair just<br />

by imagining something with a certain look, or we can imagine a chair in the dark<br />

with no visual characteristics. But there is always something more basic.) Similarly to<br />

imagine a duty, we have to imagine something more detailed, in this case presumably<br />

a society or an ecology, in virtue of which the duty exists. But no such possible, or<br />

even impossible, society readily springs to mind. So we cannot imagine Nutmegism<br />

is true.<br />

But it is hard to see how, or why, this inability should be raised into a restriction<br />

on what can be true in a story. One might think that what is wrong with Nutmeg<br />

is that the fictional world is picked out using high-level predicates. If we extend the<br />

story any way at all, the thought might go, we will generate a violation of Virtue.<br />

And that is enough to say that Nutmegism is not true in the story. But actually this<br />

isn’t quite right. If we extend the story by adding more moral claims, there is no<br />

duty to minimise suffering, there is no duty to help the poor etc, there are still no<br />

violations in the story. The restriction we would have to impose is that there is no<br />

way of extending the story to fill out the facts in virtue of which the described facts<br />

obtain, without generating a violation. But that looks like too strong a constraint,<br />

mostly because if we applied it here, to rule out Nutmegism being true in Nutmeg, we<br />

would have to apply it to every story written in a higher level language than that of<br />

microphysics. It doesn’t seem true that we have to be able to continue a story all the<br />

way to the microphysical before we can be confident that what the author says about,<br />

for instance, where the furniture in the room is. So there’s no reason to not take the<br />

author’s word in Nutmeg, and since the default is always that what the author says is<br />

true, Nutmegism is true in the story.<br />

The mathematical case is more difficult. The argument that 7 and 5 could fail<br />

to equal 12 in the story turns on an example by Gregory Currie (1990). (The main


Morality, Fiction and Possibility 50<br />

conclusions of this example are also endorsed by Byrne (1993).) Currie imagines a<br />

story in which the hero refutes Gödel’s Incompleteness Theorem. Currie argues that<br />

the story could be written in such a way that it is true in the story not merely that<br />

everyone believes our hero refuted Gödel, but that she really did. But if it could be<br />

true in a story that Gödel’s Incompleteness Theorem could be false, then it’s hard to<br />

see just why it could not be true in a story that a simpler arithmetic claim, say that 7<br />

and 5 make 12, could also be false. Anything that can’t be true in a story can’t be true<br />

in virtue of some feature it has. The only difference between Gödel’s Incompleteness<br />

Theorem and a simple arithmetic statement appears to be the simplicity of the simple<br />

statement. And it doesn’t seem possible, or advisable, to work that kind of feature<br />

into a theory of truth in fiction.<br />

The core problem here is that how simple a mathematical impossibility is very<br />

much a function of the reader’s mathematical knowledge and acumen. Some readers<br />

probably find the unique prime factorisation theorem so simple and evident that for<br />

them a story in which it is false is as crashingly bad as a story in which 7 and 5<br />

do not make 12. For other readers, it is so complex that a story in which it has<br />

a counterexample is no more implausible than a story in which Gödel is refuted.<br />

I think it cannot be true for the second reader that the unique prime factorisation<br />

theorem fails in the story and false for the first reader. That amounts to a kind of<br />

relativism about truth in fiction that seems preposterous. But I agree with Currie<br />

that some mathematical impossibilities can be true in a fiction. So I conclude that,<br />

whether it is imaginable or not, it could be true in a story that 7 and 5 not equal 12.<br />

I think, however, that it is impossible to imagine that 7 plus 5 doesn’t equal 12.<br />

Can we explain that unimaginability in the same way we explained why Nutmeg<br />

couldn’t be imagined? I think we can. It seems that the sum of 7 and 5 is what<br />

it is in virtue of the relations between 7, 5 and other numbers. It is not primitive<br />

that various sums take the values they take. That would be inconsistent with, for<br />

example, it being constitutive of addition that it’s associative, and associativity does<br />

seem to be constitutive of addition. We cannot think about 7, 5, 12 and addition<br />

without thinking about those more primitive relations. So we cannot imagine 7 and<br />

5 equally anything else. Or so I think. There’s some rather sophisticated, or at<br />

least complicated, philosophy of mathematics in the story here, and not everyone<br />

will accept all of it. So we should predict that not everyone will think that these<br />

arithmetic claims are unimaginable. And, pleasingly, not everyone does. Gendler,<br />

for instance, takes it as a data point that Tower of Goldbach is imaginable. So far so<br />

good. Unfortunately, if the story is true we should also expect that whether people<br />

find the story imaginable links up with the various philosophies of mathematics they<br />

believe. And the evidence for that is thin. So there may be more work to do here.<br />

But there is clearly a story that we can tell that handles the case.


David Lewis<br />

David Lewis (1941–2001) was one of the most important philosophers<br />

of the 20th Century. He made significant contributions to philosophy<br />

of language, philosophy of mathematics, philosophy of science, decision<br />

theory, epistemology, meta-ethics and aesthetics. In most of these fields<br />

he is essential reading; in many of them he is among the most important<br />

figures of recent decades. And this list leaves out his two most significant<br />

contributions.<br />

In philosophy of mind, Lewis developed and defended at length a new<br />

version of materialism (see the entry on physicalism). He started by<br />

showing how the motivations driving the identity theory of mind and<br />

functionalism could be reconciled in his theory of mind. He called<br />

this an identity theory, though his theory motivated the position now<br />

known as analytic functionalism. And he developed detailed accounts<br />

of mental content (building on Davidson’s interpretationism) and phenomenal<br />

knowledge (building on Nemirow’s ability hypothesis) that are<br />

consistent with his materialism. The synthesis Lewis ended up with is<br />

one of the central positions in contemporary debates in philosophy of<br />

mind.<br />

But his largest contributions were in metaphysics. One branch of his<br />

metaphysics was his Hume-inspired reductionism about the nomological.<br />

He developed a position he called “Humean supervenience”, the theory<br />

that said that there was nothing to reality except the spatio-temporal<br />

distribution of local natural properties. And he did this by showing in<br />

detail how laws, chances, counterfactual dependence, causation, dispositions<br />

and colours could be located within this Humean mosaic. The<br />

other branch of his metaphysics was his modal realism. Lewis held that<br />

the best theory of modality posited concrete possible worlds. A proposition<br />

is possible iff it is true at one of these worlds. Lewis defended this<br />

view in his most significant book, On the Plurality of Worlds. Alongside<br />

this, Lewis developed a new account of how to think about modal properties<br />

of individuals, namely counterpart theory, and showed how this<br />

theory resolved several long-standing puzzles about modal properties.<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Stanford<br />

Encyclopaedia of Philosophy. I’ve learned a lot over the years from talking about Lewis’s philosophy with<br />

Wolfgang Schwarz. I trust his book (2009) is excellent on all these topics, but unfortunately it’s only out in<br />

German so far, which I don’t read. But a lot of important points are collected on his blog, which is listed<br />

under other internet resources. The best book in English on Lewis is Daniel Nolan’s David Lewis (2005).<br />

Without that book, section 7.5 of this entry wouldn’t exist, section 6.3 would be unintelligible, and every<br />

section would be worse. Much of the biographical information in the introduction is taken from Hájek<br />

(2010). Many people helpfully spotted typos and infelicities of expression in earlier versions of this entry.<br />

Thanks especially to Zachary Miller for many suggested improvements and revisions. The bibliography is<br />

based in large part on a bibliography provided to me by Stephanie Lewis.


David Lewis 52<br />

1 Lewis’s Life and Influence<br />

As we’ve already seen, part of Lewis’s significance came from the breadth of subject<br />

matter on which he made major contributions. It is hard to think of a philosopher<br />

since Hume who has contributed so much to so many fields. And in all of these cases,<br />

Lewis’s contributions involved defending, or in many cases articulating, a big picture<br />

theory of the subject matter, as well as an account of how the details worked. Because<br />

of all his work on the details of various subjects, his writings were a font of ideas even<br />

for those who didn’t agree with the bigger picture. And he was almost invariably<br />

clear about which details were relevant only to his particular big picture, and which<br />

were relevant to anyone who worked on the subject.<br />

Lewis was born in Oberlin, Ohio in 1941, to two academics. He was an undergraduate<br />

at Swarthmore College. During his undergraduate years, his interest in<br />

philosophy was stimulated by a year abroad in Oxford, where he heard J. L. Austin’s<br />

final series of lectures, and was tutored by Iris Murdoch. He returned to Swarthmore<br />

as a philosophy major, and never looked back. He studied at Harvard for his Ph.D.,<br />

writing a dissertation under the supervision of W. V. O. Quine that became his first<br />

book, Convention. In 1966 he was hired at UCLA, where he worked until 1970,<br />

when he moved to Princeton. He remained at Princeton until his death in 2001.<br />

While at Harvard he met his wife Stephanie. They remained married throughout<br />

Lewis’s life, jointly attended numerous conferences, and co-authored three papers.<br />

Lewis visited Australia in 1971, 1975, every year from 1979 to 1999, and again shortly<br />

before his death in 2001.<br />

Lewis was a Fellow of the American Academy of Arts and Sciences, a Corresponding<br />

Fellow of the British Academy, and an Honorary Fellow of the Australian<br />

Academy of the Humanities. He received honorary doctorates from the University<br />

of Melbourne, the University of York in England, and Cambridge University. His<br />

Erd os number was 3.<br />

Lewis published four books: Convention (1969a), Counterfactuals (1973b), On the<br />

Plurality of Worlds (1986b) and Parts of Classes (1991). His numerous papers have been<br />

largely collected in five volumes: Philosophical <strong>Papers</strong> Vol. I (1983d), Philosophical <strong>Papers</strong><br />

Vol. II (1986c), <strong>Papers</strong> in Philosophical Logic (1998), <strong>Papers</strong> in Metaphysics and<br />

Epistemology (1999a) and <strong>Papers</strong> in Social Philosophy (2000). This entry starts with a<br />

discussion of Lewis’s first two books, then looks at his contributions to philosophy<br />

of mind. Sections 5 and 6 are on his metaphysics, looking in turn at Humean Supervenience<br />

and modal realism. Section 7 looks very briefly at some of the many works<br />

that aren’t been covered in the previous five categories.<br />

2 Convention<br />

David Lewis’s first book was Convention (1969a, note that all citations are to works<br />

by David Lewis, unless explicitly stated otherwise). It was based on his Harvard Ph.


David Lewis 53<br />

D. thesis, and published in 1969. The book was an extended response to the arguments<br />

of Quine and others than language could not be conventional. Quine’s argument<br />

was that conventions are agreements, and agreements require language, so language<br />

must be prior to any convention, not a consequence of a convention. Lewis’s<br />

response is to deny that conventions require anything like an agreement. Rather,<br />

on his view, conventions are regularities in action that solve co-ordination problems.<br />

We can stumble into such a regularity without ever agreeing to do so. And such a<br />

regularity can persist simply because it is in everyone’s best interest that it persist.<br />

2.1 Analysis of Convention<br />

Lewis viewed conventions as solutions to co-ordination problems (see Section 3.2<br />

of the entry on convention). His thinking about these problems was heavily influenced<br />

by Thomas Schelling’s work on co-operative games in The Strategy of Conflict<br />

(Schelling, 1960). Many of the key ideas in Lewis’s book come from game theory.<br />

The simplest cases in which conventions arise are ones where we are repeatedly<br />

playing a game that is purely co-operative, i.e. the payoffs to each agent are the same,<br />

and there are multiple equilibria. In such a case, we may well hope for the equilibrium<br />

to persist. At the very least, we will prefer the persistence of the equilibrium to any<br />

one person deviating from it. And we will have this preference even if we would<br />

prefer, all things considered, to be in some other equilibrium state. In such a case,<br />

there may well be a practice of continuing to play one’s part in the equilibrium that<br />

has been reached. This is a regularity in action—it involves making moves in the<br />

repeated game. Given that everyone else is following the regularity, each agent has a<br />

reason to follow the regularity; otherwise it wouldn’t be an equilibrium. But if other<br />

agents acted differently, agents would not be interested in following the regularity,<br />

since there are alternative equilibria. Because these three conditions are met, Lewis<br />

argued that the practice is really a convention, even if there was never any explicit<br />

agreement to continue it.<br />

The case we started with was restricted in two important ways. First, the case<br />

involved games that were perfectly repeated. Second, it involved games where the<br />

payoffs were perfectly symmetric. Lewis’s theory of convention involved getting rid<br />

of both restrictions.<br />

Instead of focussing on repeated co-ordination problems, Lewis just focussed on<br />

repeated situations which collectively constitute a co-ordination problem. Lewis does<br />

not identify situations with games. A repeated situation may come in different ‘versions’,<br />

each of which is represented by a different game. For example, it may be that<br />

the costs of performing some kind of action differ on different occasions, so the formal<br />

game will be different, but the differences are small enough that it makes sense<br />

to have a common practice. And Lewis does not require that there be identity of<br />

interests. In Convention he does require that there be large overlap of interests, but<br />

this requirement does not do much work, and is abandoned in later writing. With<br />

those requirements weakened, we get the following definition of convention.<br />

A regularity R in the behaviour of members of a population P when they<br />

are agents in a recurrent situation S is a convention if and only if it is true


David Lewis 54<br />

that, and it is common knowledge in P that, in almost any instance of S<br />

among members of P,<br />

1. almost everyone conforms to R;<br />

2. almost everyone expects everyone else to conform to R;<br />

3. almost everyone has approximately the same preferences regarding<br />

all possible combinations of actions;<br />

4. almost everyone prefers that any one more conform to R, on condition<br />

that almost everyone conform to R;<br />

5. almost everyone would prefer that any one more conform to R ′ , on<br />

condition that almost everyone conform to R ′ ,<br />

where R ′ is some possible regularity in the behaviour of members of P in<br />

S, such that almost no one in almost any instance of S among members<br />

of P could conform to both R ′ and to R. (Lewis, 1969a, 78)<br />

This is clearly a vague definition, with many ‘almost’s scattered throughout. But<br />

Lewis, characteristically, thought this was a feature not a bug of the view. Our intuitive<br />

notion of a convention is vague, and any analysis of it should capture the vagueness.<br />

The idea that analyses of imprecise folk concepts should be imprecise recurs<br />

throughout Lewis’s career.<br />

The notion of ‘common knowledge’ that Lewis is working with here is not the<br />

standard modern notion. Lewis does not require that everyone know that everyone<br />

know etc., that all of these conditions hold. Rather, when Lewis says that it is common<br />

knowledge that p, he means that everyone has a reason to believe that p, and<br />

everyone has a reason to believe everyone has a reason to believe that p, and everyone<br />

has a reason to believe that everyone has a reason to believe everyone has a reason to<br />

believe that p, and so on. That people act on these reasons, or are known to act on<br />

these reasons, to form beliefs is unnecessary. And that the beliefs people would get if<br />

they acted on their reasons are true is also not part of the view. Hence it is necessary<br />

to specify truth as well as common belief in the definition.<br />

Lewis argues that this definition captures many of our ordinary conventions, such<br />

as the convention of driving on the right side of the road in the United States, the<br />

convention of taking certain pieces of paper as payments for debts, and, most importantly,<br />

the conventions governing the use of language.<br />

2.2 Conventions of Language<br />

In the final chapter of Convention, Lewis gives his theory of what it is for a community<br />

to speak a language (see the section on conventional theories of meaning in the<br />

entry on convention), i.e., for a community to have adopted one language as their<br />

language by convention. Lewis individuates languages largely by the truth conditions<br />

they assign to sentences. And his account of truth conditions is given in terms<br />

of possible worlds. So the truth condition of an indicative sentence is the set of possible<br />

worlds in which it is true. Somewhat more abnormally, Lewis takes the truth<br />

condition for an imperative to be the set of possible worlds in which the imperative


David Lewis 55<br />

is obeyed. (The account of language in Convention covers many different moods, but<br />

we will focus here on the account of indicatives.)<br />

The focus on truth conditions is not because Lewis thinks truth conditions are<br />

all that there are to languages. He acknowledges that languages also have ‘grammars’.<br />

A grammar, in Lewis’s sense, is a lexicon (i.e. a set of elementary constituents, along<br />

with their interpretation), a generative component (i.e. rules for combining constituents<br />

into larger constituents), and a representing component (i.e. rules for verbally<br />

expressing constituents). Lewis’s preferred interpretations are functions from<br />

possible worlds to extensions. So we can sensibly talk about the meaning of a nonsentential<br />

constituent of the language, but these meanings are derived from the truth<br />

conditions of sentences, rather than determining the meanings of sentences. That’s<br />

because, as we’ll see, what the conventions of language establish in the first instance<br />

are truth conditions for entire messages, i.e., sentences.<br />

Given this understanding of what a language is, Lewis goes on to say what it is for<br />

a population to speak a language. One natural approach would be to say that speakers<br />

and hearers face a co-ordination problem, and settling on one language to communicate<br />

in would be a solution to that problem. When Lewis is analysing signalling, that<br />

is the approach he takes. But he doesn’t think it will work for language in general.<br />

The reason is that he takes conventions to be regularities in action, and it is hard to<br />

say in general what actions are taken by hearers.<br />

So instead Lewis says that a population P speaks a language L iff there is a convention<br />

of speaking truthfully in L that persists amongst P. The parties to the coordination<br />

problem (and the convention that solves it) are the different people who<br />

want to communicate in P. They solve their problem by speaking truthfully (on the<br />

whole) in L.<br />

It might be wondered whether it could really be a convention to speak truthfully<br />

in L. After all, there is no obvious alternative to speaking truthfully. As Lewis points<br />

out, however, there are many natural alternatives to speaking truthfully in L; we<br />

could speak truthfully in L ′ instead. The existence of alternative languages makes our<br />

use of L conventional. And the convention can be established, and persist, without<br />

anyone agreeing to it.<br />

2.3 Later Revisions<br />

In “Languages and Language” (1975b), Lewis makes two major revisions to the picture<br />

presented in Convention. He changes the account of what a convention is, and<br />

he changes the account of just what convention must obtain in order for a population<br />

to speak a language.<br />

There are two changes to the account of convention. First, Lewis now says that<br />

conventions may be regularities in action and belief, rather than just in action. Second,<br />

he weakens the third condition, which was approximate sameness of preferences,<br />

to the condition that (almost) each agent has a reason to conform when they<br />

believe others conform. The reason in question may be a practical reason, when<br />

conformity requires action, or an epistemic reason, when convention requires belief.<br />

In Convention, the conventions that sustained language were regularities amongst<br />

speakers. As we noted, it would be more natural to say that the conventions solved


David Lewis 56<br />

co-ordination problems between speakers of a language and their hearers. That is<br />

what the new account of what it is for a population to speak a language does. The<br />

population P speaks the language L iff there are conventions of truthfulness and trust<br />

in L. Speakers are truthful in L iff they only utter sentences they believe are true<br />

sentences of L. Hearers are trusting in L iff they take the sentences they hear to be<br />

(generally) true sentences of L.<br />

The old account took linguistic conventions to be grounded in co-ordination between<br />

speakers generally. We each communicate in English because we think we’ll<br />

be understood that way given everyone else communicates that way, and we want to<br />

be understood. In the new account there is still this kind of many-way co-ordination<br />

between all the speakers of a language, but the most basic kind of co-ordination is<br />

a two-way co-ordination between individual speakers, who want to be understood,<br />

and hearers, who want to understand. This seems like a more natural starting point.<br />

The new account also makes it possible for someone to be part of a population that<br />

uses a language even if they don’t say anything because they don’t have anything to<br />

say. As long as they are trusting in L, they are part of the population that conforms<br />

to the linguistic regularity.<br />

John Hawthorne (1990) argued that Lewis’s account cannot explain the intuitive<br />

meaning of very long sentences. While not accepting all of Hawthorne’s reasons<br />

as to why very long sentences are a problem, in “Meaning Without Use: Reply to<br />

Hawthorne” (1992) Lewis agreed that such sentences pose a problem to his view.<br />

To see the problem, let L be the function from each sentence of English to its intuitive<br />

truth condition, and let L* be the restriction of that function to sentences that<br />

aren’t very long. Arguably we do not trust speakers who utter very long sentences<br />

to have uttered truths, under the ordinary English interpretation of their sentences.<br />

We think, as Lewis said, that such speakers are “trying to win a bet or set a record,<br />

or feigning madness or raving for real, or doing it to annoy, or filibustering, or making<br />

an experiment to test the limits of what it is humanly possible to say and mean.”<br />

(1992, 108) That means that while there may be a convention of truthfulness and trust<br />

in L*, there is no convention of trust in L in its full generality. So the “Languages and<br />

Language” theory implies that we speak L*, not L, which is wrong.<br />

Lewis’s solution to this puzzle relies on his theory of natural properties, described<br />

below in Section 4.6. He argues that some grammars (in the above sense of grammar)<br />

are more natural than others. By default, we speak a language with a natural grammar.<br />

Since L has a natural grammar, and L* doesn’t, other things being equal, we should be<br />

interpreted as speaking L rather than L*. Even if other things are not quite equal, i.e.<br />

we don’t naturally trust speakers of very long sentences, if there is a convention of<br />

truthfulness and trust in L in the vast majority of verbal interactions, and there is no<br />

other language with a natural grammar in which there is a convention of truthfulness<br />

and truth, then the theory will hold, correctly, that we do speak L.<br />

3 Counterfactuals<br />

David Lewis’s second book was Counterfactuals (1973b). Counterfactual conditionals<br />

were important to Lewis for several reasons. Most obviously, they are a distinctive


David Lewis 57<br />

part of natural language and it is philosophically interesting to figure out how they<br />

work. But counterfactuals would play a large role in Lewis’s metaphysics. Many of<br />

Lewis’s attempted reductions of nomic or mental concepts would be either directly<br />

in terms of counterfactuals, or in terms of concepts (such as causation) that he in turn<br />

defined in terms of counterfactuals. And the analysis of counterfactuals, which uses<br />

possible worlds, would in turn provide motivation for believing in possible worlds.<br />

We will look at these two metaphysical motivations in more detail in section 4, where<br />

we discuss the relationship between counterfactuals and laws, causation and other<br />

high-level concepts, and in section 5, where we discuss the motivations for Lewis’s<br />

modal metaphysics.<br />

3.1 Background<br />

To the extent that there was a mid-century orthodoxy about counterfactual conditionals,<br />

it was given by the proposal in Nelson Goodman (1955). Goodman proposed<br />

that counterfactual conditionals were a particular variety of strict conditional.<br />

To a first approximation, If it were the case that p, it would be the case that q (hereafter<br />

p � q) is true just in case Necessarily, either p is false or q is true, i.e. �(p ⊃<br />

q). Goodman realised that this wouldn’t work if the modal ‘necessarily’ was interpreted<br />

unrestrictedly. He first suggested that we needed to restrict attention to those<br />

possibilities where all facts ‘co-tenable’ with p were true. More formally, if S is the<br />

conjunction of all the co-tenable facts, then p � q is true iff �((p ∧ S) ⊃ q).<br />

Lewis argued that this could not be the correct set of truth conditions for p �<br />

q in general. His argument was that strict conditionals were in a certain sense indefeasible.<br />

If a strict conditional is true, then adding more conjuncts to the antecedent<br />

cannot make it false. But intuitively, adding conjuncts to the antecedent of a counterfactual<br />

can change it from being true to false. Indeed, intuitively we can have long<br />

sequences of counterfactuals of ever increasing strength in the antecedent, but with<br />

the same consequent, that alternate in truth value. So we can imagine that (3.1) and<br />

(3.3) are true, while (3.2) and (3.4) are false.<br />

(3.1) If Smith gets the most votes, he will be the next mayor.<br />

(3.2) If Smith gets the most votes but is disqualified due to electoral fraud, he will be<br />

the next mayor.<br />

(3.3) If Smith gets the most votes, but is disqualified due to electoral fraud, then<br />

launches a military coup that overtakes the city government, he will be the<br />

next mayor.<br />

(3.4) If Smith gets the most votes, but is disqualified due to electoral fraud, then<br />

launches a military coup that overtakes the city government, but dies during<br />

the coup, he will be the next mayor.<br />

If we are to regard p � q as true iff �((p ∧ S) ⊃ q), then the S must vary for<br />

different values of p. More seriously, we have to say something about how S varies<br />

with variation in p. Goodman’s own attempts to resolve this problem had generally<br />

been regarded as unsuccessful, for reasons discussed in Bennett (1984). So a new<br />

solution was needed.


David Lewis 58<br />

3.2 Analysis<br />

The basic idea behind the alternative analysis was similar to that proposed by Robert<br />

Stalnaker (1968). Let’s say that an A-world is simply a possible world where A is true.<br />

Stalnaker had proposed that p � q was true just in case the most similar p-world to<br />

the actual world is also a q-world. Lewis offered a nice graphic way of thinking about<br />

this. He proposed that we think of similarity between worlds as a kind of metric,<br />

with the worlds arranged in some large-dimensional space, and more similar worlds<br />

being closer to each other than more dissimilar worlds. Then Stalnaker’s idea is that<br />

the closest p-world has to be a q-world for p � q to be true. Lewis considered several<br />

ways of filling out the details of this proposal, three of which will be significant here.<br />

First, he rejected Stalnaker’s presupposition that there is a most similar p-world<br />

to actuality. He thought there might be many worlds which are equally similar to<br />

actuality, with no p-world being more similar. Using the metric analogy suggested<br />

above, these worlds all fall on a common ‘sphere’ of worlds, where the centre of<br />

this sphere is the actual world. In such a case, Lewis held that p � q is true iff<br />

all the p-worlds on this sphere are q-worlds. One immediate consequence of this is<br />

that Conditional Excluded Middle, i.e., (p � q) ∨ (p � q) is not a theorem of<br />

counterfactual logic for Lewis, as it was for Stalnaker.<br />

Second, he rejected the idea that there must even be a sphere of closest p-worlds.<br />

There might, he thought, be closer and closer p-worlds without limit. He called the<br />

assumption that there was a sphere of closest worlds the “Limit Assumption”, and<br />

noted that we could do without it. The new truth conditions are that p � q is true<br />

at w iff there is a p ∧ q-world closer to w than any p ∧ ¬q-world.<br />

Third, he considered dropping the assumption that w is closer to itself than any<br />

other world, or even the assumption that w is among the worlds that are closest to<br />

it. When we think in terms of similarity (or indeed of metrics) these assumptions<br />

seem perfectly natural, but some philosophers have held that they have bad proof<br />

theoretic consequences. Given the truth conditions Lewis adopts, the assumption<br />

that w is closer to itself than any other world is equivalent to the claim that p ∧ q<br />

entails p � q, and the assumption that w is among the worlds that are closest to it<br />

is equivalent to the claim that p � q and p entail q. The first of these entailments<br />

in particular has been thought to be implausible. But Lewis ultimately decided to<br />

endorse it, in large part because of the semantic model he was using. When we don’t<br />

think about entailments, and instead simply ask ourselves whether any other world<br />

could be as similar to w as w is to itself, the answer seems clearly to be no.<br />

As well as offering these semantic models for counterfactuals, in the book Lewis<br />

offers an axiomatisation of the counterfactual logic he prefers (see the section on<br />

the Logic of Ontic Conditionals in the entry the logic of conditionals), as well as<br />

axiomatisations for several other logics that make different choices about some of the<br />

assumptions we’ve discussed here. And he has proofs that these axiomatisations are<br />

sound and complete with respect to the described semantics.<br />

He also notes that his preferred counterfactual logic invalidates several familiar<br />

implications involving conditionals. We already mentioned that strengthening the<br />

antecedent, the implication of (p ∧ r) � q by p � q, is invalid on Lewis’s theory,


David Lewis 59<br />

and gave some natural language examples that suggest that it should be invalid. Lewis<br />

also shows that contraposition, the implication of q � p by p � q, and conditional<br />

syllogism, the implication of p � r by p � q and q � r, are invalid on his model,<br />

and gives arguments that they should be considered invalid.<br />

3.3 Similarity<br />

In Counterfactuals, Lewis does not say a lot about similarity of worlds. He has some<br />

short arguments that we can make sense of the notion of two worlds being similar.<br />

And he notes that on different occasions we may wish to use different notions of similarity,<br />

suggesting a kind of context dependency of counterfactuals. But the notion is<br />

not spelled out in much more detail.<br />

Some reactions to the book showed that Lewis needed to say more here. Kit Fine<br />

(1975a) argued that given what Lewis had said to date, (3.5) would be false, when it<br />

should be true.<br />

(3.5) If Richard Nixon had pushed the button, there would have been a nuclear war.<br />

(‘The button’ in question is the button designed to launch nuclear missiles.) The<br />

reason it would be false is that a world in which the mechanisms of nuclear warfare<br />

spontaneously failed but then life went on as usual, would be more similar, all things<br />

considered, to actuality than a world in which the future consisted entirely of a postnuclear<br />

apocalypse.<br />

In “Counterfactual Dependence and Time’s Arrow” (1979b), Lewis responded by<br />

saying more about the notion of similarity. In particular, he offered an algorithm for<br />

determining similarity in standard contexts. He still held that the particular measure<br />

of similarity in use on an occasion is context-sensitive, so there is no one true measure<br />

of similarity. Nevertheless there is, he thought, a default measure that we use unless<br />

there is a reason to avoid it. Here is how Lewis expressed this default measure.<br />

1. It is of the first importance to avoid big, widespread, diverse violations of law.<br />

2. It is of the second importance to maximize the spatio-temporal region throughout<br />

which perfect match of particular fact prevails.<br />

3. It is of the third importance to avoid even small, localized, simple violations of<br />

law.<br />

4. It is of little or no importance to secure approximate similarity of particular<br />

fact, even in matters that concern us greatly. (1979b, 47-48)<br />

Lewis argues that by this measure, worlds in which the mechanisms of nuclear warfare<br />

spontaneously fail will be less similar to the actual world than the post-nuclear<br />

apocalypse. That’s because the failure of those mechanisms will either lead to divergence<br />

from the actual world (if they fail partially) or widespread, diverse violations<br />

of law (if they fail completely). In the former case, there’s a violation of law that isn’t<br />

made up for in an increase in how much spatio-temporal match we get. In the latter<br />

case the gain we get in similarity is only an expansion of the spatio-temporal region<br />

throughout which perfect match of particular fact prevails, but that doesn’t help in


David Lewis 60<br />

getting us closer to actuality if we’ve added a big miracle. So in fact the nearest worlds<br />

are ones where a nuclear war occurs, and (3.5) is true.<br />

One way to see the effects of Lewis’s ordering is to work through its implication<br />

for an important class of cases. When the antecedent of a counterfactual is about<br />

the occurrence or non-occurrence of a particular event E at time t, the effect of these<br />

rules is to say that the nearest worlds are the worlds where the following claims all<br />

hold, with t* being as late as possible.<br />

• There is an exact match of particular fact with actuality up to t*.<br />

• There is a small, localized law violation at t*.<br />

• There is exact conformity to the laws of actuality after t*.<br />

• The antecedent is true.<br />

So we find a point just before t where we can make the antecedent true by making<br />

a small law violation, and let the laws take over from there. There is something<br />

intuitively plausible about this way of viewing counterfactuals; often we do aim to<br />

talk about what would have happened if things had gone on in accordance with the<br />

laws, given a starting point slightly different from the one that actually obtained.<br />

Jonathan Bennett (2003) notes that when the antecedent of a conditional is not<br />

about a particular event, Lewis’s conditions provide the wrong results. For instance,<br />

if the antecedent is of the form If one of these events had not happened, then Lewis’s<br />

rules say that the nearest world where the antecedent is true is always the world where<br />

the most recent such event did not happen. But this does not seem to provide intuitively<br />

correct truth conditions for such conditionals. This need not bother Lewis’s<br />

larger project. For one thing, Lewis was not committed to there being a uniform<br />

similarity metric for all counterfactuals. Lewis could say that his default metric was<br />

only meant to apply to cases where the antecedent was about the happening or nonhappening<br />

of a particular event at a particular time, and it wouldn’t have seriously<br />

undermined his larger project. Indeed, as we’ll see in Section 5.2 below, the counterfactuals<br />

he was most interested in, and for which these criteria of similarity were<br />

devised, did have antecedents concerning specific events.<br />

4 Philosophy of Mind<br />

In “Reduction of Mind” (1994b), David Lewis separates his contributions to philosophy<br />

of mind into two broad categories. The first category is his reductionist metaphysics.<br />

From his first published philosophy paper, “An Argument for the Identity<br />

Theory” (1966), Lewis defended a version of the mind-brain identity theory (see the<br />

entry on the identity theory of mind). As he makes clear in “Reduction of Mind”,<br />

this became an important part of his global reductionism. We’ll look at his metaphysics<br />

of mind in sections 4.1–4.3.<br />

The second category is his interpretationist theory of mental content. Following<br />

Donald Davidson in broad outlines, Lewis held that the contents of a person’s mental<br />

states are those contents that a radical interpreter would interpret them as having,<br />

assuming the interpreter went about their task in the right way. Lewis had some


David Lewis 61<br />

disagreements with Davidson (and others) over the details of interpretationism, but<br />

we won’t focus on those here. What we will look at are two contributions that are<br />

of interest well beyond interpretationism, indeed beyond theories of mental content.<br />

Lewis held that mental contents are typically properties, not propositions. And he held<br />

that a theory of mental content requires an inegalitarian theory of properties. We’ll<br />

look at his theory of content in sections 4.4–4.6.<br />

4.1 Ramsey Sentences<br />

The logical positivists faced a hard dilemma when trying to make sense of science.<br />

On the one hand, they thought that all meaningful talk was ultimately talk about<br />

observables. On the other hand, they respected science enough to deny that talk of<br />

unobservables was meaningless. The solution was to ‘locate’ the unobservables in the<br />

observation language; in other words, to find a way to reduce talk of unobservables<br />

to talk about observables.<br />

Lewis didn’t think much of the broader positivist project, but he was happy to<br />

take over some of their technical advances in solving this location problem. Lewis<br />

noted that this formal project, the project of trying to define theoretical terms in an<br />

already understood language, was independent of the particular use we make of it.<br />

All that really matters is that we have some terms introduced by a new theory, and<br />

that the new theory is introduced in a language that is generally understood. In any<br />

such case it is an interesting question whether we can extract the denotation of an<br />

introduced term from the theory used to introduce it.<br />

The term-introducing theory could be a scientific theory, such as the theory that<br />

introduces terms like ‘electron’, and the language of the theory could be observation<br />

language. Or, more interestingly, the term-introducing theory could be folk psychology,<br />

and the language of the theory could be the language of physics. If we have a<br />

tool for deriving the denotations of terms introduced by a theory, and we have a way<br />

of treating folk psychology as a theory (i.e., a conjunction of sentences to which folk<br />

wisdom is committed), we can derive the denotations of terms like ‘belief’, ‘pain’, and<br />

so on using this theory. Some of Lewis’s important early work on the metaphysics of<br />

mind was concerned with systematising the progress positivists, especially Ramsey<br />

and Carnap, had made on just this problem. The procedure is introduced in “An Argument<br />

for the Identity Theory”, “Psychophysical and Theoretical Identifications”<br />

(1972) and “How to Define Theoretical Terms” (1970b). There are important later<br />

discussions of it in “Reduction of Mind” and “Naming the Colours” (1997c), among<br />

many others.<br />

In the simplest case, where we have a theory T that introduces one new name t,<br />

Lewis says that t denotes the x such that T[x], where T[x] is the sentence we get by (a)<br />

converting T to a single sentence, perhaps a single long conjunction, and (b) replacing<br />

all occurrences of t with the variable x. That is, if there is a unique x such that T[x],<br />

t denotes it, and t is denotationless otherwise. (Note that it isn’t meaningless, but it is<br />

denotationless.)<br />

The simplest case is not fully general in a few respects. First, theories often introduce<br />

many terms simultaneously, not just one. So the theory might introduce<br />

new terms t 1 , t 2 , ..., t n . No problem, we can just quantify over n-tuples, where n is


David Lewis 62<br />

the number of new terms introduced. So instead of looking at ∃ 1 x T[x], where ∃ 1<br />

means ‘exists a unique’ and x is an individual variable, we look at ∃ 1 x T[x], where x<br />

is a variable that ranges over n-tuples, and T[x] is the sentence you get by replacing<br />

t 1 with the first member of x, t 2 with the second member of x, ..., and t n with the<br />

n th member of x. Although this is philosophically very important, for simplicity I’ll<br />

focus here on the case where a single theoretical term is to be introduced.<br />

The simplest case is not general in another, more important, respect. Not all theoretical<br />

terms are names, so it isn’t obvious that we can quantify over them. Lewis’s<br />

response, at least in the early papers, is to say we can always replace them with names<br />

that amount to the same thing. So if T says that all Fs are Gs, and we are interested<br />

in the term ‘G’, then we’ll rewrite T so that it now says Gness is a property of all Fs.<br />

In the early papers, Lewis says that this is a harmless restatement of T, but this isn’t<br />

correct. Indeed, in later papers such as “Void and Object” (2004c) and “Tensing the<br />

Copula” (2002) Lewis notes that some predicates don’t correspond to properties or<br />

relations. There is no property of being non-self-instantiating, for instance, though<br />

we can predicate that of many things. In those cases the rewriting will not be possible.<br />

But in many cases, we can rewrite T, and then we can quantify into it.<br />

The procedure here is often called Ramsification, or Ramseyfication. (Both spellings<br />

have occurred in print. The first is in the title of Braddon-Mitchell and Nola<br />

(1997), the second in the title of Melia and Saatsi (2006).) The effect of the procedure<br />

is that if we had a theory T which was largely expressed in the language O, except<br />

for a few terms t 1 , t 2 , ..., t n , then we end up with a theory expressed entirely in the<br />

O-language, but which, says Lewis, has much the same content. Moreover, if the converted<br />

theory is true, then the T-terms can be defined as the substitutends that make<br />

the converted sentence true. This could be used as a way of eliminating theoretical<br />

terms from an observation language, if O is the observation language. Or it could be<br />

a way of understanding theoretical terms in terms of natural language, if O is the old<br />

language we had before the theory was developed.<br />

In cases where there is a unique x such that T[x], Lewis says that t denotes that x.<br />

What if there are many such x? Lewis’s official view in the early papers is that in such<br />

a case t does not have a denotation. In “Reduction of Mind”, Lewis retracted this,<br />

and said that in such a case t is indeterminate between the many values. In “Naming<br />

the Colours” he partially retracts the retraction, and says that t is indeterminate if the<br />

different values of x are sufficiently similar, and lacks a denotation otherwise.<br />

A more important complication is the case where there is no realiser of the theory.<br />

Here it is important to distinguish two cases. First, there is the case where the<br />

theory is very nearly realised. That is, a theory that contains enough of the essential<br />

features of the original theory turns out to be true. In that case we still want to say<br />

that the theory manages to provide denotations for its new terms. Second, there are<br />

cases where the theory is a long way from the truth. The scientific theory of phlogiston,<br />

and the folk theory of witchcraft, are examples of this. In this case we want to<br />

say that the terms of the theory do not denote.<br />

As it stands, the formal theory does not have the resources to make this distinction.<br />

But this is easy to fix. Just replace the theory T with a theory T*, which is<br />

a long disjunction of various important conjuncts of T. So if T consisted of three


David Lewis 63<br />

claims, p 1 p 2 and p 3 , and it is close enough to true if two of them are true, then T*<br />

would be the disjunction (p 1 ∧ p 2 ) ∨ (p 1 ∧ p 3 ) ∨ (p 2 ∧ p 3 ). Lewis endorses this method<br />

in “Psychophysical and Theoretical Identifications” The disjuncts are propositions<br />

that are true in states that would count as close enough to the world as described<br />

by T that T’s terms denote. Note that in a real-world case, some parts of T will be<br />

more important than others, so we won’t be able to just ‘count the conjuncts’. Still,<br />

we should be able to generate a plausible T* from T. And the rule in general is that<br />

we apply the above strategy to T* rather than T to determine the denotation of the<br />

terms.<br />

4.2 Arguing for the Identity Theory<br />

Lewis’s first, and most important, use of Ramsification was to argue for the mindbrain<br />

identity theory, in “An Argument for the Identity Theory”. Lewis claims in<br />

this paper that his argument does not rely on parsimony considerations. The orthodox<br />

argument for the identity theory at the time, as in e.g. J. J. C. Smart (1959),<br />

turned on parsimony. The identity theory and dualism explain the same data, but the<br />

dualist explanation involves more ontology than the identity theory explanation. So<br />

the identity theory is preferable. Lewis says that this abductive step is unnecessary.<br />

(He even evinces concern that it is unsound.) Lewis offers instead an argument from<br />

the causal efficacy of experience. The argument is something like the following. (I’ve<br />

given the argument that pains are physical, a similar argument can be given for any<br />

other kind of experience.)<br />

1. Pains are the kind of thing that typically have such-and-such physical causes<br />

and such-and-such physical effects, where the ‘such-and-such’s are filled in by<br />

our folk theory of pain.<br />

2. Since the physical is causally closed, the things that have such-and-such physical<br />

causes and such-and-such physical effects are themselves physical.<br />

3. So, pains are physical.<br />

The first premise is analytically true; it follows from the way we define theoretical<br />

terms. The second premise is something we learn from modern physics. (It isn’t clear,<br />

by the way, that we can avoid Smart’s parsimony argument if we really want to argue<br />

for premise 2.) So the conclusion is contingent, since modern physics is contingent,<br />

but it is well-grounded. Indeed, if we change the second premise a little, drawing on<br />

neurology rather than physics, we can draw a stronger conclusion, one that Lewis<br />

draws in “Psychophysical and Theoretical Identifications”.<br />

1. Pains are the kind of thing that typically have such-and-such physical causes<br />

and such-and-such physical effects, where the ‘such-and-such’s are filled in by<br />

our folk theory of pain.<br />

2. Neural state N is the state that has such-and-such physical causes and such-andsuch<br />

physical effects.<br />

3. So, pains are instances of neural state N.


David Lewis 64<br />

So, at least in the second argument, Lewis is defending a kind of identity theory. Pains<br />

just are instances of neural states. I’ll finish up this survey of Lewis’s metaphysics of<br />

mind with a look at two complications to this theory.<br />

4.3 Madmen and Martians<br />

Pain is defined by its causal role. Central to that role is that we are averse to pain, and<br />

try to avoid it. But not all of us do. Some of us seek out pain. Call them madmen. A<br />

good theory of pain should account for the possibility of madmen.<br />

The simplest way to account for madmen would be to simply identify pain with<br />

a neural state. So Lewis’s identity theory is well-placed to deal with them. But there<br />

is a complication. Not every creature in the universe who is in pain has the same<br />

neural states as us. It is at least possible that there are creatures in which some silicon<br />

state S plays the pain role. That is, the creatures are averse to S, they take S to be an<br />

indicator of bodily damage, and so on. Those creatures are in pain whenever they<br />

are in state S. Call any such creature a Martian. A simple identification of pain with<br />

neural state N will stipulate that there couldn’t be any Martians. That would be a<br />

bad stipulation to make.<br />

The possibility of madmen pushes us away from a simple functional definition<br />

of pain. Some creatures have pains that do not play the pain role. The possibility<br />

of Martians pushes us away from a purely neural definition of pains. Some creatures<br />

have pains that are not like our neural pain states. Indeed, some of them might have<br />

pains without having any neural states at all. Lewis’s way of threading this needle<br />

is to say that pains, like all mental states, are defined for kinds of creatures. Pains in<br />

humans are certain neural states. They are the neural states that (typically, in humans)<br />

have the functional role that we associate with pain. In other kinds of creatures pains<br />

are other states that (typically, in those creatures) play the pain role. The details of<br />

these views are worked out in “Mad Pain and Martian Pain” (1980b).<br />

4.4 Interpretationism<br />

In a recent Philosophical Review paper, J. Robert G. Williams describes the theory of<br />

content that Lewis endorses as ‘interpretationist’ (Williams, 2007). It is a good name.<br />

It’s a platitude that the content of someone’s mental states is the interpretation of<br />

those states that a good interpreter would make. If it were otherwise, the interpreter<br />

wouldn’t be good. What’s distinctive about interpretationism is the direction of explanatory<br />

priority. What makes a person’s states have the content they do is that a<br />

good interpreter would interpret them that way. This is the core of Lewis’s theory<br />

of mental content.<br />

Put this broadly, Lewis’s position is obviously indebted to Donald Davidson’s<br />

work, and Lewis frequently acknowledges the debt. But Lewis differs from Davidson<br />

in several respects. I’ll briefly mention four of them here, then look at two substantial<br />

changes in the next two sections. (The primary sources for the discussion in this section<br />

are “Radical Interpretation” (1974a) and especially its appendices in Philosophical<br />

<strong>Papers</strong>: Volume I (1983d), and “Reduction of Mind”.)<br />

First, Lewis does not think that part of being a good interpreter is that we interpret<br />

the subject so that as many of their beliefs as possible come out true. Rather, he


David Lewis 65<br />

thinks we should interpret someone so that as many of their beliefs as possible come<br />

out rational. If the subject is surrounded by misleading evidence, we should interpret<br />

her as having false beliefs rather than lucky guesses.<br />

Second, Lewis does not give a particularly special place to the subject’s verbal<br />

behaviour in interpreting them. In particular, we don’t try to (radically) interpret the<br />

subject’s language and then use that to interpret their mind. Rather, Lewis follows<br />

Grice (among others) in taking mental content to be metaphysically primary, and<br />

linguistic content to be determined by mental states (see the section on meaning in<br />

the entry on Grice).<br />

Third, Lewis believes in narrow content. Indeed, there is a sense in which he<br />

thinks narrow content is primary. He disagrees with Davidson, and several others,<br />

when he holds that Swampman has contentful states. And he thinks that we share<br />

many beliefs (most clearly metalinguistic beliefs) with denizens of Twin Earth.<br />

Finally, Lewis’s theory of mental content, like his theory of mind in general, is<br />

anti-individualistic. What matters is the functional role that a state typically has in<br />

creatures of a certain kind, not what role it has in this creature. So there might be a<br />

madman who does not attempt to get what they desire. A pure functionalist may say<br />

that such a person has no desires, since desires, by definition, are states that agents<br />

attempt to satisfy. Lewis says that as long as this state typically leads to satisfactionattempts<br />

in creatures of this kind, it is a desire. Indeed, if it typically leads to attempts<br />

to get X, it is a desire for X, even if little about the role the state plays in this agent<br />

would suggest it is a desire for X.<br />

4.5 De Se Content<br />

Some of our beliefs and desires are about specific individuals. I might, for instance,<br />

believe that BW is a crook and desire that he be punished. Some of our beliefs and<br />

desires are self-directed. I might, for instance, believe that I am not a crook and<br />

desire that I not be punished. If I know that I am BW, then I should not have all<br />

of those beliefs and desires. But I might be ignorant of this. In some circumstances<br />

(e.g., amnesia, or receiving deceptive information about your identity) it is no sign of<br />

irrationality to not know who you are. And if you don’t know you are X, you may<br />

ascribe different properties to yourself and to X.<br />

Lewis’s way of handling this problem was exceedingly simple. His original version<br />

of interpretationism had it that belief-states were ultimately probability distributions<br />

over possible worlds, and desire-states were ultimately utility functions, again<br />

defined over possible worlds. In “Attitude De Dicto and De Se” (1979a), he argued that<br />

this isn’t correct. Beliefs and desires are, at the end of the day, probability and utility<br />

functions. (Or at least they are approximations to those functions.) But they are not<br />

defined over possible worlds. Rather, they are defined over possible individuals.<br />

What that means for belief and desire is easiest to express using the language of<br />

possible worlds. The standard view is that propositions are (or at least determine)<br />

sets of possible worlds, and that the content of a belief is a proposition. To believe<br />

something then is to locate yourself within a class of possible worlds; to believe that<br />

you inhabit one of the worlds at which the proposition is true. Lewis’s view is that<br />

properties are (or at least determine) sets of possible individuals, and that the content


David Lewis 66<br />

of a belief is a property. To believe something then is to locate yourself within a<br />

class of possible individuals; to believe that you are one of the individuals with the<br />

property. More simply, beliefs are the self-ascriptions of properties.<br />

Within this framework, it is easy to resolve the puzzles we addressed at the top of<br />

the section. If I believe that BW is a crook, I self-ascribe the property of inhabiting<br />

a world in which BW is a crook. (On Lewis’s theory, beliefs that are not explicitly<br />

self-locating will be beliefs about which world one is in.) If I believe I am not a crook,<br />

I self-ascribe the property of not being a crook. Since there are possible individuals<br />

who are (a) not crooks but (b) in worlds where BW is a crook, this is a consistent<br />

self-ascription. Indeed, I may even have strong evidence that I have both of these<br />

properties. So there is no threat of inconsistency, or even irrationality here.<br />

Lewis’s suggestion about how to think of self-locating mental states has recently<br />

been very influential in a variety of areas. Adam Elga (2000a, 2004a) has extensively<br />

investigated the consequences of Lewis’s approach for decision theory. Andy Egan<br />

(2007a) has developed a novel form of semantic relativism using Lewis’s approach<br />

as a model. Daniel Nolan (2007) has recently argued that Lewis’s approach is less<br />

plausible for desire than for belief, and Robert Stalnaker (2008a) argues that the view<br />

makes the wrong judgments about sameness and difference of belief across agents and<br />

times.<br />

4.6 Natural Properties<br />

One classic problem for interpretationism is that our dispositions massively underdetermine<br />

contents. I believe that (healthy) grass is green. But for some interpretations<br />

of ‘grue’, ascribing to me the belief that grass is grue will fit my dispositions just as<br />

well. As Lewis points out towards the end of “New Work For a Theory of Universals”<br />

(1983c), if we are allowed to change the interpretations of my beliefs and desires<br />

at the same time, the fit can be made even better. This looks like a problem for<br />

interpretationism.<br />

The problem is of course quite familiar. In different guises it is Goodman’s<br />

grue/green problem, Kripkenstein’s plus/quus problem, Quine’s gavagai problem,<br />

and Putnam’s puzzle of the brain in a vat with true beliefs (Goodman, 1955; Wittgenstein,<br />

1953; Kripke, 1982; Quine, 1960; Putnam, 1981). One way or another it<br />

has to be solved.<br />

Lewis’s solution turns on a metaphysical posit. Some properties, he says, are<br />

more natural than others. The natural properties are those that, to use an ancient<br />

phrase, carve nature at the joints. They make for objective resemblance amongst the<br />

objects that have them, and objective dissimilarity between things that have them<br />

and those that lack them. The natural properties, but not in general the unnatural<br />

properties, are relevant to the causal powers of things. Although science is in the<br />

business of discovering which natural properties are instantiated, when Lewis talks<br />

about natural properties he doesn’t mean properties given a special role by nature. It<br />

is not a contingent matter which properties are natural, because it isn’t a contingent<br />

matter which properties make for objective similarity.<br />

Some properties are perfectly natural. Other properties are less natural, but not<br />

all unnatural properties are alike. Green things are a diverse and heterogeneous


David Lewis 67<br />

bunch, but they are more alike than the grue things are. And the grue things are<br />

more alike than some other even more disjunctive bunches. So as well as positing<br />

perfectly natural properties, Lewis posits a relation of more and less natural on properties.<br />

He suggests that we just need to take the perfectly natural as primitive, and<br />

we can define the naturalness of other properties in terms of it. The idea is that the<br />

naturalness of a property is a function of the complexity of that property’s definition<br />

in terms of perfectly natural properties. It isn’t at all obvious that this suggestion will<br />

capture the intuitive idea, and Lewis does not defend it at any length.<br />

One of the roles of natural properties is in induction. Other things being equal,<br />

more natural properties are more projectible. That’s Lewis’s solution to Goodman’s<br />

problem. We don’t project grue because doing so would conflict with projecting<br />

green, and green is more natural.<br />

Rational agents have beliefs that follow inductively from their evidence. So rational<br />

agents tend to have beliefs involving natural rather than unnatural properties.<br />

If the contents of beliefs are properties, as we suggested in the previous section, we<br />

can simplify this a bit and say that rational agents have beliefs whose contents are<br />

natural properties. Given interpretationism, what’s true of rational agents is true,<br />

other things equal, of any agent, since the correct interpretation of an agent’s beliefs<br />

assumes they are rational. So, other things being equal, our beliefs have more rather<br />

than less natural content. So, other things being equal, we believe that grass is green<br />

not grue. That’s Lewis’s solution to the Kripkenstein (and Putnam) problems. Even<br />

if my dispositions would be consistent with my believing grass is grue, ascribing that<br />

to me would uncharitably attribute gratuitous irrationality to me. Since a correct<br />

interpretation doesn’t ascribe gratuitous irrationality, that ascription would be incorrect.<br />

So I don’t believe grass is grue.<br />

Natural properties will play a major role for Lewis. We’ve already seen one place<br />

where it turns out they are needed; namely, in saying what it is for two worlds to<br />

have an ‘exact match’ of spatiotemporal regions. What Lewis means by that is that<br />

the regions are intrinsic duplicates. And the way he analyses intrinsic duplication in<br />

(1983c) is that two things are duplicates if they have the same intrinsic properties.<br />

We will see many other uses of natural properties as we go along, particularly in the<br />

discussion of Humean supervenience in section 5.<br />

This topic, natural properties, was one of very few topics where Lewis had a serious<br />

change of view over the course of his career. Of course, Lewis changed the<br />

details of many of his views, in response to criticism and further thought. But the<br />

idea that some properties could be natural, could make for objective similarity, in<br />

ways that most sets of possibilia do not, is notably absent from his writings before<br />

“New Work”. Indeed, as late as “Individuation by Acquaintance and by Stipulation”<br />

(1983b), he was rather dismissive of the idea. But natural properties came to play<br />

central roles in his metaphysics and, as we see here, his theory of mind. As Lewis<br />

notes in “New Work”, much of the impetus for his change of view came from discussions<br />

with D. M. Armstrong, and from the arguments in favour of universals that<br />

Armstrong presented in his (1978).


David Lewis 68<br />

5 Humean Supervenience<br />

Many of David Lewis’s papers in metaphysics were devoted to setting out, and defending,<br />

a doctrine he called “Humean Supervenience”. Here is Lewis’s succinct<br />

statement of the view.<br />

It is the doctrine that all there is to the world is a vast mosaic of local<br />

matters of particular fact, just one little thing and then another. (1986c,<br />

ix)<br />

The doctrine can be factored into two distinct theses. The first is the thesis that,<br />

in John Bigelow’s words, “truth supervenes on being”. That is, all the truths about<br />

a world supervene on the distribution of perfectly natural properties and relations<br />

in that world. The second is the thesis that the perfectly natural properties and relations<br />

in this world are intrinsic properties of point-sized objects, and spatiotemporal<br />

relations. Lewis held that the first of these was necessary and a priori. (See,<br />

for instance, “Parts of Classes” (1991), “Reduction of Mind”, “Truthmaking and<br />

Difference-making” (2001c).) The second is contingently true if true at all. Indeed,<br />

modern physics suggests that it is not true (Maudlin, 2007, Ch. 2). Lewis was aware<br />

of this. His aim in defending Humean supervenience was to defend, as he put it, its<br />

“tenability” (1986b, xi). We will return at the end of this section to the question of<br />

why he might have wanted to do this. For now, we will focus on how he went about<br />

this project.<br />

The primary challenge to Humean supervenience comes from those who hold<br />

that providing a subvenient basis for all the truths of this world requires more than<br />

intrinsic properties of point-sized objects and spatiotemporal relations. Some of these<br />

challenges come from theorists who think best physics will need non-spatiotemporal<br />

relations in order to explain Bell’s Theorem. But more commonly it comes from<br />

those who think that grounding the modal, the nomic or the mental requires adding<br />

properties and relations to any Humean mosaic constructed from properties found in<br />

fundamental physics. (I’m using ‘mental’ here to cover all the properties that Lewis<br />

considered mental, broadly construed. This includes contents, since Lewis thought<br />

content was grounded in mental content, and value, since he thought values were<br />

grounded in idealised desires. So it’s a fairly broad category, and there is a lot that<br />

isn’t obviously reducible to fundamental physics. As we’ll see, Lewis attempts to<br />

reduce it all step-by-step.)<br />

We’ve discussed in the previous section how Lewis aimed to reduce the mental to<br />

the nomic. (Or at least much of it; we’ll return to the question of value in section<br />

7.5.) We’ll discuss in the next section his distinctive modal metaphysics. In this<br />

section we’ll look at how he attempted to locate the nomic in the Humean mosaic.<br />

Lewis’s aim was to show that nomic properties and relations could be located in the<br />

Humean mosaic by locating them as precisely and as explicitly as he could. So the<br />

location project revealed a lot about these nomic features. We’ll spend the next two<br />

subsections looking at the two important parts of this project. Notably, they are two<br />

parts where Lewis refined his views several times on the details of the location.


David Lewis 69<br />

5.1 Laws and Chances<br />

Lewis’s reductionist project starts with laws of nature. Building on some scattered<br />

remarks by Ramsey and Mill, Lewis proposed a version of the ‘best-system’ theory<br />

of laws of nature. There is no paper devoted to this view, but it is discussed in section<br />

3.3 of Counterfactuals, in “New Work For a Theory of Universals”, extensively in<br />

Postscript C to the reprint of “A Subjectivist’s Guide to Objective Chance” in (1986c),<br />

and in “Humean Supervenience Debugged” (1994a).<br />

The simple version of the theory is that the laws are the winners of a ‘competition’<br />

among all collections of truths. Some truths are simple, e.g. the truth that<br />

this table is brown. Some truths are strong; they tell us a lot about the world. For<br />

example, the conjunction of every truth in this Encyclopedia rules out a large chunk<br />

of modal space. Typically, these are exclusive categories; simple truths are not strong,<br />

and strong truths are not simple. But there are some exceptions. The truth that any<br />

two objects are attracted to one another, with a force proportional to the product of<br />

their masses and inversely proportional to the distance between them, is relatively<br />

simple, but also quite strong in that it tells us a lot about the forces between many<br />

distinct objects. The laws, says Lewis, are these simple but strong truths.<br />

Two qualifications are needed before we get to Lewis’s 1973 view of laws. It is<br />

collections of truths, not individual truths, that are measured and compared for simplicity<br />

and strength. And it is not every truth in the winning collection (or best<br />

system), but only the generalisations within it, that are laws. So even if the best system<br />

includes particular facts about the Big Bang or its immediate aftermath, e.g. that<br />

the early universe was a low entropy state, those facts are not laws on Lewis’s view.<br />

In “New Work For a Theory of Universals”, Lewis notes another restriction that<br />

is needed. If we measure the simplicity of some truths by the length of their statement<br />

in an arbitrarily chosen language, then any truth at all can be made simple. Let<br />

Fx be true iff x is in a world where every truth in this Encyclopedia is true. Then Everything<br />

is F is simply stateable in a language containing F, and is presumably strong.<br />

So Everything is F will be a law. But this kind of construction would clearly trivialise<br />

the theory of laws. Lewis’s solution is to say that we measure the simplicity of a claim<br />

by how easily stateable it is in a language where all predicates denote perfectly natural<br />

properties. He notes that this move requires that the natural properties are specified<br />

prior to specifying the laws, which means that we can’t reductively specify naturalness<br />

in terms of laws. (In any case, since Lewis holds that laws are contingent (1986b,<br />

91) but which properties are natural is not contingent (1986b, 60n), this approach<br />

would not be open to Lewis.)<br />

In “Humean Supervenience Debugged”, Lewis notes how to extend this theory<br />

to indeterministic worlds. Some laws don’t say what will happen, but what will have<br />

a chance of happening. If the chances of events could be determined antecedently<br />

to the laws being determined, we could let facts about chances be treated more or<br />

less like any other fact for the purposes of our ‘competition’. But, as we’ll see, Lewis<br />

doesn’t think the prospects for doing this are very promising. So instead he aims to<br />

reduce laws and chances simultaneously to distributions of properties.


David Lewis 70<br />

Instead of ranking collections of truths by two measures, strength and simplicity,<br />

we will rank them by three, strength, simplicity and fit. A collection of truths that<br />

entails that what does happen has (at earlier times) a higher chance of happening<br />

has better fit than a collection that entails that what happens had a lower chance of<br />

happening. The laws are those generalisations in the collection of truths that do the<br />

best by these three measures of strength, simplicity and fit. The collection will entail<br />

various ‘history-to-chance’ conditionals. These are conditionals of the form If H t then<br />

P t (A) = x, where H t is a proposition about the history of the world to t, and P t is the<br />

function from propositions to their chance at t. The chance of A at t in w is x iff there<br />

is some such conditional If H t then P t (A) = x, where H t is the history of w to t.<br />

The position that I’ve sketched here is the position that Lewis says that he originally<br />

was drawn towards in 1975, and that he endorsed in print in 1994. (The dates are<br />

from his own description of the evolution of his views in (1994a).) But in between, in<br />

both (1980c) and Postscript C to its reprinting in (1986c), he rejected this position because<br />

he thought it conflicted with a non-negotiable conceptual truth about chance.<br />

This truth was what he called the “Principal Principle”.<br />

The Principal Principle says that a rational agent conforms their credences to the<br />

chances. More precisely, it says the following is true. Assume we have a number x,<br />

proposition A, time t, rational agent whose evidence is entirely about times up to and<br />

including t, and a proposition E that (a) is about times up to and including t and (b)<br />

entails that the chance of A at t is x. In any such case, the agent’s credence in A given<br />

E is x.<br />

An agent who knows what happens after t need not be guided by chances at t. If<br />

I’ve seen the coin land heads, that its chance of landing heads was 0.5 at some earlier<br />

time is no reason to have my credence in heads be 0.5. Conversely, if all I know is<br />

that the chance is 0.5, that’s no reason for my conditional credence in heads to be 0.5<br />

conditional on anything at all. Conditional on it landing heads, my credence in heads<br />

is 1, for instance. But given these two restrictions, the Principal Principle seems like<br />

a good constraint. Lewis calls evidence about times after t ‘inadmissible’, which lets<br />

us give a slightly more concise summary of what the Principal Principle says. For<br />

agents with no inadmissible evidence, the rational credence in A, conditional on the<br />

chance of A being x, combined with any admissible evidence, is x.<br />

The problem Lewis faced in the 1980s papers is that the best systems account<br />

of chance makes the Principal Principle either useless or false. Here is a somewhat<br />

stylised example. (I make no claims about the physical plausibility of this setup;<br />

more plausible examples would be more complicated, but would make much the<br />

same point.) Let t be some time before any particle has decayed. Let A be the proposition<br />

that every radioactive particle will decay before it reaches its actual half-life.<br />

At t, A has a positive chance of occurring. Indeed, its chance is 1 in 2 n , where n is<br />

the number of radioactive particles in the world. (Assume, again for the sake of our<br />

stylised example, that n is finite.) But if A occurred, the best system of the world<br />

would be different from how it actually is. It would improve fit, for instance, to say<br />

that the chance of decay within the actual half-life would be 1. So someone who<br />

knows that the chance of A is 1 in 2 n knows that A won’t happen.


David Lewis 71<br />

Lewis called A an ‘undermining’ future; it has a chance of happening, but if it happens<br />

the chances are different. The problem with underminers is that they conflict<br />

with the Principal Principle. Someone who knows the chance of A should, by the<br />

Principal Principle, have credence 1 in 2 n that A will happen. But given the chance<br />

of A, it is possible to deduce A, and hence have credence in A. This looks like an<br />

inconsistency, so like any principle that implies a contradiction, the Principal Principle<br />

must be false. The most obvious way out is to say that information about the<br />

chance of A is inadmissible, since it reveals something about the future, namely that<br />

A doesn’t occur. But to say that chances are inadmissible is to make the Principal<br />

Principle useless. So given the best systems theory of laws and chances, the Principal<br />

Principle is either false or useless. Since the Principal Principle is neither false nor<br />

useless, Lewis concluded in these 1980s papers that the best systems theory of laws<br />

and chances was false.<br />

The problem with this was that it wasn’t clear what could replace the best systems<br />

theory. Lewis floated two approaches in the postscripts to the reprinting of (1980c),<br />

one based on primitive chances, and the other based on history-to-chance conditionals<br />

being necessary. But neither seemed metaphysically plausible, and although each<br />

was consistent with the Principal Principle, they made it either mysterious (in the<br />

first case) or implausible (in the second). A better response, as set out in “Humean<br />

Supervenience Debugged”, was to qualify the Principal Principle. Lewis said that<br />

what was really true was the “New Principle”. His proposal was based on ideas developed<br />

by Ned Hall (1994) and Michael Thau (1994).<br />

We’ll explain the New Principle by starting with a special case of the old Principle.<br />

Let T be the ‘theory of chance’ for the world, the conjunction of all historyto-chance<br />

conditionals. And let H be the history of the world to t. Assuming T is<br />

admissible, the old Principal Principle says that the credence in A given H ∧ T should<br />

be the chance of A at t. The New Principle says that the credence in A given H ∧<br />

T should be the chance of A given T at t. That is, where C is the agent’s credence<br />

function, and P is the chance function, and the agent has no inadmissible evidence, it<br />

should be that C(A | H ∧ T ) = P(A | T ). This compares to the old principle, which<br />

held that C(A | H ∧ T ) = P(A).<br />

That’s the special case of the New Principle for an agent with no inadmissible<br />

evidence. The general case follows from this special case. In general, assuming the<br />

agent has no inadmissible evidence, the rational credence in A given E is the expected<br />

value, given E, of the chance of A given H ∧ T. That is, where C is the agent’s credence<br />

function, and P is the chance function, it should be the sum across all possible<br />

combinations of H and T of C(H ∧ T | E)P(A | H ∧ T ).<br />

The New Principle is, Lewis argues, consistent with the best systems theory of<br />

laws and chances. Lewis had originally thought that any specification of chance had<br />

to be consistent with the Principal Principle. But in later works he argued that the<br />

New Principle was a close enough approximation to the Principal Principle that a<br />

theory of chances consistent with it was close enough to our pre-theoretic notion of<br />

chance to deserve the name. So he could, and did, happily endorse the best systems<br />

theory of laws and chance.


David Lewis 72<br />

5.2 Causation<br />

In “Causation” (1973a), Lewis put forward an analysis of causation in terms of counterfactual<br />

dependence. The idea was that event B was counterfactually dependent on<br />

event A if and only if the counterfactual Had A not occurred, B would not have occurred<br />

was true. Then event C causes event E if and only if there is a chain C, D 1 ,<br />

..., D n , E such that each member in the chain (except C) is counterfactually dependent<br />

on the event before it. In summary, causation is the ancestral of counterfactual<br />

dependence.<br />

The reasoning about chains helped Lewis sidestep a problem that many thought<br />

unavoidable for a counterfactual theory of causation, namely the problem of preempting<br />

causes. Imagine that Suzy throws a rock, the rock hits a window and the<br />

window shatters. Suzy’s throw caused the window to shatter. But there is a backup<br />

thrower—Billy. Had Suzy not thrown, Billy would have thrown another rock and<br />

broken the window. So the window breaking is not counterfactually dependent on<br />

Suzy’s throw. Lewis’s solution was to posit an event of the rock flying towards the<br />

window. Had Suzy not thrown, the rock would not have been flying towards the<br />

window. And had the rock not been flying towards the window, the window would<br />

have not shattered. Lewis’s thought here is that it is Suzy’s throwing that causes<br />

Billy to not throw; once she has thrown Billy is out of the picture and the window’s<br />

shattering depends only on what Suzy’s rock does. So we avoid this problem of preempters.<br />

Much of the argumentation in “Causation” concerns the superiority of the counterfactual<br />

analysis to deductive-nomological theories. These arguments were so successful<br />

that from a contemporary perspective they seem somewhat quaint. There are<br />

so few supporters of deductive-nomological theories in contemporary metaphysics<br />

that a modern paper would not spend nearly so much time on them.<br />

After “Causation” the focus, at least of those interested in reductive theories,<br />

moved to counterfactual theories. And it became clear that Lewis had a bit of work<br />

left to do. He needed to say more about the details of the notion of counterfactual dependence.<br />

He did this in “Counterfactual Dependence and Time’s Arrow” (1979b), as<br />

discussed in section 2. He needed to say more about the nature of events. In “Events”<br />

(1986a) he said that they were natural properties of regions of space-time. And prodded<br />

by Jaegwon Kim (1973), he needed to add that A and B had to be wholly distinct<br />

events for B to counterfactually depend on A. The alternative would be to say that an<br />

event’s happening is caused by any essential part of the event, which is absurd.<br />

But the biggest problem concerned what became known as “late pre-emption”. In<br />

the rock throwing example above, we assumed that Billy decided not to throw when<br />

he saw Suzy throwing. But we can imagine a variant of the case where Billy waits to<br />

see whether Suzy’s rock hits, and only then decides not to throw. In such a case, it<br />

is the window’s shattering, not anything prior to this, that causes Billy not to throw.<br />

That means that there is no event between Suzy’s throw and the window’s shattering<br />

on which the shattering is counterfactually dependent.<br />

Lewis addressed this issue in “Redundant Causation”, one of the six postscripts to<br />

the reprinting of “Causation” in (1986c). He started by introducing a new concept:


David Lewis 73<br />

quasi-dependence. B quasi-depends on A iff there is a process starting with A*, and<br />

ending with B* , and B* counterfactually depends on A*, and the process from A*<br />

to B* is an intrinsic duplicate of the process from A to B, and the laws governing the<br />

process from A* to B* (i.e. the laws of the world in which A* and B* happen) are<br />

the same as the laws governing the process from A to B. In short, quasi-dependence<br />

is the relation you get if you start with dependence, then add all of the duplicates of<br />

dependent processes. Causation is then the ancestral of quasi-dependence. Although<br />

the window’s shattering does not depend on Suzy’s throw, it does quasi-depend on it.<br />

That’s because there is a world, with the same laws, with a duplicate of Suzy’s throw,<br />

but Billy determined not to throw, and in that world the window shatters in just the<br />

same way, and depends on Suzy’s throw.<br />

Eventually, Lewis became unsatisfied with the quasi-dependence based theory. In<br />

“Causation as Influence” (2000; 2004a) he set out several reasons for being unhappy<br />

with it, and a new theory to supersede it.<br />

One argument against it is that it makes causation intrinsic to the pair C and E,<br />

but some cases, especially cases of double prevention, show that causation is extrinsic.<br />

Double prevention occurs when an event, call it C, prevents something that would<br />

have prevented E from happening. Intuitively, these are cases of causation. Indeed,<br />

when we look at the details we find that many everyday cases of causation have this<br />

pattern. But that C causes E does not depend on the intrinsic natures of C and E.<br />

Rather, it depends on there being some threat to E, a threat that C prevents, and the<br />

existence of threats is typically extrinsic to events.<br />

Another argument is that quasi-dependence cannot account for what came to be<br />

known as ‘trumping pre-emption’. Lewis illustrated this idea with an example from<br />

Jonathan Schaffer (2000). The troops are disposed to obey all orders from either<br />

the Sergeant or the Major. But they give priority to the Major’s orders, due to the<br />

Major’s higher rank. Both the Major and the Sergeant order the troops to advance,<br />

and they do advance. Intuitively, it is the Major, not the Sergeant, who caused the<br />

advance, since the Major’s orders have priority. But the advance does quasi-depend<br />

on the Sergeant’s orders, since in a world where the Major doesn’t make an order, the<br />

advance does depend on the Sergeant.<br />

Lewis’s alternative theory relied on changing the definition of counterfactual dependence.<br />

The theory in “Causation” was based on what he came to call ‘whetherwhether’<br />

dependence. What’s crucial is that whether B happens depends counterfactually<br />

on whether A happens. The new theory was based on what we might call<br />

‘how-how’ dependence. Lewis says that B depends on A if there are large families of<br />

counterfactuals of the form If A had happened in this way, then B would have happened<br />

in that way, and the ways in which B would happen are systematically dependent on<br />

the ways in which A happens. How much A influences B depends on how big this<br />

family is, how much variation there is in the way B changes, and how systematic the<br />

influence of A on B is. He then defines causation as the ancestral of this notion of<br />

counterfactual dependence.<br />

On this new theory, causation is a degree concept, rather than an ‘all-or-nothing’<br />

concept, since counterfactual dependence comes in degrees. Sometimes Lewis says<br />

we properly ignore small amounts of causation. For instance, the location of nearby


David Lewis 74<br />

parked cars influences the smashing of a window by a rock in virtue of small gravitational<br />

effects of the cars on the flight of the rock. But it’s very little influence, and we<br />

properly ignore it most of the time.<br />

There are two other notable features of “Causation as Influence”. It contains<br />

Lewis’s most comprehensive defence of the transitivity of causation. This principle<br />

was central to Lewis’s theory of causation from the earliest days, but had come under<br />

sustained attack over the years. And the paper has a brief attack on non-Humean<br />

theories that take causation to be a primitive. Lewis says that these theories can’t<br />

explain the variety of causal relations that we perceive and can think about. These<br />

passages mark an interesting change in what Lewis took to be the primary alternatives<br />

to his counterfactuals based reductionism. In 1973 the opponents were other kinds<br />

of reductionists; in 2000 they were the non-reductionists.<br />

5.3 Why Humean Supervenience<br />

Given these concepts, a number of other concepts fall into place. Dispositions are<br />

reduced to counterfactual dependencies, though as is made clear in “Finkish Dispositions”<br />

(1997b), the reduction is not as simple as it might have seemed. Perception<br />

is reduced to dispositions and causes. (See, for instance, “Veridical Hallucination and<br />

Prosthetic Vision” (1980d).) We discussed the reduction of mental content to dispositions<br />

and causes in section 4. And we discussed the reduction of linguistic content<br />

to mental content in section 1. Values are reduced to mental states in “Dispositional<br />

Theories of Value” (1989b).<br />

But we might worry about the very foundation of the project. We started with<br />

the assumption that our subvenient base consists of intrinsic properties of pointsized<br />

objects and spatiotemporal relations. But Bell’s inequality suggests that modern<br />

physics requires, as primitive, other relations between objects. (Or it requires intrinsic<br />

properties of dispersed objects.) So Humean supervenience fails in this world.<br />

Lewis’s response is somewhat disarming. Writing in 1986, part of his response is<br />

scepticism about the state of quantum mechanics. (There is notably less scepticism<br />

in “How Many Lives Has Schr odinger’s Cat” (2004b).) But the larger part of his<br />

response is to suggest that scientific challenges to Humean supervenience are outside<br />

his responsibility.<br />

Really, what I uphold is not so much the truth of Humean supervenience<br />

as the tenability of it. If physics itself were to teach me that it is false,<br />

I wouldn’t grieve ... What I want to fight are philosophical arguments<br />

against Humean supervenience. When philosophers claim that one or<br />

another common-place feature of the world cannot supervene on the arrangement<br />

of qualities, I make it my business to resist. Being a commonsensical<br />

fellow (except where unactualized possible worlds are concerned)<br />

I will seldom deny that the features in question exist. I grant their existence,<br />

and do my best to show how they can, after all, supervene on the<br />

arrangement of qualities. (1986c, xi)


David Lewis 75<br />

We might wonder why Lewis found this such an interesting project. If physics teaches<br />

that Humean supervenience is false, why care whether there are also philosophical<br />

objections to it? There are two (related) reasons why we might care.<br />

Recall that we said that Humean supervenience is a conjunction of several theses.<br />

One of these is a thesis about which perfectly natural properties are instantiated in<br />

this world, namely local ones. That thesis is threatened by modern physics. But the<br />

rest of the package, arguably, is not. In particular, the thesis that all facts supervene<br />

on the distribution of perfectly natural properties and relations does not appear to be<br />

threatened. (Though see (Maudlin, 2007, Ch. 2) for a dissenting view.) Nor is the thesis<br />

that perfectly natural properties and relations satisfy a principle of recombination<br />

threatened by modern physics. The rough idea of the principle of recombination is<br />

that any distribution of perfectly natural properties is possible. This thesis is Lewis’s<br />

version of the Humean principle that there are no necessary connections between<br />

distinct existences, and Lewis is determined to preserve as strong a version of it as he<br />

can.<br />

Although physics does not seem to challenge these two theses, several philosophers<br />

do challenge them on distinctively philosophical grounds. Some of them suggest<br />

that the nomic, the intensional, or the normative do not supervene on the distribution<br />

of perfectly natural properties. Others suggest that the nomic, intentional,<br />

or normative properties are perfectly natural, and as a consequence perfectly natural<br />

properties are not freely recombinable. The philosophical arguments in favour<br />

of such positions rarely turn on the precise constitution of the Humean’s preferred<br />

subvenient base. If Lewis can show that such arguments fail in the setting of classical<br />

physics, then he’ll have refuted all of the arguments against Humean superveience<br />

that don’t rely on the details of modern physics. In practice that means he’ll have<br />

refuted many, though not quite all, of the objections to Humean supervenience.<br />

A broader reason for Lewis to care about Humean supervenience comes from<br />

looking at his overall approach to metaphysics. When faced with something metaphysically<br />

problematic, say free will, there are three broad approaches. Some philosophers<br />

will argue that free will can’t be located in a scientific world-view, so it should<br />

be eliminated. Call these ‘the eliminativists’. Some philosophers will agree that free<br />

will can’t be located in the scientific world-view, so that’s a reason to expand our<br />

metaphysical picture to include free will, perhaps as a new primitive. Call these ‘the<br />

expansionists’. And some philosophers will reject the common assumption of an incompatibility.<br />

Instead they will argue that we can have free will without believing in<br />

anything that isn’t in the scientific picture. Call these ‘the compatibilists’.<br />

As the above quote makes clear, Lewis was a compatibilist about most questions<br />

in metaphysics. He certainly was one about free will. (“Are We Free to Break the<br />

Laws?” (1981a).) And he was a compatibilist about most nomic, intentional and<br />

normative concepts. This wasn’t because he had a global argument for compatibilism.<br />

Indeed, he was an eliminativist about religion (“Anselm and Actuality” (1970a),<br />

“Divine Evil” (2007)). And in some sense he was an expansionist about modality.<br />

Lewis may have contested this; he thought introducing more worlds did not increase<br />

the number of kinds of things in our ontology, because we are already committed


David Lewis 76<br />

to there being at least one world. As (Melia, 1992, 192) points out though, the inhabitants<br />

of those worlds include all kinds of things not found in, or reducible to,<br />

fundamental physics. They include spirits, gods, trolls and every other consistent<br />

beast imaginable. So at least when it came to what there is, as opposed to what there<br />

actually is, Lewis’s ontology was rather expansionist.<br />

For all that, Lewis’s default attitude was to accept that much of our commonsense<br />

thinking about the nomic, the intentional and the normative was correct, and<br />

that this was perfectly compatible with this world containing nothing more than is<br />

found in science, indeed than is found in fundamental physics.<br />

Compatibilists should solve what Frank Jackson calls ‘the location problem’ (Jackson<br />

1998). If you think that there are, say, beliefs, and you think that having beliefs in<br />

one’s metaphysics doesn’t commit you to having anything in your ontology beyond<br />

fundamental physics, then you should, as Jackson puts it, be able to locate beliefs<br />

in the world described by fundamental physics. More generally, for whatever you<br />

accept, you should be able to locate it in the picture of the world you accept.<br />

This was certainly the methodology that Lewis accepted. And since he thought<br />

that so much of our common sense worldview was compatible with fundamental<br />

physics, he had many versions of the location problem to solve. One way to go<br />

about this would be to find exactly what the correct scientific theory is, and locate all<br />

the relevant properties in that picture. But this method has some shortcomings. For<br />

one thing, it might mean having to throw out your metaphysical work whenever the<br />

scientific theories change. For another, it means having your metaphysics caught up<br />

in debates about the best scientific theories, and about their interpretation. So Lewis<br />

took a somewhat different approach.<br />

What Lewis’s defence of Humean supervenience gives us is a recipe for locating<br />

the nomic, intentional and normative properties in a physical world. And it is a<br />

recipe that uses remarkably few ingredients; just intrinsic properties of point-sized<br />

objects, and spatio-temporal relations. It is likely that ideal physics will have more<br />

in it than that. For instance, it might have entanglement relations, as are needed<br />

to explain Bell’s inequality. But it is unlikely to have less. And the more there is<br />

in fundamental physics, the easier it is to solve the location problem, because the<br />

would-be locator has more resources to work with.<br />

The upshot of all this is that a philosophical defence of Humean supervenience,<br />

especially a defence like Lewis’s that shows us explicitly how to locate various folk<br />

properties in classical physics, is likely to show us how to locate those properties in<br />

more up-to-date physics. So Lewis’s defence of Humean supervenience then generalises<br />

into a defence of the compatibility of large swathes of folk theory with ideal<br />

physics. And the defence is consistent with the realist principle that truth supervenes<br />

on being, and with the Humean denial of necessary connections between distinct<br />

existences. And that, quite clearly, is a philosophically interesting project.<br />

6 Modal Realism<br />

This entry has been stressing Lewis’s many and diverse contributions to philosophy.<br />

But there is one thesis with which he is associated above all others: modal realism.


David Lewis 77<br />

Lewis held that this world was just one among many like it. A proposition, p is<br />

possibly true if and only if p is true in one of these worlds. Relatedly, he held that<br />

individuals like you or I (or this computer) only exist in one possible world. So what<br />

it is for a proposition like You are happy to be true in another world is not for you to<br />

be happy in that world; you aren’t in that world. Rather, it is for your counterpart to<br />

be happy in that world.<br />

Lewis wrote about modal realism in many places. As early as Counterfactuals he<br />

wrote this famous passage.<br />

I believe, and so do you, that things could have been different in countless<br />

ways. But what does this mean? Ordinary language permits the<br />

paraphrase: there are many ways things could have been besides the way<br />

they actually are. I believe that things could have been different in countless<br />

ways; I believe permissible paraphrases of what I believe; taking the<br />

paraphrase at its face value, I therefore believe in the existence of entities<br />

that might be called ‘ways things could have been.’ I prefer to call them<br />

‘possible worlds.’ (1973b, 84)<br />

And Lewis used counterpart theory throughout his career to resolve metaphysical<br />

puzzles in fields stretching from personal identity (“Counterparts of Persons and<br />

Their Bodies” (1971b)) to truthmaker theory (“Things qua Truthmakers” (2003)). Indeed,<br />

Lewis’s original statement of counterpart theory is in one of his first published<br />

metaphysics papers (“Counterpart Theory and Quantified Modal Logic” (1968)).<br />

But the canonical statement and defence of both modal realism and counterpart<br />

theory is in On the Plurality of Worlds (1986b), the book that grew out of his 1984<br />

John Locke lectures. This section will follow the structure of that book.<br />

The little ‘argument by paraphrase’ from Counterfactuals is a long way from an<br />

argument for Lewis’s form of modal realism. For one thing, the argument relies on<br />

taking a folksy paraphrase as metaphysically revealing; perhaps we would be better<br />

off treating this as just a careless manner of speaking. For another, the folksy paraphrase<br />

Lewis uses isn’t obviously innocuous; like many other abstraction principles<br />

it could be hiding a contradiction. And the argument does little to show that other<br />

possible worlds are concreta; talking of them as ways things could be makes them<br />

sound like properties, which are arguably abstracta if they exist at all. The first three<br />

chapters of Plurality address these three issues. The fourth chapter is an extended discussion<br />

of the place of individuals in modal realism. We’ll look at these chapters in<br />

order.<br />

6.1 A Philosophers’ Paradise<br />

The short argument from Counterfactuals that I quoted seems deeply unQuinean.<br />

Rather than saying that possible worlds exist because they are quantified over in the<br />

best paraphrase of our theories, Lewis says they exist because they are quantified<br />

over in just one paraphrase of our theories. To be sure, he says this is a permissible<br />

paraphrase. On the other hand, there is vanishingly little defence of its permissibility.


David Lewis 78<br />

In the first chapter of Plurality Lewis takes a much more Quinean orthodox line.<br />

He argues, at great length, that the best version of many philosophical theories requires<br />

quantification over possibilities. In traditional terms, he offers an extended<br />

indispensibility argument for unactualised possibilities. But traditional terms are<br />

perhaps misleading here. Lewis does not say that possibilities are absolutely indispensible,<br />

only that they make our philosophical theories so much better that we<br />

have sufficient reason to accept them.<br />

There are four areas in which Lewis thinks that possible worlds earn their keep.<br />

Modality Traditional treatments of modal talk in terms of operators face several difficulties.<br />

They can’t, at least without significant cost, properly analyse talk<br />

about contingent existence, or talk about modal comparatives, or modal supervenience<br />

theses. All of these are easy to understand in terms of quantification<br />

across possibilities.<br />

Closeness Our best theory of counterfactuals, Lewis’s theory, relies on comparisons<br />

between possible worlds. Indeed, it relies on comparisons between this world<br />

and other worlds. Such talk will be hard to paraphrase away if worlds aren’t<br />

real.<br />

Content Lewis argues, in part following Stalnaker (1984), that our best theory of<br />

mental and verbal content analyses content in terms of sets of possibilities.<br />

This, in turn, requires that the possibilities exist.<br />

Properties We often appear to quantify over properties. The modal realist can take<br />

properties to be sets of possibilia, and take such quantification at face value.<br />

In his discussion of properties here, Lewis expands upon his theory of natural<br />

properties that he introduced in “New Work for a Theory of Universals”, and<br />

that we discussed in section 3.<br />

After arguing that we are best off in all these areas of philosophy if we accept unactualised<br />

possibilities, Lewis spends the rest of chapter 1 saying what possible worlds<br />

are on his view. He isn’t yet arguing for this way of thinking about possible worlds;<br />

that will come in chapter 3. For now he is just describing what he takes to be the<br />

best theory of possible worlds. He holds that possible worlds are isolated; no part<br />

of one is spatio-temporally related to any other world. Indeed, he holds that lack of<br />

spatio-temporal relation (or something like it) is what marks individuals as being in<br />

different worlds. So his theory has the somewhat odd consequence that there could<br />

not have been two parts of the world that aren’t spatio-temporally connected. He<br />

holds that worlds are concrete, though spelling out just what the abstract/concrete<br />

distinction comes to in this context isn’t a trivial task. And he holds that worlds are<br />

plenitudinous. There is a world for every way things could be. And worlds satisfy a<br />

principle of recombination: shape and size permitting, any number of duplicates of<br />

any number of possible things can co-exist or fail to co-exist.<br />

6.2 Paradox in Paradise?<br />

Chapter 2 deals with several objections to modal realism. Some of these objections<br />

claim that modal realism leads to paradox. Other objections claim that it undermines<br />

our ordinary practice. We will look at two examples of each.


David Lewis 79<br />

Peter Forrest and D. M. Armstrong (1984) argue that modal realism leads to problems<br />

given the principle of recombination. An unrestricted principle of recombination<br />

says that for any things that could exist, there is a world in which there is a<br />

duplicate of all of them. Forrest and Armstrong apply the principle by taking the<br />

things to be the different possible worlds. A world containing a duplicate of all the<br />

worlds would, they show, be bigger than any world. But by the principle it would<br />

also be a world. Contradiction. Lewis’ reply is to deny the unrestricted version of<br />

the principle. He insists that there is independent reason to qualify the principle to<br />

those things whose size and shape permits them to be fit into a single world. Without<br />

an unrestricted principle of recombination, there is no way to create the large world<br />

that’s at the heart of Forrest and Armstrong’s paradox.<br />

David Kaplan argued that there could be no cardinality of the worlds. Kaplan<br />

did not publish this argument, so Lewis replies to the version presented by Martin<br />

(Davies, 1981, 262). On Lewis’s theory, every set of worlds is a proposition. For<br />

any proposition, says Kaplan, that proposition might be the only proposition being<br />

thought by a person at location l at time t. So for each proposition, there is a world<br />

where it (alone) is thought by a person at location l at time t. That means there<br />

is a one-one correspondence between the sets of worlds and a subset of the worlds.<br />

Contradiction. Lewis’s reply is to deny that every proposition can be thought. He<br />

claims that functionalism about belief, plus the requirement that beliefs latch onto<br />

relatively natural properties, mean that most propositions cannot be thought, and<br />

this blocks the paradox.<br />

Peter Forrest (1982) argues that modal realism leads to inductive scepticism. According<br />

to modal realism, there are other thinkers very much like us who are deceived<br />

by their surroundings. Given this, we should doubt our inductive inferences. Lewis’s<br />

reply is that modal realism does not make inductive challenges any worse than they<br />

were before. It is common ground that inductive inference is fallible. That is, it is<br />

common ground that these inferences could fail. Thinking of the possibilities of failure<br />

as concrete individuals might focus the mind on them, and hence make us less<br />

confident, but does not seem to change the inference’s justificatory status. Lewis’s<br />

argument seems hard to dispute here. Given the mutually agreed upon fact that the<br />

inference could fail, it’s hard to see what epistemological cost is incurred by agreeing<br />

that it does fail for someone kind of like the inferrer in a distant possible world.<br />

Robert Adams (1974) argues that modal realism leads to surprising results in<br />

moral philosophy. The modal realist says that the way things are, in the broadest<br />

possible sense, is not a contingent matter, since we can’t change the nature of the<br />

pluriverse. Hence we cannot do anything about it. So if moral requirements flow<br />

from a requirement to improve the way things are, in this broadest possible sense,<br />

then there are no moral requirements. Lewis rejects the antecedent of this conditional<br />

as something that only an extreme utilitarian could accept. What is crucial<br />

about morality is that we not do evil. Even if their actions won’t make a difference<br />

to the nature of the pluriverse, a virtuous agent will not want to, for instance, cause<br />

suffering. By rejecting the view that in our moral deliberations we should care about<br />

everyone, possible and actual, equally, Lewis avoids the problem.


David Lewis 80<br />

6.3 Paradise on the Cheap?<br />

In chapter 3 Lewis looks at the alternatives to his kind of modal realism. He takes<br />

himself to have established that we need to have possible worlds of some kind in our<br />

ontology, but not that these possible worlds must be concrete. In particular, they can<br />

be abstract, or what he calls “ersatz” possible worlds. Lewis does not have a single<br />

knock-down argument against all forms of ersatzism. Instead he divides the space of<br />

possible ersatzist positions into three, and launches different attacks against different<br />

ones.<br />

Lewis starts with what he calls “linguistic ersatzism”. This is the view that ersatz<br />

possible worlds are representations, and the way they represent possibilities is something<br />

like the way that language represents possibilities. In particular, they represent<br />

possibilities without resembling possibilities, but instead in virtue of structural features<br />

of the representation.<br />

He levels three main objections to linguistic ersatzism. First, it takes modality as a<br />

primitive, rather than reducing modality to something simpler (like concrete possible<br />

worlds). Second, it can’t distinguish qualitatively similar individuals in other possible<br />

worlds. Lewis argues that will mean that we can’t always quantify over possibilia, as<br />

we can in his theory. Third, it can’t allow as full a range of ‘alien’, i.e. uninstantiated,<br />

natural properties as we would like. Sider (2002) has replied that some of these challenges<br />

can be met, or at least reduced in intensity, if we take the pluriverse (i.e. the<br />

plurality of worlds) to be what is represented, rather than the individual worlds.<br />

The second theory he considers is what he calls “pictoral ersatzism”. This is the<br />

view that ersatz possible worlds are representations, and the way they represent possibilities<br />

is something like the way that pictures or models represent possibilities. That<br />

is, they represent by being similar, in a crucial respect, to what they are representing.<br />

The pictoral ersatzist, says Lewis, is caught in something of a bind. If the representations<br />

are not detailed enough, they will not give us enough possibilities to do the<br />

job that possible worlds need to do. If they are detailed enough to do that job, and<br />

they represent by resembling possibilities, then arguably they will contain as much<br />

problematic ontology as Lewisian concrete possible worlds. So they have the costs of<br />

Lewis’s theory without any obvious advantage.<br />

The final theory he considers is what he calls “magical ersatzism”. Unlike the previous<br />

two theories, this theory is defined negatively. The magical ersatzist is defined<br />

by their denial that possible worlds represent, or at least that they represent in either<br />

of the two ways (linguistic and pictoral) that we are familiar with. And Lewis’s primary<br />

complaint is that this kind of theory is mysterious, and that it could only seem<br />

attractive if it hides from view the parts of the theory that are doing the philosophical<br />

work. Lewis argues that as soon as we ask simple questions about the relationship<br />

that holds between a possibility and actuality if that possibility is actualised, such as<br />

whether this is an internal or external relation, we find the magical ersatzist saying<br />

things that are either implausible or mysterious.<br />

It isn’t clear just who is a magical ersatzist. Lewis wrote that at the time he wrote<br />

Plurality no one explicitly endorsed this theory. This was perhaps unfair to various<br />

primitivists about modality, such as Adams (1974), Plantinga (1974) and Stalnaker


David Lewis 81<br />

(1976). Given the negative definition of magical ersatzism, and given the fact that<br />

primitivists do not think that possible worlds represent possibilities via any familiar<br />

mechanism, it seems the primitivists should count as magical ersatzists, or, as Lewis<br />

calls them, “magicians”. In any case, if magical ersatzism, in all its varieties, is objectionably<br />

mysterious, that suggests ersatzism is in trouble, and hence if we want the<br />

benefits of possible worlds, we have to pay for them by accepting concrete possible<br />

worlds.<br />

6.4 Counterparts or Double Lives?<br />

The last chapter of Plurality changes tack somewhat. Instead of focussing on different<br />

ways the world could be, Lewis’s focus becomes different ways things could be. The<br />

chapter defends, and expands upon, Lewis’s counterpart theory.<br />

Counterpart theory was first introduced by Lewis in “Counterpart Theory and<br />

Quantified Modal Logic” (1968) as a way of making modal discourse extensional. Instead<br />

of worrying just what a name inside the scope of a modal operator might mean,<br />

we translate the language of quantified modal logic into a language without operators,<br />

but with quantifiers over worlds and other non-actual individuals. So instead of<br />

saying �Fa, we say ∀w∀x ((Ww ∧ Ixw ∧ Cxa) ⊃ Fx). That is, for all w and x, if w is<br />

a world, and x is in w, and x is a counterpart of a, then Fx. Or, more intuitively, all<br />

of a’s counterparts are F. The paper shows how we can extend this intuitive idea into<br />

a complete translation from the language of quantfied modal logic to the language of<br />

counterpart theory. In “Tensions” (1974b) Lewis retracts the claim that it is an advantage<br />

of counterpart theory over quantified modal logic that it is extensional rather<br />

than intensional, largely because he finds the distinction between these two notions<br />

much more elusive than he had thought. But he still thought counterpart theory had<br />

a lot of advantages, and these were pressed in chapter 4.<br />

The intuitive idea behind counterpart theory was that individuals, at least ordinary<br />

individuals of the kind we regularly talk about, are world-bound. That is, they<br />

exist in only one world. But they do not have all of their properties essentially. We<br />

can truly say of a non-contender, say Malloy, that he could have been a contender. In<br />

the language of possible worlds, there is a possible world w such that, according to it,<br />

Malloy is a contender. But what in turn does this mean? Does it mean that Malloy<br />

himself is in w? Not really, according to counterpart theory. Rather, a counterpart<br />

of Malloy’s is a contender in w. And Malloy himself has the modal property could<br />

have been a contender in virtue of having a counterpart in w who is a contender. This<br />

way of thinking about modal properties of individuals has, claims Lewis, a number<br />

of advantages.<br />

For one thing, it avoids an odd kind of inconsistency. Malloy might not only have<br />

been a contender, he might have been 6 inches taller. If we think that is because there<br />

is a world in which Malloy himself is 6 inches taller, then it seems like we’re saying<br />

that Malloy can have two heights, his actual height and one 6 inches taller. And that<br />

looks inconsistent. The obvious way out of this is to say that he bears one height in<br />

relation to this world, and another to another world. But that turns height from an<br />

intrinsic property into a relation, and that seems like a mistake. Lewis thinks this


David Lewis 82<br />

problem, what he dubs the ‘problem of accidental intrinsics’, is a reason to deny that<br />

Malloy himself is in multiple worlds.<br />

For another, it allows us a kind of inconstancy in our modal predications. Could<br />

Malloy have been brought by a stork, or must he have had the parents he actually<br />

had? In some moods we think one, in other moods we think another. Lewis thinks<br />

that counterpart theory can reflect our indecision. There is a world with someone<br />

brought by a stork who has a life much like Malloy’s. Is he one of Malloy’s counterparts?<br />

Well, he is according to some counterpart relations, and not according to<br />

others. When one of the former relations is contextually salient, it’s true to say that<br />

Malloy could have been brought by a stork. When more demanding counterpart relations<br />

are salient, he isn’t one of Malloy’s counterparts, and indeed all of Malloy’s<br />

counterparts share his parents. (More precisely, all of his counterparts have parents<br />

who are counterparts of Malloy’s actual parents.) In those contexts, it is true to say<br />

that one’s parentage is essential. Throughout his career, Lewis uses this inconstancy<br />

of the counterpart relation to resolve all manner of metaphysical puzzles, from puzzles<br />

about personal identity (1971b) to puzzles about truthmakers (2003). The final<br />

section of Plurality is Lewis’s most extended argument that this variability of the<br />

counterpart relation is a strength, not a weakness, of the theory.<br />

7 Other Writings<br />

Lewis wrote a lot that isn’t covered by the broad categories we’ve discussed so far.<br />

The point of this section is to provide a sample of that material. It isn’t close to being<br />

comprehensive. It doesn’t include his treatment of qualia in (1988e) and (1995). It<br />

doesn’t include his contributions to causal decision theory in (1979d) and (1981b). It<br />

goes very quickly over his many papers in ethics. And it skips his contributions to<br />

debates about non-classical logics, such as (1982) and (1990). We’ve tried to restrict<br />

attention to those areas where Lewis’s contributions were groundbreaking, influential,<br />

and set out a new positive theory. Shockingly, there is a lot to cover that meets<br />

those constraints, and is not included in the above survey of the major themes of his<br />

philosophy.<br />

7.1 Mathematics and Mereology<br />

Parts of Classes (1991) and “Mathematics is Megethology” (1993c) consider the distinctive<br />

philosophical problems raised by set theory. As Lewis notes, it is widely<br />

held that all of mathematics reduces to set theory. But there is little consensus about<br />

what the metaphysics of set theory is. Lewis puts forward two proposals that might,<br />

collectively, help to clarify matters.<br />

The first proposal is what he calls the Main Thesis: “The parts of a class are all and<br />

only its subclasses” (1991, 7). By ‘class’ here, Lewis does not mean ‘set’. Classes are<br />

things with members. Some classes are proper classes, and hence not sets. And one<br />

set, the null set, has no members, so is not a class. Individuals, for Lewis, are things<br />

without members. Since the null set has no members, it is an individual. But the<br />

overlap between the sets and the classes is large; most sets we think about are classes.


David Lewis 83<br />

The big payoff of the Main Thesis is that it reduces the mysteries of set theory<br />

to a single mystery. Any class is a fusion of singletons, i.e., sets with one member.<br />

If we understand what a singleton is, and we understand what fusions are, then we<br />

understand all there is to know about classes, and about sets. That’s because any set<br />

is just the fusion of the singletons of its members.<br />

But singletons are deeply mysterious. The usual metaphors that are used to introduce<br />

sets, metaphors about combining or collecting or gathering multiple things into<br />

one are less than useless when it comes to understanding the relationship between a<br />

singleton and its member. In (1993c), Lewis settles for a structuralist understanding<br />

of singletons. He also says that he “argued (somewhat reluctantly) for a structuralist’<br />

approach to the theory of singleton functions” in (1991), though on page 54 of (1991)<br />

he appears to offer qualified resistance to structuralism.<br />

One of the technical advances of (1991) and (1993c) was that they showed how a<br />

structuralist account of set theory was even possible. This part of the work was coauthored<br />

with John P. Burgess and A. P. Hazen. Given a large enough universe (i.e.,<br />

that the cardinality of the mereological atoms is an inaccessible cardinal), and given<br />

plural quantification, we can say exactly what constraints a function must satisfy for<br />

it to do the work we want the singleton function to do. (By ‘the singleton function’<br />

I mean the function that maps anything that has a singleton onto its singleton. Since<br />

proper classes don’t have singletons, and nor do fusions of sets and objects, this will<br />

be a partial function.) Given that, we can understand mathematical claims made<br />

in terms of sets/classes as quantifications over singleton functions. That is, we can<br />

understand any claim that would previous have used ‘the’ singleton function as a<br />

claim of the form for all s: ...s...s..., where the terms s go where we would previously<br />

have referred to ‘the’ singleton function. It is provable that this translation won’t<br />

introduce any inconsistency into mathematics (since there are values for s), or any<br />

indeterminacy (since the embedded sentence ...s...s... has the same truth value for any<br />

eligible value for s).<br />

Should we then adopt this structuralist account, and say that we have removed the<br />

mysteries of mathematics? As noted above, Lewis is uncharacteristically equivocal on<br />

this point, and seemed to change his mind about whether structuralism was, all things<br />

considered, a good or a bad deal. His equivocation comes from two sources. One<br />

worry is that when we work through the details, some of the mysteries of set theory<br />

seem to have been relocated rather than solved. For instance, if we antecedently<br />

understood the singleton function, we might have thought it could be used to explain<br />

why the set theoretic universe is so large. Now we have to simply posit a very large<br />

universe. Another is that the proposal is in some way revisionary, since it takes<br />

ordinary mathematical talk to be surreptitiously quantificational. Parts of Classes<br />

contains some famous invective directed against philosophers who seek to overturn<br />

established science on philosophical grounds.<br />

I’m moved to laughter at the thought of how presumptous it would be to<br />

reject mathematics for philosophical reasons. How would you like the<br />

job of telling the mathematicians that they must change their ways, and<br />

abjure countless errors, now that philosophy has discovered that there are


David Lewis 84<br />

no classes? Can you tell them, with a straight face, to follow philosophical<br />

argument wherever it may lead? If they challenge your credentials,<br />

will you boast of philosophy’s other great discoveries: that motion is<br />

impossible, that a Being than which no greater can be conceived cannot<br />

be conceived not to exist, that it is unthinkable that anything exists outside<br />

the mind, that time is unreal, that no theory has ever been made<br />

at all probable by evidence (but on the other hand that an empirically<br />

adequate ideal theory cannot possibly be false), that it is a wide-open scientific<br />

question whether anyone has ever believed anything, and so on,<br />

and on, ad nauseum? Not me! (1991, 59)<br />

And yet Lewis’s positive theory here is somewhat revisionary. It doesn’t revise<br />

the truth value of any mathematical claim, but it does revise the understanding of<br />

them. Is even this too much revision to make on philosophical grounds? Perhaps not,<br />

but it is worrying enough for Lewis to conclude merely that the theory he proposes<br />

seems better than the alternatives, not that there is a compelling positive case for its<br />

truth.<br />

7.2 Philosophy of Language<br />

Lewis’s major contribution to formal semantics was his theory of counterfactual conditionals.<br />

But there were several other contributions that he made, both on specific<br />

topics in formal semantics, and on the role of semantic theory.<br />

In “Adverbs of Quantification” (1975a), Lewis notes several difficulties in translating<br />

sentences involving “usually”, “frequently”, “rarely” or related adverbs into<br />

first-order logic or some similar formal system. Lewis’s solution to the puzzles raised<br />

involves two formal advances. First, he treats the adverbs as unselective quantifiers,<br />

binding all free variables in their scope. The second advance concerns the if-clauses<br />

in sentences like Usually, if a team plays well, they win. It is difficult for various reasons<br />

to take the structure of this sentence to involve a quantifier over a compound<br />

sentence with a conditional connective. Lewis’s second advance is to say that these<br />

if-clauses are simply domain restrictors. The ‘if’ is no more a sentential connective<br />

than the ‘and’ in New York is between Boston and Washington. Instead, the if-clause<br />

restricts what things the quantifier denoted by ‘usually’ ranges over.<br />

This paper is not widely read by philosophers, but it has been very influential<br />

among linguists, especially semanticists. Indeed, its uptake by semanticists has made<br />

it the fourth most cited paper of Lewis’s on Google Scholar. His most cited paper on<br />

Google Scholar is also in philosophy of language; it is “Scorekeeping in a Language<br />

Game” (1979f).<br />

That paper is about conversational dynamics. Lewis develops an extended analogy<br />

between the role of context in a conversation and the role of score in a baseball<br />

game. One central role of the score is to keep a record of what has already happened.<br />

In that way, score is influenced by what happens on the field, or in the conversation.<br />

But the causal influence runs in the other way as well. Some events on the field are<br />

influenced by the score. You’re only out after the third strike, for example. Similarly,


David Lewis 85<br />

Lewis holds that context (or the conversational score) can influence, or even be partially<br />

constitutive of, what happens in the conversation. If I say “None of the cats are<br />

afraid of Barney”, which cats I’ve managed to talk about depends on which cats are<br />

conversationally salient. And in saying this, I’ve made Barney salient, so the score<br />

changes in that respect. That change matters; now I can denote Barney by “he”.<br />

Lewis argues that this model can make sense of a number of otherwise puzzling<br />

features of language. One notable example of this involves quantification. Most quantifiers<br />

we use do not range over the entire universe. We quantify only over a restricted<br />

range. Lewis says that it is the salient objects. He also says that this happens not just<br />

when we explicitly quantify, but also when we use terms that have a quantificational<br />

analysis. He mentions in passing that “knows” might be one such term.<br />

This idea is developed more fully in “Elusive Knowledge” (1996b). Lewis argues<br />

that S knows that p is true iff S is in a position to rule out all possibilities in which p is<br />

false. But when we say S knows that p, we don’t mean to quantify over all possibilities<br />

there are, only over the salient possibilities. The big advantage of Lewis’s approach<br />

is that it lets him explain the appeal of scepticism. When the sceptic starts talking<br />

about fantastic possibilities of error, she makes those possibilities salient. Since we<br />

can’t rule them out, when we’re talking to the sceptic we can’t say we know very<br />

much. But since those possibilities aren’t usually salient, we are usually correct in our<br />

knowledge-ascriptions. So Lewis lets the sceptic win any debate they are in, without<br />

conceding that ordinary knowledge-ascriptions are false.<br />

The kind of position Lewis defends here, which came to be known as contextualism,<br />

has been a central focus of inquiry in epistemology for the last fifteen years.<br />

“Elusive Knowledge”, along with papers such as Cohen (1986) and DeRose (1995)<br />

founded this research program.<br />

7.3 Bayesian Philosophy<br />

This subsection is largely about two pairs of papers: “Probabilities of Conditionals<br />

and Conditional Probabilities” (1976b) and its sequel (1986d), and “Desire as Belief”<br />

(1988b) and its sequel (1996a). The papers have more in common than merely having<br />

a common naming convention. (They’re not even Lewis’s only sequels; “Lucas<br />

Against Mechanism” (1969b) also has a sequel (1979c).) In both cases Lewis aims to<br />

defend orthodox Bayesian epistemology against some challenges. And in both cases<br />

the argument turns on principles to do with updating. Lewis was throughout his career<br />

a Bayesian; he frequently said that the ideal epistemic agent was a Bayesian conditionaliser<br />

and utility maximiser. And he defended this position with some gusto.<br />

The conditionals papers concern a position that was gaining popularity before<br />

Lewis showed it was untenable. The position in question starts with the idea that a<br />

speaker can properly say Probably, if p, q iff their subjective probability of q given<br />

p is high. And the position then offers an explanation of this purported fact. The<br />

English word ‘if’ is a binary connective which forms a sentence to be written as p →<br />

q, and it is true in virtue of the meaning of this connective that Pr(q | p) = Pr(p → q).<br />

So, assuming ‘probably’ means something like subjective probabilility Probably, if p,<br />

q means that the subjective probability of p → q, and, assuming the agent is coherent,<br />

that is true just in case the subjective probability of q given p is high.


David Lewis 86<br />

Lewis doubted several aspects of this story. He briefly notes in “Adverbs of Quantification”<br />

that he didn’t think the ‘if’ in Probably, if p, q is a binary connective. But<br />

the more telling objection was his proof that there could not be a connective → such<br />

that for all p, q, Pr(q | p) = Pr(p → q). Lewis first argued for this in (1976b), and<br />

showed how to weaken some of the assumptions of the argument in (1986d). The<br />

effect of Lewis’s position was to essentially end the hope of analysing English ‘if’ in<br />

terms of a binary connective with these probabilistic properties.<br />

The desire papers (1988a; 1996b) are also about the Humean view that motivation<br />

requires both a belief and a desire. Lewis aims to attack the anti-Humean position that<br />

some beliefs, in particular beliefs that a certain thing is good, can play the functional<br />

roles of both beliefs and desires. He argues that this is not, in general, possible. And<br />

the argument is that beliefs and desires update in different ways. Or, at least, that<br />

anyone who updates their beliefs by conditionalisation and updates their valuation<br />

functions in a plausible way, will not be able to preserve any correlation between<br />

desire for a proposition being true and belief in that proposition’s goodness.<br />

Both of these papers rely on the idea that conditionalisation is a good way to update<br />

beliefs. Neither, by the way, rely on the idea that conditionalisation is the only<br />

rational way to update beliefs; the arguments go through given merely the permissibility<br />

of conditionalising. Many Bayesians hold something stronger, namely that<br />

conditionalisation is the way to update beliefs. One widely used argument in favour<br />

of this position is a so-called ‘Dutch Book’ argument. This argument shows that if<br />

you plan to follow any strategy for revising beliefs other than conditionalisation, and<br />

you do follow that strategy, then someone who knows the strategy that you’re going<br />

to follow can produce a series of bets that will seem favourable to you when each is<br />

offered, but which will collectively lead to a sure loss. If you conditionalise, however,<br />

no such series of bets can be produced. This argument was introduced to the<br />

literature by Paul Teller (1973), who credited it to Lewis. Lewis’s own version of the<br />

argument did not appear until 1999, in <strong>Papers</strong> in Metaphysics and Epistemology, under<br />

the title “Why Conditionalize?” (1999b). This was something he had written as a<br />

course handout in 1972, and which had been very widely circulated, and, via Teller’s<br />

paper, very influential on the development of Bayesian epistemology.<br />

Lewis was an early proponent of one of the two major views about the Sleeping<br />

Beauty puzzle. (There is a good description of the puzzle in section 6.3 of the entry<br />

on epistemic paradoxes, so I won’t repeat the description here.) The puzzle was introduced<br />

to the philosophical community by Adam Elga (2000b), who argued that<br />

when Beauty woke up, her credence in Heads should be 1<br />

. Lewis argued that the<br />

3<br />

correct answer was 1<br />

. The core of his argument was that before Beauty went to sleep,<br />

2<br />

her credence in Heads should be 1<br />

. That was agreed on all sides. Moreover, nothing<br />

2<br />

happened that surprised Beauty. Indeed, everything happened exactly as she expected<br />

it would. Lewis argued that “Only new relevant evidence, centred or uncentred, produces<br />

a change in credence” (2001b, 174), and that Beauty got no new evidence. This<br />

idea has featured heavily in subsequent work defending the 1<br />

answer to the Sleeping<br />

2<br />

Beauty puzzle.


David Lewis 87<br />

The Sleeping Beauty puzzle is important for another reason. As the quote above<br />

indicates, the puzzle is usually set up in terms of sets of centered worlds, following<br />

the work of Lewis we described in section 4.5. The work generated by the puzzle<br />

has been one of the reasons that that work, in particular (1979a), has received a large<br />

amount of attention in recent years.<br />

7.4 Philosophy of Religion<br />

In “Anselm and Actuality” (1970a), Lewis tries to give as good a formulation of the<br />

ontological argument as can be made in modal realist terms. This is a good framework<br />

for discussing the ontological argument, since on one interpretation, the argument<br />

rests crucially on cross-world comparisons of greatness and the modal realist<br />

can make sense of that kind of talk better than views that reject possible objects.<br />

Lewis argues that the principle “A being than which nothing greater can be conceived<br />

is possible” is crucially ambiguous. One kind of reading is that the imagined being’s<br />

greatness in its world is greater than the greatness of any other being in that being’s<br />

world. That may be true, but it doesn’t imply that the being actually exists. Another<br />

kind of reading focusses on the imagined being’s greatness in this world. It says that<br />

there (actually) is a being whose actual greatness is greater than the greatness of any<br />

possible being. That entails the conclusion, but is not plausibly true. The broader<br />

conclusion here, that the ontological argument derives its persuasive force from an<br />

equivocation, is one that has been widely adopted since Lewis’s paper.<br />

In “Evil for Freedom’s Sake” (1993a), Lewis reflects at length on the free will defence<br />

to the problem of evil. Lewis argues that for the defence to work, God must<br />

make quite different trade-offs between freedom and welfare than we are usually disposed<br />

to make, and our understanding of what freedom consists in, and what divine<br />

foreknowledge consists in, must be different to what they currently are.<br />

In “Do We Believe in Penal Substitution?” (1997a), Lewis notes that we only<br />

sometimes accept that one person can be properly punished for another’s misdeeds.<br />

He uses this to raise an interesting difficulty for the Christian idea that Christ died<br />

for our sins, suggesting this may not be a form of penal substitution that is normally<br />

acceptable.<br />

In “Divine Evil” (2007), Lewis suggests that proponents of the problem of evil<br />

should not focus on what God fails to prevent, but on what God does. In orthodox<br />

forms of theism, particularly Christianity and Islam, God is presented as perpetrating<br />

great evil against sinners of various stripes in the form of extreme punishments in the<br />

afterlife. Lewis suggests that a God that does would be so evil that we should not<br />

only reject Him, but we may regard those who endorse the divine punishments as<br />

themselves somewhat culpable for divine evil. (The published version of this paper<br />

was composed by Phillip Kitcher after Lewis’s death from notes Lewis made, and<br />

conversations Kitcher had with Lewis.)


David Lewis 88<br />

7.5 Ethics<br />

Lewis is obviously not as well known for his work in ethics as for his work in other<br />

areas of philosophy. It was something of a surprise when one of the volumes of his<br />

collected papers was called <strong>Papers</strong> in Ethics and Social Philosophy (2000). On the other<br />

hand, the existence of this volume indicates that there is a large body of work that<br />

Lewis put together in moral philosophy, very broadly construed. The best guide to<br />

this work is chapter 8 of Nolan (2005), and I’ll follow Nolan very closely here.<br />

As Nolan suggests, the least inaccurate summary of Lewis’s ethical positions is<br />

that he was a virtue ethicist. Indeed, a focus on virtue, as opposed to consequences,<br />

plays a role in his defence of modal realism, as we saw in section 6.4. Nolan also<br />

notes that this position is somewhat surprising. Most philosophers who accept views<br />

related to Lewis’s about psychology and decision-making (in particular, who accept<br />

a Humean story about beliefs and desires being the basis for motivation, and who<br />

accept some or other version of expected utility maximisation as the basis for rational<br />

decision) have broadly consequentialist positions. But not Lewis.<br />

Lewis was also a value pluralist (1984a; 1989b; 1993a). Indeed, this was part of<br />

his objection to consequentialism. He rejected the idea that there was one summary<br />

judgment we could make about the moral value of a person. In “Reply to McMichael”<br />

(1978a) he complains about the utilitarian assumption that “any sort or amount of<br />

evil can be neutralized, as if it had never been, by enough countervailing good —and<br />

that the balancing evil and good may be entirely unrelated” (1978a, 85).<br />

In meta-ethics, Lewis defended a variety of subjectivism (1989b). Like many subjectivists,<br />

Lewis held that something is valuable for us iff we would value it under<br />

ideal circumstances. And he held, following Frankfurt (1971), that valuing something<br />

is simply desiring to desire it. What is distinctive about Lewis’s position is his<br />

view about what ideal circumstances are. He thinks they are circumstances of “full<br />

imaginative acquaintance”. This has some interesting consequences. In particular, it<br />

allows Lewis to say that different goods have different conditions of full imaginative<br />

acquaintance. It might, he suggests, be impossible to properly imagine instantiating<br />

several different values at once. And that in turn lets him argue that his value<br />

pluralism is consistent with this kind of subjectivism, in a way that it might not be<br />

consistent with other varieties of subjectivism.<br />

Lewis also wrote several more papers in applied ethics. In two interesting papers<br />

on tolerance (1989a; 1989c), he suggests that one reason for being tolerant, and<br />

especially of being tolerant of speech we disapprove of, comes from game-theoretic<br />

considerations. In particular, he thinks our motivation for tolerance comes from<br />

forming a ‘tacit treaty’ with those with differing views. If we agree not to press our<br />

numerical superiority to repress them when we are in the majority, they will do the<br />

same. So tolerating opposing views may be an optimal strategy for anyone who isn’t<br />

sure that they will be in the majority indefinitely. In these works it is easy to see<br />

the legacies of Lewis’s early work on philosophical lessons to be drawn from game<br />

theory, and especially from the work of Thomas Schelling.


David Lewis 89<br />

7.6 Applied Metaphysics<br />

There’s much more that could be said about Lewis’s contributions to philosophy, but<br />

we’ll end with a discussion of two wonderful pieces of applied metaphysics.<br />

In “The Paradoxes of Time Travel” (1976a), Lewis discusses the many complicated<br />

philosophical issues about time travel. He discusses temporal parts, personal identity,<br />

causation and causal loops, free will, and the complications arising from our many<br />

different modal concepts. In some cases he uses the canvas provided to illustrate his<br />

own take on the metaphysical issues that arise. But in some cases he notes that the<br />

problems that arise are problems for everyone.<br />

“Holes” (Lewis and Lewis, 1970) was co-written with Stephanie Lewis. In it they<br />

discuss, in dialog form, some of the metaphysical issues that holes generate. One of<br />

the characters, Argle, wants to eliminate holes from his ontology, and the paper goes<br />

over what costs must be met to make this form of nominalism work. The other character,<br />

Bargle, pushes Argle to clarify his commitments, and in doing so draws out<br />

many details of the nominalist framework. The case is of some interest in itself, but<br />

it is also, as the authors note at the end, a useful case-study in the kind of moves nominalists<br />

can make in eliminating unwanted ontology, and the costs of those moves.<br />

Each paper can be, and indeed often has been, used for introducing complicated<br />

metaphysical issues to students. The papers are, like many of Lewis’s papers, widely<br />

anthologised. They are both excellent illustrations of the fact that, as well as being a<br />

wonderful philosopher, Lewis was one of the best philosophical writers of his time.


Part II<br />

Epistemology


Can We Do Without Pragmatic Encroachment?<br />

1 Introduction<br />

Recently several authors have defended claims suggesting that there is a closer connection<br />

between practical interests and epistemic justification than has traditionally<br />

been countenanced. Jeremy Fantl and Matthew McGrath 2002 argue that there is a<br />

“pragmatic necessary condition on epistemic justification” (77), namely the following.<br />

(PC) S is justified in believing that p only if S is rational to prefer as if p. (77)<br />

And John Hawthorne (2004b) and Jason Stanley (2005) have argued that what it takes<br />

to turn true belief into knowledge is sensitive to the practical environment the subject<br />

is in. These authors seem to be suggesting there is, to use Jonathan Kvanvig’s<br />

phrase “pragmatic encroachment” in epistemology. In this paper I’ll argue that their<br />

arguments do not quite show this is true, and that concepts of epistemological justification<br />

need not be pragmatically sensitive. The aim here isn’t to show that (PC) is<br />

false, but rather that it shouldn’t be described as a pragmatic condition on justification.<br />

Rather, it is best thought of as a pragmatic condition on belief. There are two<br />

ways to spell out the view I’m taking here. These are both massive simplifications,<br />

but they are close enough to the truth to show the kind of picture I’m aiming for.<br />

First, imagine a philosopher who holds a very simplified version of functionalism<br />

about belief, call it (B).<br />

(B) S believes that p iff S prefers as if p<br />

Our philosopher one day starts thinking about justification, and decides that we can<br />

get a principle out of (B) by adding normative operators to both sides, inferring (JB).<br />

(JB) S is justified in believing that p only if S is justified to prefer as if p<br />

Now it would be a mistake to treat (JB) as a pragmatic condition on justification<br />

(rather than belief) if it was derived from (B) by this simple means. And if our<br />

philosopher goes on to infer (PC) from (JB), by replacing ‘justified’ with ‘rational’,<br />

and inferring the conditional from the biconditional, we still don’t get a pragmatic<br />

condition on justification.<br />

Second, Fantl and McGrath focus their efforts on attacking the following principle.<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philosophical<br />

Perspectives 19 (2005): 417-43. Thanks to Michael Almeida, Tamar Szabó Gendler, Peter Gerdes,<br />

Jon Kvanvig, Barry Lam, Ishani Maitra, Robert Stalnaker, Jason Stanley, Matthew Weiner for helpful<br />

discussions, and especially to Matthew McGrath for correcting many mistakes in an earlier draft of this<br />

paper.


Can We Do Without Pragmatic Encroachment? 92<br />

Evidentialism For any two subjects S and S ′ , necessarily, if S and S ′ have the same<br />

evidence for/against p, then S is justified in believing that p iff S ′ is, too.<br />

I agree, evidentialism is false. And I agree that there are counterexamples to evidentialism<br />

from subjects who are in different practical situations. What I don’t agree is<br />

that we learn much about the role of pragmatic factors in epistemology properly defined<br />

from these counterexamples to evidentialism. Evidentialism follows from the<br />

following three principles.<br />

Probabilistic Evidentialism . For any two subjects S and S ′ , and any degree of belief<br />

α necessarily, if S and S ′ have the same evidence for/against p, then S is justified<br />

in believing that p to degree α iff S ′ is, too.<br />

Threshold View For any two subjects S and S ′ , and any degree of belief α, if S and<br />

S ′ both believe p to degree α, then S believes that p iff S ′ does too.<br />

Probabilistic Justification For any S, S is justified in believing p iff there is some<br />

degree of belief α such that S is justified in believing p to degree α, and in S’s<br />

situation, believing p to degree α suffices for believing p.<br />

(Degrees of belief here are meant to be the subjective correlates of Keynesian probabilities.<br />

See Keynes (1921) for more details. They need not, and usually will not, be<br />

numerical values. The Threshold View is so-called because given some other plausible<br />

premises it implies that S believes that p iff S’s degree of belief in p is above a<br />

threshold.)<br />

I endorse Probabilistic Justification, and for present purposes at least I endorse<br />

Probabilistic Evidentialism. The reason I think Evidentialism fails is because the<br />

Threshold View is false. It is plausible that Probabilistic Justification and Probabilistic<br />

Evidentialism are epistemological principles, while the Threshold View is a<br />

principle from philosophy of mind. So this matches up with the earlier contention<br />

that the failure of Evidentialism tells us something interesting about the role of pragmatics<br />

in philosophy of mind, rather than something about the role of pragmatics in<br />

epistemology.<br />

As noted, Hawthorne and Stanley are both more interested in knowledge than<br />

justification. So my discussion of their views will inevitably be somewhat distorting.<br />

I think what I say about justification here should carry over to a theory of knowledge,<br />

but space prevents a serious examination of that question. The primary bit of<br />

‘translation’ I have to do to make their works relevant to a discussion of justification<br />

is to interpret their defences of the principle (KP) below as implying some support<br />

for (JP), which is obviously similar to (PC).<br />

(KP) If S knows that p, then S is justified in using p as a premise in practical reasoning.<br />

(JP) If S justifiably believes that p, then S is justified in using p as a premise in<br />

practical reasoning.


Can We Do Without Pragmatic Encroachment? 93<br />

I think (JP) is just as plausible as (KP). In any case it is independently plausible<br />

whether or not Hawthorne and Stanley are committed to it. So I’ll credit recognition<br />

of (JP)’s importance to a theory of justification to them, and hope that in doing so<br />

I’m not irrepairably damaging the public record.<br />

The overall plan here is to use some philosophy of mind, specifically functionalist<br />

analyses of belief to respond to some arguments in epistemology. But, as you<br />

can see from the role the Threshold View plays in the above argument, our starting<br />

point will be the question what is the relation between the credences decision theory<br />

deals with, and our traditional notion of a belief? I’ll offer an analysis of this relation<br />

that supports my above claim that we should work with a pragmatic notion of belief<br />

rather than a pragmatic notion of justification. The analysis I offer has a hole in it<br />

concerning propositions that are not relevant to our current plans, and I’ll fix the<br />

hold in section 3. Sections 4 and 5 concern the role that closure principles play in<br />

my theory, in particular the relationship between having probabilistically coherent<br />

degrees of belief and logically coherent beliefs. In this context, a closure principle<br />

is a principle that says probabilistic coherence implies logical coherence, at least in a<br />

certain domain. (It’s called a closure principle because we usually discuss it by working<br />

out properties of probabilistically coherent agents, and show that their beliefs are<br />

closed under entailment in the relevant domain.) In section 4 I’ll defend the theory<br />

against the objection, most commonly heard from those wielding the preface paradox,<br />

that we need not endorse as strong a closure principle as I do. In section 5 I’ll<br />

defend the theory against those who would endorse an even stronger closure principle<br />

than is defended here. Once we’ve got a handle on the relationship between<br />

degrees of belief and belief tout court, we’ll use that to examine the arguments for<br />

pragmatic encroachment. In section 6 I’ll argue that we can explain the intuitions behind<br />

the cases that seem to support pragmatic encroachment, while actually keeping<br />

all of the pragmatic factors in our theory of belief. In section 7 I’ll discuss how to<br />

endorse principles like (PC) and (JP) (as far as they can be endorsed) while keeping<br />

a non-pragmatic theory of probabilistic justification. The interesting cases here are<br />

ones where agents have mistaken and/or irrational beliefs about their practical environment,<br />

and intuitions in those cases are cloudy. But it seems the most natural path<br />

in these cases is to keep a pragmatically sensitive notion of belief, and a pragmatically<br />

insensitive notion of justification.<br />

2 Belief and Degree of Belief<br />

Traditional epistemology deals with beliefs and their justification. Bayesian epistemology<br />

deals with degrees of belief and their justification. In some sense they are<br />

both talking about the same thing, namely epistemic justification. Two questions<br />

naturally arise. Do we really have two subject matters here (degrees of belief and<br />

belief tout court) or two descriptions of the one subject matter? If just one subject<br />

matter, what relationship is there between the two modes of description of this subject<br />

matter?<br />

The answer to the first question is I think rather easy. There is no evidence to<br />

believe that the mind contains two representational systems, one to represent things


Can We Do Without Pragmatic Encroachment? 94<br />

as being probable or improbable and the other to represent things as being true or<br />

false. The mind probably does contain a vast plurality of representational systems,<br />

but they don’t divide up the doxastic duties this way. If there are distinct visual and<br />

auditory representational systems, they don’t divide up duties between degrees of<br />

belief and belief tout court, for example. If there were two distinct systems, then we<br />

should imagine that they could vary independently, at least as much as is allowed by<br />

constitutive rationality. But such variation is hard to fathom. So I’ll infer that the<br />

one representational system accounts for our credences and our categorical beliefs.<br />

(It follows from this that the question Bovens and Hawthorne (1999) ask, namely<br />

what beliefs should an agent have given her degrees of belief, doesn’t have a nontrivial<br />

answer. If fixing the degrees of belief in an environment fixes all her doxastic<br />

attitudes, as I think it does, then there is no further question of what she should<br />

believe given these are her degrees of belief.)<br />

The second question is much harder. It is tempting to say that S believes that p<br />

iff S’s credence in p is greater than some salient number r , where r is made salient<br />

either by the context of belief ascription, or the context that S is in. I’m following<br />

Mark Kaplan (1996) in calling this the threshold view. There are two well-known<br />

problems with the threshold view, both of which seem fatal to me.<br />

As Robert Stalnaker (1984, 91) emphasised, any number r is bound to seem arbitrary.<br />

Unless these numbers are made salient by the environment, there is no special<br />

difference between believing p to degree 0.9786 and believing it to degree 0.9875.<br />

But if r is 0.98755, this will be the difference between believing p and not believing<br />

it, which is an important difference. The usual response to this, as found in (Foley,<br />

1993, Ch. 4) and Hunter (1996) is to say that the boundary is vague. But it’s not clear<br />

how this helps. On an epistemic theory of vagueness, there is still a number such that<br />

degrees of belief above that count, and degrees below that do not, and any such number<br />

is bound to seem unimportant. On supervaluational theories, the same is true.<br />

There won’t be a determinate number, to be sure, but there will a number, and that<br />

seems false. My preferred degree of belief theory of vagueness, as set out in <strong>Weatherson</strong><br />

(2005c) has the same consequence. Hunter defends a version of the threshold<br />

view combined with a theory of vagueness based around fuzzy logic, which seems to<br />

be the only theory that could avoid the arbitrariness objection. But as Williamson<br />

(1994) showed, there are deep and probably insurmountable difficulties with that position.<br />

So I think the vagueness response to the arbitrariness objection is (a) the only<br />

prima facie plausible response and (b) unsuccessful.<br />

The second problem concerns conjunction. It is also set out clearly by Stalnaker.<br />

Reasoning in this way from accepted premises to their deductive consequences<br />

(P, also Q, therefore R) does seem perfectly straightforward.<br />

Someone may object to one of the premises, or to the validity of the argument,<br />

but one could not intelligibly agree that the premises are each<br />

acceptable and the argument valid, while objecting to the acceptability<br />

of the conclusion. (Stalnaker, 1984, 92)


Can We Do Without Pragmatic Encroachment? 95<br />

If categorical belief is having a credence above the threshold, then one can coherently<br />

do exactly this. Let x be a number between r and than r 1/2 , such that for an atom<br />

of type U has probability x of decaying within a time t, for some t and U. Assume<br />

our agent knows this fact, and is faced with two (isolated) atoms of U. Let p be that<br />

the first decays within t, and q be that the second decays within t. She should, given<br />

her evidence, believe p to degree x, q to degree x, and p ∧ q to degree x 2 . If she<br />

believed p ∧ q to a degree greater than r , she’d have to either have credences that<br />

were not supported by her evidence, or credences that were incoherent. (Or, most<br />

likely, both.) So this theory violates the platitude. This is a well-known argument, so<br />

there are many responses to it, most of them involving something like appeal to the<br />

preface paradox. I’ll argue in section 4 that the preface paradox doesn’t in fact offer<br />

the threshold view proponent much support here. But even before we get to there,<br />

we should note that the arbitrariness objection gives us sufficient reason to reject the<br />

threshold view.<br />

A better move is to start with the functionalist idea that to believe that p is to treat<br />

p as true for the purposes of practical reasoning. To believe p is to have preferences<br />

that make sense, by your own lights, in a world where p is true. So, if you prefer A<br />

to B and believe that p, you prefer A to B given p. For reasons that will become apparent<br />

below, we’ll work in this paper with a notion of preference where conditional<br />

preferences are primary. 1 So the core insight we’ll work with is the following:<br />

If you prefer A to B given q, and you believe that p, then you prefer A<br />

to B given p ∧ q<br />

The bold suggestion here is that if that is true for all the A, B and q that matter, then<br />

you believe p. Put formally, where Bel( p) means that the agent believes that p, and<br />

A ≥ q B means that the agent thinks A is at least as good as B given q, we have the<br />

following<br />

1. Bel( p) ↔ ∀A∀B∀q (A ≥ q B ↔ A ≥ p∧q B)<br />

In words, an agent believes that p iff conditionalising on p doesn’t change any conditional<br />

preferences over things that matter. 2 The left-to-right direction of this seems<br />

trivial, and the right-to-left direction seems to be a plausible way to operationalise the<br />

functionalist insight that belief is a functional state. There is some work to be done<br />

if (1) is to be interpreted as a truth though.<br />

1 To say the agent prefers A to B given q is not to say that if the agent were to learn q, she would prefer<br />

A to B. It’s rather to say that she prefers the state of the world where she does A and q is true to the state<br />

of the world where she does B and q is true. These two will come apart in cases where learning q changes<br />

the agent’s preferences. We’ll return to this issue below.<br />

2 This might seem much too simple, especially when compared to all the bells and whistles that functionalists<br />

usually put in their theories to (further) distinguish themselves from crude versions of behaviourism.<br />

The reason we don’t need to include those complications here is that they will all be included in the analysis<br />

of preference. Indeed, the theory here is compatible with a thoroughly anti-functionalist treatment<br />

of preference. The claim is not that we can offer a functional analysis of belief in terms of non-mental<br />

concepts, just that we can offer a functionalist reduction of belief to other mental concepts. The threshold<br />

view is also such a reduction, but it is such a crude reduction that it doesn’t obviously fall into any category.


Can We Do Without Pragmatic Encroachment? 96<br />

If we interpret the quantifiers in (1) as unrestricted, then we get the (false) conclusion<br />

that just about no one believes no contingent propositions. To prove this,<br />

consider a bet that wins iff the statue in front of me waves back at me due to random<br />

quantum effects when I wave at it. If I take the bet and win, I get to live forever in<br />

paradise. If I take the bet and lose, I lose a penny. Letting A be that I take the bet, B<br />

be that I decline the bet, q be a known tautology (so my preferences given q are my<br />

preferences tout court) and p be that the statue does not wave back, we have that I<br />

prefer A to B, but not A to B given p. So by this standard I don’t believe that p. This<br />

is false – right now I believe that statues won’t wave back at me when I wave at them.<br />

This seems like a problem. But the solution to it is not to give up on functionalism,<br />

but to insist on its pragmatic foundations. The quantifiers in (1) should be<br />

restricted, with the restrictions motivated pragmatically. What is crucial to the theory<br />

is to say what the restrictions on A and B are, and what the restrictions on q are.<br />

We’ll deal with these in order.<br />

For better or worse, I don’t right now have the option taking that bet and hence<br />

spending eternity in paradise if the statue waves back at me. Taking or declining such<br />

unavailable bets are not open choices. For any option that is open to me, assuming<br />

that statues do not in fact wave does not change its utility. That’s to say, I’ve already<br />

factored in the non-waving behaviour of statues into my decision-making calculus.<br />

That’s to say, I believe statues don’t wave.<br />

An action A is a live option for the agent if it is really possible for the agent to<br />

perform A. An action A is a salient option if it is an option the agent takes seriously in<br />

deliberation. Most of the time gambling large sums of money on internet gambling<br />

sites over my phone is a live option, but not a salient option. I know this option<br />

is suboptimal, and I don’t have to recompute every time whether I should do it.<br />

Whenever I’m making a decision, I don’t have to add in to the list of choices bet<br />

thousands of dollars on internet gambling sites, and then rerule that out every time. I<br />

just don’t consider that option, and properly so. If I have a propensity to daydream,<br />

then becoming the centrefielder for the Boston Red Sox might be a salient option to<br />

me, but it certainly isn’t a live option. We’ll say the two initial quantifiers range over<br />

the options that are live and salient options for the agent.<br />

Note that we don’t say that the quantifiers range over the options that are live and<br />

salient for the person making the belief ascription. That would lead us to a form of<br />

contextualism for which we have little evidence. We also don’t say that an option<br />

becomes salient for the agent iff they should be considering it. At this stage we are<br />

just saying what the agent does believe, not what they should believe, so we don’t<br />

have any clauses involving normative concepts.<br />

Now we’ll look at the restrictions on the quantifier over propositions. Say a<br />

proposition is relevant if the agent is disposed to take seriously the question of whether<br />

it is true (whether or not she is currently considering that question) and conditionalising<br />

on that proposition or its negation changes some of the agents unconditional<br />

preferences over live, salient options. 3 The first clause is designed to rule out wild<br />

3 Conditionalising on the proposition There are space aliens about to come down and kill all the people<br />

writing epistemology papers will make me prefer to stop writing this paper, and perhaps grab some old<br />

metaphysics papers I could be working on. So that proposition satisfies the second clause of the definition


Can We Do Without Pragmatic Encroachment? 97<br />

hypotheses that the agent does not take at all seriously. If q is not such a proposition,<br />

if the agent is disposed to take it seriously, then it is relevant if there are live, salient<br />

A and B such that A ≥ q B ↔ A ≥ B is false. Say a proposition is salient if the agent<br />

is currently considering whether it is true. Finally, say a proposition is active relative<br />

to p iff it is a (possibly degenerate) conjunction of propositions such that each<br />

conjunct is either relevant or salient, and such that the conjunction is consistent with<br />

p. (By a degenerate conjunction I mean a conjunction with just one conjunct. The<br />

consistency requirement is there because it might be hard in some cases to make sense<br />

of preferences given inconsistencies.) Then the propositional quantifier in (1) ranges<br />

over active propositions.<br />

We will expand and clarify this in the next section, but our current solution to the<br />

relationship between beliefs and degrees of belief is that degrees of belief determine<br />

an agent’s preferences, and she believes that p iff the claim (1) about her preferences<br />

is true when the quantifiers over options are restricted to live, salient actions, and<br />

the quantifier over propositions is restricted to salient propositions. The simple view<br />

would be to say that the agent believes that p iff conditioning on p changes none of<br />

her preferences. The more complicated view here is that the agent believes that p<br />

iff conditioning on p changes none of her conditional preferences over live, salient<br />

options, where the conditions are also active relative to p.<br />

3 Impractical Propositions<br />

The theory sketched in the previous paragraph seems to me right in the vast majority<br />

of cases. It fits in well with a broadly functionalist view of the mind, and as we’ll see it<br />

handles some otherwise difficult cases with aplomb. But it needs to be supplemented<br />

a little to handle beliefs about propositions that are practically irrelevant. I’ll illustrate<br />

the problem, then note how I prefer to solve it.<br />

I don’t know what Julius Caeser had for breakfast the morning he crossed the<br />

Rubicon. But I think he would have had some breakfast. It is hard to be a good<br />

general without a good morning meal after all. Let p be the proposition that he had<br />

breakfast that morning. I believe p. But this makes remarkably little difference to<br />

my practical choices in most situations. True, I wouldn’t have written this paragraph<br />

as I did without this belief, but it is rare that I have to write about Caeser’s dietary<br />

habits. In general whether p is true makes no practical difference to me. This makes<br />

it hard to give a pragmatic account of whether I believe that p. Let’s apply (1) to see<br />

whether I really believe that p.<br />

1. Bel( p) ↔ ∀A∀B∀q (A ≥ q B ↔ A ≥ p∧q B)<br />

Since p makes no practical difference to any choice I have to make, the right hand<br />

side is true. So the left hand side is true, as desired. The problem is that the right<br />

hand side of (2) is also true here.<br />

of relevance. But it clearly doesn’t satisfy the first clause. This part of the definition of relevance won’t do<br />

much work until the discussion of agents with mistaken environmental beliefs in section 7.


Can We Do Without Pragmatic Encroachment? 98<br />

2. Bel(¬ p) ↔ ∀A∀B∀q (A ≥ q B ↔ A ≥ ¬ p∧q B)<br />

Adding the assumption that Caeser had no breakfast that morning doesn’t change<br />

any of my practical choices either. So I now seem to inconsistently believe both p and<br />

¬ p. I have some inconsistent beliefs, I’m sure, but those aren’t among them. We need<br />

to clarify what (1) claims.<br />

To do so, I supplement the theory sketched in section 2 with the following principles.<br />

• A proposition p is eligible for belief if it satisfies ∀A∀B∀q (A ≥ q B ↔ A ≥ p∧q<br />

B), where the first two quantifiers range over the open, salient actions in the<br />

sense described in section 2.<br />

• For any proposition p, and any proposition q that is relevant or salient, among<br />

the actions that are (by stipulation!) open and salient with respect to p are<br />

believing that p, believing that q, not believing that p and not believing that q<br />

• For any proposition, the subject prefers believing it to not believing it iff (a)<br />

it is eligible for belief and (b) the agents degree of belief in the proposition is<br />

greater than 1/2.<br />

• The previous stipulation holds both unconditionally and conditional on p, for<br />

any p.<br />

• The agent believes that p iff ∀A∀B∀q (A ≥ q B ↔ A ≥ p∧q B), where the first<br />

two quantifiers range over all actions that are either open and salient tout court<br />

(i.e. in the sense of section 2) or open and salient with respect to p (as described<br />

above).<br />

This all looks moderately complicated, but I’ll explain how it works in some detail<br />

as we go along. One simple consequence is that an agent only believes that p iff their<br />

degree of belief in p is greater than 1/2. Since my degree of belief in Caeser’s foodless<br />

morning is not greater than 1/2, in fact it is considerably less, I don’t believe ¬ p. On<br />

the other hand, since my degree of belief in p is considerably greater than 1/2, I prefer<br />

to believe it than disbelieve it, so I believe it.<br />

There are many possible objections to this position, which I’ll address sequentially.<br />

Objection: Even if I have a high degree of belief in p, I might prefer to not believe<br />

p because I think that belief in p is bad for some other reason. Perhaps, if p is a<br />

proposition about my brilliance, it might be immodest to believe that p.<br />

Reply: Any of these kinds of considerations should be put into the credences. If it<br />

is immodest to believe that you are a great philosopher, it is equally immodest to<br />

believe to a high degree that you are a great philosopher.<br />

Objection: Belief that p is not an action in the ordinary sense of the term.<br />

Reply: True, which is why this is described as a supplement to the original theory,<br />

rather than just cashing out its consequences.<br />

Objection: It is impossible to choose to believe or not believe something, so we<br />

shouldn’t be applying these kinds of criteria.


Can We Do Without Pragmatic Encroachment? 99<br />

Reply: I’m not as convinced of the impossibility of belief by choice as others are, but I<br />

won’t push that for present purposes. Let’s grant that beliefs are always involuntary.<br />

So these ‘actions’ aren’t open actions in any interesting sense, and the theory is section<br />

2 was really incomplete. As I said, this is a supplement to the theory in section<br />

2.<br />

This doesn’t prevent us using principles of constitutive rationality, such as we<br />

prefer to believe p iff our credence in p is over 1/2. Indeed, on most occasions where<br />

we use constitutive rationality to infer that a person has some mental state, the mental<br />

state we attribute to them is one they could not fail to have. But functionalists<br />

are committed to constitutive rationality (Lewis, 1994b). So my approach here is<br />

consistent with a broadly functionalist outlook.<br />

Objection: This just looks like a roundabout way of stipulating that to believe that<br />

p, your degree of belief in p has to be greater than 1/2. Why not just add that as<br />

an extra clause than going through these little understood detours about preferences<br />

about beliefs?<br />

Reply: There are three reasons for doing things this way rather than adding such a<br />

clause.<br />

First, it’s nice to have a systematic theory rather than a theory with an ad hoc<br />

clause like that.<br />

Second, the effect of this constraint is much more than to restrict belief to propositions<br />

whose credence is greater than 1/2. Consider a case where p and q and their<br />

conjunction are all salient, p and q are probabilistically independent, and the agent’s<br />

credence in each is 0.7. Assume also that p, q and p ∧ q are completely irrelevant to<br />

any practical deliberation the agent must make. Then the criteria above imply that<br />

the agent does not believe that p or that q. The reason is that the agent’s credence in<br />

p ∧q is 0.49, so she prefers to not believe p ∧q. But conditional on p, her credence in<br />

p ∧q is 0.7, so she prefers to believe it. So conditionalising on p does change her preferences<br />

with respect to believing p ∧ q, so she doesn’t believe p. So the effect of these<br />

stipulations rules out much more than just belief in propositions whose credence is<br />

below 1/2.<br />

This suggests the third, and most important point. The problem with the threshold<br />

view was that it led to violations of closure. Given the theory as stated, we can<br />

prove the following theorem. Whenever p and q and their conjunction are all open<br />

or salient, and both are believed, and the agent is probabilistically coherent, the agent<br />

also believes p ∧ q. This is a quite restricted closure principle, but this is no reason to<br />

deny that it is true, as it fails to be true on the threshold view.<br />

The proof of this theorem is a little complicated, but worth working through.<br />

First we’ll prove that if the agent believes p, believes q, and p and q are both salient,<br />

then the agent prefers believing p ∧ q to not believing it, if p ∧ q is eligible for belief.<br />

In what follows Pr(x|y) is the agent’s conditional degree of belief in x given y. Since<br />

the agent is coherent, we’ll assume this is a probability function (hence the name).<br />

1. Since the agent believes that q, they prefer believing that q to not believing that<br />

q (by the criteria for belief)


Can We Do Without Pragmatic Encroachment? 100<br />

2. So the agent prefers believing that q to not believing that q given p (From 1<br />

and the fact that they believe that p, and that q is salient)<br />

3. So Pr(q| p) > 1/2 (from 2)<br />

4. Pr(q| p) = Pr( p ∧ q| p) (by probability calculus)<br />

5. So Pr( p ∧ q| p) > 1/2 (from 3, 4)<br />

6. So, if p ∧ q is eligible for belief, then the agent prefers believing that p ∧ q to<br />

not believing it, given p (from 5)<br />

7. So, if p ∧ q is eligible for belief, the agent prefers believing that p ∧ q to not<br />

believing it (from 6, and the fact that they believe that p, and p ∧ q is salient)<br />

So whenever, p, q and p∧q are salient, and the agent believes each conjunct, the agent<br />

prefers believing the conjunction p ∧ q to not believing it, if p ∧ q is eligible. Now<br />

we have to prove that p ∧ q is eligible for belief, to prove that it is actually believed.<br />

That is, we have to prove that (5) follows from (4) and (3), where the initial quantifiers<br />

range over actions that are open and salient tout court.<br />

(3) ∀A∀B∀r (A ≥ r B ↔ A ≥ p ∧r B)<br />

(4) ∀A∀B∀r (A ≥ r B ↔ A ≥ q ∧r B)<br />

(5) ∀A∀B∀r (A ≥ r B ↔ A ≥ p∧q∧r B)<br />

Assume that (5) isn’t true. That is, there are A, B and s such that ¬(A ≥ s B ↔ A<br />

≥ p∧q∧s B). By hypothesis s is active, and consistent with p∧q. So it is the conjunction<br />

of relevant, salient propositions. Since q is salient, this means q ∧ s is also active.<br />

Since s is consistent with p ∧ q, it follows that q ∧ s is consistent with p. So q ∧ s is<br />

a possible substitution instance for r in (3). Since (3) is true, it follows that A ≥ q∧s<br />

B ↔ A ≥ p∧q∧s B. By similar reasoning, it follows that s is a permissible substitution<br />

instance in (4), giving us A ≥ s B ↔ A ≥ q∧s B. Putting the last two biconditionals<br />

together we get A ≥ s B ↔ A ≥ p∧q∧s B, contradicting our hypothesis that there is a<br />

counterexample to (5). So whenever (3) and (4) are true, (5) is true as well, assuming<br />

p, q and p ∧ q are all salient.<br />

4 Defending Closure<br />

So on my account of the connection between degrees of belief and belief tout court,<br />

probabilistic coherence implies logical coherence amongst salient propositions. The<br />

last qualification is necessary. It is possible for a probabilistically coherent agent to<br />

not believe the non-salient consequences of things they believe, and even for a probabilistically<br />

coherent agent to have inconsistent beliefs as long as not all the members<br />

of the inconsistent set are active. Some people argue that even this weak a closure<br />

principle is implausible. David Christensen (2005), for example, argues that the preface<br />

paradox provides a reason for doubting that beliefs must be closed under entailment,<br />

or even must be consistent. Here is his description of the case.


Can We Do Without Pragmatic Encroachment? 101<br />

We are to suppose that an apparently rational person has written a long<br />

non-fiction book—say, on history. The body of the book, as is typical,<br />

contains a large number of assertions. The author is highly confident<br />

in each of these assertions; moreover, she has no hesitation in making<br />

them unqualifiedly, and would describe herself (and be described by others)<br />

as believing each of the book’s many claims. But she knows enough<br />

about the difficulties of historical scholarship to realize that it is almost<br />

inevitable that at least a few of the claims she makes in the book are mistaken.<br />

She modestly acknowledges this in her preface, by saying that<br />

she believes the book will be found to contain some errors, and she graciously<br />

invites those who discover the errors to set her straight. (Christensen,<br />

2005, 33-4)<br />

Christensen thinks such an author might be rational in every one of her beliefs, even<br />

though these are all inconsistent. Although he does not say this, nothing in his discussion<br />

suggests that he is using the irrelevance of some of the propositions in the<br />

author’s defence. So here is an argument that we should abandon closure amongst<br />

relevant beliefs.<br />

Christensen’s discussion, like other discussions of the preface paradox, makes frequent<br />

use of the fact that examples like these are quite common. We don’t have to go<br />

to fake barn country to find a counterexample to closure. But it seems to me that we<br />

need two quite strong idealisations in order to get a real counterexample here.<br />

The first of these is discussed in forthcoming work by Ishani Maitra (Maitra,<br />

2010), and is briefly mentioned by Christensen in setting out the problem. We only<br />

have a counterexample to closure if the author believes every thing she writes in her<br />

book. (Indeed, we only have a counterexample if she reasonably believes every one<br />

of them. But we’ll assume a rational author who only believes what she ought to<br />

believe.) This seems unlikely to be true to me. An author of a historical book is like<br />

a detective who, when asked to put forward her best guess about what explains the<br />

evidence, says “If I had to guess, I’d say . . . ” and then launches into spelling out her<br />

hypothesis. It seems clear that she need not believe the truth of her hypothesis. If<br />

she did that, she could not later learn it was true, because you can’t learn the truth of<br />

something you already believe. And she wouldn’t put any effort into investigating alternative<br />

suspects. But she can come to learn her hypothesis was true, and it would be<br />

rational to investigate other suspects. It seems to me (following here Maitra’s discussion)<br />

that we should understand scholarly assertions as being governed by the same<br />

kind of rules that govern detectives making the kind of speech being contemplated<br />

here. And those rules don’t require that the speaker believe the things they say without<br />

qualification. The picture is that the little prelude the detective explicitly says is<br />

implicit in all scholarly work.<br />

There are three objections I know to this picture, none of them particularly conclusive.<br />

First, Christensen says that the author doesn’t qualify their assertions. But<br />

neither does our detective qualify most individual sentences. Second, Christensen<br />

says that most people would describe our author as believing her assertions. But it is<br />

also natural to describe our detective as believing the things she says in her speech. It’s


Can We Do Without Pragmatic Encroachment? 102<br />

natural to say things like “She thinks it was the butler, with the lead pipe,” in reporting<br />

her hypothesis. Third, Timothy Williamson (2000a) has argued that if speakers<br />

don’t believe what they say, we won’t have an explanation of why Moore’s paradoxical<br />

sentences, like “The butler did it, but I don’t believe the butler did it,” are always<br />

defective. Whatever the explanation of the paradoxicality of these sentences might<br />

be, the alleged requirement that speakers believe what they say can’t be it. For our<br />

detective cannot properly say “The butler did it, but I don’t believe the butler did it”<br />

in setting out her hypothesis, even though believing the butler did it is not necessary<br />

for her to say “The butler did it” in setting out just that hypothesis.<br />

It is plausible that for some kinds of books, the author should only say things<br />

they believe. This is probably true for travel guides, for example. Interestingly, casual<br />

observation suggests that authors of such books are much less likely to write modest<br />

prefaces. This makes some sense if those books can only include statements their<br />

authors believe, and the authors believe the conjunctions of what they believe.<br />

The second idealisation is stressed by Simon Evnine in his paper “Believing Conjunctions”.<br />

The following situation does not involve me believing anything inconsistent.<br />

• I believe that what Manny just said, whatever it was, is false.<br />

• Manny just said that the stands at Fenway Park are green.<br />

• I believe that the stands at Fenway Park are green.<br />

If we read the first claim de dicto, that I believe that Manny just said something false,<br />

then there is no inconsistency. (Unless I also believe that what Manny just said was<br />

that the stands in Fenway Park are green.) But if we read it de re, that the thing Manny<br />

just said is one of the things I believe to be false, then the situation does involve me<br />

being inconsistent. The same is true when the author believes that one of the things<br />

she says in her book is mistaken. If we understand what she says de dicto, there is<br />

no contradiction in her beliefs. It has to be understood de re before we get a logical<br />

problem. And the fact is that most authors do not have de re attitudes towards the<br />

claims made in their book. Most authors don’t even remember everything that’s in<br />

their books. (I’m not sure I remember how this section started, let alone this paper.)<br />

Some may argue that authors don’t even have the capacity to consider a proposition as<br />

long and complicated as the conjunction of all the claims in their book. Christensen<br />

considers this objection, but says it isn’t a serious problem.<br />

It is undoubtedly true that ordinary humans cannot entertain book-length<br />

conjunctions. But surely, agents who do not share this fairly superficial<br />

limitation are easily conceived. And it seems just as wrong to say of such<br />

agents that they are rationally required to believe in the inerrancy of the<br />

books they write. (38: my emphasis)<br />

I’m not sure this is undoubtedly true; it isn’t clear that propositions (as opposed to<br />

their representations) have lengths. And humans can believe propositions that can be<br />

represented by sentences as long as books. But even without that point, Christensen<br />

is right that there is an idealisation here, since ordinary humans do not know exactly


Can We Do Without Pragmatic Encroachment? 103<br />

what is in a given book, and hence don’t have de re attitudes towards the propositions<br />

expressed in the book.<br />

I’m actually rather suspicious of the intuition that Christensen is pushing here,<br />

that idealising in this way doesn’t change intuitions about the case. The preface paradox<br />

gets a lot of its (apparent) force from intuitions about what attitude we should<br />

have towards real books. Once we make it clear that the real life cases are not relevant<br />

to the paradox, I find the intuitions become rather murky. But I won’t press this<br />

point.<br />

A more important point is that we believers in closure don’t think that authors<br />

should think their books are inerrant. Rather, following Stalnaker (1984), we think<br />

that authors shouldn’t unqualifiedly believe the individual statements in their book if<br />

they don’t believe the conjunction of those statements. Rather, their attitude towards<br />

those propositions (or at least some of them) should be that they are probably true.<br />

(As Stalnaker puts it, they accept the story without believing it.) Proponents of the<br />

preface paradox know that this is a possible response, and tend to argue that it is<br />

impractical. Here is Christensen on this point.<br />

It is clear that our everyday binary way of talking about beliefs has immense<br />

practical advantages over a system which insisted on some more<br />

fine-grained reporting of degrees of confidence . . . At a minimum, talking<br />

about people as believing, disbelieving, or withholding belief has at<br />

least as much point as do many of the imprecise ways we have of talking<br />

about things that can be described more precisely. (96)<br />

Richard Foley makes a similar point.<br />

There are deep reasons for wanting an epistemology of beliefs, reasons<br />

that epistemologies of degrees of belief by their very nature cannot possibly<br />

accommodate. (Foley, 1993, 170, my emphasis)<br />

It’s easy to make too much of this point. It’s a lot easier to triage propositions into<br />

TRUE, FALSE and NOT SURE and work with those categories than it is to work<br />

assign precise numerical probabilities to each proposition. But these are not the only<br />

options. Foley’s discussion subsequent to the above quote sometimes suggests they<br />

are, especially when he contrasts the triage with “indicat[ing] as accurately as I can<br />

my degree of confidence in each assertion that I defend.” (171) But really it isn’t much<br />

harder to add two more categories, PROBABLY TRUE and PROBABLY FALSE to<br />

those three, and work with that five-way division rather than a three-way division.<br />

It’s not clear that humans as they are actually constructed have a strong preference for<br />

the three-way over the five-way division, and even if they do, I’m not sure in what<br />

sense this is a ‘deep’ fact about them.<br />

Once we have the five-way division, it is clear what authors should do if they<br />

want to respect closure. For any conjunction that they don’t believe (i.e. classify<br />

as true), they should not believe one of the conjuncts. But of course they can classify<br />

every conjunct as probably true, even if they think the conjunction is false, or<br />

even certainly false. Still, might it not be considered something of an idealisation to


Can We Do Without Pragmatic Encroachment? 104<br />

say rational authors must make this five-way distinction amongst propositions they<br />

consider? Yes, but it’s no more of an idealisation than we need to set up the preface<br />

paradox in the first place. To use the preface paradox to find an example of someone<br />

who reasonably violates closure, we need to insist on the following three constraints.<br />

a) They are part of a research community where only asserting propositions you<br />

believe is compatible with active scholarship;<br />

b) They know exactly what is in their book, so they are able to believe that one<br />

of the propositions in the book is mistaken, where this is understood de re; but<br />

c) They are unable to effectively function if they have to effect a five-way, rather<br />

than a three-way, division amongst the propositions they consider.<br />

Put more graphically, to motivate the preface paradox we have to think that our inability<br />

to have de re thoughts about the contents of books is a “superficial constraint”,<br />

but our preference for working with a three-way rather than a five-way division is a<br />

“deep” fact about our cognitive system. Maybe each of these attitudes could be plausible<br />

taken on its own (though I’m sceptical of that) but the conjunction seems just<br />

absurd.<br />

I’m not entirely sure an agent subject to exactly these constraints is even fully<br />

conceivable. (Such an agent is negatively conceivable, in David Chalmers’s terminology,<br />

but I rather doubt they are positively conceivable.) But even if they are a genuine<br />

possibility, why the norms applicable to an agent satisfying that very gerrymandered<br />

set of constraints should be considered relevant norms for our state is far from clear.<br />

I’d go so far as to say it’s clear that the applicability (or otherwise) of a given norm to<br />

such an odd agent is no reason whatsoever to say it applies to us. But since the preface<br />

paradox only provides a reason for just these kinds of agents to violate closure, we<br />

have no reason for ordinary humans to violate closure. So I see no reason here to say<br />

that we can have probabilistic coherence without logical coherence, as proponents of<br />

the threshold view insist we can have, but which I say we can’t have at least when the<br />

propositions involved are salient. The more pressing question, given the failure of the<br />

preface paradox argument, is why I don’t endorse a much stronger closure principle,<br />

one that drops the restriction to salient propositions. The next section will discuss<br />

that point.<br />

I’ve used Christensen’s book as a stalking horse in this section, because it is the<br />

clearest and best statement of the preface paradox. Since Christensen is a paradoxmongerer<br />

and I’m a paradox-denier, it might be thought we have a deep disagreement<br />

about the relevant epistemological issues. But actually I think our overall views are<br />

fairly close despite this. I favour an epistemological outlook I call “Probability First”,<br />

the view that getting the epistemology of partial belief right is of the first importance,<br />

and everything else should flow from that. Christensen’s view, reduced to a slogan,<br />

is “Probability First and Last”. This section has been basically about the difference<br />

between those two slogans. It’s an important dispute, but it’s worth bearing in mind<br />

that it’s a factional squabble within the Probability Party, not an outbreak of partisan<br />

warfare.


Can We Do Without Pragmatic Encroachment? 105<br />

5 Too Little Closure?<br />

In the previous section I defended the view that a coherent agent has beliefs that are<br />

deductively cogent with respect to salient propositions. Here I want to defend the<br />

importance of the qualification. Let’s start with what I take to be the most important<br />

argument for closure, the passage from Stalnaker’s Inquiry that I quoted above.<br />

Reasoning in this way from accepted premises to their deductive consequences<br />

(P, also Q, therefore R) does seem perfectly straightforward.<br />

Someone may object to one of the premises, or to the validity of the argument,<br />

but one could not intelligibly agree that the premises are each<br />

acceptable and the argument valid, while objecting to the acceptability<br />

of the conclusion. (Stalnaker, 1984, 92)<br />

Stalnaker’s wording here is typically careful. The relevant question isn’t whether we<br />

can accept p, accept q, accept p and q entail r , and reject r . As Christensen (2005,<br />

Ch. 4) notes, this is impossible even on the threshold view, as long as the threshold<br />

is above 2/3. The real question is whether we can accept p, accept q, accept p and<br />

q entail r , and fail to accept r . And this is always a live possibility on any threshold<br />

view, though it seems absurd at first that this could be coherent.<br />

But it’s important to note how active the verbs in Stalnaker’s description are.<br />

When faced with a valid argument we have to object to one of the premises, or the<br />

validity of the argument. What we can’t do is agree to the premises and the validity<br />

of the argument, while objecting to the conclusion. I agree. If we are really agreeing<br />

to some propositions, and objecting to others, then all those propositions are salient.<br />

And in that case closure, deductive coherence, is mandatory. This doesn’t tell us what<br />

we have to do if we haven’t previously made the propositions salient in the first place.<br />

The position I endorse here is very similar in its conclusions to that endorsed by<br />

Gilbert Harman in Change in View. There Harman endorses the following principle.<br />

(At least he endorses it as true – he doesn’t seem to think it is particularly explanatory<br />

because it is a special case of a more general interesting principle.)<br />

Recognized Logical Implication Principle One has reason to believe P if one recognizes<br />

that P is logically implied by one’s view. (Harman, 1986, 17)<br />

This seems right to me, both what it says and its implicature that the reason in question<br />

is not a conclusive reason. My main objection to those who use the preface<br />

paradox to argue against closure is that they give us a mistaken picture of what we<br />

have to do epistemically. When I have inconsistent beliefs, or I don’t believe some<br />

consequence of my beliefs, that is something I have a reason to deal with at some<br />

stage, something I have to do. When we say that we have things to do, we don’t<br />

mean that we have to do them right now, or instead of everything else. My current<br />

list of things to do includes cleaning my bathroom, yet here I am writing this paper,<br />

and (given the relevant deadlines) rightly so. We can have the job of cleaning up our<br />

epistemic house as something to do while recognising that we can quite rightly do<br />

other things first. But it’s a serious mistake to infer from the permissibility of doing


Can We Do Without Pragmatic Encroachment? 106<br />

other things that cleaning up our epistemic house (or our bathroom) isn’t something<br />

to be done. The bathroom won’t clean itself after all, and eventually this becomes a<br />

problem.<br />

There is a possible complication when it comes to tasks that are very low priority.<br />

My attic is to be cleaned, or at least it could be cleaner, but there are no imaginable<br />

circumstances under which something else wouldn’t be higher priority. Given that,<br />

should we really leave clean the attic on the list of things to be done? Similarly, there<br />

might be implications I haven’t followed through that it couldn’t possibly be worth<br />

my time to sort out. Are they things to be done? I think it’s worthwhile recording<br />

them as such, because otherwise we might miss opportunities to deal with them in the<br />

process of doing something else. I don’t need to put off anything else in order to clean<br />

the attic, but if I’m up there for independent reasons I should bring down some of<br />

the garbage. Similarly, I don’t need to follow through implications mostly irrelevant<br />

to my interests, but if those propositions come up for independent reasons, I should<br />

deal with the fact that some things I believe imply something I don’t believe. Having<br />

it be the case that all implications from things we believe to things we don’t believe<br />

constitute jobs to do (possibly in the loose sense that cleaning my attic is something<br />

to do) has the right implications for what epistemic duties we do and don’t have.<br />

While waxing metaphorical, it seems time to pull out a rather helpful metaphor<br />

that Gilbert Ryle develops in The Concept of Mind at a point where he’s covering what<br />

we’d now call the inference/implication distinction. (This is a large theme of chapter<br />

9, see particularly pages 292-309.) Ryle’s point in these passages, as it frequently is<br />

throughout the book, is to stress that minds are fundamentally active, and the activity<br />

of a mind cannot be easily recovered from its end state. Although Ryle doesn’t use<br />

this language, his point is that we shouldn’t confuse the difficult activity of drawing<br />

inferences with the smoothness and precision of a logical implication. The language<br />

Ryle does use is more picturesque. He compares the easy work a farmer does when<br />

sauntering down a path from the hard work he did when building the path. A good<br />

argument, in philosophy or mathematics or elsewhere, is like a well made path that<br />

permits sauntering from the start to finish without undue strain. But from that it<br />

doesn’t follow that the task of coming up with that argument, of building that path<br />

in Ryle’s metaphor, was easy work. The easiest paths to walk are often the hardest<br />

to build. Path-building, smoothing out our beliefs so they are consistent and<br />

closed under implication, is hard work, even when the finished results look clean and<br />

straightforward. Its work that we shouldn’t do unless we need to. But making sure<br />

our beliefs are closed under entailment even with respect to irrelevant propositions is<br />

suspiciously like the activity of buildings paths between points without first checking<br />

you need to walk between them.<br />

For a less metaphorical reason for doubting the wisdom of this unchecked commitment<br />

to closure, we might notice the difficulties theorists tend to get into all sorts<br />

of difficulties. Consider, for example, the view put forward by Mark Kaplan in Decision<br />

Theory as Philosophy. Here is his definition of belief.<br />

You count as believing P just if, were your sole aim to assert the truth (as<br />

it pertains to P), and you only options were to assert that P, assert that


Can We Do Without Pragmatic Encroachment? 107<br />

¬P or make neither assertion, you would prefer to assert that P. (109)<br />

Kaplan notes that conditional definitions like this are prone to Shope’s conditional<br />

fallacy. If my sole aim were to assert the truth, I might have different beliefs to what I<br />

now have. He addresses one version of this objection (namely that it appears to imply<br />

that everyone believes their sole desire is to assert the truth) but as we’ll see presently<br />

he can’t avoid all versions of it.<br />

These arguments are making me thirsty. I’d like a beer. Or at least I think I<br />

would. But wait! On Kaplan’s theory I can’t think that I’d like a beer, for if my sole<br />

aim were to assert the truth as it pertains to my beer-desires, I wouldn’t have beer<br />

desires. And then I’d prefer to assert that I wouldn’t like a beer, I’d merely like to<br />

assert the truth as it pertains to my beer desires.<br />

Even bracketing this concern, Kaplan ends up being committed to the view that<br />

I can (coherently!) believe that p even while regarding p as highly improbable. This<br />

looks like a refutation of the view to me, but Kaplan accepts it with some equanimity.<br />

He has two primary reasons for saying we should live with this. First, he says that it<br />

only looks like an absurd consequence if we are committed to the Threshold View.<br />

To this all I can say is that I don’t believe the Threshold View, but it still seems absurd<br />

to me. Second, he says that any view is going to have to be revisionary to some extent,<br />

because our ordinary concept of belief is not “coherent” (142). His view is that,<br />

“Our ordinary notion of belief both construes belief as a state of confidence short of<br />

certainty and takes consistency of belief to be something that is at least possible and,<br />

perhaps, even desirable” and this is impossible. I think the view here interprets belief<br />

as a state less than confidence and allows for as much consistency as the folk view<br />

does (i.e. consistency amongst salient propositions), so this defence is unsuccessful as<br />

well.<br />

None of the arguments here in favour of our restrictions on closure are completely<br />

conclusive. In part the argument at this stage rests on the lack of a plausible<br />

rival theory that doesn’t interpret belief as certainty but implements a stronger closure<br />

principle. It’s possible that tomorrow someone will come up with a theory that<br />

does just this. Until then, we’ll stick with the account here, and see what its epistemological<br />

implications might be.<br />

6 Examples of Pragmatic Encroachment<br />

Fantl and McGrath’s case for pragmatic encroachment starts with cases like the following.<br />

(The following case is not quite theirs, but is similar enough to suit their<br />

plan, and easier to explain in my framework.)<br />

Local and Express<br />

There are two kinds of trains that run from the city to the suburbs: the<br />

local, which stops at all stations, and the express, which skips the first<br />

eight stations. Harry and Louise want to go to the fifth station, so they<br />

shouldn’t catch the Express. Though if they do it isn’t too hard to catch<br />

a local back the other way, so it isn’t usually a large cost. Unfortunately,


Can We Do Without Pragmatic Encroachment? 108<br />

the trains are not always clearly labelled. They see a particular train<br />

about to leave. If it’s a local they are better off catching it, if it is an<br />

express they should wait for the next local, which they can see is already<br />

boarding passengers and will leave in a few minutes. While running towards<br />

the train, they hear a fellow passenger say “It’s a local.” This gives<br />

them good, but far from overwhelming, reason to believe that the train<br />

is a local. Passengers get this kind of thing wrong fairly frequently, but<br />

they don’t have time to get more information. So each of them face a<br />

gamble, which they can take by getting on the train. If the train is a local,<br />

they will get home a few minutes early. If it is an express they will<br />

get home a few minutes later. For Louise, this is a low stakes gamble, as<br />

nothing much turns on whether she is a few minutes early or late, but she<br />

does have a weak preference for arriving earlier rather than later. But for<br />

Harry it is a high stakes gamble, because if he is late he won’t make the<br />

start of his daughter’s soccer game, which will highly upset her. There is<br />

no large payoff for Harry arriving early.<br />

What should each of them do? What should each of them believe?<br />

The first question is relatively easy. Louise should catch the train, and Harry<br />

should wait for the next. For each of them that’s the utility maximising thing to<br />

do. The second one is harder. Fantl and McGrath suggest that, despite being in the<br />

same epistemic position with respect to everything except their interests, Louise is<br />

justified in believing the train is a local and Harry is not. I agree. (If you don’t think<br />

the particular case fits this pattern, feel free to modify it so the difference in interests<br />

grounds a difference in what they are justified in believing.) Does this show that our<br />

notion of epistemic justification has to be pragmatically sensitive? I’ll argue that it<br />

does not.<br />

The fundamental assumption I’m making is that what is primarily subject to epistemic<br />

evaluation are degrees of belief, or what are more commonly called states of<br />

confidence in ordinary language. When we think about things this way, we see that<br />

Louise and Harry are justified in adopting the very same degrees of belief. Both of them<br />

should be confident, but not absolutely certain, that the train is a local. We don’t<br />

have even the appearance of a counterxample to Probabilistic Evidentialism here. If<br />

we like putting this in numerical terms, we could say that each of them is justified<br />

in assigning a probability of around 0.9 to the proposition That train is a local. 4 So<br />

as long as we adopt a Probability First epistemology, where we in the first instance<br />

evaluate the probabilities that agents assign to propositions, Harry and Louise are<br />

evaluated alike iff they do the same thing.<br />

How then can we say that Louise alone is justified in believing that the train is<br />

a local? Because that state of confidence they are justified in adopting, the state of<br />

4 I think putting things numerically is misleading because it suggests that the kind of bets we usually<br />

use to measure degrees of belief are open, salient options for Louise and Harry. But if those bets were<br />

open and salient, they wouldn’t believe the train is a local. Using qualitative rather than quantitative<br />

language to describe them is just as accurate, and doesn’t have misleading implications about their practical<br />

environment.


Can We Do Without Pragmatic Encroachment? 109<br />

being fairly confident but not absolutely certain that the train is a local, counts as<br />

believing that the train is a local given Louise’s context but not Harry’s context.<br />

Once Louise hears the other passenger’s comment, conditionalising on That’s a local<br />

doesn’t change any of her preferences over open, salient actions, including such ‘actions’<br />

as believing or disbelieving propositions. But conditional on the train being a<br />

local, Harry prefers catching the train, which he actually does not prefer.<br />

In cases like this, interests matter not because they affect the degree of confidence<br />

that an agent can reasonably have in a proposition’s truth. (That is, not because they<br />

matter to epistemology.) Rather, interests matter because they affect whether those<br />

reasonable degrees of confidence amount to belief. (That is, because they matter to<br />

philosophy of mind.) There is no reason here to let pragmatic concerns into epistemology.<br />

7 Justification and Practical Reasoning<br />

The discussion in the last section obviously didn’t show that there is no encroachment<br />

of pragmatics into epistemology. There are, in particular, two kinds of concerns<br />

one might have about the prospects for extending my style of argument to block all<br />

attempts at pragmatic encroachment. The biggest concern is that it might turn out<br />

to be impossible to defend a Probability First epistemology, particularly if we do not<br />

allow ourselves pragmatic concerns. For instance, it is crucial to this project that we<br />

have a notion of evidence that is not defined in terms of traditional epistemic concepts<br />

(e.g. as knowledge), or in terms of interests. This is an enormous project, and<br />

I’m not going to attempt to tackle it here. The second concern is that we won’t be<br />

able to generalise the discussion of that example to explain the plausibility of (JP)<br />

without conceding something to the defenders of pragmatic encroachment.<br />

(JP) If S justifiably believes that p, then S is justified in using p as a premise in<br />

practical reasoning.<br />

And that’s what we will look at in this section. To start, we need to clarify exactly<br />

what (JP) means. Much of this discussion will be indebted to Fantl and McGrath’s<br />

discussion of various ways of making (JP) more precise. To see some of the complications<br />

at issue, consider a simple case of a bet on a reasonably well established<br />

historical proposition. The agent has a lot of evidence that supports p, and is offered<br />

a bet that returns $1 if p is true, and loses $500 if p is false. Since her evidence doesn’t<br />

support that much confidence in p, she properly declines the bet. One might try to<br />

reason intuitively as follows. Assume that she justifiably believed that p. Then she’d<br />

be in a position to make the following argument.<br />

p<br />

If p, then I should take the bet<br />

So, I should take the bet


Can We Do Without Pragmatic Encroachment? 110<br />

Since she isn’t in a position to draw the conclusion, she must not be in a position to<br />

endorse both of the premises. Hence (arguably) she isn’t justified in believing that p.<br />

But we have to be careful here. If we assume also that p is true (as Fantl and McGrath<br />

do, because they are mostly concerned with knowledge rather than justified belief),<br />

then the second premise is clearly false, since it is a conditional with a true antecedent<br />

and a false consequent. So the fact that she can’t draw the conclusion of this argument<br />

only shows that she can’t endorse both of the premises, and that’s not surprising since<br />

one of the premises is most likely false. (I’m not assuming here that the conditional<br />

is true iff it has a true antecendent or a false consequent, just that it is only true if it<br />

has a false antecedent or a true consequent.)<br />

In order to get around this problem, Fantl and McGrath suggest a few other ways<br />

that our agent might reason to the bet. They suggest each of the following principles.<br />

S knows that p only if, for any act A, if S knows that if p, then A is the<br />

best thing she can do, then S is rational to do A. (72)<br />

S knows that p only if, for any states of affairs A and B, if S knows that<br />

if p, then A is better for her than B, then S is rational to prefer A to B.<br />

(74)<br />

(PC) S is justified in believing that p only if S is rational to prefer as if p.<br />

(77)<br />

Hawthorne (2004b, 174-181) appears to endorse the second of these principles. He<br />

considers an agent who endorses the following implication concerning a proposed<br />

sell of a lottery ticket for a cent, which is well below its actuarially fair value.<br />

I will lose the lottery.<br />

If I keep the ticket, I will get nothing.<br />

If I sell the ticket, I will get a cent.<br />

So I ought to sell the ticket. (174)<br />

(To make this fully explicit, it helps to add the tacit premise that a cent is better than<br />

nothing.) Hawthorne says that this is intuitively a bad argument, and concludes that<br />

the agent who attempts to use it is not in a position to know its first premise. But<br />

that conclusion only follows if we assume that the argument form is acceptable. So it<br />

is plausible to conclude that he endorses Fantl and McGrath’s second principle.<br />

The interesting question here is whether the theory endorsed in this paper can<br />

validate the true principles that Fantl and McGrath articulate. (Or, more precisely,<br />

we can validate the equivalent true principles concerning justified belief, since knowledge<br />

is outside the scope of the paper.) I’ll argue that it can in the following way. First,<br />

I’ll just note that given the fact that the theory here implies the closure principles we<br />

outlined in section 5, we can easily enough endorse Fantl and McGrath’s first two<br />

principles. This is good, since they seem true. The longer part of the argument involves<br />

arguing that their principle (PC), which doesn’t hold on the theory endorsed<br />

here, is in fact incorrect.


Can We Do Without Pragmatic Encroachment? 111<br />

One might worry that the qualification on the closure principles in section 5<br />

mean that we can’t fully endorse the principles Fantl and McGrath endorse. In particular,<br />

it might be worried that there could be an agent who believes that p, believes<br />

that if p, then A is better than B, but doesn’t put these two beliefs together to infer<br />

that A is better than B. This is certainly a possibility given the qualifications listed<br />

above. But note that in this position, if those two beliefs were justified, the agent<br />

would certainly be rational to conclude that A is better than B, and hence rational to<br />

prefer A to B. So the constraints on the closure principles don’t affect our ability to<br />

endorse these two principles.<br />

The real issue is (PC). Fantl and McGrath offer a lot of cases where (PC) holds,<br />

as well as arguing that it is plausibly true given the role of implications in practical<br />

reasoning. What’s at issue is that (PC) is stronger than a deductive closure principle.<br />

It is, in effect, equivalent to endorsing the following schema as a valid principle of<br />

implication.<br />

p<br />

Given p, A is preferable to B<br />

So, A is preferable to B<br />

I call this Practical Modus Ponens, or PMP. The middle premise in PMP is not a<br />

conditional. It is not to be read as If p, then A is preferable to B. Conditional valuations<br />

are not conditionals. To see this, again consider the proposed bet on (true) p at<br />

exorbitant odds, where A is the act of taking the bet, and B the act of declining the<br />

bet. It’s true that given p, A is preferable to B. But it’s not true that if p, then A<br />

is preferable to B. Even if we restrict our attention to cases where the preferences<br />

in question are perfectly valid, this is a case where PMP is invalid. Both premises<br />

are true, and the conclusion is false. It might nevertheless be true that whenever<br />

an agent is justified in believing both of the premises, she is justified in believing<br />

the conclusion. To argue against this, we need a very complicated case, involving<br />

embedded bets and three separate agents, Quentin, Robby and Thom. All of them<br />

have received the same evidence, and all of them are faced with the same complex bet,<br />

with the following properties.<br />

• p is an historical proposition that is well (but not conclusively) supported by<br />

their evidence, and happens to be true. All the agents have a high credence in<br />

p, which is exactly what the evidence supports.<br />

• The bet A, which they are offered, wins if p is true, and loses if p is false.<br />

• If they win the bet, the prize is the bet B.<br />

• s is also an historical proposition, but the evidence tells equally for and against<br />

it. All the agents regard s as being about as likely as not. Moreover, s turns out<br />

to be false.<br />

• The bet B is worth $2 if s is true, and worth -$1 if s is false. Although it is<br />

actually a losing bet, the agents all rationally value it at around 50 cents.<br />

• How much A costs is determined by which proposition from the partition<br />

{q, r, s} is true.


Can We Do Without Pragmatic Encroachment? 112<br />

• If q is true, A costs $2<br />

• If r is true, A costs $500<br />

• If t is true, A costs $1<br />

• The evidence the agents has strongly supports r , though t is in fact true<br />

• Quentin believes q<br />

• Robby believes r<br />

• Thom believes t<br />

All of the agents make the utility calculations that their beliefs support, so Quentin<br />

and Thom take the bet and lose a dollar, while Robby declines it. Although Robby<br />

has a lot of evidence in favour of p, he correctly decides that it would be unwise to<br />

bet on p at effective odds of 1000 to 1 against. I’ll now argue that both Quentin and<br />

Thom are potential counterexamples to (PC). There are three possibilities for what<br />

we can say about those two.<br />

First, we could say that they are justified in believing p, and rational to take the<br />

bet. The problem with this position is that if they had rational beliefs about the<br />

partition {q, r, t} they would realise that taking the bet does not maximise expected<br />

utility. If we take rational decisions to be those that maximise expected utility given<br />

a rational response to the evidence, then the decisions are clearly not rational.<br />

Second, we could say that although Quentin and Thom are not rational in accepting<br />

the bet, nor are they justified in believing that p. This doesn’t seem particularly<br />

plausible for several reasons. The irrationality in their belief systems concerns<br />

whether q, r or t is true, not whether p is true. If Thom suddenly got a lot of evidence<br />

that t is true, then all of his (salient) beliefs would be well supported by the<br />

evidence. But it is bizarre to think that whether his belief in p is rational turns on<br />

how much evidence he has for t. Finally, even if we accept that agents in higher stakes<br />

situations need more evidence to have justified beliefs, the fact is that the agents are<br />

in a low-risk situation, since t is actually true, so the most they could lose is $1.<br />

So it seems like the natural thing to say is that Quentin and Thom are justified<br />

in believing that p, and are justified in believing that given p, it maximises expected<br />

utility to take the bet, but they are not rational to take the bet. (At least, in the version<br />

of the story where they are thinking about which of q, r and t are correct given their<br />

evidence when thinking about whether to take the bet they are counterexamples<br />

to (PC).) Against this, one might respond that if belief in p is justified, there are<br />

arguments one might make to the conclusion that the bet should be taken. So it<br />

is inconsistent to say that the belief is justified, but the decision to take the bet is<br />

not rational. The problem is finding a premise that goes along with p to get the<br />

conclusion that taking the bet is rational. Let’s look at some of the premises the<br />

agent might use.<br />

• If p, then the best thing to do is to take the bet.<br />

This isn’t true ( p is true, but the best thing to do isn’t to take the bet). More importantly,<br />

the agents think this is only true if s is true, and they think s is a 50/50<br />

proposition. So they don’t believe this premise, and it would not be rational to believe<br />

it.


Can We Do Without Pragmatic Encroachment? 113<br />

• If p, then probably the best thing to do is to take the bet.<br />

Again this isn’t true, and it isn’t well supported, and it doesn’t even support the<br />

conclusion, for it doesn’t follow from the fact that x is probably the best thing to do<br />

that x should be done.<br />

• If p, then taking the bet maximises rational expected utility.<br />

This isn’t true – it is a conditional with a true antecedent and a false consequent.<br />

Moreover, if Quentin and Thom were rational, like Robby, they would recognise<br />

this.<br />

• If p, then taking the bet maximises expected utility relative to their beliefs.<br />

This is true, and even reasonable to believe, but it doesn’t imply that they should<br />

take the bet. It doesn’t follow from the fact that doing something maximises expected<br />

utility relative to my crazy beliefs that I should do that thing.<br />

• Given p, taking the bet maximises rational expected utility.<br />

This is true, and even reasonable to believe, but it isn’t clear that it supports the<br />

conclusion that the agents should take the bet. The implication appealed to here is<br />

PMP, and in this context that’s close enough to equivalent to (PC). If we think that<br />

this case is a prima facie problem for (PC), as I think is intuitively plausible, then we<br />

can’t use (PC) to show that it doesn’t post a problem. We could obviously continue<br />

for a while, but it should be clear it will be very hard to find a way to justify taking<br />

the bet even spotting the agents p as a premise they can use in rational deliberation.<br />

So it seems to me that (PC) is not in general true, which is good because as we’ll see<br />

in cases like this one the theory outlined here does not support it.<br />

The theory we have been working with says that belief that p is justified iff the<br />

agent’s degree of belief in p is sufficient to amount to belief in their context, and<br />

they are justified in believing p to that degree. Since by hypothesis Quentin and<br />

Thom are justified in believing p to the degree that they do, the only question left is<br />

whether this amounts to belief. This turns out not to be settled by the details of the<br />

case as yet specified. At first glance, assuming there are no other relevant decisions, we<br />

might think they believe that p because (a) they prefer (in the relevant sense) believing<br />

p to not believing p, and (b) conditionalising on p doesn’t change their attitude<br />

towards the bet. (They prefer taking the bet to declining it, both unconditionally<br />

and conditional on p.)<br />

But that isn’t all there is to the definition of belief tout court. We must also ask<br />

whether conditionalising on p changes any preferences conditional on any active<br />

proposition. And that may well be true. Conditional on r , Quentin and Thom<br />

prefer not taking the bet to taking it. But conditional on r and p, they prefer taking<br />

the bet to not taking it. So if r is an active proposition, they don’t believe that p.<br />

If r is not active, they do believe it. In more colloquial terms, if they are concerned<br />

about the possible truth of r (if it is salient, or at least not taken for granted to be


Can We Do Without Pragmatic Encroachment? 114<br />

false) then p becomes a potentially high-stakes proposition, so they don’t believe<br />

it without extraordinary evidence (which they don’t have). Hence they are only a<br />

counterexample to (PC) if r is not active. But if r is not active, our theory predicts<br />

that they are a counterexample to (PC), which is what we argued above is intuitively<br />

correct.<br />

Still, the importance of r suggests a way of saving (PC). Above I relied on the<br />

position that if Quentin and Thom are not maximising rational expected utility, then<br />

they are being irrational. This is perhaps too harsh. There is a position we could take,<br />

derived from some suggestions made by Gilbert Harman in Change in View, that an<br />

agent can rationally rely on their beliefs, even if those beliefs were not rationally<br />

formed, if they cannot be expected to have kept track of the evidence they used to<br />

form that belief. If we adopt this view, then we might be able to say that (PC) is<br />

compatible with the correct normative judgments about this case.<br />

To make this compatibility explicit, let’s adjust the case so Quentin takes q for<br />

granted, and cannot be reasonably expected to have remembered the evidence for q.<br />

Thom, on the other hand, forms the belief that t rather than r is true in the course<br />

of thinking through his evidence that bears on the rationality of taking or declining<br />

the bet. (In more familiar terms, t is part of the inference Thom uses in coming to<br />

conclude that he should take the bet, though it is not part of the final implication<br />

he endorses whose conclusion is that he should take the bet.) Neither Quentin nor<br />

Thom is a counterexample to (PC) thus understood. (That is, with the notion of<br />

rationality in (PC) understood as Harman suggests that it should be.) Quentin is<br />

not a counterexample, because he is rational in taking the bet. And Thom is not a<br />

counterexample, because in his context, where r is active, his credence in p does not<br />

amount to belief in p, so he is not justified in believing p.<br />

We have now two readings of (PC). On the strict reading, where a rational choice<br />

is one that maximises rational expected utility, the principle is subject to counterexample,<br />

and seems generally to be implausible. On the loose reading, where we allow<br />

agents to rely on beliefs formed irrationally in the past in rational decision making,<br />

(PC) is plausible. Happily, the theory sketched here agrees with (PC) on the plausible<br />

loose reading, but not on the implausible strict reading. In the previous section<br />

I argued that the theory also accounts for intuitions about particular cases like Local<br />

and Express. And now we’ve seen that the theory accounts for our considered opinions<br />

about which principles connecting justified belief to rational decision making<br />

we should endorse. So it seems at this stage that we can account for the intuitions<br />

behind the pragmatic encroachment view while keeping a concept of probabilistic<br />

epistemic justification that is free of pragmatic considerations.<br />

8 Conclusions<br />

Given a pragmatic account of belief, we don’t need to have a pragmatic account of<br />

justification in order to explain the intuitions that whether S justifiably believes that<br />

p might depend on pragmatic factors. My focus here has been on sketching a theory<br />

of belief on which it is the belief part of the concept of a justified belief which is<br />

pragmatically sensitive. I haven’t said much about why we should prefer to take that


Can We Do Without Pragmatic Encroachment? 115<br />

option than say that the notion of epistemic justification is a pragmatic notion. I’ve<br />

mainly been aiming to show that a particular position is an open possibility, namely<br />

that we can accept that whether a particular agent is justified in believing p can be<br />

sensitive to their practical environment without thinking that the primary epistemic<br />

concepts are themselves pragmatically sensitive.


Knowledge, Bets and Interests<br />

1 Knowledge in Decision Making<br />

When you pick up a volume like this one, which describes itself as being about<br />

‘knowledge ascriptions’, you probably expect to find it full of papers on epistemology,<br />

broadly construed. And you’d probably expect many of those papers to concern<br />

themselves with cases where the interests of various parties (ascribers, subjects of the<br />

ascriptions, etc.) change radically, and this affects the truth values of various ascriptions.<br />

And, at least in this paper, your expectations will be clearly met.<br />

But here’s an interesting contrast. If you’d picked up a volume of papers on ‘belief<br />

ascriptions’, you’d expect to find a radically different menu of writers and subjects.<br />

You’d expect to find a lot of concern about names and demonstratives, and about<br />

how they can be used by people not entirely certain about their denotation. More<br />

generally, you’d expect to find less epistemology, and much more mind and language.<br />

I haven’t read all the companion papers to mine in this volume, but I bet you won’t<br />

find much of that here.<br />

This is perhaps unfortunate, since belief ascriptions and knowledge ascriptions<br />

raise at least some similar issues. Consider a kind of contextualism about belief ascriptions,<br />

which holds that (L) can be truly uttered in some contexts, but not in<br />

others, depending on just what aspects of Lois Lane’s psychology are relevant in the<br />

conversation. 1<br />

(L) Lois Lane believes that Clark Kent is vulnerable to kryptonite.<br />

We could imagine a theorist who says that whether (L) can be uttered truly depends<br />

on whether it matters to the conversation that Lois Lane might not recognise Clark<br />

Kent when he’s wearing his Superman uniform. And, this theorist might continue,<br />

this isn’t because ‘Clark Kent’ is a context-sensitive expression; it is rather because<br />

‘believes’ is context-sensitive. Such a theorist will also, presumably, say that whether<br />

(K) can be uttered truly is context-sensitive.<br />

(K) Lois Lane knows that Clark Kent is vulnerable to kryptonite.<br />

And so, our theorist is a kind of contextualist about knowledge ascriptions. But<br />

they might agree with approximately none of the motivations for contextualism<br />

about knowledge ascriptions put forward by Cohen (1988), DeRose (1995) or Lewis<br />

(1996b). Rather, they are a contextualist about knowledge ascriptions solely because<br />

they are contextualist about belief ascriptions like (L).<br />

Call the position I’ve just described doxastic contextualism about knowledge<br />

ascriptions. It’s a kind of contextualism all right; it says that whether (K) can be<br />

† In progress. Intended for a volume on knowledge ascriptions, forthcoming from OUP in 2011/12.<br />

1 The reflections in the next few paragraphs are inspired by some comments in by Stalnaker (2008a),<br />

though I don’t want to suggest the theory I’ll discuss is actually Stalnaker’s.


Knowledge, Bets and Interests 117<br />

truly uttered is context sensitive, and not because of the context-sensitivity of any<br />

term in the ‘that’-clause. But it explains the contextualism solely in terms of the<br />

contextualism of belief ascriptions. The more familiar kind of contextualism about<br />

knowledge ascriptions we’ll call non-doxastic contextualism.<br />

We can make the same kind of division among interest-relative invariantist, or<br />

IRI, theories of knowledge ascriptions. Any kind of IRI will say that there are sentences<br />

of the form S knows that p whose truth depends on the interests, in some sense,<br />

of S. But we can divide IRI theories up the same way that we divide up contextualist<br />

theories.<br />

Doxastic IRI Knowledge ascriptions are interest-relative, but their interest-relativity<br />

traces solely to the interest-relativity of the corresponding belief ascriptions.<br />

Non-Doxastic IRI Knowledge ascriptions are interest-relative, and their interestrelativity<br />

goes beyond the interest-relativity of the corresponding belief ascriptions.<br />

In my (2005a), I tried to motivate Doxastic IRI. More precisely, I argued for Doxastic<br />

IRI about ascriptions of justified belief, and hinted that the same arguments would<br />

generalise to knowledge ascriptions. I now think those hints were mistaken, and want<br />

to defend Non-Doxastic IRI about knowledge ascriptions. 2 My change of heart has<br />

been prompted by cases like those Jason Stanley (2005) calls ‘Ignorant High Stakes’<br />

cases. 3 But to see why these cases matter, it will help to start with why I think some<br />

kind of IRI must be true. And that story starts with some reflections on the way we<br />

teach decision theory.<br />

1.1 The Struction of Decision Problems<br />

Professor Dec is teaching introductory decision theory to her undergraduate class.<br />

She is trying to introduce the notion of a dominant choice. So she introduces the following<br />

problem, with two states, S 1 and S 2 , and two choices, C 1 and C 2 as is normal<br />

for introductory problems.<br />

S 1<br />

S 2<br />

C 1 -$200 $1000<br />

C 2 -$100 $1500<br />

2 Whether Doxastic or Non-Doxastic IRI is true about justified belief ascriptions turns on some tricky<br />

questions about what to say when a subject’s credences are nearly, but not exactly appropriate given her<br />

evidence. Space considerations prevent a full discussion of those cases here.<br />

3 I mean here the case of Coraline, to be discussed in section 3 below. Several people have remarked in<br />

conversation that Coraline doesn’t look to them like a case of Ignorant High Stakes. This isn’t surprising;<br />

Coraline is better described as being mistaken than ignorant, and she’s mistaken about odds not stakes.<br />

If they’re right, that probably means my argument for Non-Doxastic IRI is less like Stanley’s, and hence<br />

more original, than I think it is. So I don’t feel like pressing the point! But I do want to note that I thought<br />

the Coraline example was a variation on a theme Stanley originated.


Knowledge, Bets and Interests 118<br />

She’s hoping that the students will see that C 1 and C 2 are bets, but C 2 is clearly the<br />

better bet. If S 1 is actual, then both bets lose, but C 2 loses less money. If S 2 is actual,<br />

then both bets win, but C 2 wins more. So C 2 is better. That analysis is clearly wrong<br />

if the state is causally dependent on the choice, and controversial if the states are<br />

evidentially dependent on the choices. But Professor Dec has not given any reason<br />

for the students to think that the states are dependent on the choices in either way,<br />

and in fact the students don’t worry about that kind of dependence.<br />

That doesn’t mean, however, that the students all adopt the analysis that Professor<br />

Dec wants them to. One student, Stu, is particularly unwilling to accept that C 2 is<br />

better than C 1 . He thinks, on the basis of his experience, that when more than<br />

$1000 is on the line, people aren’t as reliable about paying out on bets. So while<br />

C 1 is guaranteed to deliver $1000 if S 2 , if the agent bets on C 2 , she might face some<br />

difficulty in collecting on her money.<br />

Given the context, i.e., that they are in an undergraduate decision theory class, it<br />

seems that Stu has misunderstood the question that Professor Dec intended to ask.<br />

But it is a little harder than it first seems to specify just exactly what Stu’s mistake<br />

is. It isn’t that he thinks Professor Dec has misdescribed the situation. It isn’t that he<br />

thinks it is false that the agent will collect $1500 if she chooses C 2 and is in S 2 . He just<br />

thinks that she might not be able to collect it, so the expected payout might really be<br />

a little less than $1500.<br />

Before we try to say just what the misunderstanding between Professor Dec and<br />

Stu consists in, let’s focus on a simpler problem. Alice is out of town on a holiday,<br />

and she faces the following decision choice concerning what to do with a token in<br />

her hand.<br />

Choice Outcome<br />

Put token on table Win $1000<br />

Put token in pocket Win nothing<br />

This looks easy, especially if we’ve taken Professor Dec’s class. Putting the token on<br />

the table dominates putting the token in her pocket. It returns $1000, versus no gain.<br />

So she should put the token on the table.<br />

I’ve left Alice’s story fairly schematic; let’s fill in some of the details. Alice is on<br />

holiday at a casino. It’s a fair casino; the probabilities of the outcomes of each of the<br />

games is just what you’d expect. And Alice knows this. The table she’s standing at is<br />

a roulette table. The token is a chip from the casino worth $1000. Putting the token<br />

on the table means placing a bet. As it turns out, it means placing a bet on the roulette<br />

wheel landing on 28. If that bet wins she gets her token back and another token of<br />

the same value. There are many other bets she could make, but Alice has decided not<br />

to make all but one of them. Since her birthday is the 28 th , she is tempted to put a<br />

bet on 28; that’s the only bet she is considering. If she makes this bet, the objective<br />

chance of her winning is 1/38, and she knows this. As a matter of fact she will win,<br />

but she doesn’t know this. (This is why the description in the table I presented above<br />

is truthful, though frightfully misleading.) As you can see, the odds on this bet are


Knowledge, Bets and Interests 119<br />

terrible. She should have a chance of winning around 1/2 to justify placing this bet. 4<br />

So the above table, which makes it look like placing the bet is the dominant, and<br />

hence rational, option, is misleading.<br />

Just how is the table misleading though? It isn’t because what is says is false. If<br />

Alice puts the token on the table she wins $1000; and if she doesn’t, she stays where<br />

she is. It isn’t, or isn’t just, that Alice doesn’t believe the table reflects what will<br />

happen if she places the bet. As it turns out, Alice is smart, so she doesn’t form<br />

beliefs about chance events like roulette wheels. But even if she did, that wouldn’t<br />

change how misleading the table is. The table suggests that it is rational for Alice to<br />

put the token on the table. In fact, that is irrational. And it would still be irrational<br />

if Alice believes, irrationally, that the wheel will land on 28.<br />

A better suggestion is that the table is misleading because Alice doesn’t know<br />

that it accurately depicts the choice she faced. If she did know that these were the<br />

outcomes to putting the token on the table versus in her pocket, it seems it would<br />

be rationally compelling for her to put it on the table. If we take it as tacit in a<br />

presentation of a decision problem that the agent knows that the table accurately<br />

depicts the outcomes of various choices in different states, then we can tell a plausible<br />

story about what the miscommunication between Professor Dec and Stu was. Stu<br />

was assuming that if the agent wins $1500, she might not be able to easily collect.<br />

That is, he was assuming that the agent does not know that she’ll get $1500 if she<br />

chooses C 2 and is in state S 2 . Professor Dec, if she’s anything like other decision<br />

theory professors, will have assumed that the agent did know exactly that.<br />

As we’ve seen, the standard presentation of a decision problem presupposes not<br />

just that the table states what will happen, but the agent stands in some special doxastic<br />

relationship to that information. Could that relationship be weaker than knowledge?<br />

It’s true that it is hard to come up with clear counterexamples to the suggestion<br />

that the relationship is merely justified true belief. But I think it is somewhat implausible<br />

to hold that the standard presentation of an example merely presupposes that<br />

the agent has a justified true belief that the table is correct, and does not in addition<br />

know that the table is correct.<br />

My reasons for thinking this are similar to one of the reasons Timothy Williamson<br />

(Williamson, 2000a, Ch. 9) gives for doubting that one’s evidence is all that one<br />

justifiably truly believes. To put the point in Lewisian terms, it seems that knowledge<br />

is a much more natural relation than justified true belief. And when ascribing contents,<br />

especially contents of tacitly held beliefs, we should strongly prefer to ascribe<br />

more rather than less natural contents.<br />

I’m here retracting some things I said a few years ago in a paper on philosophical<br />

methodology (<strong>Weatherson</strong>, 2003c). There I argued that identifying knowledge with<br />

justified true belief would give us a theory on which knowledge was more natural<br />

than a theory on which we didn’t identify knowledge with any other epistemic property.<br />

I now think that is wrong for a couple of reasons. First, although it’s true (as<br />

I say in the earlier paper) that knowledge can’t be primitive or perfectly natural, this<br />

4 Assuming Alice’s utility curve for money curves downwards, she should be looking for a slightly<br />

higher chance of winning than 1/2 to place the bet, but that level of detail isn’t relevant to the story we’re<br />

telling here.


Knowledge, Bets and Interests 120<br />

doesn’t make it less natural than justification, which is also far from a fundamental<br />

feature of reality. Indeed, given how usual it is for languages to have a simple representation<br />

of knowledge, we have some evidence that it is very natural for a term from<br />

a special science. Second, I think in the earlier paper I didn’t fully appreciate the<br />

point (there attributed to Peter Klein) that the Gettier cases show that the property<br />

of being a justified true belief is not particularly natural. In general, when F and G<br />

are somewhat natural properties, then so is the property of being F ∧ G. But there<br />

are exceptions, especially in cases where these are properties that a whole can have in<br />

virtue of a part having the property. In those cases, a whole that has an F part and<br />

a G part will be F ∧ G, but this won’t reflect any distinctive property of the whole.<br />

And one of the things the Gettier cases show is that the properties of being justified<br />

and being true, as applied to belief, fit this pattern. 5<br />

So the ‘special doxastic relationship’ is not weaker than knowledge. Could it<br />

be stronger? Could it be, for example, that the relationship is certainty, or some<br />

kind of iterated knowledge? Plausibly in some game-theoretic settings it is stronger<br />

– it involves not just knowing that the table is accurate, but knowing that the other<br />

player knows the table is accurate. In some cases, the standard treatment of games will<br />

require positing even more iterations of knowledge. For convenience, it is sometimes<br />

explicitly stated that iterations continue indefinitely, so each party knows the table<br />

is correct, and knows each party knows this, and knows each party knows that, and<br />

knows each party knows that, and so on. An early example of this in philosophy is<br />

in the work by David Lewis (1969a) on convention. But it is usually acknowledged<br />

(again in a tradition extending back at least to Lewis) that only the first few iterations<br />

are actually needed in any problem, and it seems a mistake to attribute more iterations<br />

than are actually used in deriving solutions to any particular game.<br />

The reason that would be a mistake is that we want game theory, and decision<br />

theory, to be applicable to real-life situations. There is very little that we know, and<br />

know that we know, and know we know we know, and so on indefinitely (Williamson,<br />

2000a, Ch. 4). There is, perhaps, even less that we are certain of. If we only<br />

could say that a person is playing a particular game when they stand in these very<br />

strong relationships to the parameters of the game, then people will almost never be<br />

playing any games of interest. Since game theory, and decision theory, are not meant<br />

to be that impractical, I conclude that the ‘special doxastic relationship’ cannot be<br />

that strong. It could be that in some games, the special relationship will involve a<br />

few iterations of knowledge, but in decision problems, where the epistemic states of<br />

others are irrelevant, even that is unnecessary, and simple knowledge seems sufficient.<br />

It might be argued here that we shouldn’t expect to apply decision theory directly<br />

to real-life problems, but only to idealised versions of them, so it would be acceptable<br />

5 Note that even if you think that philosophers are generally too quick to move from instinctive reactions<br />

to the Gettier case to abandoning the justified true belief theory of knowledge, this point holds<br />

up. What is important here is that on sufficient reflection, the Gettier cases show that some justified true<br />

beliefs are not knowledge, and that the cases in question also show that being a justified true belief is not a<br />

particularly natural or unified property. So the point I’ve been making in the last few paragraphs is independent<br />

of the point I wanted to stress in “What Good are Counterexamples?”, namely, that philosophers<br />

in some areas (especially epistemology) are insufficiently reformist in their attitude towards our intuitive<br />

reactions to cases.


Knowledge, Bets and Interests 121<br />

to, for instance, require that the things we put in the table are, say, things that have<br />

probability exactly 1. In real life, virtually nothing has probability 1. In an idealisation,<br />

many things do. But to argue this way seems to involve using ‘idealisation’<br />

in an unnatural sense. There is a sense in which, whenever we treat something with<br />

non-maximal probability as simply given in a decision problem that we’re ignoring,<br />

or abstracting away from, some complication. But we aren’t idealising. On the contrary,<br />

we’re modelling the agent as if they were irrationally certain in some things<br />

which are merely very very probable.<br />

So it’s better to say that any application of decision theory to a real-life problem<br />

will involve ignoring certain (counterfactual) logical or metaphysical possibilities in<br />

which the decision table is not actually true. But not any old abstraction will do. We<br />

can’t ignore just anything, at least not if we want a good model. Which abstractions<br />

are acceptable? I have an answer to this: we can abstract away from any possibility<br />

in which something the agent actually knows is false. I don’t have a knock-down<br />

argument that this is the best of all possible abstractions, but nor do I know of any<br />

alternative answer to the question which abstractions are acceptable which is nearly<br />

as plausible.<br />

In part that is because it is plausible that the ‘special doxastic relationship’ should<br />

be a fairly simple, natural relationship. And it seems that any simple, natural relationship<br />

weaker than knowledge will be so weak that when we plug it into our decision<br />

theory, it will say that Alice should do clearly irrational things in one or other of the<br />

cases we described above. And it seems that any simple, natural relationship stronger<br />

than knowledge will be so strong that it makes decision theory or game theory impractical.<br />

I also cheated a little in making this argument. When I described Alice in the<br />

casino, I made a few explicit comments about her information states. And every<br />

time, I said that she knew various propositions. It seemed plausible at the time that<br />

this is enough to think those propositions should be added to the table. That’s some<br />

evidence against the idea that more than knowledge, perhaps iterated knowledge or<br />

certainty, is needed before we add propositions to the decision table.<br />

1.2 From Decision Theory to Interest-Relativity<br />

This way of thinking about decision problems offers a new perspective on the issue of<br />

whether we should always be prepared to bet on what we know. 6 To focus intuitions,<br />

let’s take a concrete case. Barry is sitting in his apartment one evening when he hears<br />

a musician performing in the park outside. The musician, call her Beth, is one of<br />

Barry’s favourite musicians, so the music is familiar to Barry. Barry is excited that<br />

Beth is performing in his neighbourhood, and he decides to hurry out to see the<br />

show. As he prepares to leave, a genie appears an offers him a bet. If he takes the bet,<br />

and the musician is Beth, then the genie give Barry ten dollars. On the other hand, if<br />

the musician is not Beth, he will be tortured in the fires of hell for a millenium. Let’s<br />

put Barry’s options in table form.<br />

6 This issue is of course central to the plotline in Hawthorne (2004b).


Knowledge, Bets and Interests 122<br />

Musician is Beth Musician is not Beth<br />

Take Bet Win $10 1000 years of torture<br />

Decline Bet Status quo Status quo<br />

Intuitively, it is extremely irrational for Barry to take the bet. People do make mistakes<br />

about identifying musicians, even very familiar musicians, by the strains of<br />

music that drift up from a park. It’s not worth risking a millenium of torture for<br />

$10.<br />

But it also seems that we’ve misstated the table. Before the genie showed up, it<br />

seemed clear that Barry knew that the musician was Beth. That was why he went out<br />

to see her perform. (If you don’t think this is true, make the sounds from the park<br />

clearer, or make it that Barry had some prior evidence that Beth was performing<br />

which the sounds from the park remind him of. It shouldn’t be too hard to come up<br />

with an evidential base such that (a) in normal circumstances we’d say Barry knew<br />

who was performing, but (b) he shouldn’t take this genie’s bet.) Now our decision<br />

tables should reflect the knowledge of the agent making the decision. If Barry knows<br />

that the musician is Beth, then the second column is one he knows will not obtain.<br />

Including it is like including a column for what will happen if the genie is lying about<br />

the consequences of taking or declining the bet. So let’s write the table in the standard<br />

form.<br />

Musician is Beth<br />

Take Bet Win $10<br />

Decline Bet Status quo<br />

And it is clear what Barry’s decision should be in this situation. Taking the bet<br />

dominates declining it, and Barry should take dominating options.<br />

What has happened? It is incredibly clear that Barry should decline the bet, yet<br />

here we have an argument that he should take the bet. If you accept that the bet<br />

should be declined, then there are three options available it seems to me.<br />

1. Barry never knew that the musician was Beth.<br />

2. Barry did know that the musician was Beth, but this knowledge was destroyed<br />

by the genie’s offer of the bet.<br />

3. States of the world that are known not to obtain should still be represented in<br />

decision problems, so taking the bet is not a dominating option.<br />

The first option is basically a form of scepticism. If the take-away message from the<br />

above discussion is that Barry doesn’t know the musician is Beth, we can mount a<br />

similar argument to show that he knows next to nothing. 7 And the third option<br />

would send us back into the problems about interpreting and applying decision theory<br />

that we spend the first few pages trying to get out of.<br />

7 The idea that interest-relativity is a way of fending off scepticism is a very prominent theme in Fantl<br />

and McGrath (2009).


Knowledge, Bets and Interests 123<br />

So it seems that the best solution here, or perhaps the least bad solution, is to<br />

accept that knowledge is interest-relative. Barry did know that the musician was<br />

Beth, but the genie’s offer destroyed that knowledge.<br />

The argument here bears more than a passing resemblance to the arguments in<br />

favour of interest-relativity that are made by Hawthorne, Stanley and Fantl and Mc-<br />

Grath. But I think the focus on decision theory shows how we getting to interestrelativity<br />

with slightly weaker premises than they are using. 8 In particular, the only<br />

premises I’ve used to derive an interest-relative conclusion are:<br />

1. Before the genie showed up, Barry knew the musician was Beth.<br />

2. It’s rationally permissible, in cases like Barry’s, to take dominating options.<br />

3. It’s always right to model decision problems by including what the agent knows<br />

in the ‘framework’, i.e., the specification of the options and payouts.<br />

4. It is rationally impermissible for Barry to take the genie’s offered bet.<br />

The second premise there is much weaker than the principles linking knowledge and<br />

action defended in previous arguments for interest-relativity. It isn’t the claim that<br />

one can always act on what one knows, or that one can only act on what one knows,<br />

or that knowledge always (or only) provides reason to act. It’s just the claim that in<br />

one very specific type of situation, in particular when one has to make a relatively<br />

simple bet, which affects nobody but the person making the bet, it’s rationally permissible<br />

to take a dominating option. In conjunction with the third premise, it entails<br />

that in those kind of cases, the fact that one knows taking the bet will lead to a better<br />

outcome suffices for making acceptance of the bet rationally permissible. It doesn’t<br />

say anything about what else might or might not make acceptance rationally permissible.<br />

It doesn’t say anything about what suffices for rationally permissibility in<br />

other kinds of cases, such as cases where someone else’s interests are at stake, or where<br />

taking the bet might violate a deontological constraint, or any other way in which<br />

real-life choices differ from the simplest decision problems. 9 It doesn’t say anything<br />

about any other kind of permissibility, e.g., moral permissibility. But it doesn’t need<br />

to, because we’re only in the business of proving that there is some interest-relativity<br />

to knowledge, and an assumption about practical rationality in some range of cases<br />

suffices to prove that.<br />

The case of Barry and Beth also bears some relationship to one of the kinds of<br />

case that have motivated contextualism about knowledge. Indeed, it has been widely<br />

noted in the literature on interest-relativity that interest-relativity can explain away<br />

many of the puzzles that motivate contextualism. And there are difficulties that face<br />

any contextualist theory (<strong>Weatherson</strong>, 2006b). So I prefer an invariantist form of<br />

8 As they make clear in Hawthorne and Stanley (2008), Hawthorne and Stanley are interested in defending<br />

relatively strong premises linking knowledge and action independently of the argument for the<br />

interest-relativity of knowledge. What I’m doing here is showing how that conclusion does not rest on<br />

anything nearly as strong as the principles they believe, and so there is plenty of space to disagree with<br />

their general principles, but accept interest-relativity.<br />

9 I have more to say about those cases in section 2.2.


Knowledge, Bets and Interests 124<br />

interest-relativity about knowledge. That is, my view is a form of interest-relativeinvariantism,<br />

or IRI. 10<br />

Now everything I’ve said here leaves it open whether the interest-relativity of<br />

knowledge is a natural and intuitive theory, or whether it is a somewhat unhappy<br />

concession to difficulties that the case of Barry and Beth raise. I think the former is<br />

correct, and interest-relativity is fairly plausible on its own merits, but it would be<br />

consistent with my broader conclusions to say that in fact the interest-relative theory<br />

of knowledge is very implausible and counterintuitive. If we said that, we could<br />

still justify the interest-relative theory by noting that we have on our hands here a<br />

paradoxical situation, and any option will be somewhat implausible. This consideration<br />

has a bearing on how we should think about the role of intuitions about cases, or<br />

principles, in arguments that knowledge is interest-relative. Several critics of the view<br />

have argued that the view is counter-intuitive, or that it doesn’t accord with the reactions<br />

of non-expert judges. 11 In a companion paper, “Defending Interest-Relative Invariantism”,<br />

I note that those arguments usually misconstrue what the consequences<br />

of interest-relative theories of knowledge are. But even if they don’t, I don’t think<br />

there’s any quick argument that if interest-relativity is counter-intuitive, it is false.<br />

After all, the only alternatives that seem to be open here are very counter-intuitive.<br />

Finally, it’s worth noting that if Barry is rational, he’ll stop (fully) believing that<br />

the musician is Beth once the genie makes the offer. Assuming the genie allows this,<br />

it would be very natural for Barry to try to acquire more information about the<br />

singer. He might walk over to the window to see if he can see who is performing in<br />

the park. So this case leaves it open whether the interest-relativity of knowledge can<br />

be explained fully by the interest-relativity of belief. I used to think it could be; I no<br />

longer think that. To see why this is so, it’s worth rehearsing how the interest-relative<br />

theory of belief runs.<br />

2 The Interest-Relativity of Belief<br />

2.1 Interests and Functional Roles<br />

The previous section was largely devoted to proving an existential claim: there is some<br />

interest-relativity to knowledge. Or, if you prefer, it proved a negative claim: the<br />

best theory of knowledge is not interest-neutral. But this negative conclusion invites<br />

a philosophical challenge: what is the best explanation of the interest-relativity of<br />

knowledge? My answer is in two parts. Part of the interest-relativity of knowledge<br />

comes from the interest-relativity of belief, and part of it comes from the fact that<br />

interests generate certain kinds of doxastic defeaters.<br />

We can see that belief is interest-relative by seeing that the best functionalist theory<br />

of belief is interest-relative. Since the best functionalist theory of belief is the true<br />

theory of belief, that means belief is interest-relative. That last sentence assumed a<br />

non-trivial premise, namely that functionalism about belief is true. I’m not going to<br />

10This is obviously not a full argument against contextualism; that would require a much longer paper<br />

than this.<br />

11See, for instance, Blome-Tillmann (2009a), or Feltz and Zarpentine (forthcoming).


Knowledge, Bets and Interests 125<br />

argue for that, since the argument would take at least a book. (The book in question<br />

might look a lot like Braddon-Mitchell and Jackson (2007).) But I am going to, in this<br />

section, argue for what I said in the first sentence of this paragraph, namely that the<br />

best functionalist account of belief is interest-relative.<br />

In my (2005a), I suggested that was all of the explanation of the interest-relativity<br />

of knowledge. I was wrong, and section 3 of this paper will show why I was wrong.<br />

Interests also generate certain kinds of doxastic defeaters, and they enter independently<br />

into the explanation of why knowledge is interest-relative. Very roughly, the<br />

idea is that S doesn’t know p if S’s belief that p does not sufficiently cohere with the<br />

rest of what she should believe and does believe. If this coherence constraint fails,<br />

there is a doxastic defeater. When is some incoherence too much incoherence for<br />

knowledge? That turns out to be interest-relative, in cases like the case of Coraline<br />

described in the next section. But we’re getting ahead of ourselves, the first task is to<br />

link functionalism about belief and the interest-relativity of belief.<br />

We start not with the functional characterisaton of belief, but with something<br />

closeby, the functional characterisation of credence. Frank Ramsey (1926) provides<br />

a clear statement of one of the key functional roles of credences; their connection<br />

to action. Of course, Ramsey did not take himself to be providing one component<br />

of the functional theory of credence. He took himself to be providing a behaviourist/operationalist<br />

reduction of credences to dispositions. But we do not have<br />

to share Ramsey’s metaphysics to use his key ideas. Those ideas include that it’s distinctively<br />

betting dispositions that are crucial to the account of credence, and that all<br />

sorts of actions in everyday life constitute bets.<br />

The connection to betting behaviour lives on today most prominently in the<br />

work on ‘representation theorems’. 12 What a representation theorem shows is that<br />

for any agent whose pairwise preferences satisfy some structural constraints, there is<br />

a probability function and a utility function such that the agent prefers bet X to bet<br />

Y just in case the expected utility of X (given that probability and utility function)<br />

is greater than that of Y . Moreover, the probability function is unique (and the<br />

utility function is unique up to positive affine transformations). Given that, it might<br />

seem plausible to identify the agent’s credence with this probability function, and the<br />

agent’s (relative) values with this utility function.<br />

Contemporary functionalism goes along with much, but not quite all, of this<br />

picture. The betting preferences are an important part of the functional role of a<br />

credence; indeed, they just are the output conditions. But there are two other parts<br />

to a functional role: an input condition and a set of internal connections. So the<br />

functionalist thinks that the betting dispositions are not quite sufficient for having<br />

credences. A pre-programmed automaton might have dispositions to accept (or at<br />

least move as if accepting) various bets, but this will not be enough for credences<br />

(Braddon-Mitchell and Jackson, 2007). So when we’re considering whether someone<br />

is in a particular credal state, we have to consider not just their actions, and their<br />

disposition to act, but the connections between the alleged credal state and other<br />

states.<br />

12 See Maher (1993) for the most developed account in recent times.


Knowledge, Bets and Interests 126<br />

The same will be true of belief. Part of what it is to believe p is to act as if p is<br />

true. Another part is to have other mental states that make sense in light of p. This<br />

isn’t meant to rule out the possibility of inconsistent beliefs. As David Lewis (1982)<br />

points out, it is easy to have inconsistent beliefs if we don’t integrate our beliefs fully.<br />

What it is meant to rule out is the possibility of a belief that is not at all integrated<br />

into the agent’s cognitive system. If the agent believes q, but refuses to infer p ∧ q,<br />

and believes p → r , but refuses to infer r , and so on for enough other beliefs, she<br />

doesn’t really believe p.<br />

When writers start to think about the connection between belief and credence,<br />

they run into the following problem fairly quickly. The alleged possibility where S<br />

believes p, and doesn’t believe q, but her credence in q is higher than her credence in<br />

p strikes many theorists as excessively odd. 13 That suggests the so-called ‘threshold<br />

view’ of belief, that belief is simply credence above a threshold. It is also odd to<br />

say that a rational, reflective agent could believe p, believe q, yet take it as an open<br />

question whether p ∧q is true, refusing to believe or disbelieve it. Finally, it seems we<br />

can believe things to which we don’t give credence 1. In the case of Barry and Beth<br />

from section 1, for example, before the genie comes in, it seems Barry does believe<br />

the musician is Beth. But he doesn’t have credence 1 in this, since having credence<br />

1 means being disposed to make a bet at any odds. And we can’t have all three of<br />

these ideas. That is, we can’t accept the ‘threshold view’, the closure of belief under<br />

conjunction, and the possibility that propositions with credence less than one are<br />

believed.<br />

We can raise the same kind of problem by looking directly at functional roles. A<br />

key functional role of credences is that if an agent has credence x in p she should be<br />

prepared to buy a bet that returns 1 util if p, and 0 utils otherwise, iff the price is no<br />

greater than x utils. A key functional role of belief is that if an agent believes p, and<br />

recognises that φ is the best thing to do given p, then she’ll do φ. Given p, it’s worth<br />

paying any price up to 1 util for a bet that pays 1 util if p. So believing p seems to<br />

mean being in a functional state that is like having credence 1 in p. But as we argued<br />

in the previous paragraph, it is wrong to identify belief with credence 1.<br />

If we spell out more carefully what the functional states of credence and belief<br />

are, a loophole emerges in the argument that belief implies credence 1. The interestrelative<br />

theory of belief exploits that loophole. What’s the difference, in functional<br />

terms, between having credence x in p, and having credence x + ɛ in p? Well, think<br />

again about the bet that pays 1 util if p, and 0 utils otherwise. And imagine that bet<br />

is offered for x + ɛ/2 utils. The person whose credence is x will decline the offer; the<br />

person whose credence is x + ɛ will accept it. Now it will usually be that no such bet<br />

is on offer. 14 No matter; as long as one agent is disposed to accept the offer, and the<br />

13I’m going to argue below that cases like that of Barry and Beth suggest that in practice this isn’t nearly<br />

as odd as it first seems.<br />

14There are exceptions, especially in cases where p concerns something significant to financial markets,<br />

and the agent trades financial products. If you work through the theory that I’m about to lay out, one<br />

consequence is that such agents should have very few unconditional beliefs about financially-sensitive information,<br />

just higher and lower credences. I think that’s actually quite a nice outcome, but I’m not going<br />

to rely on that in the argument for the view.


Knowledge, Bets and Interests 127<br />

other agent is not, that suffices for a difference in credence.<br />

The upshot of that is that differences in credences might be, indeed usually will<br />

be, constituted by differences in dispositions concerning how to act in choice situations<br />

far removed from actuality. I’m not usually in a position of having to accept or<br />

decline a chance to buy a bet for 0.9932 utils that the local coffee shop is currently<br />

open. Yet whether I would accept or decline such a bet matters to whether my credence<br />

that the coffee shop is open is 0.9931 or 0.9933. This isn’t a problem with the<br />

standard picture of how credences work. It’s just an observation that the high level<br />

of detail embedded in the picture relies on taking the constituents of mental states to<br />

involve many dispositions.<br />

It isn’t clear that belief should be defined in terms of the same kind of dispositions<br />

involving better behaviour in remote possibilities. It’s true that if I believe that p,<br />

and I’m rational enough, I’ll act as if p is true. Is it also true that if I believe p, I’m<br />

disposed to act as if p is true no matter what choices are placed in front of me? I don’t<br />

see any reason to say yes, and there are a few reasons to say no. As we say in the case<br />

of Barry and Beth, Barry can believe that p, but be disposed to lose that belief rather<br />

than act on it if odd choices, like that presented by the genie, emerge.<br />

This suggests the key difference between belief and credence 1. For a rational<br />

agent, a credence of 1 in p means that the agent is disposed to answer a wide range<br />

of questions the same way she would answer that question conditional on p. That<br />

follows from the fact that these four principles are trivial theorems of the orthodox<br />

theory of expected utility. 15<br />

C1AP For all q, x, if Pr( p) = 1 then Pr(q) = x iff Pr(q| p) = x.<br />

C1CP For all q, r , if Pr( p) = 1 then Pr(q) ≥ Pr(r ) iff Pr(q| p) ≥ Pr(r | p).<br />

C1AU For all φ, x, if Pr( p) = 1 then U (φ) = x iff U (φ| p) = x.<br />

C1CP For all φ,ψ, if Pr( p) = 1 then U (φ) ≥ U (ψ) iff U (φ| p) ≥ U (ψ| p).<br />

In the last two lines, I use U (φ) to denote the expected utility of φ, and U (φ| p) to<br />

denote the expected utility of φ conditional on p. It’s often easier to write this as<br />

simply U (φ ∧ p), since the utility of φ conditional on p just is the utility of doing φ<br />

in a world where p is true. That is, it is the utility of φ ∧ p being realised. But we get<br />

a nicer symmetry between the probabilistic principles and the utility principles if we<br />

use the explictly conditional notation for each.<br />

If we make the standard kinds of assumptions in orthodox decision theory, i.e.,<br />

assume at least some form of probabilism and consequentialism 16 , then the agent will<br />

answer each of these questions the same way simpliciter and conditional on p.<br />

• How probable is q?<br />

• Is q or r more probable?<br />

• How good an idea is it to do φ?<br />

15 The presentation in this section, as in <strong>Weatherson</strong> (2005a), assumes at least a weak form of consequentialism.<br />

This was arguably a weakness of the earlier paper. We’ll return to the issue of what happens in<br />

cases where the agent doesn’t, and perhaps shouldn’t, maximise expected utility, at the end of the section.<br />

16 I mean here consequentialism in roughly the sense used by Hammond (1988).


Knowledge, Bets and Interests 128<br />

• Is it better to do φ or ψ?<br />

Each of those questions is schematic. As in the more technical versions given above,<br />

they quantify over propositions and actions, albeit tacitly in the case of these versions.<br />

And these quantifiers have a very large domain. The standard theory is that an agent<br />

whose credence in p is 1 will have the same credence in q as in q given p for any q<br />

whatsoever.<br />

Now in one sense, exactly the same things are true if the agent believes p. If one<br />

is wondering whether q or r is more probable, and one believes p, then the fact that<br />

p will be taken as a given in structring this inquiry. So conditionalising on p should<br />

not change the answer to the question. And the same goes for any actions φ and ψ<br />

that the agent is choosing between. But this isn’t true unrestrictedly. That’s what<br />

we saw in the case of Barry and Beth. Which choices are on the table will change<br />

which things the agent will take as given, will use to structure inquiry, will believe.<br />

If we return to the technical version of our four questions, we might put all this the<br />

following way.<br />

BAP For all relevant q, x, if p is believed then Pr(q) = x iff Pr(q| p) = x.<br />

BCP For all relevant q, r , if p is believed then Pr(q) ≥ Pr(r ) iff Pr(q| p) ≥ Pr(r | p).<br />

BAU For all relevant φ, x, if p is believed then U (φ) = x iff U (φ| p) = x.<br />

BCP For all relevant φ,ψ, if p is believed then U (φ) ≥ U (ψ) iff U (φ| p) ≥ U (ψ| p).<br />

This is where interests, theoretical or practical, matter. The functional definition<br />

of belief looks a lot like the functional definition of credence 1. But there’s one<br />

key difference. Both definitions involve quantifiers: quantifying over propositions,<br />

actions and values. In the definition of credence 1, those quantifiers are (largely)<br />

unrestricted. In the definition of belief, those quantifiers are tightly restricted, and<br />

the restrictions are in terms of the interests, practical and theoretical, of the agent.<br />

In the earlier paper I went into a lot of detail about just what ‘relevant’ means in<br />

this context, and I won’t repeat that here. But I will say a little about one point I<br />

didn’t sufficiently stress in that paper: the importance of the restriction of BAP and<br />

BAU to relevant values of x. This lets us have the following nice consequence.<br />

Charlie is trying to figure out exactly what the probability of p is. That is, for any<br />

x ∈ [0,1], whether Pr( p) = x is a relevant question. Now Charlie is well aware that<br />

Pr( p| p) = 1. So unless Pr( p) = 1, Charlie will give a different answer to the questions<br />

How probable is p? and Given p, how probable is p?. So unless Charlie holds that Pr( p)<br />

is 1, she won’t count as believing that p. One consequence of this is that Charlie<br />

can’t reason, “The probability of p is exactly 0.978, so p.” That’s all to the good,<br />

since that looks like bad reasoning. And it looks like bad reasoning even though in<br />

some circumstances Charlie can rationally believe propositions that she (rationally)<br />

gives credence 0.978 to.<br />

But note that the reasoning in the previous paragraph assumes that every question<br />

of the form Is the probability of p equal to x? is relevant. In practice, fewer questions<br />

than that will be relevant. Let’s say that the only questions relevant to Charlie are<br />

of the form What is the probability of p to one decimal place?. And assume that no


Knowledge, Bets and Interests 129<br />

other questions become relevant in the course of her inquiry into this question. 17<br />

Charlie decides that to the first decimal place, Pr( p) = 1.0, i.e., Pr( p) > 0.95. That<br />

is compatible with simply believing that p. And that seems right; if for practical<br />

purposes, the probability of p is indistinguishable from 1, then the agent is confident<br />

enough in p to believe it.<br />

2.2 Two Caveats<br />

The theory sketched so far seems to me right in the vast majority of cases. It fits in<br />

well with a broadly functionalist view of the mind, and it handles difficult cases, like<br />

that of Kate, nicely. But it needs to be supplemented and clarified a little to handle<br />

some other difficult cases. In this section I’m going to supplement the theory a little<br />

to handle what I call ‘impractical propositions’, and say a little about morally loaded<br />

action.<br />

Jones has a false geographic belief: he believes that Los Angeles is west of Reno,<br />

Nevada. 18 This isn’t because he’s ever thought about the question. Rather, he’s just<br />

disposed to say “Of course” if someone asks, “Is Los Angeles west of Reno?” That<br />

disposition has never been triggered, because no one’s ever bothered to ask him this.<br />

Call the proposition that Los Angeles is west of Reno p.<br />

The theory given so far will get the right result here: Jones does believe that p.<br />

But it gets the right answer for an odd reason. Jones, it turns out, has very little interest<br />

in American geography right now. He’s a schoolboy in St Andrews, Scotland,<br />

getting ready for school and worried about missing his schoolbus. There’s no inquiry<br />

he’s currently engaged in for which p is even close to relevant. So conditionalising<br />

on p doesn’t change the answer to any inquiry he’s engaged in, but that would be<br />

true no matter what his credence in p is.<br />

There’s an immediate problem here. Jones believes p, since conditionalising on p<br />

doesn’t change the answer to any relevant inquiry. But for the very same reason, conditionalising<br />

on ¬ p doesn’t change the answer to any relevant inquiry. It seems our<br />

theory has the bizarre result that Jones believes ¬ p as well. That is both wrong and<br />

unfair. We end up attributing inconsistent beliefs to Jones simply because he’s a harried<br />

schoolboy who isn’t currently concerned with the finer points of the geography<br />

of the American southwest.<br />

Here’s a way out of this problem in four relatively easy steps. First, we say that<br />

which questions are relevant questions is not just relative to the agent’s interests, but<br />

also relevant to the proposition being considered. A question may be relevant relative<br />

to p, but not relative to q. Second, we say that relative to p, the question of whether<br />

to believe p is a relevant question. Third, we say that an agent only prefers believing<br />

p to not believing it if their credence in p is greater than their credence in ¬ p, i.e., if<br />

their credence in p is greater than 1/2. Finally, we say that when the issue is whether<br />

the subject believes that p, the question of whether to believe p is not just a relvant<br />

17 This is probably somewhat unrealistic. It’s hard to think about whether Pr( p) is closer to 0.7 or 0.8<br />

without raising to salience questions about, for example, what the second decimal place in Pr( p) is. This<br />

is worth bearing in mind when coming up with intuitions about the cases in this paragraph.<br />

18 I’m borrowing this example from Fred Dretske, who uses it to make some interesting points about<br />

dispositional belief.


Knowledge, Bets and Interests 130<br />

question on its own, but it stays being a relevant question conditional on any q that<br />

is relevant to the subject. In the earlier paper (<strong>Weatherson</strong>, 2005a) I argue that this<br />

solves the problem raised by impractical propositions in a smooth and principled<br />

way.<br />

That’s the first caveat. The second is one that isn’t discussed in the earlier paper. If<br />

the agent is merely trying to get the best outcome for themselves, then it makes sense<br />

to represent them as a utility maximiser. And within orthodox decision theory, it is<br />

easy enough to talk about, and reason about, conditional utilities. That’s important,<br />

because conditional utilities play an important role in the theory of belief offered<br />

here. But if the agent faces moral constraints on her decision, it isn’t always so easy<br />

to think about conditional utilities.<br />

When agents have to make decisions that might involve them causing harm to<br />

others if certain propositions turn out to be true, then I think it is best to supplement<br />

orthodox decision theory with an extra assumption. The assumption is, roughly, that<br />

for choices that may harm others, expected value is absolute value. It’s easiest to see<br />

what this means using a simple case of three-way choice. The kind of example I’m<br />

considering here has been used for (slightly) different purposes by Frank Jackson<br />

(1991).<br />

The agent has to do ϕ or ψ. Failure to do either of these will lead to disaster, and<br />

is clearly unacceptable. Either ϕ or ψ will avert the disaster, but one of them will<br />

be moderately harmful and the other one will not. The agent has time before the<br />

disaster to find out which of ϕ and ψ is harmful and which is not for a nominal cost.<br />

Right now, her credence that ϕ is the harmful one is, quite reasonably, 1/2. So the<br />

agent has three choices:<br />

• Do ϕ;<br />

• Do ψ; or<br />

• Wait and find out which one is not harmful, and do it.<br />

We’ll assume that other choices, like letting the disaster happen, or finding out which<br />

one is harmful and doing it, are simply out of consideration. In any case, they are<br />

clearly dominated options, so the agent shouldn’t do them. Let p be the propostion<br />

that ϕ is the harmful one. Then if we assume the harm in question has a disutility of<br />

10, and the disutility of waiting to act until we know which is the harmful one is 1,<br />

the values of the possible outcomes are as follows:<br />

p ¬ p<br />

Do ϕ -10 0<br />

Do ψ 0 -10<br />

Find out which is harmful -1 -1<br />

Given that P r ( p) = 1/2, it’s easy to compute that the expected value of doing either ϕ<br />

or ψ is -5, while the expected value of finding out which is harmful is -1, so the agent<br />

should find out which thing is to be done before acting. So far most consequentialists


Knowledge, Bets and Interests 131<br />

would agree, and so probably would most non-consequentialists for most ways of<br />

fleshing out the abstract example I’ve described. 19<br />

But most consequentialists would also say something else about the example that<br />

I think is not exactly true. Just focus on the column in the table above where p is<br />

true. In that column, the highest value, 0, is alongside the action Do ψ. So you might<br />

think that conditional on p, the agent should do ψ. That is, you might think the<br />

conditional expected value of doing ψ, conditional on p being true, is 0, and that’s<br />

higher than the conditional expected value of any other act, conditional on p. If you<br />

thought that, you’d certainly be in agreement with the orthodox decision-theoretic<br />

treatment of this problem.<br />

In the abstract statement of the situation above, I said that one of the options<br />

would be harmful, but I didn’t say who it would be harmful to. I think this matters. I<br />

think what I called the orthodox treatment of the situation is correct when the harm<br />

accrues to the person making the decision. But when the harm accrues to another<br />

person, particularly when it accrues to a person that the agent has a duty of care<br />

towards, then I think the orthodox treatment isn’t quite right.<br />

My reasons for this go back to Jackson’s original discussion of the puzzle. Let the<br />

agent be a doctor, the actions ϕ and ψ be her prescribing different medication to a<br />

patient, and the harm a severe allergic reaction that the patient will have to one of<br />

the medications. Assume that she can run a test that will tell her which medication<br />

the patient is allergic to, but the test will take a day. Assume that the patient will die<br />

in a month without either medication; that’s the disaster that must be averted. And<br />

assume that the patient is is some discomfort that either medication would relieve;<br />

that’s the small cost of finding out which medication is risk. Assume finally that<br />

there is no chance the patient will die in the day it takes to run the test, so the cost of<br />

running the test is really nominal.<br />

A good doctor in that situation will find out which medication the patient is<br />

allergic to before ascribing either medicine. It would be reckless to ascribe a medicine<br />

that is unnecessary and that the patient might be allergic to. It is worse than reckless<br />

if the patient is actually allergic to the medicine prescribed, and the doctor harms the<br />

patient. But even if she’s lucky and prescribes the ‘right’ medication, the recklessness<br />

remains. It was still, it seems, the wrong thing for her to do.<br />

All of that is in Jackson’s discussion of the case, though I’m not sure he’d agree<br />

with the way I’m about the incorporate these ideas into the formal decision theory.<br />

Even under the assumption that p, prescribing ψ is still wrong, because it is reckless.<br />

That should be incorporated into the values we ascribe to different actions in different<br />

circumstances. The way I do it is to associate the value of each action, in each<br />

circumstance, with its actual expected value. So the decision table for the doctor’s<br />

decision looks something like this.<br />

19 Some consequentialists say that what the agent should do depends on whether p is true. If p is true,<br />

she should do ψ, and if p is false she should do ϕ. As we’ll see, I have reasons for thinking this is rather<br />

radically wrong.


Knowledge, Bets and Interests 132<br />

p ¬ p<br />

Do ϕ -5 -5<br />

Do ψ -5 -5<br />

Find out which is harmful -1 -1<br />

In fact, the doctor is making a decision under certainty. She knows that the value of<br />

prescribing either medicine is -5, and the value of running the tests is -1, so she should<br />

run the tests.<br />

In general, when an agent has a duty to maximise the expected value of some<br />

quantity Q, then the value that goes into the agent’s decision table in a cell is not the<br />

value of Q in the world-action pair the agent represents. Rather, it’s the expected<br />

value of Q given that world-action pair. In situations like this one where the relevant<br />

facts (e.g., which medicine the patient is allergic to) don’t affect the evidence the agent<br />

has, the decision is a decision under certainty. This is all as things should be. When<br />

you have obligations that are drawn in terms of the expected value of a variable, the<br />

actual values of that variable cease to be directly relevant to the decision problem.<br />

One upshot of these considerations is that when moral and epistemic considerations<br />

get entangled, for example when agents have a moral duty not to take certain<br />

kinds of risks, it can get tricky to apply the theory of belief developed here. In a<br />

separate paper (“Defending Interest-Relative Invariantism”) I’ve shown how this idea<br />

can help respond to some criticisms of similar views raised by Jessica Brown (2008).<br />

3 Fantl and McGrath on Interest-Relativity<br />

Jeremy Fantl and Matthew McGrath (2009) have argued that my interest-relative theory<br />

of belief cannot explain all of the interest-relativity in epistemology. I’m going<br />

to agree with their conclusion, but not with their premises. At this point you might<br />

suspect, dear reader, that you’re about to be drawn into a micro-battle between two<br />

similar but not quite identical explanations of the same (alleged) phenomena. And<br />

you wouldn’t be entirely mistaken. But I think seeing why Fantl and McGrath’s objections<br />

to my theory fail will show us something interesting about the relationship<br />

between interests and knowledge. In particular, it will show us that interests can<br />

generate certain kinds of defeaters of claims to knowledge.<br />

Fantl and McGrath’s primary complaint against the interest-relative theory of<br />

belief I developed in my <strong>Weatherson</strong> (2005a) and in the previous section is that it is<br />

not strong enough to entail principles such as (JJ).<br />

(JJ) If you are justified in believing that p, then p is warranted enough to justify you<br />

in ϕ-ing, for any ϕ. (Fantl and McGrath, 2009, 99)<br />

In practice, what this means is that there can’t be a salient p,ϕ such that:<br />

• The agent is justified in believing p;<br />

• The agent is not warranted in doing ϕ; but<br />

• If the agent had more evidence for p, and nothing else, the agent would be be<br />

warranted in doing ϕ.


Knowledge, Bets and Interests 133<br />

That is, once you’ve got enough evidence, or warrant, for justified belief in p, then<br />

you’ve got enough evidence for p as matters for any decision you face. This seems<br />

intuitive, and Fantl and McGrath back up its intuitiveness with some nicely drawn<br />

examples.<br />

Now it’s true that the interest-relative theory of belief cannot be used to derive<br />

(JJ), at least on the reading of it I just provided. But that’s because on the intended<br />

reading, it is false, and the interest-relative theory is true. So the fact that (JJ) can’t<br />

be derived is a feature, not a bug. The problem arises because of cases like that of<br />

Coraline. Here’s what we’re going to stipulate about Coraline.<br />

• She knows that p and q are independent, so her credence in any conjunction<br />

where one conjunct is a member of { p,¬p} and the other is a member<br />

of {q,¬q} will be the product of her credences in the conjuncts.<br />

• Her credence in p is 0.99, just as the evidence supports.<br />

• Her credence in q is also 0.99. This is unfortunate, since the rational credence<br />

in q given her evidence is 0.01.<br />

• She has a choice between taking and declining a bet with the following payoff<br />

structure. 20 (Assume that the marginal utility of money is close enough to constant<br />

that expected dollar returns correlate more or less precisely with expected<br />

utility returns.)<br />

p ∧ q p ∧ ¬q ¬ p<br />

Take bet $100 $1 −$1000<br />

Decline bet 0 0 0<br />

As can be easily computed, the expected utility of taking the bet given her credences<br />

is positive, it is just over $89. And Coraline takes the bet. She doesn’t compute the<br />

expected utility, but she is sensitive to it. 21 That is, had the expected utility given her<br />

credences been close to 0, she would have not acted until she made a computation.<br />

But from her perspective this looks like basically a free $100, so she takes it. Happily,<br />

this all turns out well enough, since p is true. But it was a dumb thing to do. The<br />

expected utility of taking the bet given her evidence is negative, it is a little under -$8.<br />

So she isn’t warranted, given her evidence, in taking the bet.<br />

I also claim the following three things are true of her.<br />

20 I’m more interested in the abstract structure of the case than in whether any real-life situation is<br />

modelled by just this structure. But it might be worth noting the rough kind of situation where this kind<br />

of situation can arise. So let’s say Coraline has a particular bank account that is uninsured, but which<br />

currently paying 10% interest, and she is deciding whether to deposit another $1000 in it. Then p is the<br />

proposition that the bank will not collapse, and she’ll get her money back, and q is the proposition that<br />

the interest will stay at 10%. To make the model exact, we have to also assume that if the interest rate<br />

on her account doesn’t stay at 10%, it falls to 0.1%. And we have to assume that the interest rate and the<br />

bank’s collapse are probabilistically independent. Neither of these are at all realistic, but a realistic case<br />

would simply be more complicated, and the complications would obscure the philosophically interesting<br />

point.<br />

21 If she did compute the expected utility, then one of the things that would be salient for her is the<br />

expected utility of the bet. And the expected utility of the bet is different to its expected utility given p.<br />

So if that expected utility is salient, she doesn’t believe p. And it’s going to be important to what follows<br />

that she does believe p.


Knowledge, Bets and Interests 134<br />

1. p is not justified enough to warrant her in taking the bet.<br />

2. She believes p. 22<br />

3. This belief is rational.<br />

The argument for 1 is straightforward. She isn’t warranted in taking the bet, so p<br />

isn’t sufficiently warranted to justify it. This is despite the fact that p is obviously<br />

relevant. Indeed, given p, taking the bet strictly dominates declining it. But still, p<br />

doesn’t warrant taking this bet, because nothing warrants taking a bet with negative<br />

expected utility. Had the rational credence in p been higher, then the bet would have<br />

been reasonable. Had the reasonable credence in p been, say, 0.9999, then she would<br />

have been reasonable in taking the bet, and using p as a reason to do so. So there’s a<br />

good sense in which p simply isn’t warranted enough to justify taking the bet. 23<br />

The argument for 2 is that she has a very high credence in p, this credence is<br />

grounded in the evidence in the right way, and it leads her to act as if p is true, e.g. by<br />

taking the bet. It’s true that her credence in p is not 1, and if you think credence 1 is<br />

needed for belief, then you won’t like this example. But if you think that, you won’t<br />

think there’s much connection between (JJ) and pragmatic conditions in epistemology<br />

either. So that’s hardly a position a defender of Fantl and McGrath’s position can<br />

hold. 24<br />

The argument for 3 is that her attitude towards p tracks the evidence perfectly.<br />

She is making no mistakes with respect to p. She is making a mistake with respect to<br />

q, but not with respect to p. So her attitude towards p, i.e. belief, is rational.<br />

I don’t think the argument here strictly needs the assumption I’m about to make,<br />

but I think it’s helpful to see one very clear way to support the argument of the last<br />

paragraph. The working assumption of my project on interest-relativity has been that<br />

talking about beliefs and talking about credences are simply two ways of modelling<br />

the very same things, namely minds. If the agent both has a credence 0.99 in p, and<br />

believes that p, these are not two different states. Rather, there is one state of the<br />

agent, and two different ways of modelling it. So it is implausible, if not incoherent,<br />

to apply different valuations to the state depending on which modelling tools we<br />

choose to use. That is, it’s implausible to say that while we’re modelling the agent<br />

with credences, the state is rational, but when we change tools, and start using beliefs,<br />

the state is irrational. Given this outlook on beliefs and credences, premise 3 seems<br />

to follow immediately from the setup of the example.<br />

So that’s the argument that (JJ) is false. And if it’s false, the fact that the interestrelative<br />

theory doesn’t entail it is a feature, not a bug. But there are a number of<br />

22 In terms of the example discussed in the previous footnote, she believes that the bank will survive, i.e.,<br />

that she’ll get her money back if she deposits it.<br />

23 I’m assuming here that the interpretation of (JJ) that I gave above is correct, though actually I’m not<br />

entirely sure about this. For present purposes, I plan to simply interpret (JJ) that way, and not return to<br />

exegetical issues.<br />

24 We do have to assume that ¬q is not so salient that attitudes conditional on ¬q are relevant to determining<br />

whether she believes p. That’s because conditional on ¬q, she prefers to not take the bet, but<br />

conditional on ¬q ∧ p, she prefers to take the bet. But if she is simply looking at this as a free $100, then<br />

it’s plausible that ¬q is not salient.


Knowledge, Bets and Interests 135<br />

possible objections to that position. I’ll spend the rest of this section, and this paper,<br />

going over them. 25<br />

Objection: The following argument shows that Coraline is not in fact justified in<br />

believing that p.<br />

1. p entails that Coraline should take the bet, and Coraline knows this.<br />

2. If p entails something, and Coraline knows this, and she justifiably believes p,<br />

she is in a position to justifiably believe the thing entailed.<br />

3. Coraline is not in a position to justifiably believe that she should take the bet.<br />

C. So, Coraline does not justifiably believe that p<br />

Reply: The problem here is that premise 1 is false. What’s true is that p entails that<br />

Coraline will be better off taking the bet than declining it. But it doesn’t follow that<br />

she should take the bet. Indeed, it isn’t actually true that she should take the bet,<br />

even though p is actually true. Not just is the entailment claim false, the world of the<br />

example is a counterinstance to it.<br />

It might be controversial to use this very case to reject premise 1. But the falsity<br />

of premise 1 should be clear on independent grounds. What p entails is that Coraline<br />

will be best off by taking the bet. But there are lots of things that will make me better<br />

off that I shouldn’t do. Imagine I’m standing by a roulette wheel, and the thing that<br />

will make me best off is betting heavily on the number than will actually come up. It<br />

doesn’t follow that I should do that. Indeed, I should not do it. I shouldn’t place any<br />

bets at all, since all the bets have a highly negative expected return.<br />

In short, all p entails is that taking the bet will have the best consequences. Only<br />

a very crude kind of consequentialism would identify what I should do with what<br />

will have the best returns, and that crude consequentialism isn’t true. So p doesn’t<br />

entail that Coraline should take the bet. So premise 1 is false.<br />

Objection: Even though p doesn’t entail that Coraline should take the bet, it does<br />

provide inductive support for her taking the bet. So if she could justifiably believe p,<br />

she could justifiably (but non-deductively) infer that she should take the bet. Since<br />

she can’t justifiably infer that, she isn’t justified in taking the bet.<br />

Reply: The inductive inference here looks weak. One way to make the inductive<br />

inference work would be to deduce from p that taking the bet will have the best<br />

outcomes, and infer from that that the bet should be taken. But the last step doesn’t<br />

even look like a reliable ampliative inference. The usual situation is that the best<br />

outcome comes from taking an ex ante unjustifiable risk.<br />

It may seem better to use p combined with the fact that conditional on p, taking<br />

the bet has the highest expected utility. But actually that’s still not much of a reason<br />

to take the bet. Think again about cases, completely normal cases, where the action<br />

25 Thanks here to a long blog comments thread with Jeremy Fantl and Matthew McGrath for making<br />

me formulate these points much more carefully. The original thread is at http://tar.weatherson.org/<br />

2010/03/31/do-justified-beliefs-justify-action/.


Knowledge, Bets and Interests 136<br />

with the best outcome is an ex ante unjustifiable risk. Call that action ϕ, and let Bϕ<br />

be the proposition that ϕ has the best outcome. Then Bϕ is true, and conditional<br />

on Bϕ, ϕ has an excellent expected return. But doing ϕ is still running a dumb risk.<br />

Since these kinds of cases are normal, it seems it will very often be the case that this<br />

form of inference leads from truth to falsity. So it’s not a reliable inductive inference.<br />

More generally, we should worry quite a lot about Coraline’s ability to draw<br />

inductive inferences about the propriety of the bet here. Unlike deductive inferences,<br />

inductive inferences can be defeated by a whole host of factors. If I’ve seen a lot of<br />

swans, in a lot of circumstances, and they’ve all been blue, that’s a good reason to<br />

think the next swan I see will be blue. But it ceases to be a reason if I am told by a<br />

clearly reliable testifier that there are green swans in the river outside my apartment.<br />

And that’s true even if I dismiss the testifier because I think he has a funny name, and<br />

I don’t trust people with funny names. Now although Coraline has evidence for p,<br />

she also has a lot of evidence against q, evidence that she is presumably ignoring since<br />

her credence in q is so high. Any story about how Coraline can reason from p to<br />

the claim that she should have to take the bet will have to explain how her irrational<br />

attraction to q doesn’t serve as a defeater, and I don’t see how that could be done.<br />

Objection: In the example, Coraline isn’t just in a position to justifiably believe p,<br />

she is in a position to know that she justifiably believes it. And from the fact that she<br />

justifiably believes p, and the fact that if p, then taking the bet has the best option,<br />

she can infer that she should take the bet.<br />

Reply: It’s possible at this point that we get to a dialectical impasse. I think this<br />

inference is non-deductive, because I think the example we’re discussing here is one<br />

where the premises are true and the conclusion false. Presumably someone who<br />

doesn’t like the example will think that it is a good deductive inference.<br />

What makes the objection useful is that, unlike the inductive inference mentioned<br />

in the previous objection, this at least has the form of a good inductive inference.<br />

Whenever you justifiably believe p, and the best outcome given p is gained by doing<br />

ϕ, then usually you should ϕ. Since Coraline knows the premises are true, ceteris<br />

paribus that gives her a reason to believe the premise is probably true.<br />

But other things aren’t at all equal. In particular, this is a case where Coraline has<br />

a highly irrational credence concerning a proposition whose probability is highly<br />

relevant to the expected utility of possible actions. Or, to put things another way,<br />

an inference from something to something else it is correlated with can be defeated<br />

by related irrational beliefs. (That’s what the swan example above shows.) So if<br />

Coraline tried to infer this way that she should take the bet, her irrational confidence<br />

in q would defeat the inference.<br />

The objector might think I am being uncharitable here. The objection doesn’t say<br />

that Coraline’s knowledge provides an inductive reason to take the bet. Rather, they<br />

say, it provides a conclusive reason to take the bet. And conclusive reasons cannot<br />

be defeated by irrational beliefs elsewhere in the web. Here we reach an impasse. I<br />

say that knowledge that you justifiably believe p cannot provide a conclusive reason<br />

to bet on p because I think Coraline knows she justifiably believes p, but does not


Knowledge, Bets and Interests 137<br />

have a conclusive reason to bet on p. That if, I think the premise the objector uses<br />

here is false because I think (JJ) is false. The person who believes in (JJ) won’t be so<br />

impressed by this move.<br />

Having said all that, the more complicated example at the end of <strong>Weatherson</strong><br />

(2005a) was designed to raise the same problem without the consequence that if p is<br />

true, the bet is sure to return a positive amount. In that example, conditionalising on<br />

p means the bet has a positive expected return, but still possibly a negative return.<br />

But in that case (JJ) still failed. If accepting there are cases where an agent justifiably<br />

believes p, and knows this, but can’t rationally bet on p is too much to accept, that<br />

more complicated example might be more persuasive. Otherwise, I concede that<br />

someone who believes (JJ) and thinks rational agents can use it in their reasoning<br />

will not think that a particular case is a counterexample to (JJ).<br />

Objection: If Coraline were ideal, then she wouldn’t believe p. That’s because if<br />

she were ideal, she would have a lower credence in q, and if that were the case, her<br />

credence in p would have to be much higher (close to 0.999) in order to count as a<br />

belief. So her belief is not justified.<br />

Reply: The premise here, that if Coraline were ideal she would not believe that p,<br />

is true. The conclusion, that she is not justified in believing p, does not follow.<br />

It’s always a mistake to identify what should be done with what is done in ideal circumstances.<br />

This is something that has long been known in economics. The locus<br />

classicus of the view that this is a mistake is Lipsey and Lancaster (1956-1957). A similar<br />

point has been made in ethics in papers such as Watson (1977) and Kennett and<br />

Smith (1996a,b). And it has been extended to epistemology by Williamson (1998).<br />

All of these discussions have a common structure. It is first observed that the<br />

ideal is both F and G. It is then stipulated that whatever happens, the thing being<br />

created (either a social system, an action, or a cognitive state) will not be F . It is then<br />

argued that given the stipulation, the thing being created should not be G. That is<br />

not just the claim that we shouldn’t aim to make the thing be G. It is, rather, that in<br />

many cases being G is not the best way to be, given that F -ness will not be achieved.<br />

Lipsey and Lancaster argue that (in an admittedly idealised model) that it is actually<br />

quite unusual for G to be best given that the system being created will not be F .<br />

It’s not too hard to come up with examples that fit this structure. Following<br />

(Williamson, 2000a, 209), we might note that I’m justified in believing that there are<br />

no ideal cognitive agents, although were I ideal I would not believe this. Or imagine a<br />

student taking a ten question mathematics exam who has no idea how to answer the<br />

last question. She knows an ideal student would correctly answer an even number<br />

of questions, but that’s no reason for her to throw out her good answer to question<br />

nine. In general, once we have stipulated one departure from the ideal, there’s no<br />

reason to assign any positive status to other similarities to the idea. In particular,<br />

given that Coraline has an irrational view towards q, she won’t perfectly match up<br />

with the ideal, so there’s no reason it’s good to agree with the ideal in other respects,<br />

such as not believing p.


Knowledge, Bets and Interests 138<br />

Stepping back a bit, there’s a reason the interest-relative theory says that the ideal<br />

and justification come apart right here. On the interest-relative theory, like on any<br />

pragmatic theory of mental states, the identification of mental states is a somewhat<br />

holistic matter. Something is a belief in virtue of its position in a much broader<br />

network. But the evaluation of belief is (relatively) atomistic. That’s why Coraline<br />

is justified in believing p, although if she were wiser she would not believe it. If she<br />

were wiser, i.e., if she had the right attitude towards q, the very same credence in p<br />

would not count as a belief. Whether her state counts as a belief, that is, depends<br />

on wide-ranging features of her cognitive system. But whether the state is justified<br />

depends on more local factors, and in local respects she is doing everything right.<br />

Objection: Since the ideal agent in Coraline’s position would not believe p, it follows<br />

that there is no propositional justification for p. Moreover, doxastic justification<br />

requires propositional justification 26 So Coraline is not doxastically justified in believing<br />

p. That is, she isn’t justified in believing p.<br />

Reply: I think there are two ways of understanding ‘propositional justification’. On<br />

one of them, the first sentence of the objection is false. On the other, the second<br />

sentence is false. Neither way does the objection go through.<br />

The first way is to say that p is propositionally justified for an agent iff that agent’s<br />

evidence justifies a credence in p that is high enough to count as a belief given the<br />

agent’s other credences and preferences. On that understanding, p is propositionally<br />

justified by Coraline’s evidence. For all that evidence has to do to make p justified is<br />

to support a credence a little greater than 0.9. And by hypothesis, the evidence does<br />

that.<br />

The other way is to say that p is propositionally justified for an agent iff that<br />

agent’s evidence justifies a credence in p that is high enough to count as a belief given<br />

the agent’s preferences and the credences supported by that evidence. On this reading, the<br />

objection reduces to the previous objection. That is, the objection basically says that<br />

p is propositionally justified for an agent iff the ideal agent in her situation would believe<br />

it. And we’ve already argued that that is compatible with doxastic justification.<br />

So either the objection rests on a false premise, or it has already been taken care of.<br />

Objection: If Coraline is justified in believing p, then Coraline can use p as a premise<br />

in practical reasoning. If Coraline can use p as a premise in practical reasoning, and<br />

p is true, and her belief in p is not Gettiered, then she knows p. By hypothesis, her<br />

belief is true, and her belief is not Gettiered. So she should know p. But she doesn’t<br />

know p. So by several steps of modus tollens, she isn’t justified in believing p. 27<br />

Reply: Like the previous objection, this one turns on an equivocation, this time over<br />

the neologism ‘Gettiered’. Some epistemologists use this to simply mean that a belief<br />

is justified and true without constituting knowledge. By that standard, the third<br />

sentence is false. Or, at least, we haven’t been given any reason to think that it is true.<br />

26 See Turri (2010) for a discussion of recent views on the relationship between propositional and doxastic<br />

justification. This requirement seems to be presupposed throughout that literature.<br />

27 Compare the ‘subtraction argument’ on page 99 of Fantl and McGrath (2009).


Knowledge, Bets and Interests 139<br />

Given everything else that’s said, the third sentence is a raw assertion that Coraline<br />

knows that p, and I don’t think we should accept that.<br />

The other way epistemologists sometimes use the term is to pick out justified true<br />

beliefs that fail to be knowledge for the reasons that the beliefs in the original examples<br />

from Gettier (1963) fail to be knowledge. That is, it picks out a property that<br />

beliefs have when they are derived from a false lemma, or whatever similar property<br />

is held to be doing the work in the original Gettier examples. Now on this reading,<br />

Coraline’s belief that p is not Gettiered. But it doesn’t follow that it is known.<br />

There’s no reason, once we’ve given up on the JTB theory of knowledge, to think<br />

that whatever goes wrong in Gettier’s examples is the only way for a justified true<br />

belief to fall short of knowledge. It could be that there’s a practical defeater, as in this<br />

case. So the second sentence of the objection is false, and the objection again fails.<br />

But note I’m conceding to the objector that Coraline does not know p. That’s<br />

important, because it reveals another way in which knowledge is interest-relative.<br />

We can see that Coraline does not know that p by noting that if she did know p, we<br />

could represent her decision like this:<br />

q ¬q<br />

Take bet $100 $1<br />

Decline bet 0 0<br />

And now she should clearly take the bet, since it is a dominating option. Her irrational<br />

credence in q simply wouldn’t matter. But she can’t rationally take the bet, so<br />

this table must be wrong, so she doesn’t know p.<br />

This seems odd. Why should an irrational credence in q defeat knowledge of<br />

p? The answer is that beliefs have to sufficiently cohere with our other beliefs to be<br />

knowledge. If I have a very firm belief that ¬ p, and also a belief that p, the latter<br />

belief cannot be knowledge even if it is grounded in the facts in just the right way.<br />

The contradictory belief that ¬ p is a doxastic defeater.<br />

Now in practice almost all of our beliefs are in some tension with some other<br />

subset of our beliefs. Unless we are perfectly coherent, which we never are, there will<br />

be lots of ways to argue that any particular belief does not sufficiently cohere with<br />

our broader doxastic state, and hence is defeated from being knowledge. There must<br />

be some restrictions on when incoherence with some part of the rest of one’s beliefs<br />

defeats knowledge.<br />

I think an interest-relative story here is most likely to succeed. If we said Coraline<br />

knew that p, that would produce a tension between the fact that she does (rationally)<br />

believe that taking the bet is best given p, and that she should believe that declining<br />

the bet is not best simpliciter. And that matters because whether to take the bet or<br />

not is a live decision for her. That is, incoherencies or irrationalities that manifest in<br />

decision problems that are salient can defeat knowledge.


Knowledge, Bets and Interests 140<br />

Put another way, an agent only knows p if we can properly model any decision<br />

she’s really facing with a decision table where p is taken as given. 28 Crediting Coraline<br />

with knowing p will mean we get the table for this decision, i.e., whether to take<br />

this very bet, wrong. And that’s a live decision for her. Those two facts together<br />

entail that she doesn’t know p. Neither alone would suffice for defeating knowledge<br />

of p. And that’s a part of the explanation for the interest-relativity of knowledge, a<br />

part I left out of the story in my (2005a).<br />

28 Or, at least, we can model any such decision where standard decision theory applies. Perhaps we’ll<br />

have to exclude cases where one of the options violates a deontological constraint. As noted in section 1,<br />

the argument for interest-relativity of knowledge is neutral on how to handle such cases.


Defending Interest-Relative Invariantism<br />

In recent years a number of authors have defended the interest-relativity of various<br />

epistemological claims, such as claims of the form S knows that p, or S has a justified<br />

belief that p. Views of this form are floated by John Hawthorne (2004b), and endorsed<br />

by Jeremy Fantl and Matthew McGrath (2002; 2009), Jason Stanley (2005) and <strong>Brian</strong><br />

<strong>Weatherson</strong> (2005a). The various authors differ quite a lot in how much interestrelativity<br />

they allow, and even more in their purported explanations for the interestrelativity,<br />

but what is common is the endorsement of some kind of interest-relativity<br />

in statements like S knows that p, or S has a justified belief that p.<br />

These views have, quite naturally, drawn a range of criticisms. The primary purpose<br />

of this paper is to respond to these criticisms and, as it says on the tin, defend<br />

interest-relative invariantism, or IRI for short. 1 But to defend the view, I first need<br />

to clarify three features of the view. The best version of IRI, and the only one I’m<br />

interested in defending, has these three features:<br />

• Odds, not stakes, are primarily what matter to knowledge.<br />

• Interests create defeaters.<br />

• Interest-relativity is an existential claim; it says that interests sometimes matter,<br />

not that they always do.<br />

In the first three sections, I’ll say more about each of these three points. In the next<br />

six sections, I’ll defend IRI against criticisms from five different authors.<br />

1 Odds and Stakes<br />

It is common to describe IRI as a theory where in ‘high stakes’ situations, more<br />

evidence is needed for knowledge than in ‘low stakes’ situations. But this is at best<br />

misleading. What really matters are the odds on any bet-like decision the agent faces<br />

with respect to the target proposition. More precisely, interests affect belief because<br />

whether someone believes p depends inter alia on whether their credence in p is high<br />

enough that any bet on p they actually face is a good bet. Raising the stakes of any<br />

bet on p does not directly change that, but changing the odds of the bets on p they<br />

face does change it. Now in practice due to the declining marginal utility of material<br />

goods, high stakes situations will usually be situations where an agent faces long odds.<br />

But it is the odds that matter to knowledge, not the stakes.<br />

Some confusion on this point may have been caused by the Bank Cases that Stanley<br />

uses, and the Train Cases that Fantl and McGrath use, to motivate IRI. In those<br />

† Penultimate draft only. Under review at AJP.<br />

1 ‘Interest-relative invariantism’ is Jason Stanley’s term for the view that ‘knows’ is not contextsensitive,<br />

but whether S knows that p might depend on S’s interests. Some critics, such as Michael<br />

Blome-Tillmann (2009a), call IRI ‘subject-sensitive invariantism’. This is an unfortunate moniker. The<br />

only subject-insensitive theory of knowledge has that for any S and T , S knows that p iff T knows that p.<br />

The view the critics target certainly isn’t defined in opposition to this generalisation.


Defending Interest-Relative Invariantism 142<br />

cases, the authors lengthen the odds the relevant agents face by increasing the potential<br />

losses the agent faces by getting the bet wrong. But we can make the same point<br />

by decreasing the amount the agent stands to gain by taking the bet. Let’s go through<br />

a pair of cases, which I’ll call the Map Cases, that illustrate this.<br />

High Cost Map: Zeno is walking to the Mysterious Bookshop in lower Manhattan.<br />

He’s pretty confident that it’s on the corner of Warren Street and West<br />

Broadway. But he’s been confused about this in the past, forgetting whether<br />

the east-west street is Warren or Murray, and whether the north-south street<br />

is Greenwich, West Broadway or Church. In fact he’s right about the location<br />

this time, but he isn’t justified in having a credence in his being correct<br />

greater than about 0.95. While he’s walking there, he has two options. He<br />

could walk to where he thinks the shop is, and if it’s not there walk around for<br />

a few minutes to the nearby corners to find where it is. Or he could call up<br />

directory assistance, pay $1, and be told where the shop is. Since he’s confident<br />

he knows where the shop is, and there’s little cost to spending a few minutes<br />

walking around if he’s wrong, he doesn’t do this, and walks directly to the<br />

shop.<br />

Low Cost Map: Just like the previous case, except that Zeno has a new phone with<br />

more options. In particular, his new phone has a searchable map, so with a few<br />

clicks on the phone he can find where the store is. Using the phone has some<br />

very small costs. For example, it distracts him a little, which marginally raises<br />

the likelihood of bumping into another pedestrian. But the cost is very small<br />

compared to the cost of getting the location wrong. So even though he is very<br />

confident about where the shop is, he double checks while walking there.<br />

I think the Map Cases are like the Bank Cases, Train Cases etc., in all important<br />

respects. I think Zeno knows where the shop is in High Cost Map, and doesn’t know<br />

in Low Cost Map. And he doesn’t know in Low Cost Map because the location of the<br />

shop has suddenly become the subject matter of a bet at very long odds. You should<br />

think of Zeno’s not checking the location of the shop on his phone-map as a bet on<br />

the location of the shop. If he wins the bet, he wins a few seconds of undistracted<br />

strolling. If he loses, he has to walk around a few blocks looking for a store. The<br />

disutility of the loss seems easily twenty times greater than the utility of the gain,<br />

and by hypothesis the probability of winning the bet is no greater than 0.95. So he<br />

shouldn’t take the bet. Yet if he knew where the store was, he would be justified in<br />

taking the bet. So he doesn’t know where the store is. Now this is not a case where<br />

higher stakes defeat knowledge. If anything, the stakes are lower in Low Cost Map.<br />

But the relevant odds are longer, and that’s what matters to knowledge. 2<br />

2 Note that I’m not claiming that it is intuitive that Zeno has knowledge in High Cost Map, but not<br />

that Low Cost Map. Nor am I claiming that we should believe IRI because it gets the Map Cases right. In<br />

fact, I don’t believe either of those things. Instead I believe Zeno has knowledge in High Cost Map and not<br />

in Low Cost Map because I believe IRI is correct, and that’s what IRI says about the case. It is sometimes<br />

assumed, e.g, in the experimental papers I’ll discuss in section 4, that pairs of cases like these are meant<br />

to motivate, and not just illustrate, IRI. I can’t speak for everyone’s motivations, but I’m only using these<br />

cases as illustrations, not motivations.


Defending Interest-Relative Invariantism 143<br />

2 Interests are Defeaters<br />

Interests can have a somewhat roundabout effect on knowledge. The IRI story goes<br />

something like this. If the agent has good but not completely compelling evidence<br />

for p, that is sometimes but not always sufficient for knowledge that p if everything<br />

else goes right. It isn’t sufficient if they face a choice where the right thing to do<br />

is different to the right thing to do conditional on p. That is, if adding p to their<br />

background information would make a genuinely bad choice look like a good choice,<br />

they don’t know that p. The reason is that if they did know p, they’d be in one<br />

of two unhappy states. The first such state is that they’d believe the choice is bad<br />

despite believing that conditional on something they believe, namely p, the choice is<br />

good. That’s incoherent, and the incoherence defeats knowledge that p. The second<br />

such state is that they’d believe the choice is good. That’s also irrational, and the<br />

irrationality defeats knowledge that p. 3<br />

Cases where knowledge is defeated because if the agent did know p, that would<br />

lead to problems elsewhere in their cognitive system, have a few quirky features. In<br />

particular, whether the agent knows p can depend on very distant features. Consider<br />

the following kind of case.<br />

Confused Student<br />

Con is systematically disposed to affirm the consequent. That is, if he<br />

notices that he believes both p and q → p, he’s disposed to either infer<br />

q, or if that’s impermissible given his evidence, to ditch his belief in the<br />

conjunction of p and q → p. Con has completely compelling evidence<br />

for both q → p and ¬q. He has good but less compelling evidence for p.<br />

And this evidence tracks the truth of p in just the right way for knowledge.<br />

On the basis of this evidence, Con believes p. Con has not noticed<br />

that he believes both p and q → p. If he did, he’s unhesitatingly drop<br />

his belief that p, since he’d realise the alternatives (given his dispositions)<br />

involved dropping belief in a compelling proposition. Two questions:<br />

• Does Con know that p?<br />

• If Con were to think about the logic of conditionals, and reason<br />

himself out of the disposition to affirm the consequent, would he<br />

know that p?<br />

I think the answer to the first question is No, and the answer to the second question<br />

is Yes. As it stands, Con’s disposition to affirm the consequent is a doxastic defeater<br />

of his putative knowledge that p. Put another way, p doesn’t cohere well enough<br />

with the rest of Con’s views for his belief that p to count as knowledge. To be sure,<br />

p coheres well enough with those beliefs by objective standards, but it doesn’t cohere<br />

3 Note again that I’m not here purporting to argue for IRI, just set out its features. Obviously not<br />

everyone will agree that knowledge that p can be defeated by irrationality or incoherence in closely related<br />

attitudes. But I think it is at least plausible that there are coherence constraints like this on knowledge,<br />

which is all I need for a defence of IRI against critics.


Defending Interest-Relative Invariantism 144<br />

at all by Con’s lights. Until he changes those lights, it doesn’t cohere well enough to<br />

be knowledge.<br />

I don’t expect exactly everyone will agree with those judgments. Some people<br />

will simply reject that this kind of coherence by one’s own lights is necessary for<br />

knowledge. Others might even reject the whole idea of doxastic defeaters. But I think<br />

the picture I’ve just sketched, one which puts reasonably tight coherence constraints<br />

on knowledge, is plausible enough to use in a defence of IRI. It certainly isn’t so<br />

implausible that committing to it amounts to a reductio of one’s views. Yet as we’ll<br />

frequently see below, some criticisms of IRI do suggest that any theory that allows<br />

for these kinds of coherence constraints or doxastic defeaters is thereby shown to be<br />

false. 4 I’m going to take that suggestion to be a reductio of the criticisms.<br />

3 IRI is an Existential Theory<br />

IRI theorists do not typically say that interests are always relevant to knowledge. In<br />

fact, they hardly could be. If p is not true, or the agent has very little evidence for<br />

p, the agent does not know p whatever their interests. But an assumption that seems<br />

to shared by both some critics and some proponents of IRI is that IRI rests on some<br />

universal epistemic principles, not just on various existential principles. For instance,<br />

Jeremy Fantl and Matthew McGrath use a lot of principles like (JJ) in deriving IRI.<br />

(JJ) If you are justified in believing that p, then p is warranted enough to justify you<br />

in ϕ-ing, for any ϕ. (Fantl and McGrath, 2009, 99)<br />

Now we it turns out, I think (JJ) is false. (I think it fails in cases where the agent is<br />

seriously mistaken about the risks and payoffs involved in doing ϕ, for instance.) But<br />

more importantly, we don’t need anything nearly as strong as this to derive IRI. As<br />

long as there is some sufficiently large range of cases where (JJ) holds, we’ll be able to<br />

establish the existence of some pair of cases which differ in whether the agent knows<br />

that p in virtue of the interests the agent has.<br />

Relatedly, the argument for a version of IRI in <strong>Weatherson</strong> (2005a) makes frequent<br />

appeal to standard Bayesian decision theory. This might suggest that such an<br />

argument stands and falls with the success of consequentialism in decision theory. (I<br />

mean to use ‘consequentialism’ here roughly in the sense that it is used in Hammond<br />

(1988).) But again, this suggestion would be false. If consequentialism is true in some<br />

range of cases, we’ll be able to use similar techniques to the ones used in that paper to<br />

show that there are the kinds of pairs of cases that IRI say exist.<br />

4 This kind of criticism is most pronounced in the arguments I’ll respond to in section 7, but it is<br />

somewhat pervasive.


Defending Interest-Relative Invariantism 145<br />

4 Experimental Objections<br />

As I mentioned in the discussion of the Map Cases, I don’t think the argument for<br />

IRI rests on judgments, or intuitions, about similar cases. Rather, IRI can be independently<br />

motivated by, for instance, reflections on the relationship between belief<br />

and credence. It’s a happy result, in my view, that IRI gets various Bank Cases and<br />

Map Cases right, but not essential to the view. If it turned out that the facts about<br />

the examples were less clear than we thought, that wouldn’t undermine the argument<br />

for IRI, since those facts weren’t part of the best arguments for IRI. But if it turned<br />

out that the facts about those examples were quite different to what IRI predicts, that<br />

may rebut the view, since it would then be shown to make false predictions.<br />

This kind of rebuttal may be suggested by various recent experimental results,<br />

such as the results in May et al. (forthcoming) and Feltz and Zarpentine (forthcoming).<br />

I’m going to concentrate on the latter set of results here, though I think that<br />

what I say will generalise to related experimental work. 5 In fact, I think the experiments<br />

don’t really tell against IRI, because IRI doesn’t make any unambiguous<br />

predictions about the cases at the centre of the experiments. The reason for this is<br />

related to the first point made in section one: it is odds, not stakes, that are most<br />

important.<br />

Feltz and Zarpentine gave subjects related vignettes, such as the following pair.<br />

(Each subject only received one of the pair.)<br />

High Stakes Bridge John is driving a truck along a dirt road in a caravan of trucks.<br />

He comes across what looks like a rickety wooden bridge over a yawning thousand<br />

foot drop. He radios ahead to find out whether other trucks have made it<br />

safely over. He is told that all 15 trucks in the caravan made it over without a<br />

problem. John reasons that if they made it over, he will make it over as well.<br />

So, he thinks to himself, ‘I know that my truck will make it across the bridge.’<br />

Low Stakes Bridge John is driving a truck along a dirt road in a caravan of trucks.<br />

He comes across what looks like a rickety wooden bridge over a three foot<br />

ditch. He radios ahead to find out whether other trucks have made it safely<br />

over. He is told that all 15 trucks in the caravan made it over without a problem.<br />

John reasons that if they made it over, he will make it over as well. So, he<br />

thinks to himself, ‘I know that my truck will make it across the bridge.’ (Feltz<br />

and Zarpentine, forthcoming, ??)<br />

Subjects were asked to evaluate John’s thought. And the result was that 27% of the<br />

participants said that John does not know that the truck will make it across in Low<br />

Stakes Bridge, while 36% said he did not know this in High Stakes Bridge. Feltz<br />

and Zarpentine say that these results should be bad for interest-relativity views. But<br />

it is hard to see just why this is so.<br />

Note that the change in the judgments between the cases goes in the direction that<br />

IRI seems to predict. The change isn’t trivial, even if due to the smallish sample size it<br />

5 Note to editors: Because this work is not yet in press, I don’t have page numbers for any of the quotes<br />

from Feltz and Zarpentine.


Defending Interest-Relative Invariantism 146<br />

isn’t statistically significant in this sample. But should a view like IRI have predicted<br />

a larger change? To figure this out, we need to ask three questions.<br />

1. What are the costs of the bridge collapsing in the two cases?<br />

2. What are the costs of not taking the bet, i.e., not driving across the bridge?<br />

3. What is the rational credence to have in the bridge’s sturdiness given the evidence<br />

John has?<br />

IRI predicts that there is knowledge in Low Stakes Bridge but not in High Stakes<br />

Bridge only if the following equation is true:<br />

C H<br />

G + C H<br />

> x > C L<br />

G + C L<br />

where G is the gain the driver gets from taking a non-collapsing bridge rather than<br />

driving around (or whatever the alternative is), C H is the cost of being on a collapsing<br />

bridge in High Stakes Bridge, C L is the cost of being on a collapsing bridge in Low<br />

Stakes Bridge, and x is the probability that the bridge will collapse. I assume x is<br />

constant between the two cases. If that equation holds, then taking the bridge, i.e.,<br />

acting as if the bridge is safe, maximises expected utility in Low Stakes Bridge but not<br />

High Stakes Bridge. So in High Stakes Bridge, adding the proposition that the bridge<br />

won’t collapse to the agent’s cognitive system produces incoherence, since the agent<br />

won’t (at least rationally) act as if the bridge won’t collapse. So if the equation holds,<br />

the agent’s interests in avoiding C H creates a doxastic defeater in High Stakes Bridge.<br />

But does the equation hold? Or, more relevantly, did the subjects of the experiment<br />

believe that the equation hold? None of the four variables has their values<br />

clearly entailed by the story, so we have to guess a little as to what the subjects’ views<br />

would be.<br />

Feltz and Zarpentine say that the costs in “High Stakes Bridge [are] very costly—<br />

certain death—whereas the costs in Low Stakes Bridge are likely some minor injuries<br />

and embarrassment.” (Feltz and Zarpentine, forthcoming, ??) I suspect both of those<br />

claims are wrong, or at least not universally believed. A lot more people survive<br />

bridge collapses than you may expect, even collapses from a great height. 6 And once<br />

the road below a truck collapses, all sorts of things can go wrong, even if the next<br />

bit of ground is only 3 feet away. (For instance, if the bridge collapses unevenly, the<br />

truck could roll, and the driver would probably suffer more than minor injuries.)<br />

We aren’t given any information as to the costs of not crossing the bridge. But<br />

given that 15 other trucks, with less evidence than John, have decided to cross the<br />

bridge, it seems plausible to think they are substantial. If there was an easy way to<br />

avoid the bridge, presumably the first truck would have taken it. If G is large enough,<br />

and C H small enough, then the only way for this equation to hold will be for x to be<br />

6 In the West Gate bridge collapse in Melbourne in 1971, a large number of the victims were underneath<br />

the bridge; the people on top of the bridge had a non-trivial chance of survival. That bridge was 200 feet<br />

above the water, not 1000, but I’m not sure the extra height would matter greatly. Again from a slightly<br />

lower height, over 90% of people on the bridge survived the I-35W collapse in Minneapolis in 2007.


Defending Interest-Relative Invariantism 147<br />

low enough that we’d have independent reason to say that the driver doesn’t know<br />

the bridge will hold.<br />

But what is the value of x? John has a lot of information that the bridge will<br />

support his truck. If I’ve tested something for sturdiness two or three times, and<br />

it has worked, I won’t even think about testing it again. Consider what evidence<br />

you need before you’ll happily stand on a particular chair to reach something in the<br />

kitchen, or put a heavy television on a stand. Supporting a weight is the kind of<br />

thing that either fails the first time, or works fairly reliably. Obviously there could<br />

be some strain-induced effects that cause a subsequent failure 7 , but John really has a<br />

lot of evidence that the bridge will support him.<br />

Given those three answers, it seems to me that it is a reasonable bet to cross the<br />

bridge. At the very least, it’s no more of an unreasonable bet than the bet I make<br />

every day crossing a busy highway by foot. So I’m not surprised that 64% of the<br />

subjects agreed that John knew the bridge would hold him. At the very least, that<br />

result is perfectly consistent with IRI, if we make plausible assumptions about how<br />

the subjects would answer the three numbered questions above.<br />

And as I’ve stressed, these experiments are only a problem for IRI if the subjects<br />

are reliable. I can think of two reasons why they might not be. First, subjects tend to<br />

massively discount the costs and likelihoods of traffic related injuries. In most of the<br />

country, the risk of death or serious injury through motor vehicle accident is much<br />

higher than the risk of death or serious injury through some kind of crime or other<br />

attack, yet most people do much less to prevent vehicles harming them than they<br />

do to prevent criminals or other attackers harming them. 8 Second, only 73% of this<br />

subjects in this very experiment said that John knows the bridge will support him in<br />

Low Stakes Bridge. This is just absurd. Unless the subjects endorse an implausible<br />

kind of scepticism, something has gone wrong with the experimental design. Given<br />

the fact that the experiment points broadly in the direction of IRI, and that with some<br />

plausible assumptions it is perfectly consistent with that theory, and that the subjects<br />

seem unreasonably sceptical to the point of unreliability about epistemology, I don’t<br />

think this kind of experimental work threatens IRI.<br />

5 Knowledge By Indifference and By Wealth<br />

Gillian Russell and John Doris (2009) argue that Jason Stanley’s account of knowledge<br />

leads to some implausible attributions of knowledge, and if successful their objections<br />

would generalise to other forms of IRI. I’m going to argue that Russell and<br />

Doris’s objections turn on principles that are prima facie rather plausible, but which<br />

ultimately we can reject for independent reasons. 9<br />

7 As I believe was the case in the I-35W collapse.<br />

8 See the massive drop in the numbers of students walking or biking to school, reported in Ham et al.<br />

(2008), for a sense of how big an issue this is.<br />

9 I think the objections I make here are similar in spirit to those Stanley made in a comments thread<br />

on Certain Doubts, though the details are new. The thread is at http://el-prod.baylor.edu/certain_doubts/?p=616


Defending Interest-Relative Invariantism 148<br />

Their objection relies on variants of the kind of case Stanley uses heavily in his<br />

(2005) to motivate a pragmatic constraint on knowledge. Stanley imagines a character<br />

who has evidence which would normally suffice for knowledge that p, but is faced<br />

with a decision where A is both the right thing to do if p is true, and will lead to a<br />

monumental material loss if p is false. Stanley intuits, and argues, that this is enough<br />

that they cease to know that p. I agree, at least as long as the gains from doing A are<br />

low enough that doing A amounts to a bet on p at insufficiently favourable odds to<br />

be reasonable in the agent’s circumstance.<br />

Russell and Doris imagine two kinds of variants on Stanley’s case. In one variant<br />

the agent doesn’t care about the material loss. As I’d put it, the agent’s indifference to<br />

material odds shortens the odds of the bet. That’s because costs and benefits of bets<br />

should be measured in something like utils, not something like dollars. Given that,<br />

Russell and Doris object that “you should have reservations ... about what makes<br />

[the knowledge claim] true: not giving a damn, however enviable in other respects,<br />

should not be knowledge-making.” (Russell and Doris, 2009, 432). Their other variant<br />

involves an agent with so much money that the material loss is trifling to them.<br />

Again, this lowers the effective odds of the bet, so by my lights they may still know<br />

that p. But this is somewhat counter-intuitive. As Russell and Doris say, “[m]atters<br />

are now even dodgier for practical interest accounts, because money turns out to be<br />

knowledge making.” (Russell and Doris, 2009, 433) And this isn’t just because wealth<br />

can purchase knowledge. As they say, “money may buy the instruments of knowledge<br />

... but here the connection between money and knowledge seems rather too direct.”<br />

(Russell and Doris, 2009, 433)<br />

The first thing to note about this case is that indifference and wealth aren’t really<br />

producing knowledge. What they are doing is more like defeating a defeater.<br />

Remember that the agent in question had enough evidence, and enough confidence,<br />

that they would know p were it not for the practical circumstances. As I argued<br />

in section 2, practical considerations enter debates about knowledge in part because<br />

they are distinctive kinds of defeaters. It seems that’s what is going on here. And<br />

we have, somewhat surprisingly, independent evidence to think that indifference and<br />

wealth do matter to defeaters.<br />

Consider two variants on Gilbert Harman’s ‘dead dictator’ example (Harman,<br />

1973, 75). In the original example, an agent reads that the dictator has died through<br />

an actually reliable source. But there are many other news sources around, such that<br />

if the agent read them, she would lose her belief. Even if the agent doesn’t read those<br />

sources, their presence can constitute defeaters to her putative knowledge that the<br />

dictator died.<br />

In our first variant on Harman’s example, the agent simply does not care about<br />

politics. It’s true that there are many other news sources around that are ready to<br />

mislead her about the dictator’s demise. But she has no interest in looking them<br />

up, nor is she at all likely to look them up. She mostly cares about sports, and<br />

will spend most of her day reading about baseball. In this case, the misleading news<br />

sources are too distant, in a sense, to be defeaters. So she still knows the dictator<br />

has died. Her indifference towards politics doesn’t generate knowledge - the original


Defending Interest-Relative Invariantism 149<br />

reliable report is the knowledge generator - but her indifference means that a wouldbe<br />

defeater doesn’t gain traction.<br />

In the second variant, the agent cares deeply about politics, and has masses of<br />

wealth at hand to ensure that she knows a lot about it. Were she to read the misleading<br />

reports that the dictator has survived, then she would simply use some of the<br />

very expensive sources she has to get more reliable reports. Again this suffices for<br />

the misleading reports not to be defeaters. Even before the rich agent exercises her<br />

wealth, the fact that her wealth gives her access to reports that will correct for misleading<br />

reports means that the misleading reports are not actually defeaters. So with<br />

her wealth she knows things she wouldn’t otherwise know, even before her money<br />

goes to work. Again, her money doesn’t generate knowledge – the original reliable<br />

report is the knowledge generator – but her wealth means that a would-be defeater<br />

doesn’t gain traction.<br />

The same thing is true in Russell and Doris’s examples. The agent has quite a bit<br />

of evidence that p. That’s why she knows p. There’s a potential practical defeater for<br />

p. But due to either indifference or wealth, the defeater is immunised. Surprisingly<br />

perhaps, indifference and/or wealth can be the difference between knowledge and<br />

ignorance. But that’s not because they can be in any interesting sense ‘knowledge<br />

makers’, any more than I can make a bowl of soup by preventing someone from<br />

tossing it out. Rather, they can be things that block defeaters, both when the defeaters<br />

are the kind Stanley talks about, and when they are more familiar kinds of defeaters.<br />

6 Temporal Embeddings<br />

Michael Blome-Tillmann (2009a) has argued that tense-shifted knowledge ascriptions<br />

can be used to show that his version of Lewisian contextualism is preferable to IRI.<br />

Like Russell and Doris, his argument uses a variant of Stanley’s Bank Cases. 10 Let<br />

O be that the bank is open Saturday morning. If Hannah has a large debt, she is in<br />

a high-stakes situation with respect to O. In Blome-Tillman’s version of the example,<br />

Hannah had in fact incurred a large debt, but on Friday morning the creditor<br />

waived this debt. Hannah had no way of anticipating this on Thursday. She has<br />

some evidence for O, but not enough for knowledge if she’s in a high-stakes situation.<br />

Blome-Tillmann says that this means after Hannah discovers the debt waiver,<br />

she could say (2).<br />

(2) I didn’t know O on Thursday, but on Friday I did.<br />

But I’m not sure why this case should be problematic for any version of IRI, and<br />

very unsure why it should even look like a reductio of IRI. As Blome-Tillmann notes,<br />

it isn’t really a situation where Hannah’s stakes change. She was never actually in a<br />

high stakes situation. At most her perception of her stakes change; she thought she<br />

was in a high-stakes situation, then realised that she wasn’t. Blome-Tillmann argues<br />

that even this change in perceived stakes can be enough to make (2) true if IRI is<br />

true. Now actually I agree that this change in perception could be enough to make<br />

10 In the interests of space, I won’t repeat those cases yet again here.


Defending Interest-Relative Invariantism 150<br />

(2) true, but when we work through the reason that’s so, we’ll see that it isn’t because<br />

of anything distinctive, let alone controversial, about IRI.<br />

If Hannah is rational, then given her interests she won’t be ignoring ¬O possibilities<br />

on Thursday. She’ll be taking them into account in her plans. Someone who<br />

is anticipating ¬O possibilities, and making plans for them, doesn’t know O. That’s<br />

not a distinctive claim of IRI. Any theory should say that if a person is worrying<br />

about ¬O possibilities, and planning around them, they don’t know O. And that’s<br />

simply because knowledge requires a level of confidence that such a person simply<br />

does not show. If Hannah is rational, that will describe her on Thursday, but not<br />

on Friday. So (2) is true not because Hannah’s practical situation changes between<br />

Thursday and Friday, but because her psychological state changes, and psychological<br />

states are relevant to knowledge.<br />

What if Hannah is, on Thursday, irrationally ignoring ¬O possibilities, and not<br />

planning for them even though her rational self wishes she were planning for them?<br />

In that case, it seems she still believes O. After all, she makes the same decisions as<br />

she would as if O were sure to be true. But it’s worth remembering that if Hannah<br />

does irrationally ignore ¬O possibilities, she is being irrational with respect to O.<br />

And it’s very plausible that this irrationality defeats knowledge. That is, you can’t be<br />

irrational with respect to a proposition and know it. Irrationality excludes knowledge.<br />

That’s what we saw in Con’s case in section 2, and it’s all we see here as well.<br />

Note also that Con will know p after fixing his consequent-affirming disposition but<br />

not before. That’s just what happens with Hannah; a distant change in her cognitive<br />

system will remove a defeater, and after it does she gets more knowledge.<br />

There’s a methodological point here worth stressing. Doing epistemology with<br />

imperfect agents often results in facing tough choices, where any way to describe a<br />

case feels a little counterintuitive. If we simply hew to intuitions, we risk being led<br />

astray by just focussing on the first way a puzzle case is described to us. But once we<br />

think through Hannah’s case, we see perfectly good reasons, independent of IRI, to<br />

endorse IRI’s prediction about the case.<br />

7 Problematic Conjunctions<br />

George and Ringo both have $6000 in their bank accounts. They both are thinking<br />

about buying a new computer, which would cost $2000. Both of them also have rent<br />

due tomorrow, and they won’t get any more money before then. George lives in<br />

New York, so his rent is $5000. Ringo lives in Syracuse, so his rent is $1000. Clearly,<br />

(3) and (4) are true.<br />

(3) Ringo has enough money to buy the computer.<br />

(4) Ringo can afford the computer.<br />

And (3) is true as well, though there’s at least a reading of (4) where it is false.<br />

(3) George has enough money to buy the computer.<br />

(4) George can afford the computer.


Defending Interest-Relative Invariantism 151<br />

Focus for now on (3). It is a bad idea for George to buy the computer; he won’t be<br />

able to pay his rent. But he has enough money to do so; the computer costs $2000,<br />

and he has $6000 in the bank. So (3) is true. Admittedly there are things close to (3)<br />

that aren’t true. He hasn’t got enough money to buy the computer and pay his rent.<br />

You might say that he hasn’t got enough money to buy the computer given his other<br />

financial obligations. But none of this undermines (3). The point of this little story<br />

is to respond to another argument Blome-Tillmann offers against IRI. Here is how<br />

he puts the argument. (Again I’ve changed the numbering and some terminology for<br />

consistency with this paper.)<br />

Suppose that John and Paul have exactly the same evidence, while John is<br />

in a low-stakes situation towards p and Paul in a high-stakes situation towards<br />

p. Bearing in mind that IRI is the view that whether one knows p<br />

depends on one’s practical situation, IRI entails that one can truly assert:<br />

(2) John and Paul have exactly the same evidence for p, but only John<br />

has enough evidence to know p, Paul doesn’t.<br />

(Blome-Tillmann, 2009a, 328-9)<br />

And this is meant to be a problem, because (2) is intuitively false.<br />

But IRI doesn’t entail any such thing. Paul does have enough evidence to know<br />

that p, just like George has enough money to buy the computer. Paul can’t know<br />

that p, just like George can’t buy the computer, because of their practical situations.<br />

But that doesn’t mean he doesn’t have enough evidence to know it. So, contra Blome-<br />

Tillmann, IRI doesn’t entail this problematic conjunction.<br />

In a footnote attached to this, Blome-Tillmann tries to reformulate the argument.<br />

I take it that having enough evidence to ‘know p’ in C just means having<br />

evidence such that one is in a position to ‘know p’ in C , rather than having<br />

evidence such that one ‘knows p’. Thus, another way to formulate (2)<br />

would be as follows: ‘John and Paul have exactly the same evidence for p,<br />

but only John is in a position to know p, Paul isn’t.’ (Blome-Tillmann,<br />

2009a, 329n23)<br />

The ‘reformulation’ is obviously bad, since having enough evidence to know p isn’t<br />

the same as being in a position to know it, any more than having enough money to<br />

buy the computer puts George in a position to buy it. But might there be a different<br />

problem for IRI here? Might it be that IRI entails (2), which is false?<br />

(2) John and Paul have exactly the same evidence for p, but only John is in a<br />

position to know p, Paul isn’t.


Defending Interest-Relative Invariantism 152<br />

There isn’t a problem with (2) because almost any epistemological theory will imply<br />

that conjunctions like that are true. In particular, any epistemological theory that<br />

allows for the existence of defeaters to not supervene on the possession of evidence<br />

will imply that conjunctions like (2) are true. Again, it matters a lot that IRI is<br />

suggesting that traditional epistemologists did not notice that there are distinctively<br />

pragmatic defeaters. Once we see that, we’ll see that conjunctions like (2) are not<br />

surprising at all.<br />

Consider again Con, and his friend Mod who is disposed to reason by modus<br />

ponens and not by affirming the consequent. We could say that Con and Mod have<br />

the same evidence for p, but only Mod is in a position to know p. There are only two<br />

ways to deny that conjunction. One is to interpret ‘position to know’ so broadly that<br />

Con is in a position to know p because he could change his inferential dispositions.<br />

But then we might as well say that Paul is in a position to know p because he could get<br />

into a different ‘stakes’ situation. Alternatively, we could say that Con’s inferential<br />

dispositions count as a kind of evidence against p. But that stretches the notion of<br />

evidence beyond a breaking point. Note that we didn’t say Con had any reason to<br />

affirm the consequent, just that he does. Someone might adopt, or change, a poor<br />

inferential habit because they get new evidence. But they need not do so, and we<br />

shouldn’t count their inferential habits as evidence they have.<br />

If that case is not convincing, we can make the same point with a simple Gettierstyle<br />

case.<br />

Getting the Job<br />

In world 1, at a particular workplace, someone is about to be promoted.<br />

Agnetha knows that Benny is the management’s favourite choice for the<br />

promotion. And she also knows that Benny is Swedish. So she comes to<br />

believe that the promotion will go to someone Swedish. Unsurprisingly,<br />

management does choose Benny, so Agnetha’s belief is true.<br />

World 2 is similar, except there it is Anni-Frid who knows that Benny<br />

is the management’s favourite choice for the promotion, that Benny is<br />

Swedish. So she comes to believe that the promotion will go to someone<br />

Swedish. But in this world Benny quits the workplace just before the<br />

promotion is announced, and the management unexpectedly passes over<br />

a lot of Danish workers to promote another Swede, namely Björn. So<br />

Anni-Frid’s belief that the promotion will go to someone Swedish is true,<br />

but not in a way that she could have expected.<br />

In that story, I think it is clear that Agnetha and Anni-Frid have exactly the same<br />

evidence that the job will go to someone Swedish, but only Agnetha is in a position<br />

to know this, Anni-Frid is not. The fact that an intermediate step is false in Anni-<br />

Frid’s reasoning, but not Agnetha’s, means that Anni-Frid’s putative knowledge is<br />

defeated, but Agnetha’s is not. And when that happens, we can have differences in<br />

knowledge without differences in evidence. So it isn’t an argument against IRI that it<br />

allows differences in knowledge without differences in evidence.


Defending Interest-Relative Invariantism 153<br />

8 Holism and Defeaters<br />

The big lesson of the last few sections is that interests create defeaters. Sometimes an<br />

agent can’t know p because adding p to her stock of beliefs would introduce either<br />

incoherence or irrationality. The reason is normally that the agent faces some decision<br />

where it is, say, bad to do ϕ, but good to do ϕ given p. In that situation, if she<br />

adds p, she’ll either incoherently think that it’s bad to do ϕ although it’s good to do<br />

it given what is (by her lights) true, or she’ll irrationally think that it’s good to do<br />

ϕ. Moreover, the IRI theorist says, being either incoherent or irrational in this way<br />

blocks knowledge, so the agent doesn’t know p.<br />

But there are other, more roundabout, ways in which interests can mean that believing<br />

p would entail incoherence or irrationality. One of these is illustrated by an<br />

example alleged by Ram Neta to be hard for interest-relative theorists to accommodate.<br />

Kate needs to get to Main Street by noon: her life depends upon it. She is<br />

desperately searching for Main Street when she comes to an intersection<br />

and looks up at the perpendicular street signs at that intersection. One<br />

street sign says “State Street” and the perpendicular street sign says “Main<br />

Street.” Now, it is a matter of complete indifference to Kate whether she<br />

is on State Street–nothing whatsoever depends upon it. (Neta, 2007, 182)<br />

Let’s assume for now that Kate is rational; dropping this assumption introduces<br />

mostly irrelevant complications. 11 Kate will not believe she’s on Main Street. She<br />

would only have that belief if she took it to be settled that she’s on Main, and hence<br />

not worthy of spending further effort investigating. But presumably she won’t do<br />

that. The rational thing for her to do is to get confirming (or, if relevant, confounding)<br />

evidence for the appearance that she’s on Main. If it were settled that she was on<br />

Main, the rational thing to do would be to try to relax, and be grateful that she had<br />

found Main Street. Since she has different attitudes about what to do simpliciter and<br />

conditional on being on Main Street, she doesn’t believe she’s on Main Street.<br />

So far so good, but what about her attitude towards the proposition that she’s<br />

on State Street? She has enough evidence for that proposition that her credence in it<br />

should be rather high. And no practical issues turn on whether she is on State. So<br />

she believes she is on State, right?<br />

Not so fast! Believing that she’s on State has more connections to her cognitive<br />

system than just producing actions. Note in particular that street signs are hardly<br />

basic epistemic sources. They are the kind of evidence we should be ‘conservative’<br />

about in the sense of Pryor (2004b). We should only use them if we antecedently<br />

believe they are accurate. So for Kate to believe she’s on State, she’d have to believe<br />

street signs around here are accurate. If not, she’d incoherently be relying on a source<br />

11 It means we constantly have to run through both the irrationality horn of the dilemma from the first<br />

paragraph of this section as well as the incoherence horn, but the two horns look very similar in practice.


Defending Interest-Relative Invariantism 154<br />

she doesn’t trust, even though it is not a basic source. 12 But if she believes the street<br />

signs are accurate, she’d believe she was on Main, and that would lead to practical<br />

incoherence. So there’s no way to coherently add the belief that she’s on State Street<br />

to her stock of beliefs. So she doesn’t know, and can’t know, that she’s either on State<br />

or on Main. This is, in a roundabout way, due to the high stakes Kate faces.<br />

Neta thinks that the best way for the interest-relative theorist to handle this case is<br />

to say that the high stakes associated with the proposition that Kate is on Main Street<br />

imply that certain methods of belief formation do not produce knowledge. And he<br />

argues, plausibly, that such a restriction will lead to implausibly sceptical results. But<br />

that’s not the right way for the interest-relative theorist to go. What they should say<br />

is that Kate can’t know she’s on State Street because the only grounds for that belief is<br />

intimately connected to a proposition that, in virtue of her interests, she needs very<br />

large amounts of evidence to believe.<br />

9 Non-Consequentialist Cases<br />

None of the replies yet have leaned heavily on the point from section 3, the fact that<br />

IRI is an existential claim. This reply will make heavy use of that fact.<br />

If an agent is merely trying to get the best outcome for themselves, then it makes<br />

sense to represent them as a utility maximiser. But when agents have to make decisions<br />

that might involve them causing harm to others if certain propositions turn out<br />

to be true, then I think it is not so clear that orthodox decision theory is the appropriate<br />

way to model the agents. That’s relevant to cases like this one, which Jessica<br />

Brown has argued are problematic for the epistemological theories John Hawthorne<br />

and Jason Stanley have recently been defending. 13<br />

A student is spending the day shadowing a surgeon. In the morning he<br />

observes her in clinic examining patient A who has a diseased left kidney.<br />

The decision is taken to remove it that afternoon. Later, the student<br />

observes the surgeon in theatre where patient A is lying anaesthetised<br />

on the operating table. The operation hasn’t started as the surgeon is<br />

consulting the patient’s notes. The student is puzzled and asks one of<br />

the nurses what’s going on:<br />

Student: I don’t understand. Why is she looking at the patient’s records?<br />

She was in clinic with the patient this morning. Doesn’t she even know<br />

which kidney it is?<br />

Nurse: Of course, she knows which kidney it is. But, imagine what it<br />

would be like if she removed the wrong kidney. She shouldn’t operate<br />

before checking the patient’s records. (Brown, 2008, 1144-1145)<br />

12 The caveats here about basic sources are to cancel any suggestion that Kate has to antecedently believe<br />

that any source is reliable before she uses it. As Pryor (2000b) notes, that view is problematic. The<br />

view that we only get knowledge from a street sign if we antecedently have reason to trust it is not so<br />

implausible.<br />

13 The target here is not directly the interest-relativity of their theories, but more general principles<br />

about the role of knowledge in action and assertion. But it’s important to see how IRI handles the cases<br />

that Brown discusses, since these cases are among the strongest challenges that have been raised to IRI.


Defending Interest-Relative Invariantism 155<br />

It is tempting, but I think mistaken, to represent the surgeon’s choice as follows. Let<br />

Left mean the left kidney is diseased, and Right mean the right kidney is diseased.<br />

Left Right<br />

Remove left kidney 1 −1<br />

Remove right kidney −1 1<br />

Check notes 1 − ɛ 1 − ɛ<br />

Here ɛ is the trivial but non-zero cost of checking the chart. Given this table, we<br />

might reason that since the surgeon knows that she’s in the left column, and removing<br />

the left kidney is the best option in that column, she should remove the left<br />

kidney rather than checking the notes.<br />

But that reasoning assumes that the surgeon does not have any obligations over<br />

and above her duty to maximise expected utility. And that’s very implausible, since<br />

consequentialism is a fairly implausible theory of medical ethics. 14<br />

It’s not clear exactly what the obligation the surgeon has. Perhaps it is an obligation<br />

to not just know which kidney to remove, but to know this on the basis of evidence<br />

she has obtained while in the operating theatre. Or perhaps it is an obligation<br />

to make her belief about which kidney to remove as sensitive as possible to various<br />

possible scenarios. Before she checked the chart, this counterfactual was false: Had<br />

she misremembered which kidney was to be removed, she would have a true belief about<br />

which kidney was to be removed. Checking the chart makes that counterfactual true,<br />

and so makes her belief that the left kidney is to be removed a little more sensitive to<br />

counterfactual possibilities.<br />

However we spell out the obligation, it is plausible given what the nurse says that<br />

the surgeon has some such obligation. And it is plausible that the ‘cost’ of violating<br />

this obligation, call it δ, is greater than the cost of checking the notes. So here is the<br />

decision table the surgeon faces.<br />

Left Right<br />

Remove left kidney 1 − δ −1 − δ<br />

Remove right kidney −1 − δ 1 − δ<br />

Check notes 1 − ɛ 1 − ɛ<br />

And it isn’t surprising, or a problem for an interest-relative theory of knowledge,<br />

that the surgeon should check the notes, even if she believes and knows that the left<br />

kidney is the diseased one.<br />

There is a very general point here. The best arguments for IRI start with the<br />

role that knowledge plays in a particular theory of decision or reasoning. It’s easiest<br />

to make the arguments for IRI work if that theory is orthodox (consequentialist)<br />

decision theory. That doesn’t mean that the arguments for IRI presuppose that consequentialism<br />

is always the right decision theory. As long as consequentialism is correct<br />

14 I’m not saying that consequentialism is wrong as a theory of medical ethics. But if it is right, so many<br />

intuitions about medical ethics are going to be mistaken that such intuitions have no evidential force. And<br />

Brown’s argument relies on intuitions about this case having evidential value. So I think for her argument<br />

to work, we have to suppose non-consequentialism about medical ethics.


Defending Interest-Relative Invariantism 156<br />

in the case described, the argument for IRI can work. (By consequentialism being correct<br />

in a case, I mean that it can be preferable to choose ϕ over ψ in some case because<br />

ϕ has the higher expected utility. It’s plausible that there are such cases because it’s<br />

plausible that there are choices we face where the options differ in no normatively<br />

salient respect except expected utiilty.) Remember, the IRI theorist is trying to prove<br />

an existential: there is a pair of cases that differ with respect to knowledge in virtue of<br />

differing with respect to interests. And that just needs consequentialism to be locally<br />

true. The only way medical cases like Brown’s could be counterexamples to IRI is<br />

if we assumed that consequentialism was globally true, but it probably isn’t, so IRI<br />

survives the examples.


Deontology and Descartes’ Demon<br />

1 Digesting Evidence<br />

In his Principles of Philosophy, Descartes says,<br />

Finally, it is so manifest that we possess a free will, capable of giving<br />

or withholding its assent, that this truth must be reckoned among the<br />

first and most common notions which are born with us. (Descartes,<br />

1644/2003, paragraph xxxix)<br />

In this paper, I am going to defend a broadly Cartesian position about doxastic freedom.<br />

At least some of our beliefs are freely formed, so we are responsible for them.<br />

Moreover, this has consequences for epistemology. But the some here is crucial.<br />

Some of our beliefs are not freely formed, and we are not responsible for those. And<br />

that has epistemological consequences too. Out of these considerations a concept of<br />

doxastic responsibility arises that is useful to the externalist in responding to several<br />

challenges. I will say at some length how it supports a familiar style of externalism<br />

response to the New Evil Demon problem, and I will note some difficulties in reconciling<br />

internalism with the idea that justification is a kind of blamelessness. The<br />

internalist, I will argue, has to say that justification is a kind of praiseworthiness,<br />

and this idea that praise is more relevant to epistemic concepts than blame will be a<br />

recurring theme of the paper.<br />

While the kind of position I am adopting has been gaining supporters in recent<br />

years, it is still largely unpopular. The arguments of William Alston (1988) have convinced<br />

many that it is a mistake to talk of doxastic freedom, or doxastic responsibility.<br />

The short version of this argument is that our beliefs are involuntary, and freedom<br />

and responsibility require voluntariness. The longer, and more careful, argument<br />

involves drawing some distinctions between ways in which we might come to be in<br />

a state. It helps to start with an example where the normative facts are relatively<br />

uncontroversial, namely digestion.<br />

Imagine that Emma eats a meat pie, and due to a malfunction in her stomach<br />

the pie is not properly digested, leading to some medical complications. Is Emma<br />

responsible for her ill-health? Well, that depends on the back-story. If Emma knew<br />

that she could not properly digest meat pies, but ate one anyway, she is responsible<br />

for the illness via her responsibility for eating the pie. Even if Emma did not know<br />

this, she might be responsible for the state of her stomach. If her stomach could not<br />

digest the pie because it had been damaged by Emma’s dietary habits, and say Emma<br />

knew that her diet could damage her stomach, then Emma is responsible for the state<br />

of her stomach and hence for the misdigestion of the pie and hence for her ill-health.<br />

But if neither of these conditions obtain, if it just happens that her stomach misdigests<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Journal<br />

of Philosophy 105 (2008): 540-569. Thanks to Andrew Chignell, Matthew Chrisman, Richard Holton, Neil<br />

Levy, Clayton Littlejohn, Ishani Maitra and Nicholas Silins.


Deontology and Descartes’ Demon 158<br />

the pie, then Emma is not responsible for her ill-health. Even though the cause of her<br />

ill-health is something that her stomach does, he is not responsible for that since her<br />

stomach is not under her voluntary control. Put another way, her responsibility for<br />

maintaining her own health means that she is responsible for the type of digester she<br />

is, but he is not responsible for this token digestion.<br />

Simplifying a little, Alston thinks that the case of belief is similar. Say that Emma<br />

has a false belief that p. Is she responsible for this piece of doxastic ill-health? Again,<br />

that depends on the back story. If Emma believes that p because she was careless in<br />

gathering evidence, and the evidence would have pointed to ˜p, then she is responsible<br />

for being a bad gatherer of evidence. If Emma has been negligent in maintaining<br />

her doxastic health, or worse if she has been doing things she knows endangers doxastic<br />

health, then she is responsible for being the type of believer she is. But she is<br />

never responsible merely for the token belief that is formed. Her mind simply digests<br />

the evidence she has, and Emma’s responsibility only extends to her duty to gather<br />

evidence for it, and her duty to keep her mind in good working order. She is not<br />

responsible for particular acts of evidential digestion.<br />

But these particular acts of evidential digestion are the primary subject matters of<br />

epistemology. When we say Emma’s belief is justified or unjustified, we frequently<br />

mean that it is a good or bad response to the evidence in the circumstances. (I am obviously<br />

here glossing over enormous disputes about what makes for a good response,<br />

what is evidence, and what relevance the circumstances have. But most theories of<br />

justification can be fit into this broad schema, provided we are liberal enough in interpreting<br />

the terms ‘good’, ‘evidence’ and ‘circumstances’.) If Emma is not responsible<br />

for her response to the evidence, then either we have to divorce justification from<br />

responsibility, or we have to say that the concept of justification being used in these<br />

discussions is defective.<br />

We can summarise these considerations as a short argument. The following formulation<br />

is from Sharon (Ryan, 2003, 49).<br />

1. If we have any epistemic obligations, then doxastic attitudes must sometimes<br />

be under our voluntary control.<br />

2. Doxastic attitudes are never under our voluntarily control.<br />

3. We do not have any epistemic obligations.<br />

Ryan goes on to reject both premises. (And she does so while interpreting “voluntary<br />

control” to mean “direct voluntary control”; the response is not meant to sidestep<br />

Alston’s argument.) Matthias Steup (2000, 2008) also rejects both premises of this<br />

argument. I am more sympathetic to premise 1, but I (tentatively) agree with them,<br />

against what sometimes seems to be orthodoxy, that premise 2 fails. That is, I endorse<br />

a kind of doxastic voluntarism. (Just what kind will become clearer as we go along.)<br />

There are four questions that anyone who endorses voluntarism, and wants to argue<br />

that this matters epistemologically, should I think answer. These are:<br />

(A) What is wrong with current arguments against voluntarism?<br />

(B) What does the voluntariness of (some) beliefs consist in?


Deontology and Descartes’ Demon 159<br />

(C) Which kinds of beliefs are voluntary?<br />

(D) What difference does the distinction between these classes make for epistemology?<br />

My answer to (A) will be similar to Ryan’s, and to Steup’s, but with I think enough<br />

differences in emphasis to be worth working through. My answer to (B), however,<br />

will be a little more different. I am going to draw on some work on self-control to<br />

argue that some beliefs are voluntary because they are the result of exercises of, or<br />

failures to exercise, self-control. My answer to (C) is that what I will call inferential<br />

beliefs are voluntary, while perceptual beliefs are not. Ryan and Steup sometimes<br />

seem to suggest that even perceptual beliefs are voluntary, and I do not think this is<br />

true. The consequence for this, I will argue in answering (D), is that inferential beliefs<br />

should be judged by how well they respond to the evidence, while perceptual beliefs<br />

should be judged by how well they reflect reality. When an agent has misleading<br />

evidence, their inferential beliefs might be fully justified, but their perceptual beliefs,<br />

being misleading, are not.<br />

I will detail my answers to those four questions in sections 2, 4, 6 and 7. In<br />

between I will discuss recent work on self-control (section 3) and the contrast between<br />

my answer to (B) and other voluntarist answers (section 5). In section 8 I will say<br />

how my partially voluntarist position gives the externalist a way to avoid the New<br />

Evil Demon problem. And in section 9 I will make a direct argument for the idea<br />

that justification is a kind of praiseworthiness, not a kind of blamelessness.<br />

Before we start, I want to note two ways, other than Ryan’s, of formulating an<br />

argument against doxastic responsibility. These are going to seem quite similar to<br />

Ryan’s formulation, but I think they hide important differences. The first version<br />

uses the idea that some doings (or states) are volitional. That is, we do them (or are<br />

in them) because we formed a volition to do so, and this volition causes the doing (or<br />

state) in the right kind of way.<br />

1. If we have any epistemic obligations, then either the formation or maintenance<br />

of doxastic attitudes must sometimes be volitional.<br />

2. The formation or maintenance of doxastic attitudes is never volitional.<br />

3. We do not have any epistemic obligations.<br />

I will not argue against premise 2 of this argument, though Carl Ginet (1985, 2001)<br />

(1985, 2001) has done so. But I think there’s little to be said for premise 1. The principle<br />

behind it is that we are only responsible for volitional doings. And that principle<br />

is very dubious. We could run the kind of regress arguments against it that Gilbert<br />

Ryle (1949) offers. But it is simpler to note some everyday counterexamples. Borrowing<br />

an example from Angela M Smith (2005), if I forget a friend’s birthday, that<br />

is something I am responsible and blameworthy for, but forgetting a birthday is not<br />

volitional. (Below I will offer a Rylean argument that we are sometimes praiseworthy<br />

for doings that are not volitional.) So this argument fails. Alternatively, we could run<br />

the argument by appeal to freedom.


Deontology and Descartes’ Demon 160<br />

1. If we have any epistemic obligations, then doxastic attitudes must sometimes<br />

be free.<br />

2. Doxastic attitudes are never free.<br />

3. We do not have any epistemic obligations.<br />

Premise 1 of this argument is more plausible. But, as we’ll see presently, premise 2 is<br />

not very plausible. Whether Descartes was right that premise 2 is obviously false, it<br />

does seem on reflection very hard to defend. So this argument fails. Ryan’s formulation<br />

is interesting because it is not clear just which of the premises fails. As I said,<br />

I am going to suggest that premise 2 fails, and that doxastic attitudes are voluntary.<br />

But this will turn on some fine judgments about the voluntary/involuntary boundary.<br />

If I am wrong about those judgments, then the arguments below will suggest<br />

that premise 1, not premise 2, in Ryan’s formulation fails. Either way though, the<br />

argument is unsuccessful.<br />

2 Responding to the Involuntarists<br />

There are two kinds of argument against the idea that belief is voluntary. One kind,<br />

tracing back to Bernard Williams (1976), holds that the possibility of voluntary belief<br />

can be shown to be incoherent by reflection on the concept of belief. This argument<br />

is no longer widely endorsed. Nishi Shah (2002) provides an excellent discussion of<br />

the problems with Williams’ argument, and I have nothing to add to his work. I will<br />

focus on the other kind, that claims we can see that belief is involuntary by observing<br />

differences between beliefs and paradigm cases of voluntary actions. I will make three<br />

objections to these arguments. First, the argument looks much less plausible once we<br />

distinguish between having a belief and forming a belief. Second, the argument seems<br />

to rely on inferring from the fact that we do not do something (in particular, believe<br />

something that we have excellent evidence is false) to the conclusion that we can not<br />

do it. As Sharon Ryan (2003) points out, this little argument overlooks the possibility<br />

that we will not do it. Third, the argument relies on too narrow a conception of what<br />

is voluntary, and when we get a more accurate grasp on that concept, we’ll give up<br />

the argument. Here is a representative version of the argument from William Alston.<br />

Can you, at this moment, start to believe that the United States is still a<br />

colony of Great Britain, just by deciding to do so? . . . [S]uppose that<br />

someone offers you $500,000,000 to believe it, and you are much more<br />

interested in the money than in believing the truth. Could you do what<br />

it takes to get that reward? . . . Can you switch propositional attitudes<br />

toward that proposition just by deciding to do so? It seems clear to<br />

me that I have no such power. Volitions, decisions, or choosings don’t<br />

hook up with anything in the way of propositional attitude inauguration,<br />

just as they don’t hook up with the secretion of gastric juices or cell<br />

metabolism. (Alston, 1988, 122)


Deontology and Descartes’ Demon 161<br />

Now Alston does note, just one page earlier, that what is really relevant is whether<br />

our being in a state of belief is voluntary, not whether the activity of belief formation<br />

is voluntary. But he thinks nevertheless that issues about whether we can form<br />

beliefs, any old beliefs it seems, voluntarily matters to the question about the voluntariness<br />

of belief states.<br />

If we think about what it is to be in a state voluntarily, this all seems beside the<br />

point. We can see this by considering what it is to be in a political state voluntarily.<br />

Consider Shane, who was born into Victoria. His coming to be in Victoria was<br />

hence not, in any way, voluntary. Shane is now a grown man, and he has heard many<br />

travellers’ tales of far away lands. But the apparent attractions of Sydney and other<br />

places have no pull on Shane; he has decided to stay in Victoria. If he has the capacity<br />

to leave Victoria, then Shane’s continued presence in Victoria is voluntary. Similarly,<br />

we are voluntarily in a belief state if we have the capacity to leave it, but choose not to<br />

exercise this capacity. Whether the belief was formed voluntarily is beside the point.<br />

If Shane leaves a state, the natural place to leave is for another state, perhaps New<br />

South Wales or South Australia. It might be thought that if we leave a belief state, we<br />

have to move into another belief state. So to have this capacity to leave, we need the<br />

ability to form beliefs voluntarily. Not at all. The capacity to become uncertain, i.e.<br />

to not be in any relevant belief state, is capacity enough. (If Shane has a boat, and the<br />

capacity to flourish at sea, then perhaps he too can have the capacity to leave Victoria<br />

without the capacity to go into another state.)<br />

But do we have the capacity to become uncertain? Descartes appeared to think so;<br />

arguably the point of the First Meditation is to show us how to exercise this capacity.<br />

Moreover, this capacity need not be one that we exercise in any particularly nearby<br />

possible worlds. We might exercise our freedom by always doing the right thing. As<br />

Descartes goes on to say in the Fourth Meditation.<br />

For in order to be free, there is no need for me to be capable of going<br />

in each of two directions; on the contrary, the more I incline in one<br />

direction – either because I clearly understand that reasons of truth and<br />

goodness point that way, or because of a divinely produced disposition<br />

of my inmost thoughts – the freer is my choice. (Descartes, 1641/1996,<br />

40)<br />

This seems like an important truth. Someone who is so sure of their own interests<br />

and values, and so strong-willed as to always aim to promote them, cannot in a certain<br />

sense act against their own self-interest and values. But this does not make their<br />

actions in defence of those interests and values unfree. If it did, we might well wonder<br />

what the value of freedom was. And note that even if there’s a sense that our character<br />

could not have done otherwise, this in no way suggests their actions are outside their<br />

control. Indeed, a person who systematically promotes the interests and values they<br />

have seems an exemplar of an agent in control. The character I am imagining here is in<br />

important respects unlike normal humans. We know we can, and do, act against our<br />

interests and values. But we can become more or less like them, and it is important to


Deontology and Descartes’ Demon 162<br />

remember, as Descartes does, that in doing so we do not sacrifice freedom for values<br />

or interests.<br />

John Cottingham (2002) interprets Descartes here as suggesting that there is a<br />

gap between free action and voluntary action, contrasting his “strongly compatibilist<br />

notion of human freedom” (350) with the “doxastic involuntarism” (355) suggested<br />

by the following lines of the Third Meditation.<br />

Yet when I turn to the things themselves which I think I perceive very<br />

clearly, I am so convinced by them that I spontaneously declare: let whoever<br />

can do so deceive me, he will never bring it about that I am nothing,<br />

so long as I continue to think that I am something . . . (Descartes,<br />

1641/1996, 25)<br />

Now there are two questions here. The first is whether Descartes intended to draw<br />

this distinction. That is, whether Descartes thought that the kind of free actions that<br />

he discusses in the Fourth Meditations, the free action where we are incapable of going<br />

in the other directions, are nevertheless involuntary. I do not have any informed<br />

opinions about this question. The second is whether this kind of consideration supports<br />

the distinction between the free and the voluntary. And it seems to me that it<br />

does not. Just as Descartes says the free person will be moved by reasons in the right<br />

way, it seems natural to say that a person who acts voluntarily will be responsive to<br />

reasons. Voluntary action does require freedom from certain kinds of coercion, but<br />

the world does not coerce us when it gives us reason to believe one thing rather than<br />

another. If we have voluntary control over our beliefs, then we should be compelled<br />

by the sight of rain to believe it is raining.<br />

In her discussion of the puzzle of imaginative resistance, Tamar Szabó Gendler<br />

(2000) notes that philosophers have a tendency to read too much into intuitions about<br />

certain cases. What we can tell from various thought experiments is that in certain<br />

circumstances we will not do a certain thing. But getting from what we will not<br />

do to what we can not do is a tricky matter, and it is a bad mistake to infer from<br />

will not to can not too quickly. Matthias Steup (2000) points out that if you or I<br />

try to stick a knife into our hand, we similarly will not do it. (I assume a somewhat<br />

restricted readership here.) But this is no evidence that we cannot do it. And Sharon<br />

Ryan (2003) notes that we will not bring ourselves to run over pedestrians for no<br />

reason. For most of us, our moral sense prevents acting quite this destructively. Yet<br />

our continued avoiding of pedestrians is a series of free, even voluntary, actions. We<br />

could run over the pedestrians, but we will not. Since forming false beliefs is a form<br />

of self-harm, it is not surprising that it has a similar phenomenology, even if it is<br />

genuinely possible.<br />

It might be argued that we will engage in small forms of self-harm that we can<br />

do when the financial rewards are great enough. So we should be able to form this<br />

belief about the United States for a large amount sum of money. But I suspect that<br />

the only way to exercise the capacity to believe the United States is still a colony is<br />

by first suspending my belief that it is no longer a colony. And the only way I can do<br />

that is by generally becoming more sceptical of what I have been told over the years.


Deontology and Descartes’ Demon 163<br />

Once I get into such a sceptical mood, I will be sceptical of claims that I will get half a<br />

billion dollars should I have this wild political belief. So I will not form the belief in<br />

part because the ‘promisor’ lacks the capacity to sufficiently convince me that I will<br />

be richly rewarded for doing so. This looks like a lack of capacity on their part, not<br />

my part.<br />

The final point to make about this argument, and those like it, is that if we are to<br />

conclude that belief formation is never voluntary, then we need to compare it to all<br />

kinds of voluntary action. And Alston really only ever compares belief formation to<br />

volitional action. If this does not exhaust the range of voluntary action, then belief<br />

formation might be properly analogous to some other voluntary action. Indeed, this<br />

turns out to be the case. To see so, we need to make a small detour through modern<br />

work on self-control.<br />

3 How to Control Your Temper<br />

To start, let’s consider three examples of a person failing to keep a commitment they<br />

have made about what the good life is. The three ways will be familiar from Gary<br />

Watson’s discussion of recklessness, weakness and compulsion Watson (1977), and the<br />

discussion of these cases by Jeanette Kennett and Michael Smith Kennett and Smith<br />

(1996b,a). My characterisation of the cases will turn out to differ a little from theirs,<br />

but the cases are similar. Each of the examples concerns a character Murray, who<br />

has decided that he should not swear around his young son Red. He resolves to do<br />

this, and has been working on curbing his tendency to swear whenever anything bad<br />

happens. But three times over the course of the day he breaks his commitment. 1<br />

The first time comes when Murray puts his hand down on a hot plate that he did<br />

not realise was on. The searing pain undermines his self-control, and he is unable to<br />

stop himself from swearing loudly through the pain.<br />

The second time comes when Murray drops and breaks a wine glass. Murray<br />

does not lose his self-control, but he does not exercise the self-control he has. He<br />

temporarily forgets his commitment and so, quite literally, curses his misfortune.<br />

On doing so he immediately remembers that Red is around, and the commitment he<br />

has made, and regrets what he did.<br />

The third time comes on the tram home, when Murray gets into a disagreement<br />

with a political opponent. Murray can not find the words to express what he feels<br />

about the opponent without breaking his commitment. So he decides, without much<br />

reason, that his need to express what he feels outweighs his commitment, and starts<br />

describing his opponent using language he would, all things considered, not have used<br />

around young Red.<br />

The first and third cases are close to textbook cases of compulsion and recklessness.<br />

Note in the first case that when Murray reflects back on what happened, he<br />

might be irritated that his work on reducing his tendency to swear has not been<br />

more successful. But he will not be upset that he did not exercise more self-control<br />

1 The cases, especially the second, were inspired by Richard Holton’s discussion of resolutions to prevent<br />

‘automatic’ actions like smoking or sleeping in. See Holton (2003, 2004).


Deontology and Descartes’ Demon 164<br />

on that occasion. He did not have, no normal person would have, the amount of selfcontrol<br />

he would have needed to stop swearing then. All that would help is having<br />

the disposition to say different things when his self-control is defeated. And that is<br />

not a disposition he can acquire on the spot.<br />

I have described the first case as one where Murray’s self-control is undermined.<br />

This is a term taken from recent work by Richard Holton and Stephen Shute 2007,<br />

who carefully distinguish between self-control being undermined by a provocation,<br />

and it being overwhelmed by a provocation. Undermining occurs when the provocation<br />

causes the agent to have less self-control than they usually have; overwhelming<br />

occurs when the provocation is too much for the agent’s control. The difference is<br />

relevant to them, because they are interested in what it is for an agent to lose control.<br />

That seems to be what happens here. After all, the things one would naturally do<br />

afterwards (jumping around, screaming, swearing if one’s so disposed) do not seem<br />

particularly controlled by any measure.<br />

Similarly I have accepted Watson’s description of cases like the third as instances<br />

of recklessness, but we should not think this necessarily contrasts with weakness.<br />

It might be that in this case Murray is both weak and reckless. He is not akratic,<br />

if we stipulatively define akrasia as acting against one’s better judgment. But if we<br />

accept Richard Holton’s view that weakness of will consists in being “too ready to<br />

reconsider their intentions” (Holton, 1999, 241), then in this case Murray is weakwilled.<br />

2 This seems to be the right way to talk about the case to me. With these<br />

details in place, we can talk about what’s crucial to this essay, the contrast with the<br />

second case.<br />

In the second case Murray fails to exercise self-control. He could have prevented<br />

himself from swearing in front of his son. Breaking a wine glass is irritating, but it<br />

neither undermines nor, necessarily, overwhelms self-control. Murray had the capacity<br />

to think about his resolution to not swear in front of Red. And if he had exercised<br />

this capacity, he would not have sworn when he did.<br />

In the first case, Murray will only regret his lack of prior work at changing his<br />

dispositions in cases where his control fails. In the second case he will regret that, but<br />

he will also regret what he did on that occasion, for he could have kept his resolution,<br />

had only he thought of it. This regret seems appropriate, for in the second case he<br />

did something wrong at the time he swore, as well perhaps as having done something<br />

wrong earlier. (Namely, not having worked hard enough on his dispositions.) This<br />

difference in regret does not constitute the difference between compulsion and a case<br />

where self-control fails, but it is pretty good evidence that this is a failure of selfcontrol.<br />

2 Whether Murray is akratic is a slightly more complicated question than I have suggested in the text. If<br />

akrasia is acting against one’s judgment, then he is not; if akrasia is acting against one’s considered judgment,<br />

then he is. ‘Akrasia’ is a technical term, so I do not think a huge amount turns on what we say about this<br />

question.<br />

There is an interesting historical precedent for Holton’s theory of weakness of will. Ryle hints at a<br />

similar position to Holton’s when he says “Strength of will is a propensity the exercise of which consist<br />

in sticking to tasks’ that is, in not being deterred or diverted. Weakness of will is having too little of this<br />

propensity.” 1949, 73 But the idea is not well developed in Ryle. We’ll return below to the differences<br />

between Ryle’s and Holton’s theories.


Deontology and Descartes’ Demon 165<br />

So the second case is not one where Murray was compelled. He had the capacity<br />

to keep his commitment, and nothing was stopping him exercising this control, but<br />

he failed to do so. His failure was a failure of self-control. Murray’s self-control is,<br />

in this case, overwhelmed by the provocation. But it need not have been. Within<br />

some fairly broad limits, how much self-control we exercise is up to us. 3 Murray’s<br />

failure of self-control is culpable because anyone with the capacity for self-control<br />

Murray has could have avoided breaking his commitment. I am not going to try<br />

to offer an analysis of what it is to have a capacity, but I suspect something like the<br />

complicated counterfactual analysis Kennett and Smith offer, and that Smith offers<br />

elsewhere (Smith, 1997, 2003), is broadly correct. 4<br />

Kennett and Smith stress two things about this capacity that are worth noting<br />

here. First, having this kind of capacity is part of what it is to be rational. That is,<br />

being rational requires thinking of the right thing at the right time. As Ryle says, “Intelligently<br />

reflecting how to act is, among other things, considering what is pertinent<br />

and disregarding what is inappropriate.”(Ryle, 1949, 31) Second, Kennett and Smith<br />

note that exercises of this capacity cannot be volitional. Following Davidson (1963),<br />

they say they cannot be actions. I find this terminology somewhat strained. Catching<br />

a fast moving ball is an action, I would say, but it does not seem to be volitional.<br />

So I will use ‘volitional action’ for this Davidsonian sense of action.<br />

Many recent philosophers have endorsed the idea that some of the mental states<br />

for which we hold people responsible are not voluntary, or at least are not volitional.<br />

Adams (1985); Heller (2000); Owens (2000) and Hieronymi (2008) note ways in which<br />

we appropriately blame people for being in certain states, where being in that state<br />

is not volitional. Something like this idea seems to be behind Ryle’s several regress<br />

arguments against the intellectualist legend. It just is not true that what we do divides<br />

cleanly into outcomes of conscious thought on the one hand, and mere bodily<br />

movements (a la digestion) on the other. 5 Rather there is a spectrum of cases from<br />

pure ratiocination at one end to pure bodily movement at the other. And some of<br />

the things in the middle of this spectrum are proper subjects of reactive attitudes.<br />

The focus in this literature has been on blame, but some states in the middle of this<br />

spectrum are also praiseworthy.<br />

Consider some action that is strikingly imaginative, e.g. a writer’s apt metaphor<br />

or, say, a cricket captain’s imaginative field placements. It seems that, assuming the<br />

field settings are successful, the captain deserves praise for being so imaginative. But<br />

of course the captain did not, really could not, first intend to imagine such field settings,<br />

then carry out that intention. So something for which the captain deserves<br />

praise, his act of imagination, is not volitional. So not all praiseworthy things we do<br />

are volitional.<br />

There are two responses to this argument that I can imagine, neither of them<br />

particularly plausible. First, we might think that the captain’s imagination is simply<br />

3 Holton (2003) compares self-control to a muscle that we can exercise. We can make a similar point to<br />

the one in the text about physical muscles. If I try to lift a box of books and fail, that does not show I lack<br />

the muscular capacity to lift the box; I might not have been trying hard enough.<br />

4 (Ryle, 1949, 71ff) also offers a counterfactual account of capacities that seems largely accurate.<br />

5 As I read him, Ryle takes this fact to reveal an important weakness in Descartes’ theory of mind.


Deontology and Descartes’ Demon 166<br />

a remarkable feature of nature, as the Great Barrier Reef is. It is God, or Mother<br />

Nature, who should be praised, not the captain. Now it seems fair to react to some<br />

attributes of a person this way. A person does not deserve praise for having great<br />

eyesight, for example. But such a reaction seems grossly inappropriate, almost dehumanising,<br />

in this case. To be sure, we might also praise God or Mother Nature for<br />

yielding such an imaginative person, but we’ll do that as well as rather than instead<br />

of, praising the person. Second, we might praise the captain for his work in studying<br />

the game, and thinking about possible ways to dismiss batsmen, rather than this<br />

particular action. But if that is what we praise the captain for, we should equally<br />

praise the captain’s opponent, a hard working dullard. And that does not seem right.<br />

The hard-working dullard deserves praise for his hard work in the lead up, but the<br />

hard-working imaginative skipper deserves praise for what he does in the game too.<br />

So reactive attitudes, particularly praise, are appropriately directed at things people<br />

do even if these things are not volitional.<br />

The key point of this section then is that responsibility outruns volition. Some<br />

actions are blameworthy because they are failures of self-control. Some actions are<br />

praiseworthy because they are wonderful feats of imagination. But neither failing<br />

to exercise self-control, nor exercising imagination, needs be volitional is order to be<br />

a locus of responsibility. I will argue in the next section that these considerations<br />

support the idea of responsibility for beliefs.<br />

4 Voluntariness about Belief<br />

Here is a situation that will seem familiar to anyone who has spent time in a student<br />

household. Mark is writing out the shopping list for the weekly grocery shop. He<br />

goes to the fridge and sees that there is a carton of orange juice in the fridge. He forms<br />

the belief that there is orange juice in the fridge, and hence that he does not need to<br />

buy orange juice. As it turns out both of these beliefs are false. One of his housemates<br />

finishes off the orange juice, but stupidly put the empty carton back in the fridge.<br />

When Mark finds this out, he is irritated at his housemate, but he is also irritated at<br />

himself. He did not have to draw the conclusion that there was orange juice in the<br />

fridge. He was, after all, living in a student house where people do all sorts of dumb<br />

things. That his housemate might have returned an empty container to the fridge<br />

was well within the range of live possibilities. Indeed had he even considered the<br />

possibility he would have thought it was a live possibility, and checked whether the<br />

container was empty before forming beliefs about what was needed for the shopping.<br />

Examples like this can be easily multiplied. There are all sorts of beliefs that<br />

we form in haste, where we could have stopped to consider the various realistic hypotheses<br />

consistent with the evidence, and doing so would have stopped us forming<br />

the belief. Indeed, unless one is a real master of belief formation, it should not be<br />

too hard to remember such episodes frequently from one’s everyday life. These conclusions<br />

that we leap to are voluntary beliefs; we could have avoided forming them.<br />

And not only could we have avoided these formations, but we would have if we had<br />

followed the methods for belief formation that we approve of. That seems enough, to<br />

me, to say the formation is voluntary. This is not the only way that voluntary doings,


Deontology and Descartes’ Demon 167<br />

like calling a relevant possibility to mind, can matter to belief. The next example will<br />

be a little more controversial, but it points at the importance of dismissing irrelevant<br />

possibilities.<br />

Later that evening, Mark is watching his team, Geelong, lose another football<br />

game. Geelong are down by eight goals with fifteen minutes to go. His housemates<br />

are leaving to go see a movie, and want to know if Mark wants to come along. He<br />

says that he is watching the end of the game because Geelong might come back. One<br />

of his housemates replies, “I guess it is possible they’ll win. Like it is possible they’ll<br />

call you up next week to see if you want a game with them.” Mark replies, “Yeah,<br />

you are right. This one’s over. So, which movie?” Mark does nott just give in to<br />

his housemates, he forms the belief that Geelong will lose. Later that night, when<br />

asked what the result of the game was, he says that he did nott see the final score,<br />

but that Geelong lost by a fair bit. (In a recent paper (<strong>Weatherson</strong>, 2005a) I go into a<br />

lot more detail on the relation between not taking possibilities seriously, and having<br />

beliefs. The upshot is that what Mark does can count as belief formation, even if his<br />

credence that Geelong will lose does not rise.)<br />

Now it is tempting, or perhaps I should say that I am tempted, to view the housemate<br />

as offering Mark a reason to believe that Geelong will lose. We could view the<br />

housemate’s comments as shorthand for the argument that Geelong’s winning is as<br />

likely as Mark’s playing for Geelong, and since the latter will not happen, neither<br />

will the former. And maybe that is part of what the housemate is doing. But the<br />

larger part is that she is mocking Mark for his misplaced confidence. And the point<br />

of mocking someone, at least the point of constructive mockery like this, is to get<br />

them to change their attitudes. Mark does so, by ceasing to take seriously the possibility<br />

that Geelong will come back. In doing so, he exercises a capacity he had for a<br />

while, the capacity to cease taking this unserious possibility seriously, but needed to<br />

be prompted to use.<br />

In both cases I say Mark’s belief formation is voluntary. In the first case he forms<br />

the belief because he does not exercise his doxastic self-control. He should have hesitated<br />

and not formed a belief until he checked the orange juice. And he would have<br />

done so if only he’d thought of the possibility that the container was empty. But he<br />

did not. And just as things we do because we do not bring the right thing to mind,<br />

like Murray’s swearing in the second case, are voluntary and blameworthy, Mark’s<br />

belief is voluntary and blameworthy. In the second case, he forms the belief by ceasing<br />

to take an unserious possibility seriously. In most cases of non-perceptual, nontestimonial<br />

belief formation, there is a counter-possibility that we could have taken<br />

seriously. Skill at being a believer involves not taking extreme possibilities, from<br />

Cartesian sceptical scenarios to unlikely footballing heroics, seriously. Exercises of<br />

such skill are rarely, if ever, volitional. But just like other mental activities that are<br />

not volitional can be voluntary and praiseworthy, not taking an extreme possibility<br />

seriously can be voluntary and praiseworthy. 6<br />

6 (Ryle, 1949, 29ff) stresses the importance of calling the right things to mind to rational thought and<br />

action. I am using a case here where Mark deliberately casts an option from his mind, but the more general<br />

point is that what possibilities we call to mind is a crucial part of rational action, and can be praiseworthy<br />

or blameworthy, whether or not it is volitional.


Deontology and Descartes’ Demon 168<br />

I have made two claims for Mark’s beliefs in the above two cases. First, they are<br />

instances of voluntary belief formation. In each case he could have done otherwise,<br />

either by exercising or failing to exercise his capacity to take various hypotheses seriously.<br />

Second, they are appropriate subjects of praise and blame. I imagine some<br />

people will agree with the second point but not the first. They will say that only<br />

volitional actions are voluntary, even though things we do like bringing relevant considerations<br />

to mind are praiseworthy or blameworthy. Such people will agree with<br />

most of what I say in this paper. In particularly they’ll agree that the examples involving<br />

Mark undermine Alston’s argument against the applicability of deontological<br />

concepts in epistemology. So I am not going to die in a ditch over just what we call<br />

voluntary. That is, I won’t fuss too much over whether we want to say premise 2 in<br />

Ryan’s formulation of the argument is shown to be false by these examples (as I say)<br />

or premise 1 is shown to be false (as such an objector will say.) I will just note that it<br />

is hard for such people to say intuitive things about the second instance of Murray’s<br />

swearing, and this seems like a strong reason to not adopt their position. 7<br />

5 Ryan and Steup<br />

Sharon Ryan has a slightly different view. She thinks that the truth of voluntarism<br />

consists in the fact that we hold certain beliefs intentionally. She does not offer an<br />

analysis of what it is to do something intentionally, except to say that consciously<br />

deciding to do something is not necessary for doing it intentionally, but doing it<br />

purposefully is (Ryan, 2003, 70-71) In a similar vein, she says “When there’s a car<br />

zooming toward me and I believe that there is, I’m believing freely because I’m believing<br />

what I mean to believe.” (Ryan, 2003, 74) This is said to be an intentional, and<br />

I take it a voluntary, belief.<br />

It seems to me that there’s a large difference between things we voluntarily do,<br />

and things we mean to do, or do purposefully. There are several things we do voluntarily<br />

without meaning to do them. Murray’s swearing in the second example above<br />

is one instance. When we misspeak, or (as I frequently do) mistype, we do things<br />

voluntarily without meaning to do them. I do not mean by mistype cases where we<br />

simply hit the wrong key, but such cases as where I write in one more negation than<br />

I meant to, or, as I did earlier this evening, write “S is justified in believing that p”<br />

when I meant to write “S is justified in believing that she is justified in believing that<br />

p.” These are voluntary actions because I had the capacity to get it right, but did not<br />

exercise the capacity. But they are not things I meant to do. (I suspect there are also<br />

cases where we do things because we mean to do them, but they are not voluntary.<br />

These include cases where we train ourselves to produce a reflexive response. But I<br />

will not stress such cases here.)<br />

7 Ryle seems to have taken an intermediate position. He holds, I think, the view that voluntary acts<br />

are culpable acts where we had the capacity to do otherwise (71). So Mark’s belief about the orange juice<br />

is voluntary because he had the capacity to retain doubt, and nothing prevented him exercising it. But<br />

the belief about the football is not voluntary because we should not talk about praiseworthy acts being<br />

voluntary or involuntary. The last point is the kind of error that (Grice, 1989, Ch.1) showed us how to<br />

avoid.


Deontology and Descartes’ Demon 169<br />

Matthias Steup (2008) argues that if compatibilism is true about free action, then<br />

our beliefs are free. His argument consists in running through the most plausible candidates<br />

to be compatibilist notions of freedom, and for each candidate that is plausible,<br />

showing that at least some of our beliefs satisfy the purported conditions on free<br />

actions. I agree with a lot of what Steup says, indeed this paper has been heavily influenced<br />

by what he says. But one crucial analogy fails I think. Steup is concerned to<br />

reject the premise that if Φ-ing is free, one Φs because one has formed the intention<br />

to Φ. His response centres around ‘automatic’ actions, such as the things we do when<br />

starting our drive to work: inserting the key, shifting into reverse, etc.<br />

The question is whether they are caused by any antecedently formed<br />

intentions. I don’t think they are. . . . I didn’t form an intention to . . .<br />

shift into reverse. . . . I do things like that automatically, without thinking<br />

about them, and I assume you do too. But one can’t form an intention<br />

to Φ without thinking about Φing . . . Just one more example: I’d like to<br />

see the person who, just before brushing her teeth, forms the intention<br />

to unscrew the cap of the toothpaste tube. (Steup, 2008, 383)<br />

I suspect that Steup simply has to look in the mirror. It is true that we do not<br />

usually form conscious intentions to shift into reverse, or unscrew the cap, but not<br />

all intentions are conscious. If we were asked later, perhaps by someone who thought<br />

we’d acted wrongly, whether we intended to do these things, the natural answer is yes.<br />

The best explanation of this is that we really did have an intention to do them, albeit<br />

an unconscious one. (I am indebted here to Ishani Maitra.)<br />

Steup is right that free actions do not require a prior intention, but his examples<br />

do not quite work. The examples I have used above are the Rylean regress stoppers,<br />

such as acts of imagination, and actions that we do because we did not think, like<br />

Murray’s swearing. If asked later whether he intended to say what he said, Murray<br />

would say yes in the third example, but (I think) no in the first and second. Intuitively,<br />

I think, he did not have such an intention. 8<br />

6 Involuntarism about Perceptual Beliefs<br />

In some early 1990s papers, Daniel Gilbert and colleagues defended a rather startling<br />

thesis concerning the relation of comprehension and belief (Gilbert et al., 1990; Gilbert,<br />

1991; Gilbert et al., 1993) Casual introspection suggests that when one reads<br />

or hears something, one first comprehends it and then, if it is backed by sufficient<br />

reasons, believes it. Gilbert (1991) argues against this seeming separation of comprehension<br />

and belief, and in favour of a view said to derive from Spinoza. When we<br />

8 If so, Murray is not weak-willed according to Holton’s theory of will, but, since he does not keep his<br />

resolution, he is weak-willed according to Ryle’s otherwise similar theory. This seems to be an advantage of<br />

Holton’s theory over Ryle’s. Murray’s problem is not that his will was weak, it is that it was not called on.<br />

More generally, Ryle’s identification of weakness of will with irresoluteness seems to fail for people who<br />

frequently forget their resolutions. These people are surely irresolute, but (in agreement with Holton’s<br />

theory) I think they are not weak-willed.


Deontology and Descartes’ Demon 170<br />

comprehend a sentence, we add it to our stock of beliefs. If the new belief is implausible<br />

given our old beliefs, then we “unbelieve” it. 9<br />

We may picturesquely compare the two models of belief and comprehension to<br />

two models for security. The way security works at a nightclub is that anyone can<br />

turn up at the door, but only those cleared by the guards are allowed in. On the<br />

other hand, the way security works at a shopping mall is that anyone is allowed in,<br />

but security might remove those it regards as undesirable. Intuitively, our minds<br />

work on the nightclub model. A hypothesis can turn up and ask for admission, but it<br />

has to be approved by our cognitive security before we adopt it as a belief. Gilbert’s<br />

position is that we work on the shopping mall model. Any hypothesis put in front of<br />

us is allowed in, as a belief, and the role of security is to remove troublemakers once<br />

they have been brought inside.<br />

Now I do not want to insist Gilbert’s theory is correct. The experimental evidence<br />

for it is challenged in a recent paper (Hasson et al., 2005). But I do want to<br />

argue that if it is correct, then there is a kind of belief that is clearly involuntary. We<br />

do not have much control over what claims pass in front of our eyes, or to our ears.<br />

(We have some indirect control over this – we could wear eye shades and ear plugs<br />

– but no direct control, which is what’s relevant.) If all such claims are believed,<br />

these are involuntary beliefs. To be sure, nothing Gilbert says implies that we can<br />

not quickly regain voluntary control over our beliefs as we unbelieve the unwanted<br />

inputs. But in the time it takes to do this, our beliefs are out of our control.<br />

Gilbert’s theory is rather contentious, but there are other kinds of mental representations<br />

that it seems clear we can not help forming. In The Modularity of Mind,<br />

Jerry Fodor has a long discussion of how the various input modules that he believes<br />

to exist are not under our voluntary control. 10 If I am sitting on a train opposite<br />

some people who are chatting away, I can not help but hear what they say. (Unless,<br />

perhaps, I put my fingers in my ear.) This is true not just in the sense that I can not<br />

help receive the sound waves generated by their vocalisations. I also can not help<br />

interpreting and comprehending what they are saying. Much as I might like to not<br />

be bothered with the details of their lives, I can not help but hear what they say as a<br />

string of English sentences. Not just hearing, but hearing as happens automatically.<br />

This automatic ‘hearing as’ is not under my voluntary control. I do not do it<br />

because I want to do it, or as part of a general plan that I endorse or have chosen to<br />

undertake. It does not reflect any deep features of my character. (Frankly I would<br />

much rather that I just heard most of these conversations as meaningless noise, like<br />

the train’s sound.) But I do it, involuntarily, nonetheless. This involuntariness is<br />

reflected in some of our practices. A friend tells me not to listen to X, because X<br />

is so often wrong about everything. Next I see the friend I say that I now believe<br />

that p, and when the friend asks why, I say it is because X said that p. The friend<br />

might admonish me. They will not admonish me for being within hearing range of<br />

X; that might have been unavoidable. And, crucially, they will not admonish me for<br />

interpreting X’s utterances. Taken literally, that might be what they were asking me<br />

9 The evidence for this view is set out in Gilbert et al. (1990, 1993).<br />

10 As he says, they have a mandatory operation. See pages 52-55 in particular, but the theme is central to<br />

the book.


Deontology and Descartes’ Demon 171<br />

not to do. But they’ll know it was unavoidable. What they were really asking me not<br />

to do was the one relevant thing that I had control over, namely believe what X said.<br />

As Fodor points out at length, both seeing as and hearing as are generally outside<br />

voluntary control. Our perceptual systems, and by this I am including verbal processing<br />

systems, quickly produce representations that are outside voluntary control<br />

in any sense. If any of these representations amount to beliefs, then there are some<br />

involuntary beliefs that we have. So we might think that in the case above, although<br />

it was up to me to believe that p, it was not up to me to believe that, say, X said that p,<br />

because this belief was produced by a modular system over which I have no control.<br />

This is not the position that Fodor takes. He thinks that beliefs are not produced<br />

by input modules. Rather, the non-modular part of the mind, the central processor,<br />

is solely responsible for forming and fixing beliefs. And the operation of this central<br />

processor is generally not mandatory, at least not in the sense that the operation of<br />

the modules is mandatory. Whether this is right seems to turn (in part) on a hard<br />

question to do with the analysis of belief.<br />

Let us quickly review Fodor’s views on the behaviour of input modules. The<br />

purpose of each module is to, within a specified domain, quickly and automatically<br />

produce representations of the world. These are, as on the nightclub model, then<br />

presented to cognition to be allowed in as beliefs or not. Here is how Fodor puts it.<br />

I am supposing that input systems offer central processes hypotheses<br />

about the world, such hypotheses being responsive to the current, local<br />

distribution of proximal stimulations. The evaluation of these hypotheses<br />

in light of the rest of what one knows is one of the things that central<br />

processes are for; indeed, it is the fixation of perceptual belief.(Fodor,<br />

1983, 136)<br />

But these representations do not just offer hypotheses. They can also guide action<br />

prior to being ‘approved’ by the central processes. That, at least, seems to be the point<br />

of Fodor’s discussion of the evolutionary advantages of having fast modules (Fodor,<br />

1983, 70-71). The core idea is that when one is at risk of being eaten by a panther,<br />

there is much to be said for a quick, automatic, panther recognition device. But<br />

there is just as much to be said for acting immediately on one’s panther recognition<br />

capacities rather than, say, searching for possible reasons why this panther appearance<br />

might be deceptive. And browsing reason space for such evidence of deceptions is<br />

just what central processes, in Fodor’s sense, do. So it seems the natural reaction to<br />

seeing a panther should be, and is, guided more-or-less directly by the input modules<br />

not central processes.<br />

So these ‘hypotheses’ are representations with belief-like direction of fit, i.e. they<br />

are responsive to the world, that guide action in the way that beliefs do. These are<br />

starting to sound a lot like beliefs. Perhaps we should take a Gilbert-style line and<br />

say that we automatically believe what we perceive, and the role of Fodorian central<br />

processes is not to accept or reject mere hypotheses, but to unbelieve undesirable


Deontology and Descartes’ Demon 172<br />

inputs. 11 There are a number of considerations that can be raised for and against this<br />

idea, and perhaps our concept of belief is not fine enough to settle the matter. But<br />

let’s first look at three reasons for thinking these inputs are not beliefs.<br />

First, if they are beliefs then we are often led into inconsistency. If we are looking<br />

at a scene we know to be illusory, then we might see something as an F when we<br />

know it is not an F. If the outputs of visual modules are beliefs, then we inconsistently<br />

believe both that it is and is not F. Perhaps this inconsistency is not troubling,<br />

however. After all, one of the two inconsistent beliefs is involuntary, so we are not<br />

responsible for it. So this inconsistency is not a sign of irrationality, just a sign of<br />

defective perception. And that is not something we should be surprised by; the case<br />

by definition is one where perception misfires.<br />

Second, the inputs do not, qua inputs, interact with other beliefs in the right kind<br />

of way. Even if we believe that if p then q, and perceive that p, we will not even be<br />

disposed to infer that q unless and until p gets processed centrally. On this point, see<br />

Stich (1978) and (Fodor, 1983, 83-86). The above considerations in favour of treating<br />

inputs as beliefs turned heavily on the idea that they have the same functional characteristics<br />

as paradigm beliefs. But as David Braddon-Mitchell and Frank Jackson (2007,<br />

114-123) stress, functionalism can only be saved from counterexamples if we include<br />

these inferential connections between belief states in the functional charactisation of<br />

belief. So from a functionalist point of view, the encapsulation of input states counts<br />

heavily against their being beliefs.<br />

Finally, if Fodor is right, then the belief-like representation of the central processes<br />

form something like a natural kind. On the other hand, the class consisting of<br />

these representations plus the representations of the input modules looks much more<br />

like a disjunctive kind. Even if all members of the class play the characteristic role<br />

of beliefs, we might think it is central to our concept of belief that belief is a natural<br />

kind. So these inputs should not count as beliefs.<br />

On the other hand, we should not overestimate the role of central processes, even<br />

if Fodor is right that central processes are quite different to input systems. There are<br />

two related features of the way we process inputs that point towards counting some<br />

inputs as beliefs, and hence as involuntary beliefs. The first feature is that we do not<br />

have to put any effort into believing what we see. On the contrary, as both Descartes<br />

and Hume were well aware, we believe what we see by default, and have to put effort<br />

into being sceptical. The second feature is that, dramatic efforts aside, we can only<br />

be so sceptical. Perhaps sustained reflection on the possibility of an evil demon can<br />

make us doubt all of our perceptions at once. But in all probability, at least most of<br />

the time, we can not doubt everything we see and hear. 12 We can perhaps doubt any<br />

perceptual input we receive, but we can not doubt them all.<br />

In the picturesque terms from above, we might think our security system is less<br />

like a nightclub and more like the way customs appears to work at many airports.<br />

11 To be clear, the position being considered here is not that we automatically believe p when someone<br />

says p to us, but that we automatically believe that they said that p.<br />

12 As noted in the last footnote, when I talk here about what we hear, I mean to include propositions of<br />

the form S said that p, not necessarily the p that S says.


Deontology and Descartes’ Demon 173<br />

(Heathrow Airport is especially like this, but I think it is not that unusual.) Everyone<br />

gets a cursory glance from the customs officials, but most people walk through<br />

the customs hall without even being held up for an instant, and there are not enough<br />

officials to stop everyone even if they wanted to. Our central processes, faced with<br />

the overwhelming stream of perceptual inputs, are less the all-powerful nightclub<br />

bouncer and more the overworked customs official, looking for the occasional smuggler<br />

who should not be getting through.<br />

The fact that inputs turn into fully fledged beliefs by default is some reason to say<br />

that they are beliefs as they stand. It is noteworthy that what Gilbert et al’s experiments<br />

primarily tested was whether sentences presented to subjects under cognitive<br />

load ended up as beliefs of the subjects. Now this could be because comprehending<br />

a sentence implies, at least temporarily, believing it. But perhaps a more natural<br />

reading in the first instance is that inputted sentences turn into beliefs unless we do<br />

something about it. Gilbert et al are happy inferring that in this case, the inputs are<br />

beliefs until and unless we do that something. This seems to be evidence that the<br />

concept of belief philosophers and psychologists use include states that need to be<br />

actively rejected if they are not to acquire all the paradigm features of belief. And<br />

that includes the inputs from Fodorian modules.<br />

That argument is fairly speculative, but we can make more of the fact that subjects<br />

can not stop everything coming through. This implies that there will be some long<br />

disjunctions of perceptual inputs that they will end up believing no matter how hard<br />

they try. Any given input can be rejected, but subjects only have so much capacity to<br />

block the flow of perceptual inputs. So some long disjunctions will turn up in their<br />

beliefs no matter how hard they try to keep them out. I think these are involuntary<br />

beliefs.<br />

So I conclude tentatively that perceptual inputs are involuntary beliefs, at least for<br />

the time it would take the central processes to evaluate them were it disposed to do so.<br />

And I conclude less tentatively that subjects involuntarily believe long disjunctions<br />

of perceptual inputs. So some beliefs are involuntary.<br />

Space considerations prevent a full investigation of this, but there is an interesting<br />

connection here to some late medieval ideas about evidence. In a discussion of<br />

how Descartes differed from his medieval influences, Matthew L. Jones writes “For<br />

Descartes, the realignment of one’s life came about by training oneself to assent only<br />

to the evident; for the scholastics, assenting to the evident required no exercise, as<br />

it was automatic.” (Jones, 2006, 84) 13 There is much contemporary interest in the<br />

analysis of evidence, with Timothy Williamson’s proposal that our evidence is all of<br />

our knowledge being a central focus (Williamson, 2000a, Ch. 9). I think there’s much<br />

to be said for using Fodor’s work on automatic input systems to revive the medieval<br />

idea that the evident is that which we believe automatically, or perhaps it is those<br />

pieces of knowledge that we came to believe automatically. As I said though, space<br />

prevents a full investigation of these interesting issues.<br />

13 Jones attributes this view to Scotus and Ockham, and quotes Pedro Fonseca as saying almost explicitly<br />

this in his commentary on Aristotle’s Metaphysics.


Deontology and Descartes’ Demon 174<br />

7 Epistemological Consequences<br />

So some of our beliefs, loosely speaking the perceptual beliefs, are spontaneous and<br />

involuntary, while other beliefs, the inferential beliefs, are voluntary in that we have<br />

the capacity to check them by paying greater heed to counter-possibilities. (In what<br />

follows it will not matter much whether we take the spontaneous beliefs to include<br />

all the perceptual inputs, or just the long disjunctions of perceptual inputs that are<br />

beyond our capacity to reject. I will note the few points where it matters significantly.)<br />

This has some epistemological consequences, for the appropriate standards<br />

for spontaneous, involuntary beliefs are different to the appropriate standards for<br />

considered, reflective beliefs. I include in the latter category beliefs that were formed<br />

when considered reflection was possible, but was not undertaken.<br />

To think about the standards for spontaneous beliefs, start by considering the<br />

criteria we could use to say that one kind of animal has a better visual system than<br />

another. One dimension along which we could compare the two animals concerns<br />

discriminatory capacity – can one animal distinguish between two things that the<br />

other cannot distinguish? But we would also distinguish between two animals with<br />

equally fine-grained visual representations, and the way we would distinguish is in<br />

terms of the accuracy of those representations. Some broadly externalist, indeed<br />

broadly reliabilist, approach has to be right when it comes to evaluating the visual<br />

systems of different animals.<br />

Things are a little more complicated when it comes to evaluating individual visual<br />

beliefs of different animals, but it is still clear that we will use externalist considerations.<br />

So imagine we are looking for standards for evaluating particular visual beliefs<br />

of again fairly basic animals. One very crude externalist standard we might use is<br />

that a belief is good iff it is true. Alternatively, we might say that the belief is good<br />

iff the process that produces it satisfied some externalist standard, e.g. it is generally<br />

reliable. Or we might, in a way, combine these and say that the belief is good iff it<br />

amounts to knowledge, incorporating both the truth and reliability standards. It is<br />

not clear which of these is best. Nor is it even clear which, if any, animals without<br />

sophisticated cognitive systems can be properly said to have perceptual beliefs. (I will<br />

not pretend to be able to evaluate the conceptual and empirical considerations that<br />

have been brought to bear on this question.) But what is implausible is to say that<br />

these animals have beliefs, and the relevant epistemic standards for evaluating these<br />

beliefs are broadly internal.<br />

This matters to debates about the justificatory standards for our beliefs because<br />

we too have perceptual beliefs. And the way we form perceptual beliefs is not that<br />

different from the way simple animals do. (If the representations of input processes<br />

are beliefs, then it does not differ in any significant way.) When we form beliefs in<br />

ways that resemble those simple believers, most notably when we form perceptual<br />

beliefs, we too are best evaluated using externalist standards. The quality of our<br />

visual beliefs, that is, seems to directly track the quality of our visual systems. And<br />

the quality of our visual system is sensitive to external matters. So the quality of our<br />

visual beliefs is sensitive to external matters.


Deontology and Descartes’ Demon 175<br />

On the other hand, when we reason, we are doing something quite different to<br />

what a simple animal can do. A belief that is the product of considered reflection<br />

should be assessed, inter alia, by assessing the standards of the reflection that produced<br />

it. To a first approximation, such a belief seems to be justified if it is well supported<br />

by reasons. Some reasoners will be in reasonable worlds, and their beliefs will be<br />

mostly true. Some reasoners will be in deceptive worlds, and many of their beliefs<br />

will be false. But this does not seem to change what we say about the quality of their<br />

reasoning. This, I take it, is the core intuition behind the New Evil Demon problem,<br />

that we’ll address much more below.<br />

So we’re naturally led to a view where epistemic justification has a bifurcated<br />

structure. A belief that is the product of perception is justified iff the perception is<br />

reliable; a belief that is (or could have been) the product of reflection is justified iff it<br />

is well-supported by reasons. 14 This position will remind many of Ernest Sosa’s view<br />

that there is animal knowledge, and higher knowledge, or scientia (Sosa, 1991, 1997).<br />

And the position is intentionally similar to Sosa’s. But there is one crucial difference.<br />

On my view, there is just one kind of knowledge, and the two types of justification<br />

kick in depending on the kind of knower, or the kind of knowing, that is in question.<br />

If we simply form perceptual beliefs, without the possibility of reconsidering them<br />

(in a timely manner), then if all goes well, our beliefs are knowledge. Not some lesser<br />

grade of animal knowledge, but simply knowledge. To put it more bluntly, if you’re<br />

an animal, knowledge just is animal knowledge. On the other hand, someone who<br />

has the capacity (and time) to reflect on their perceptions, and fails to do so even<br />

though they had good evidence that their perceptions were unreliable, does not have<br />

knowledge. Their indolence defeats their knowledge. Put more prosaically, the more<br />

you are capable of doing, the more that is expected of you.<br />

8 The New Evil Demon Problem<br />

The primary virtue of the above account, apart from its intuitive plausibility, is that<br />

it offers a satisfactory response to the New Evil Demon argument. The response<br />

in question is not new; it follows fairly closely the recent response due to Clayton<br />

Littlejohn (2009), who in turn builds on responses due to Kent Bach (1985) and Mylan<br />

Engel (1992). But I think it is an attractive feature of the view defended in this paper<br />

that it coheres so nicely with a familiar and attractive response to the argument.<br />

The New Evil Demon argument concerns victims of deception who satisfy all<br />

the internal standards we can imagine for being a good epistemic agent. So they are<br />

always careful to avoid making fallacious inferences, they respect the canons of good<br />

inductive and statistical practice, they do not engage in wishful thinking, and so on.<br />

The core intuition of the New Evil Demon argument is that although these victims<br />

14 There is a delicate matter here about individuating beliefs. If I look up, see, and hence believe it<br />

is raining outside, that is a perceptual belief. I could have recalled that it was raining hard a couple of<br />

minutes ago, and around here that kind of rain does not stop quickly, and formed an inferential belief<br />

that it was raining outside. I want to say that that would have been a different belief, although it has the<br />

same content. If I do not say that, it is hard to defend the position suggested here when it comes to the<br />

justificatory status of perceptual beliefs whose contents I could have otherwise inferred.


Deontology and Descartes’ Demon 176<br />

do not have knowledge (because their beliefs are false), they do have justified beliefs.<br />

Since the beliefs do not satisfy any plausible externalist criteria of justification, we<br />

conclude that no externalist criteria can be correct. The argument is set out by Stewart<br />

Cohen (1984).<br />

A fairly common response is to note that even according to externalist epistemology<br />

there will be some favourable epistemic property that the victim’s beliefs have,<br />

and this can explain our intuition that there is something epistemically praiseworthy<br />

about the victim’s beliefs. My approach is a version of this, one that is invulnerable<br />

to recent criticisms of the move. For both this response and the criticism to it, see<br />

James Pryor (2001). I am going to call my approach the agency approach, because the<br />

core idea is that the victim of the demon is in some sense a good doxastic agent, in<br />

that all their exercises of doxastic agency are appropriate, although their perception<br />

is quite poor and this undermines their beliefs.<br />

As was noted above, the quality of our visual beliefs is sensitive to external matters.<br />

This is true even for the clear-thinking victim of massive deception. Denying<br />

that the victim’s visual beliefs are as good as ours is not at all implausible; indeed intuition<br />

strongly supports the idea that they are not as good. What they are as good at as<br />

we are is exercising their epistemic agency. That is to say, they are excellent epistemic<br />

agents. But since there is more to being a good believer than being a good epistemic<br />

agent, there is also for example the matter of being a good perceiver, they are not as<br />

good at believing as we are.<br />

So the short version of my response to the New Evil Demon problem is this.<br />

There are two things we assess when evaluating someone’s beliefs. We evaluate how<br />

good an epistemic agent they are. And we evaluate how good they are at getting evidence<br />

from the world. Even shorter, we evaluate both their collection and processing<br />

of evidence. Externalist standards for evidence collection are very plausible, as is<br />

made clear when we consider creatures that do little more than collect evidence. The<br />

intuitions that the New Evil Demon argument draws on come from considering how<br />

we process evidence. When we consider beliefs that are the products of agency, such<br />

as beliefs that can only be arrived at by extensive reflection, we naturally consider the<br />

quality of the agency that led to those beliefs. In that respect a victim might do as<br />

well as we do, or even better. But that is no threat to the externalist conclusion that<br />

they are not, all things considered, as good at believing as we are.<br />

As I mentioned earlier, this is similar to a familiar response to the argument that<br />

James Pryor considers and rejects. He considers someone who says that what is in<br />

common to us and the clear-thinking victim is that we are both epistemically blameless.<br />

The objection he considers says that the intuitions behind the argument come<br />

from confusing this notion of being blameless with the more general notion of being<br />

justified. This is similar to my idea that the victim might be a good epistemic agent<br />

while still arriving at unjustified beliefs because they are so bad at evidence collection.<br />

But Pryor argues that this kind of deontological approach cannot capture all of the<br />

intuitions around the problem.<br />

Pryor considers three victims of massive deception. Victim A uses all sorts of<br />

faulty reasoning practices to form beliefs, practices that A could, if they were more<br />

careful, could see were faulty. Victim B was badly ‘brought up’, so although they


Deontology and Descartes’ Demon 177<br />

use methods that are subtly fallacious, there is no way we could expect B to notice<br />

these mistakes. Victim C is our paradigm of good reasoning, though of course C<br />

still has mostly false beliefs because all of their apparent perceptions are misleading.<br />

Pryor says that both B and C are epistemically blameless; C because they are a perfect<br />

reasoner and B because they cannot be blamed for their epistemic flaws. But we<br />

intuit that C is better, in some epistemic respects, than B. So there is some internalist<br />

friendly kind of evaluation that is stronger than being blameless. Pryor suggests that<br />

it might be being justified, which he takes to be an internalist but non-deontological<br />

concept.<br />

The agency approach has several resources that might be brought to bear on this<br />

case. For one thing, even sticking to deontological concepts we can make some distinctions<br />

between B and C. We can, in particular, say that C is epistemically praiseworthy<br />

in ways that B is not. Even if B cannot be blamed for their flaws, C can be<br />

praised for not exemplifying those flaws. It is consistent with the agency approach<br />

to say that C can be praised for many of their epistemic practices while saying that,<br />

sadly, most of C’s beliefs are unjustified because they are based on faulty evidence, or<br />

on merely apparent evidence.<br />

The merits of this kind of approach can be brought out by considering how we<br />

judge agents who are misled about the nature of the good. Many philosophers think<br />

that it is far from obvious which character traits are virtues and which are vices. Any<br />

particular example is bound to be controversial, but I think it should be uncontroversial<br />

that there are some such examples. So I will assume that, as Simon Keller (2005)<br />

suggests, it is true but unobvious that patriotism is not a virtue but a vice.<br />

Now consider three agents D, E and F. D takes patriotism to extremes, developing<br />

a quite hostile strand of nationalism, which leads to unprovoked attacks on<br />

non-compatriots. E is brought up to be patriotic, and lives this way without acting<br />

with any particular hostility to foreigners. F is brought up the same way, but comes<br />

to realise that patriotism is not at all virtuous, and comes to live according to purely<br />

cosmopolitan norms. Now it is natural to say that D is blameworthy in a way that E<br />

and F are not. As long as it seems implausible to blame E for not working through<br />

the careful philosophical arguments that tell against following patriotic norms, we<br />

should not blame E for being somewhat patriotic. But it is also natural to say that F<br />

is a better agent than either D or E. That is because F exemplifies a virtue, cosmopolitanism,<br />

that D and E do not, and does not exemplify a vice, patriotism, that D and E<br />

do exemplify. F is in this way praiseworthy, while D and E are not.<br />

This rather strongly suggests that when agents are misled about norms, a gap<br />

will open up between blamelessness and praiseworthiness. We can say that Pryor’s<br />

victim C is a better epistemic agent than A or B, because they are praiseworthy in a<br />

way that A and B are not. And we can say this even though we do not say that B is<br />

blameworthy and we do not say that being a good epistemic agent is all there is to<br />

being a good believer.<br />

At this point the internalist might respond with a new form of the argument. A<br />

victim of deception is, they might intuit, just as praiseworthy as a regular person,<br />

if they perform the same inferential moves. I think at this point the externalist can<br />

simply deny the intuitions. In general, praiseworthiness is subject to a degree of luck.


Deontology and Descartes’ Demon 178<br />

(Arguably blameworthiness is as well, but saying so sounds somewhat more counterintuitive<br />

than saying praiseworthiness is a matter of luck.) For example, imagine two<br />

people dive into ponds in which they believe there are drowning children. The first<br />

saves two children. The second was mistaken; there are no children to be rescued in<br />

the pond they dive into. Both are praiseworthy for their efforts, but they are not<br />

equally praiseworthy. The first, in particular, is praiseworthy for rescuing two children.<br />

As we saw in the examples of the writer and the good cricket captain above,<br />

praiseworthiness depends on outputs as well as inputs, and if the victim of deception<br />

produces beliefs that are defective, i.e. false, then through no fault of their own they<br />

are less praiseworthy.<br />

9 Praise and Blame<br />

As Pryor notes, many philosophers have thought that a deontological conception<br />

of justification supports an internalist theory of justification. I rather think that is<br />

mistaken, and that at least one common deontological understanding of what justification<br />

is entails a very strong kind of externalism. This is probably a reason to not<br />

adopt that deontological understanding.<br />

Assume, for reductio, that S’s belief that p is justified iff S is blameless in believing<br />

that p. I will call this principle J=B to note the close connection it posits between<br />

justification and blamelessness. Alston (1988) seems to identify the deontological<br />

conception of justification with J=B, or at least to slide between the two when offering<br />

critiques. But one of Alston’s own examples, the ‘culturally isolated tribesman’,<br />

suggests a principle that can be used to pull these two ideas apart. The example, along<br />

with Pryor’s three brains case, suggests that A1 is true.<br />

A1 It is possible for S to have a justified but false belief that her belief in p is justified.<br />

A1 is a special instance of the principle that justification does not entail truth. Some<br />

externalists about justification will want to reject the general principle, but all internalists<br />

(and indeed most externalists) will accept it. Now some may think that<br />

the general principle is right, but that beliefs about what we are justified in believing<br />

are special, and if they are justified they are true. But such an exception seems<br />

intolerably ad hoc. If we can have false but justified beliefs about some things, then<br />

presumably we can have false but justified beliefs about our evidence, since in principle<br />

our evidence could be practically anything. So the following situation seems<br />

possible; indeed it seems likely that something of this form happens frequently in<br />

real life. S has a false but justified belief that e is part of her evidence. S knows both<br />

that anyone with evidence e is justified in believing p in the absence of defeaters, and<br />

that there are no defeaters present. So S comes to believe, quite reasonably, that she<br />

is justified in believing that p. But S does not have this evidence, and in fact all of her<br />

evidence points towards ˜p. 15 So it is false that she is justified in believing p.<br />

15 I am assuming here that evidence of evidence need not be evidence. This seems likely to be true. In<br />

Bayesian terms, something can raise the probability of e, while lowering the probability of p, even though<br />

the probability of p given e is greater than the probability of p. Bayesian models are not fully general, but<br />

usually things that are possible in Bayesian models are possible in real life.


Deontology and Descartes’ Demon 179<br />

The following principle seems to be a reasonable principle concerning blameless<br />

inference.<br />

A2 If S blamelessly believes that she is justified in believing that p, and on the basis<br />

of that belief comes to believe that p, then she is blameless in believing that p.<br />

This is just a principle of transfer of blameworthiness. The quite natural thought is<br />

that you do not become blameworthy by inferring from I am justified in believing<br />

p to p. This inference is clearly not necessarily truth-preserving, but that is not a<br />

constraint on inferences that transfer blameworthiness, since not all ampliative inferences<br />

are blameworthy. (Indeed, many are praiseworthy.) And it is hard to imagine a<br />

less blameworthy ampliative inference schema than this one.<br />

We can see this more clearly with an example of A2. Suzy sees a lot of Fs and<br />

observes they are all Gs. She infers that it is justified for her to conclude that all Fs are<br />

Gs. Now it turns out this is a bad inference. In fact, G is a gruesome predicate in her<br />

world, so that is not a justified inference. But Suzy, like many people, does not have<br />

the concept of gruesomeness, and without it had no reason to suspect that this would<br />

be a bad inference. So she is blameless. If all that is correct, it is hard to imagine that<br />

she becomes blameworthy by actually inferring from what she has so far that all Fs<br />

are in fact Gs. Perhaps you might think her original inference, that it is justified to<br />

believe all Fs are Gs, was blameworthy, but blame can not kick in for the first time<br />

when she moves to the first order belief.<br />

I am now going to derive a contradiction from A1, A2 and J=B, and a clearly<br />

consistent set of assumptions about a possible case of belief.<br />

1. S justifiedly, but falsely, believes that she is justified in believing p. (Assumption<br />

- A1)<br />

2. On the basis of this belief, S comes to believe that p. (Assumption)<br />

3. S blamelessly believes that she is justified in believing that p. (1, J=B)<br />

4. S blamelessly believes that p. (2, 3, A2)<br />

5. S is justified in believing that p. (4, J=B)<br />

6. It is false that S is justified in believing that p. (1)<br />

One of A1, A2 and J=B has to go. If you accept J=B, I think it has got to be A1, since<br />

A2 is extremely plausible. But A1 only fails if we accept quite a strong externalist<br />

principle of justification, namely that justification entails truth. More precisely, we’re<br />

led to the view that justification entails truth when it comes to propositions about our<br />

own justification. But as we saw above, that pretty directly implies that justification<br />

entails truth when it comes to propositions about our own evidence. And, on the<br />

plausible assumption that evidence can be practically anything, that leads to there<br />

being a very wide range of cases where justification entails truth. So J=B entails this<br />

strong form of externalism.<br />

This does not mean that internalists cannot accept a deontological conception<br />

of justification. But the kind of deontological conception of justification that is left<br />

standing by this argument is quite different to J=B, and I think to existing deontological<br />

conceptions of justification. Here’s what it would look like. First, we say


Deontology and Descartes’ Demon 180<br />

that a belief’s being justified is not a matter of it being blameless, but a matter of it<br />

being in a certain way praiseworthy. Second, we say that the inference from I am<br />

justified in believing that p to p is not praiseworthy if the premise is false. So if we<br />

tried to run the above argument against J=P (the premise that justified beliefs are<br />

praiseworthy) it would fail at step 4. So anyone who wants to hold that justification<br />

is (even in large part) deontological, and wants to accept that justification can come<br />

apart from truth, should hold that justification is a kind of praiseworthiness, not a<br />

kind of blamelessness.


Luminous Margins<br />

Abstract<br />

Timothy Williamson has recently argued that few mental states are luminous,<br />

meaning that to be in that state is to be in a position to know that<br />

you are in the state. His argument rests on the plausible principle that<br />

beliefs only count as knowledge if they are safely true. That is, any belief<br />

that could easily have been false is not a piece of knowledge. I argue<br />

that the form of the safety rule Williamson uses is inappropriate, and the<br />

correct safety rule might not conflict with luminosity.<br />

1 Luminosity<br />

In Knowledge and Its Limits Timothy Williamson argues that few conditions are luminous<br />

(Williamson, 2000a, Ch. 4; all references to this book unless otherwise specified).<br />

A condition is luminous iff we know we are in it whenever we are. Slightly<br />

more formally, Williamson defines<br />

A condition C is defined to be luminous if and only if (L) holds:<br />

(L) For every case α, if in α C obtains, then in α one is in a position to<br />

know that C obtains (95).<br />

Intuitively, the argument against this is as follows. The following three conditions<br />

are incompatible.<br />

Gradual Change There is a series of cases, each very similar to adjacent cases, that<br />

starts with a case where C clearly obtains, and ends with a case where C clearly<br />

doesn’t obtain.<br />

Luminosity Whenever C obtains you can know it does.<br />

Safety Only safe beliefs count as knowledge, so whenever you can know that C<br />

obtains, C obtains in all very similar cases.<br />

Luminosity and Safety entail<br />

Tolerance Whenever C obtains, it obtains in all very similar cases.<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Australasian<br />

Journal of Philosophy 83 (2004): 373-83. Thanks to Tamar Szabó Gendler, John Hawthorne,<br />

Chris Hill, Ernest Sosa and the AJP’s referees.


Luminous Margins 182<br />

But Tolerance is incompatible with Gradual Change, since Tolerance entails that if<br />

the first member of the series is a case where C obtains, then every successive member<br />

is also a case where C obtains. Williamson argues that for any interesting epistemic<br />

condition, Gradual Change is a clear possibility. And he argues that Safety is a general<br />

principle about knowledge. So Luminosity must be scrapped. The counterexamples<br />

to Luminosity we get from following this proof through are always borderline cases<br />

of C obtaining. In these cases Luminosity fails because any belief that C did obtain<br />

would be unsafe, and hence not knowledge.<br />

I will argue, following Sainsbury (1996), that Williamson has misinterpreted the<br />

requirement that knowledge be safe. The most plausible safety condition might be<br />

compatible with Gradual Change and Luminosity, if we make certain plausible assumptions<br />

about the structure of phenomenal beliefs.<br />

One consequence of the failure of Luminosity is that a certain historically important<br />

kind foundationalist analysis of knowledge fails. This kind of foundationalist<br />

takes the foundations to be luminous. Although I think Williamson’s argument<br />

against Luminosity does not work, my objections are no help to the foundationalist.<br />

As I said, my objection to Williamson rests on certain assumptions about the<br />

structure of phenomenal beliefs. It is a wide open empirical and philosophical question<br />

whether these assumptions are true. If this kind of foundationalism provided a<br />

plausible analysis of knowledge, then it would be a wide open question whether our<br />

purported knowledge rested on any foundations, and hence a wide open question<br />

whether we really had any knowledge. But this is a closed question. It is a Moorean<br />

fact that we know many things. So while I object to Williamson’s claim that we have<br />

no luminous mental states, I do not object to the weaker claim that we might not have<br />

any luminous mental states, and this claim is enough to do much of the philosophical<br />

work to which Williamson puts Luminosity.<br />

2 Williamson’s Example<br />

Williamson suggests that (L), the formal rendition of Luminosity, fails for all interesting<br />

conditions even if we restrict the quantifier to those that are ‘physically and<br />

psychologically feasible’ [94], and I will assume that is what we are quantifying over.<br />

To argue that (L) fails for any interesting C, Williamson first argues that it fails in a<br />

special case, when C is the condition feeling cold, and then argues that the conditions<br />

that lead to failure here are met for any other interesting C. So I will also focus on<br />

the special case.<br />

Mr Davis’s apartment faces southwest, so while it is often cold in the mornings<br />

it always warms up as the midday and afternoon sun streams in. This morning Mr<br />

Davis felt cold when he awoke, but now at noon he is quite warm, almost hot. But the<br />

change from wake-up time to the present is rather gradual. Mr Davis does not take a<br />

hot bath that morning, nor cook a hot breakfast, but sits reading by the window until<br />

the sun does its daily magic. Assume, for the sake of the argument, that feeling cold<br />

is luminous, so whenever Mr Davis feels cold, he knows he feels cold. Williamson<br />

argues this leads to a contradiction as follows. (I’ve changed names and pronouns to<br />

conform with my example.)


Luminous Margins 183<br />

Let t 0 , t 1 , . . . , t n be a series of times at one millisecond intervals from<br />

dawn to noon. Let α i be the case at t i (0 ≤ i ≤ n). Consider a time t i<br />

between t 0 and t n , and suppose that at t n Mr Davis knows that he feels<br />

cold. . . . Now at t i+1 he is almost equally confident that he feels cold,<br />

by the description of the case. So if he does not feel cold at t i+1 , then<br />

his confidence at t i that he feels cold is not reliably based, for his almost<br />

equal confidence on a similar basis one millisecond earlier that he felt<br />

cold was misplaced . . . His confidence at t i was reliably based in the way<br />

required for knowledge only if he feels cold at t i+1 . In the terminology<br />

of cases. . . :<br />

(I i ) If in α i he knows that he feels cold, then in α i+1 he feels cold. (97)<br />

Given (L), all instances of (I i ), and the fact that Mr Davis feels cold when he awakes,<br />

we get the false conclusion that he now feels cold. So if we accept all instances of<br />

(I i ), we must conclude that (L) is false when C is feeling cold and ‘one’ denotes Mr<br />

Davis. Why, then, accept (I i )? One move Williamson makes here is purely defensive.<br />

He notes that (I i ) is different from the conditionals that lead to paradox in the Sorites<br />

argument. The antecedent of (I i ) contains the modal operator knows that absent from<br />

its consequent, so we cannot chain together instances of (I i ) to produce an implausible<br />

conditional claim. If that operator were absent then from all the instances of (I i ) it<br />

would follow that if Mr Davis feels cold at dawn he feels cold at noon, which is<br />

false. But by strengthening the antecedent, Williamson weakens (I i ) to avoid that<br />

conclusion. But the fact that (I i ) is not paradoxical is not sufficient reason to accept<br />

it.<br />

3 Reliability<br />

It is useful to separate out two distinct strands in Williamson’s argument for (I i ).<br />

One strand sees Williamson arguing for (I i ) by resting on the principle that beliefs<br />

constitute knowledge only if they are reliably based. The idea is that if Mr Davis’s<br />

belief that he feels cold is a bit of knowledge, it is reliable, and if it is reliable it is<br />

true in all similar situations, and hence it is true in α i+1 . The other strand sees him<br />

appealing to a vague but undoubtedly real requirement that beliefs must be safely true<br />

in order to be knowledge. Neither argument is successful, though the second kind of<br />

argument is better than the first.<br />

Williamson acknowledges Conee and Feldman’s arguments that no reliabilist epistemologist<br />

has yet solved the generality problem (100). But he takes this to be reason<br />

to abandon not the concept of reliability, but the hope of providing a reductive analysis<br />

of it. Williamson thinks we can get a long way by just resting on the intuitive<br />

concept of reliability. This seems to be a mistake. There are two ordinary ways of<br />

using ‘reliable’ in the context of discussing beliefs, and neither provides support for<br />

(I i ).<br />

First, and this is clearly not what is needed, sometimes ‘reliable’ just means true.<br />

This is the sense of the word in which we can consistently say, “It turned out the<br />

information that old Ronnie provided us about where the gov’nor was eating tonight


Luminous Margins 184<br />

was reliable, which was plenty surprising since Ronnie hadn’t been right about anything<br />

since the Nixon administration.” This is the sense in which reliable means just<br />

what the etymology suggests it means, something that can be relied upon. And that<br />

means, in practice, true. But that won’t help at all, for if ‘reliable’ just means true,<br />

then nothing follows from the fact that knowledge is reliable that does not follow<br />

from the fact that it is factive.<br />

Second, there is a distinctively philosophical sense in which reliable means something<br />

more like true in a wide range of circumstances. This is the sense in which<br />

a stopped clock is not even reliable twice a day. At first, this might look to help<br />

Williamson a little more. But if philosophical usage is to be key, the second look<br />

is more discouraging. For in its philosophical usage, reliability does not even entail<br />

truth. And if reliability does not entail truth in the actual situation, it surely does<br />

not entail truth in nearby situations. But Williamson’s argument for (I i ) requires that<br />

reliability in α i entails truth in α i+1 . So on neither of its natural readings does the<br />

concept of reliability seal the argument here, and since we have no unnatural reading<br />

to fall back upon, the argument from reliability for (I i ) fails. To be fair, by chapter<br />

5 of Williamson’s book the concept of reliability that seems to be employed is little<br />

distinguishable from the concept of safety. So let us turn to those arguments.<br />

4 Safety<br />

Williamson at times suggests that the core argument for (I i ) is a straight appeal to<br />

intuition. “[E]ven when we can appeal to rigorous rules, they only postpone the<br />

moment at which we must apply concepts in particular cases on the basis of good<br />

judgement. . . . The argument for (I i ) appeals to such judgement.” (101) The appeal<br />

to intuition is the royal road to scepticism, so we would be justified in being a little<br />

wary of it. Weinberg et al. (2001) discovered that undergraduates from the same social<br />

class as Williamson, Mr Davis and I would frequently judge that a subject could not<br />

know that mule was a mule unless he could tell it apart from a cleverly painted zebra.<br />

The judgements of that class are not obviously the basis for a sane epistemology.<br />

Williamson undersells his argument by making it an appeal to judgement. For<br />

there is a principle here, if not a rigorous rule, that grounds the judgement. The<br />

principle is something like Ernest Sosa’s safety principle. The idea is that a belief<br />

does not constitute knowledge if it is false in similar situations. “[N]ot easily would<br />

S believe that p without it being the case that p.” (Sosa, 1999, 142) There is much<br />

to be said here about what is a similar situation. (David Lewis (1996b) discusses a<br />

concept of similarity in the context of saying that worlds can be salient, in his sense,<br />

in virtue of being similar to salient worlds.) It might turn out that there is no account<br />

of similarity that makes it plausible that this is a constraint on knowledge. But for<br />

present purposes I am prepared to grant (a) that only safe beliefs count as knowledge,<br />

and (b) that α i+1 is a similar situation to α i .<br />

This might seem like too much of a concession to Williamson, for it already conflicts<br />

with some platitudes about knowledge. Consider a case that satisfies the following<br />

three conditions. Some light reflects off a leopard some distance away and<br />

strikes our eyes. The impact of that light causes, by the normal processes, a belief


Luminous Margins 185<br />

that a leopard is nearby to appear in our belief box. Beliefs, including leopard-related<br />

beliefs, that we form by this kind of process are on the whole very reliable. You<br />

might think these conditions are sufficient for our belief to count as knowledge that<br />

a tiger is present. The proponent of Safety denies this. She says that if, for example,<br />

there are several cheetahs with a particularly rare mutation that make the look much<br />

like leopards around, and if we saw them at similar distance we would have mistaken<br />

them for leopards. Since we could easily have had the belief that a leopard is nearby<br />

while there were no leopards, only cheetahs, nearby, the belief is not safe and so does<br />

not count as knowledge.<br />

There are two reasons to think that safety is too strong here, neither of which<br />

strike me as completely compelling. (I’m still conceding things to Williamson here.<br />

If there’s a general objection to Safety then his argument against Luminosity does not<br />

get off the ground. That’s not my position. As I’ll soon argue, I think Williamson<br />

has misinterpreted Safety.) The first reason is a worry that if we deny knowledge<br />

in a case of reliable veridical perception, we are conceding too much to the sceptic.<br />

But the proponent of Safety has a very good reason to distinguish this case from my<br />

current veridical perception of a table - my perception is safe and the perception of<br />

a leopard is not. So there is no slippery slope to scepticism here. The second is that<br />

the allegedly similar case is not really that similar, because in that case the belief is<br />

caused by a cheetah, not a leopard. But to regard cases where the evidence is different<br />

in this way as being dissimilar is to make the safety condition impotent, and Sosa<br />

has shown that we need some version of Safety to account for our intuitions about<br />

different cases. 1<br />

So I think some version of Safety should be adopted. I don’t think this gives us<br />

(I i ), for reasons related to some concerns first raised by Mark Sainsbury (1996). The<br />

role for Safety condition in a theory of knowledge is to rule out knowledge by lucky<br />

guesses. This includes lucky guesses in mathematics. If Mr Davis guesses that 193<br />

plus 245 is 438, he does not thereby know what 193 plus 245 is. Can Safety show<br />

why this is so? Yes, but only if we phrase it in a certain way. Assume that we have a<br />

certain belief B with content p. (As it might be, Mr Davis’s belief with content 193 +<br />

245 = 438.) Then the following two conditions both have claims to being the correct<br />

analysis of ‘safe’ as it appears in Safety.<br />

Content-safety B is safe iff p is true in all similar worlds.<br />

Belief-safety B is safe iff B is true in all similar worlds.<br />

If we rest with content-safety, then we cannot explain why Mr Davis’s lucky guess<br />

does not count as knowledge. For in all nearby worlds, the content of the belief<br />

1 I assume here a relatively conservative epistemological methodology, one that says we should place a<br />

high priority on having our theories agree with our intuitive judgments. I’m in favour of a more radical<br />

methodology that makes theoretical virtues as important as agreement with particular intuitions <strong>Weatherson</strong><br />

(2003c). On the radical view Safety might well be abandoned. But on that view knowledge might<br />

be merely true belief, or merely justified true belief, so the argument for Luminosity will be a non-starter.<br />

But the argument of this paper does not rest on these radical methodological principles. The position I’m<br />

defending is that, supposing a standard methodological approach, we should accept a Safety principle. But<br />

as I’ll argue, the version of Safety Williamson adopts is not appropriate, and the appropriate version does<br />

not necessarily support the argument against Luminosity.


Luminous Margins 186<br />

he actually has is true. If we use belief-safety as our condition though, I think we<br />

can show why Mr Davis has not just got some mathematical knowledge. The story<br />

requires following Marian David’s good advice for token physicalists and rejecting<br />

content essentialism about belief (David (2002); see also Gibbons (1993). The part<br />

of Mr Davis’s brain that currently instantiates a belief that 193 plus 245 is 438 could<br />

easily have instantiated a belief that 193 plus 245 is 338, for Mr Davis is not very good<br />

at carrying hundreds while guessing. If, as good physicalists, we identify his belief<br />

with the part of the brain that instantiates it, we get the conclusion that this very<br />

belief could have had the false content that 193 plus 245 is 338. So the belief is not<br />

safe, and hence it is not knowledge.<br />

This lends some credence to the idea that it’s belief-safety, not content-safety,<br />

that’s the important safety criteria. When talking about Mr Davis’s mathematical<br />

hunches, belief-safety is a stronger condition than content-safety. But when talking<br />

about his feelings, things may be reversed.<br />

Let me tell you a little story about how Mr Davis’s mind is instantiated. Mr<br />

Davis’s phenomenal beliefs do not arise from one part of his brain, his belief box or<br />

mind’s eye, tracking another part, the part whose states constitute his feeling cold.<br />

Rather, when he is in some phenomenal state, the very same brain states constitute<br />

both the phenomena and a belief about the phenomena. Mr Davis’s brain is so wired<br />

that he could not have any sensation of radiant heat (or lack thereof) without his<br />

thereby believing that he is having just that sensation, because he could not have felt<br />

cold without that feeling itself being a belief that he felt cold. In that case, belief-safety<br />

will not entail (I i ). Imagine that at α i Mr Davis feels cold, but at α i+1 he does not.<br />

(I assume here, with Williamson, that there is such an i.) At α i he thereby believes<br />

that he feels cold. The content of that belief is a de se proposition that is false at α i+1 ,<br />

so it violates content-safety. But in α t+1 that part of his brain does not constitute his<br />

feeling cold (for he does not feel cold), and thereby does not constitute his believing<br />

that he feels cold. By hypothesis, by that time no part of his brain constitutes feeling<br />

cold. So the belief in α i that he feels cold is not false in α i+1 ; it either no longer exists,<br />

or now has the true content that Mr Davis does not feel cold. So belief-safety does not<br />

prevent this belief of Mr Davis’s from being knowledge. And indeed, it seems rather<br />

plausible that it is knowledge, for he could not have had just this belief without it<br />

being true. This belief violates content-safety but not belief-safety, and since we have<br />

no reason to think that content-safety rather than belief-safety is the right form of the<br />

safety constraint, we have no reason to reject the intuition that this belief, this more<br />

or less infallible belief, counts as a bit of knowledge.<br />

This story about Mr Davis’s psychology might seem unbelievable, so let me clear<br />

up some details. Mr Davis has both phenomenal and judgemental beliefs about his<br />

phenomenal states. The phenomenal beliefs are present when and only when the<br />

phenomenal states are present. The judgemental beliefs are much more flexible, they<br />

are nomically independent of the phenomena they describe. The judgemental beliefs<br />

are grounded in ‘inner perceptions’ of his phenomenal states. The phenomenal<br />

beliefs are not, they just are the phenomenal states. The judgemental beliefs can be<br />

complex, as in a belief that I feel cold iff it is Monday, while the phenomenal beliefs<br />

are always simple. It is logically possible that Mr Davis be wired so that he feel cold


Luminous Margins 187<br />

without believing he feels cold, but it is not an accident that he is so wired. Most of<br />

his conspecifics are similarly set up. It is possible that at a particular time Mr Davis<br />

has both a phenomenal belief and a judgemental belief that he feels cold, with the<br />

beliefs being instantiated in different parts of his brain. If he has both of these beliefs<br />

in α i , then Williamson’s argument may well show that the judgemental belief does<br />

not count as knowledge, for it could be false in α i+1 . If he has the judgemental belief<br />

that he is not cold in α i , then the phenomenal belief that he is cold may not be<br />

knowledge, for it is plausible that the existence of a contrary belief defeats a particular<br />

belief’s claim to knowledge. But that does not mean that he is not in a position to<br />

know that he is cold in α i .<br />

Some may object that it is conceptually impossible that a brain state that instantiates<br />

a phenomenal feel should also instantiate a belief. And it is true that Mr Davis’s<br />

phenomenal states do not have some of the features that we typically associate with<br />

beliefs. These states are relatively unstructured, for example. Anyone who thinks<br />

that it is a conceptual truth that mental representations are structured like linguistic<br />

representations will think that Mr Davis could not have the phenomenal beliefs<br />

I have ascribed to him. But it is very implausible that this is a conceptual truth. The<br />

best arguments for the language of thought hypothesis rest on empirical facts about<br />

believers, especially the facts that mental representation is typically productive and<br />

systematic. If there are limits to how productive and systematic Mr Davis’s phenomenal<br />

representations are, then it is possible that his phenomenal states are beliefs.<br />

Certainly those states are correlated with inputs (external states of affairs) and outputs<br />

(bodily movements, if not actions) to count as beliefs on some functionalist<br />

conceptions of belief.<br />

A referee noted that we don’t need the strong assumption that phenomenal states<br />

can be beliefs to make the argument here, though it probably is the most illumination<br />

example. Either of the following stories about Mr Davis’s mind could have<br />

done. First, Mr Davis’s phenomenal belief may be of the form “I feel φ”, where<br />

“I” and “feel” are words in Mr Davis’s language of thought, and φ is the phenomenal<br />

state, functioning as a name for itself. As long as the belief arises whenever Mr<br />

Davis is φ, and it has the phenomenal state as a constituent, it can satisfy belief-safety<br />

even when content-safety fails. The second option involves some more contentious<br />

assumptions. The phenomenal belief may be of the form “I feel thus”, where the<br />

demonstrative picks out the phenomenal state. As long as it is essential to the belief<br />

that it includes a demonstrative reference to that phenomenal state, it will satisfy<br />

belief-safety. This is more contentious because it might seem plausible that a particular<br />

demonstrative belief could have picked out a different state. What won’t work, of<br />

course, is if the phenomenal belief is “I feel F”, where F is an attempted description<br />

of the phenomenal state. That certainly violates every kind of safety requirement. I<br />

think it is plausible that phenomenal states could be belief states, but if you do not<br />

believe that it is worth noting the argument could possibly go through without it, as<br />

illustrated in this paragraph.<br />

Mr Davis is an interesting case because he shows just how strong a safety assumption<br />

we need to ground (I i ). For Mr Davis is a counterexample to (I i ), but his coldness<br />

beliefs satisfy many plausible safety-like constraints. For example, his beliefs about


Luminous Margins 188<br />

whether he feels cold are sensitive to whether he feels cold. Williamson (Ch. 7)<br />

shows fairly conclusively that knowledge does not entail sensitivity, so one might<br />

have thought that in interesting cases sensitivity would be too strong for what is<br />

needed, not too weak as it is here. From this it follows that any safety condition that<br />

is strictly weaker than sensitivity, such as the condition that the subject could not<br />

easily believe p and be wrong, is not sufficient to support (I i ). Williamson slides over<br />

this point by assuming that the subject will be almost as confident that he feels cold<br />

at α i+1 as he is at α i . This is no part of the description of the case, as Mr Davis shows.<br />

My argument above rests on the denial of content essentialism, which might look<br />

like a relatively unsafe premise. So to conclude this section, let’s see how far the<br />

argument can go without that assumption. Sainsbury responds to his example, the<br />

lucky arithmetic guess, by proposing a different version of safety: mechanism-safety.<br />

Mechanism-safety B is safe iff the mechanism that produced B produces true beliefs<br />

in all similar worlds.<br />

I didn’t want to rest on this too much because I think it’s rather hard to say exactly<br />

what the mechanism is that produces Mr Davis’s belief that he feels cold. But if it’s<br />

just his sensory system, then I think it is clear that even at α i , Mr Davis’s belief that<br />

he feels cold satisfies mechanism-safety. The bigger point here is that content-safety<br />

is a very distinctive kind of safety claim, but it’s the only kind that justifies (I i ).<br />

5 Retractions<br />

To close, let me stress how limited my criticisms of Williamson here are. Very briefly,<br />

the argument is that there can be some self-presenting mental states, states that are either<br />

token identical with the belief that they exist or are constituents of (the contents<br />

of) beliefs that they exist, and these beliefs will satisfy all the safety requirements we<br />

should want, even in borderline cases. If some conditions are invariably instantiated<br />

by self-presenting states, then those conditions will be luminous. And I think it is a<br />

live possibility, relative at least to the assumptions Williamson makes, that there are<br />

such self-presenting states. But there aren’t very many of them. There is a reason I<br />

picked feels cold as my illustration. It’s not laughable that it is self-presenting.<br />

On the other hand, it is quite implausible that, say, knowing where to buy the<br />

best Guinness is self-presenting. And for states that are not self-presenting, I think<br />

Williamson’s anti-luminosity argument works. That’s because it is very plausible (a)<br />

that for a belief to be knowledge it must satisfy either belief-safety or mechanismsafety,<br />

(b) a non-self-presenting state satisfies probably belief-safety or mechanismsafety<br />

only if it satisfies content-safety, and (c) as Williamson showed, if beliefs about<br />

a state must satisfy content-safety to count as knowledge, then that state is not luminous.<br />

So epistemic states, like the state of knowing where to buy the best Guinness,<br />

are not luminous. That is to say, one can know where to buy the best Guinness<br />

without knowing that one knows this. And saying that (for these reasons) is to just<br />

endorse Williamson’s arguments against the KK principle. Those arguments are an<br />

important special case of the argument against luminosity, and I don’t see how any<br />

of my criticisms of the general argument touch the special case.


Luminous Margins 189<br />

Williamson describes his attacks on luminosity as an argument for cognitive<br />

homelessness. If a state was luminous, that state would be a cognitive home. Williamson<br />

thinks we are homeless. I think we may have a small home in our phenomenal<br />

states. This home is not a mansion, perhaps just a small apartment with some afternoon<br />

sun, but it may be a home.<br />

Don’t be fooled into thinking this supports any kind of foundationalism about<br />

knowledge, however. It is true that if we have the kind of self-presenting states that<br />

Mr Davis has (under one of the three descriptions I’ve offered), then we have the<br />

self-justifying beliefs that foundationalism needs to get started. But it is at best a<br />

wide-open philosophical and scientific question whether we have any such states,<br />

while it is not a wide-open question whether we have any knowledge, or any justified<br />

beliefs. If these states are the only things that could serve as foundations, it would be<br />

at least conceptually possible that we could have knowledge without self-justifying<br />

foundations. So the kind of possibility exemplified by Mr Davis cannot, on its own,<br />

prop up foundationalism.


Scepticism, Rationalism and Externalism<br />

This paper is about three of the most prominent debates in modern epistemology.<br />

The conclusion is that three prima facie appealing positions in these debates cannot<br />

be held simultaneously.<br />

The first debate is scepticism vs anti-scepticism. My conclusions apply to most<br />

kinds of debates between sceptics and their opponents, but I will focus on the inductive<br />

sceptic, who claims we cannot come to know what will happen in the future by<br />

induction. This is a fairly weak kind of scepticism, and I suspect many philosophers<br />

who are generally anti-sceptical are attracted by this kind of scepticism. Still, even<br />

this kind of scepticism is quite unintuitive. I’m pretty sure I know (1) on the basis of<br />

induction.<br />

(1) It will snow in Ithaca next winter.<br />

Although I am taking a very strong version of anti-scepticism to be intuitively true<br />

here, the points I make will generalise to most other versions of scepticism. (Focussing<br />

on the inductive sceptic avoids some potential complications that I will note<br />

as they arise.)<br />

The second debate is a version of rationalism vs empiricism. The kind of rationalist<br />

I have in mind accepts that some deeply contingent propositions can be known<br />

a priori, and the empiricist I have in mind denies this. Kripke showed that there<br />

are contingent propositions that can be known a priori. One example is Water is<br />

the watery stuff of our acquaintance. (‘Watery’ is David Chalmers’s nice term for the<br />

properties of water by which folk identify it.) All the examples Kripke gave are of<br />

propositions that are, to use Gareth Evans’s term, deeply necessary (Evans, 1979).<br />

It is a matter of controversy presently just how to analyse Evans’s concepts of deep<br />

necessity and contingency, but most of the controversies are over details that are not<br />

important right here. I’ll simply adopt Stephen Yablo’s recent suggestion: a proposition<br />

is deeply contingent if it could have turned out to be true, and could have turned<br />

out to be false (Yablo, 2002) 1 . Kripke did not provide examples of any deeply contingent<br />

propositions knowable a priori, though nothing he showed rules out their<br />

existence.<br />

The final debate is a version of internalism vs externalism about epistemic justification.<br />

The internalist I have in mind endorses a very weak kind of access internalism.<br />

Say that a class of properties (intuitively, a determinable) is introspective<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Oxford<br />

Studies in Epistemology 1 (2005): 311-31. This paper has been presented at Cornell University and the<br />

Inland Northwest Philosophy Conference, and each time I received valuable feedback. Thanks also to<br />

David Chalmers, Harold Hodes, Nicholas Sturgeon and, especially, Tamar Szabó Gendler for very helpful<br />

comments on various drafts of the paper.<br />

1 If you prefer the ‘two-dimensional’ way of talking, a deeply contingent proposition is one that is true<br />

in some possible world ‘considered as actual’. See Chalmers (2006) for a thorough discussion of ways to<br />

interpret this phrase, and the broader notion of so-called ‘deep’ contingency. Nothing that goes on here<br />

will turn on any of the fine distinctions made in that debate - the relevant propositions will be deeply<br />

contingent in every plausible sense.


Scepticism, Rationalism and Externalism 191<br />

iff any beliefs an agent has about which property in the class (which determinate)<br />

she instantiates are guaranteed to not be too badly mistaken. 2 (Since ‘too badly’ is<br />

vague, ‘introspective’ will be vague too, but as we’ll see this won’t matter to the main<br />

argument.) My internalist believes the following two claims:<br />

• Which propositions an agent can justifiably believe supervenes in which introspective<br />

properties she instantiates, and this is knowable a priori. 3<br />

• There exist some introspective properties and some deeply contingent propositions<br />

about the future such that it’s a priori that whoever instantiates those<br />

properties can justifiably believe those propositions.<br />

My externalist denies one or other of these claims. Typically, she holds that no matter<br />

what introspective properties you have, unless some external condition is satisfied<br />

(such as the reliability of the connection between instantiating those properties and<br />

the world being the way you believe it is) you lack justification. Alternatively, she<br />

holds that the connection between introspective properties and justification is always<br />

a posteriori. (Or, of course, she might deny both.)<br />

My argument will be that the combination of anti-scepticism, empiricism and<br />

internalism is untenable. Since there’s quite a bit to be said for each of these claims<br />

individually, that their combination is untenable means we are stuck with a fairly<br />

hard choice: accept scepticism, or rationalism, or externalism. Of the three, it may<br />

seem that externalism is the best, but given how weak the version of internalism is<br />

that I’m using, I think we should take the rationalist option seriously. 4 In this paper<br />

I’ll just argue against the combination of anti-scepticism, empiricism and internalism,<br />

and leave it to the reader to judge which of the three to reject.<br />

Very roughly, the argument for the trilemma will be as follows. There are some<br />

propositions q such that these three claims are true.<br />

2 That a property is introspective does not mean that whenever a subject instantiates it she is in a<br />

position to form a not too badly mistaken belief about it. Even if the subject instantiates the property she<br />

may not possess sufficient concepts in order to have beliefs about it. And even if she has the concept she<br />

may simply have more pressing cognitive needs than forming certain kinds of belief. Many agents have<br />

no beliefs about the smell in their ordinary environment much of the time, for example, and this does not<br />

show that phenomenal smell properties are not introspective. All that is required is that if she has any<br />

beliefs at all about which determinate she instantiates, the beliefs are immune to massive error.<br />

3 There is a delicate ambiguity in this expression to which a referee drew my attention. The intended<br />

meaning is that for any two agents who instantiate the same introspective properties, belief in the same<br />

propositions is justified. What’s not intended is that if there’s an agent who justifiably believes p, and<br />

the introspective properties they instantiate are F 1 , . . . , F n , then any agent who instantiates F 1 , . . . , F n is<br />

justified in believing p. For there might be some other introspective property F n+1 they instantiate that<br />

justifies belief in q, and q might be a defeater for p. The ‘unintended’ claim would be a very strong, and<br />

very implausible, claim about the subvenient basis for justification.<br />

4 Rationalism is supported by BonJour (1997) and Hawthorne (2002), and my argument owes a lot to<br />

each of their discussions.


Scepticism, Rationalism and Externalism 192<br />

(2) If anti-scepticism is true, then I either know q a priori or a posteriori.<br />

(3) If internalism and empiricism is true, I do not know q a priori. 5<br />

(4) If internalism is true, I do not know q a posteriori.<br />

Much of the paper will be spent giving us the resources to find, and state, such a q,<br />

but to a first approximation, think of q as being a proposition like I am not a brain-ina-vat<br />

whose experiences are as if they were a normal person. 6 The important features of<br />

q are that (a) it is entailed by propositions we take ourselves to know, (b) it is possibly<br />

false and (c) if something is evidence for it, then any evidence is evidence for it. I will<br />

claim that by looking at propositions like this, propositions that say in effect that I<br />

am not being misled in a certain way, it is possible to find a value for q such that (2),<br />

(3) and (4) are all true. From that it follows that<br />

For most of the paper I will assume that internalism and anti-scepticism are true,<br />

and use those hypotheses to derive rationalism. The paper will conclude with a detailed<br />

look at the role internalism plays in the argument, and this will give us some<br />

sense of what an anti-sceptical empiricist externalism may look like.<br />

1 A Sceptical Argument<br />

Among the many things I know about future, one of the firmest is (1).<br />

(1) It will snow in Ithaca next winter.<br />

I know this on the basis of inductive evidence about the length of meteorological cycles<br />

and the recent history of Ithaca in winter. The inductive sceptic now raises the<br />

spectre of Winter Wonderland, a kind of world that usually has the same meteorological<br />

cycles as ours, and has the same history, but in which it is sunny every day in<br />

Ithaca next winter. 7 She says that to know (1) we must know that (5) is false, and we<br />

do not.<br />

(5) I am living in Winter Wonderland.<br />

Just how does reflection (5) affect my confidence that I know (1)? The sceptic might<br />

just appeal to the intuition that I don’t know that (5) is false. But I don’t think I have<br />

that intuition, and if I do it is much weaker than my intuition that I know (1) and<br />

that I can infer (5) from (1). James Pryor (2000a, 527-529) has suggested the sceptic is<br />

better off using (5) in the following interesting argument. 8<br />

5 Aesthetically it would be preferable to have the antecedent of this claim be just that empiricism is true,<br />

but unfortunately this does not seem to be possible.<br />

6 I.e. I am not a brain-in-a-vat* in the sense of Cohen (1999)<br />

7 If she is convinced that there is no possible world with the same history as ours and no snow in Ithaca<br />

next winter, the sceptic will change her story so Winter Wonderland’s past differs imperceptibly from the<br />

past in our world. She doesn’t think this issue is particularly relevant to the epistemological debate, no<br />

matter how interesting the scientific and metaphysical issues may be, and I agree with her.<br />

8 Pryor is discussing the external world sceptic, not the inductive sceptic, so the premises here are a little<br />

different to those he provides.


Scepticism, Rationalism and Externalism 193<br />

Sceptical Argument 1<br />

(6) Either you don’t know you’re not living in Winter Wonderland; or, if you do<br />

know that, it’s because that knowledge rests in part on your inductive knowledge<br />

that it will snow in Ithaca next winter.<br />

(7) If you’re to know (1) on the basis of certain experiences or grounds e, then for<br />

every q which is “bad” relative to e and (1), you have to be in a position to know<br />

q to be false in a non-question-begging way—i.e., you have to be in a position<br />

to know q to be false antecedently to knowing that it will snow next winter on<br />

the basis of e.<br />

(8) (5) is “bad” relative to any course of experience e and (1).<br />

C. You can’t know (1), that it will snow next winter on the basis of your current<br />

experiences.<br />

An alternative hypothesis q is “bad” in the sense used here iff (to quote Pryor) “it has<br />

the special features that characterise the sceptic’s scenarios—whatever those features<br />

turn out to be.” (527) To a first approximation, q is bad relative to p and e iff you’re<br />

meant to be able to know p on the basis of e, but q is apparently compatible with e,<br />

even though it is not compatible with p.<br />

Pryor argues that the best response to the external world sceptic is dogmatism.<br />

On this theory you can know p on the basis of e even though you have no prior reason<br />

to rule out alternatives to p compatible with e. Pryor only defends the dogmatic<br />

response to the external world sceptic, but it’s worth considering the dogmatist response<br />

to inductive scepticism. According to this response, I can come to know I’m<br />

not in Winter Wonderland on the basis of my experiences to date, even though I<br />

didn’t know this a priori. So dogmatism is a version of empiricism, and it endorses<br />

(6). 9 The false premise in this argument, according to the dogmatist, is (7). We can<br />

know it will snow even though the Winter Wonderland hypothesis is bad relative to<br />

this conclusion and our actual evidence, and we have no prior way to exclude it.<br />

Pryor notes that the sceptic could offer a similar argument concerning justification,<br />

and the dogmatist offers a similar response.<br />

Sceptical Argument 2<br />

(9) Either you’re not justified in believing that you’re not in Winter Wonderland;<br />

or, if you are justified in believing this, it’s because that justification rests in<br />

part on your justified belief that it will snow in Ithaca next winter.<br />

9 It is a version of the kind of internalism discussed in footnote 2, since according to the dogmatist<br />

seeming to see that p can be sufficient justification for belief in p. Pryor’s preferred version of dogmatism<br />

is also internalist in the slightly stronger sense described in the text, but it seems possible that one could be<br />

a dogmatist without accepting that internalist thesis. One could accept, for instance, that seeming to see<br />

that p justifies a belief that p, but also think that seeming to see that q justifies a belief that p iff there is a<br />

known reliable connection between q and p. As I said, even the weaker version of internalism is sufficient<br />

to generate a conflict with anti-scepticism and empiricism, provided we just focus on the propositions that<br />

can be justifiably believed on the basis of introspective properties.


Scepticism, Rationalism and Externalism 194<br />

(10) If you’re to have justification for believing (1) on the basis of certain experiences<br />

or grounds e, then for every q which is “bad” relative to e and (1), you<br />

have to have antecedent justification for believing q to be false—justification<br />

which doesn’t rest on or presuppose any e-based justification you may have for<br />

believing (1).<br />

(11) (5) is “bad” relative to any course of experience e you could have and (1).<br />

C. You can’t justifiably believe it will snow in Ithaca next winter on the basis of<br />

past experiences.<br />

The dogmatist rejects (10), just as she rejects (7). I shall spend most of my time in the<br />

next two sections arguing for (10), returning to (7) only at the end. For it seems there<br />

are compelling reasons to accept (10), and hold that the problem with this argument<br />

is either with (9) or (11). 10<br />

2 Dominance Arguments<br />

The primary argument for (10) will turn on a dominance principle: if you will be<br />

in a position to justifiably believe p whatever evidence you get, and you know this,<br />

then you are now justified in believing p. This kind of reasoning is perfectly familiar<br />

in decision theory: if you know that one of n states obtains, and you know that<br />

in each of those states you should do X rather than Y, then you know now (or at<br />

least you should know) that you should do X rather than Y. This is a very plausible<br />

principle, and equivalent epistemic principles are just as viable. Dominance reasoning<br />

can directly support (10) and hence indirectly support (7). (As Vann McGee (1999)<br />

showed, the dominance principle in decision theory has to be qualified for certain<br />

kinds of agents with unbounded utility functions who are faced with a decision tree<br />

with infinitely many branches. Such qualifications do not seem at all relevant here.)<br />

It will be useful to start with an unsound argument for (10), because although this<br />

argument is unsound, it fails in an instructive way. Before I can present the argument<br />

I need to make an attempt at formalising Pryor’s concept of badness.<br />

q is bad relative to e and p = df q is deeply contingent, you know p entails<br />

¬q, and for any possible evidence e ′ (that you could have had at the time<br />

your total evidence is actually e) there exists a p ′ such that you know p ′<br />

entails ¬q and you are justified in believing p ′ on the basis of e ′ if e ′ is<br />

your total evidence.<br />

Roughly, the idea is that a bad proposition is one that would be justifiably ruled<br />

out by any evidence, despite the fact that it could turn out to be true. 11 Using this<br />

10 Just which is wrong then? That depends on how “bad” is defined. On our final definition (8) will fail,<br />

but there are other sceptical arguments, using other sceptical hypotheses, on which (6) fails.<br />

11 Note that there’s a subtle shift here in our conception of badness. Previously we said that bad propositions<br />

are those you allegedly know on the basis of your actual evidence (if you know p) even though<br />

they are logically consistent with that evidence. Now we say that they are propositions you could rule<br />

out on any evidence, even though they are consistent with your actual total evidence. This is a somewhat<br />

narrower class of proposition, but focussing on it strengthens the sceptic’s case appreciably.


Scepticism, Rationalism and Externalism 195<br />

definition we can present an argument for rationalism. The argument will use some<br />

fairly general premises connecting justification, evidence and badness. If we were just<br />

interested in this case we could replace q with (5), r with the proposition that (5) is<br />

false, e with my current evidence, and e ′ with some evidence that would undermine<br />

my belief that (5) is false, if such evidence could exist. The intuitions behind the<br />

argument may be clearer if you make those substitutions when reading through the<br />

argument. But because the premises are interesting beyond their application to this<br />

case, I will present the argument in its more general form.<br />

Rationalist Argument 1<br />

(12) If you are justified in believing (1) on the basis of e, and you know (1) entails<br />

¬(5), then you are justified in believing ¬(5) when your evidence is e.<br />

(13) If you are justified in believing r (at time t) on the basis of e, then there is some<br />

other possible evidence e ′ (that you could have at t) such that you would not be<br />

justified in believing r were your total evidence e ′ .<br />

(14) If you are justified in believing r, and there is no evidence e such that e is part<br />

of your evidence and you are justified in believing r on the basis of e, then you<br />

are justified in believing r a priori. 12<br />

(15) By definition, q is bad relative to e and p iff q is deeply contingent, you know<br />

p entails ¬q, and for any possible evidence e ′ (that you could have when your<br />

evidence is e) there exists a p ′ such that you know p ′ entails ¬q and you are<br />

justified in believing p ′ on the basis of e ′ if e ′ is your total evidence.<br />

(16) So, if q is bad relative to e and (1), and you are justified in believing (1) on the<br />

basis of e, then you are justified in believing ¬q a priori.<br />

(The references to times in (13) and (15) is just to emphasise that we are talking about<br />

your current evidence, and ways it could be. That you could observe Winter Wonderland<br />

next winter doesn’t count as a relevant alternative kind of evidence now.)<br />

Our conclusion (16) entails (10), since (10) merely required that for every bad<br />

proposition relative to e and (1), you have ‘antecedent’ justification for believing that<br />

proposition to be false, while (16) says this justification is a priori. (‘Antecedent’<br />

12 David Chalmers noted that (10) and (11) entail that I exist is a priori. He thought this was a bad result,<br />

and a sufficient reason to modify these premises. I’m perfectly happy with saying, following Kaplan, that I<br />

exist is a priori. I don’t think this proves rationalism, because I think it’s also deeply necessary that I exist.<br />

(It’s not deeply necessary that <strong>Brian</strong> exists, but that’s no objection to what I just claimed, because it’s not<br />

deeply necessary that I’m <strong>Brian</strong>.)<br />

This position is controversial though, so I don’t want to rest too much weight on it. If you don’t think<br />

that I exist should be a priori, rewrite (11) so that it’s conclusion is that you would be justified in believing<br />

the material conditional I exist ⊃ r a priori. (Note that since I’m presupposing in the dominance argument<br />

that all the salient possibilities are ones in which I have some evidence, and hence exist, it’s not surprising<br />

that I exist has a special status within the theory.)<br />

On a separate point, note that I make no assumptions whatsoever here about what relationship must<br />

obtain between a justified belief and the evidence on which it is based. Depending on what the right<br />

theory of justification is, that relationship might be entailment or constitution or causation or association<br />

or reliable connection or something else or some combination of these. I do assume that a posteriori<br />

beliefs are somehow connected to evidence, and if the beliefs are justified this relation is properly called<br />

basing.


Scepticism, Rationalism and Externalism 196<br />

justification need not be a priori as long as it arrives before the particular evidence<br />

you have for (1). This is why (16) is strictly stronger than (10).) So if (10) is false then<br />

one of these premises must be false. I take (15) to define “bad”, so it cannot be false.<br />

Note that given this definition we cannot be certain that (5) is bad. We will return to<br />

this point a few times.<br />

Which premise should the dogmatist reject? (12) states a fairly mundane closure<br />

principle for justified belief. And (13) follows almost automatically from the notion<br />

of ‘basing’. A belief can hardly be based in some particular evidence if any other<br />

evidence would support it just as well. This does not mean that such a belief cannot<br />

be rationally caused by the particular evidence that you have, just that the evidence<br />

cannot be the rational basis for that belief. The dogmatist objects to (14). There is a<br />

prima facie argument for (14), but as soon as we set it out we see why the dogmatist<br />

is correct to stop us here.<br />

Consider the following argument for (14), which does little more than lay out the<br />

intuition (14) is trying to express. Assume r is such that for any possible evidence<br />

e, one would be justified in believing r with that evidence. Here’s a way to reason<br />

a priori to r. Whatever evidence I get, I will be justified in believing that q. So<br />

I’m now justified in believing that r, before I get the evidence. Compare a simple<br />

decision problem where there is one unknown variable, and it can one of two values,<br />

but whichever value it takes it is better for one to choose X rather than Y. That is<br />

sufficient to make it true now that one should choose X rather than Y. Put this way,<br />

the argument for (14) is just a familiar dominance argument.<br />

Two flaws with this argument for (14) stand out, each of them arising because of<br />

disanalogies with the decision theoretic case.<br />

First, when we apply dominance reasoning in decision theory, we look at cases<br />

where it would be better to take X rather than Y in every possible case, and this is<br />

known. This point is usually not stressed, because it’s usually just assumed in decision<br />

theory problems that the players know the consequences of their actions given the<br />

value of certain unknown variables. It’s not obviously a good idea to assume this<br />

without comment in applications of decision theory, and it’s clearly a bad idea to<br />

make the same kind of assumption in epistemology. Nothing in the antecedent of<br />

(14) specifies that we can know, let alone know a priori, that if our evidence is e then<br />

we are justified in believing r. Even if this is true, even if it is necessarily true, it may<br />

not be knowable.<br />

Second, in the decision theory case we presupposed it is known that the variable<br />

can take only one of two values. Again, there in nothing in the antecedent of (14)<br />

to guarantee the parallel. Even if an agent knows of every possible piece of evidence<br />

that if she gets that evidence she will be justified in believing r, she may not be in a<br />

position to justifiably conclude r now because she may not know that these are all the<br />

possible pieces of evidence. In other words, she can only use dominance reasoning<br />

to conclude r if she knows de dicto, and not merely de re, of every possible body of<br />

evidence that it justifies r.<br />

So the quick argument for (14) fails. Still, it only failed because (14) left out two<br />

qualifications. If we include those qualifications, and adjust the other premises to


Scepticism, Rationalism and Externalism 197<br />

preserve validity, the argument will work. To make this adjustment, we need a new<br />

definition of badness.<br />

q is bad relative to e and p = df<br />

1. q is deeply contingent;<br />

2. p is known to entail ¬q; and<br />

3. it is knowable a priori that for any possible evidence e ′ there exists a<br />

p ′ such that p ′ is known to entail ¬q, and one is justified in believing<br />

p ′ on the basis of e ′ .<br />

The aim still is to find an argument for some claim stronger than (10) in sceptical<br />

argument 2. If we can do that, and if as the sceptic suggests (5) really is bad, then the<br />

only anti-sceptical response to sceptical argument 2 will be rationalism. So the fact<br />

that this looks like a sound argument for a slightly stronger conclusion than (10) is<br />

a large step in our argument that anti-scepticism plus internalism entails rationalism.<br />

(I omit the references to times from here on.)<br />

Rationalist Argument 2<br />

(12) If you are justified in believing (1) on the basis of e, and you know (1) entails<br />

¬(5), then you are justified in believing ¬(5) when your evidence is e.<br />

(10 ′ ) If you are justified in believing r on the basis of e, then there is some other<br />

possible evidence e ′ such that you would not be justified in believing r were<br />

your total evidence e ′ .<br />

(17) If you know you are justified in believing r, and you know a priori that there<br />

is no evidence e you have such that you are justified in believing r on the basis<br />

of e, then you are justified in believing r a priori. 13<br />

(18) By definition, q is bad relative to e and p iff q is deeply contingent, p is known<br />

to entail ¬q, and it is knowable a priori that for any possible evidence e ′ there<br />

exists a p ′ such that p ′ is known to entail ¬q, and one is justified in believing p ′<br />

on the basis of e ′ .<br />

(19) So, if q is bad relative to e and (1), and you are justified in believing (1) on the<br />

basis of e, then you are justified in believing ¬q a priori.<br />

This is a sound argument for (19), and hence for (10), but as noted on this definition<br />

of “bad” (11) may be false. If the Winter Wonderland hypothesis is to be bad it must<br />

be a priori knowable that on any evidence whatsoever, you’d be justified in believing<br />

it to be false. But as we will now see, although no evidence could justify you in<br />

believing the Winter Wonderland hypothesis to be true, it is not at all obvious that<br />

you are always justified in believing it is false.<br />

13 Again, if you don’t think I exist should be a priori, the conclusion should be that I exist ⊃ r is a priori.


Scepticism, Rationalism and Externalism 198<br />

3 Hunting the Bad Proposition<br />

A proposition is bad if it is deeply contingent but if you could justifiably believe it<br />

to be false on the basis of your current evidence, you could justifiably believe it to<br />

be false a priori. If a bad proposition exists, then we are forced to choose between<br />

rationalism and scepticism. To the extent that rationalism is unattractive, scepticism<br />

starts to look attractive. I think Pryor is right that this kind of argument tacitly underlies<br />

many sceptical arguments. The importance of propositions like (5) is not that<br />

it’s too hard to know them to be false. The arguments of those who deny closure<br />

principles for knowledge notwithstanding, it’s very intuitive that it’s easier to know<br />

(5) is false than to know (1) is true. So why does reflection on (5) provide more comfort<br />

to the inductive sceptic than reflection on (1)? The contextualist has one answer,<br />

that thinking about (5) moves the context to one where sceptical doubts are salient.<br />

Pryor’s work suggests a more subtle answer. Reflecting on (5) causes us to think<br />

about how we could come to know it is false, and prima facie it might seem we could<br />

not know that a priori or a posteriori. It’s that dilemma, and not the mere salience of<br />

the Winter Wonderland possibility, that drives the best sceptical argument. But this<br />

argument assumes that (5) could not be known to be false on the basis of empirical<br />

evidence, i.e. that it is bad. If it is not bad, and nor is any similar proposition, then<br />

we can easily deflect the sceptical argument. However, if we assume internalism, we<br />

can construct a bad proposition.<br />

The prima facie case that (5) is bad (relative to (1) and our current evidence e –<br />

I omit these relativisations from now on) looks strong. The negation of (5) is (20),<br />

where H is a proposition that summarises the relevant parts of the history of the<br />

world. 14<br />

(20) Either ¬H or it will snow in Ithaca next winter.<br />

Now one may argue that (5) is bad as follows. Either our evidence justifies believing<br />

¬H or it doesn’t. If it does, then it clearly justifies believing (20), for ¬H trivially<br />

entails it. If it does not, then we are justified in believing H, and whenever we are<br />

justified believing the world’s history is H, we can inductively infer that it will snow<br />

in Ithaca next winter. The problem with this argument, however, is fairly clear: the<br />

step from the assumption that we are not justified in believing ¬H to the conclusion<br />

we are justified in believing H is a modal fallacy. We might be justified in believing<br />

neither H nor its negation. In such a situation, it’s not obvious we could justifiably<br />

infer (20). So (5) may not be bad.<br />

A suggestion John Hawthorne (2002) makes seems to point to a proposition that<br />

is more plausibly bad. Hawthorne argues that disjunctions like (21) are knowable a<br />

priori, and this suggests that (22), its negation, is bad.<br />

(21) Either my evidence is not e or it will snow in Ithaca next winter.<br />

(22) My evidence is e and it will not snow in Ithaca next winter.<br />

14 I assume H includes a ‘that’s all that’s relevant clause’ to rule out defeaters. That is, it summaries the<br />

relevant history of the world as such.


Scepticism, Rationalism and Externalism 199<br />

Hawthorne does not provide a dominance argument that (21) is knowable a priori.<br />

Instead he makes a direct appeal to the idea that whatever kinds of inference we can<br />

draw now the basis of our evidence we could have drawn prior to getting e as conditional<br />

conclusions, conditional on getting e. So if I can now know it will snow<br />

in Ithaca next winter, prior to getting e I cold have known the material conditional<br />

If my evidence is e, it will snow in Ithaca, which is equivalent to (21). It’s not clear<br />

this analogy works, since when we do such hypothetical reasoning we take someone<br />

to know that our evidence is e, and this may cause some complications. Could we<br />

find a dominance argument to use instead? One might be tempted by the following<br />

argument.<br />

Rationalist Argument 3<br />

(23) I know a priori that if my evidence is e, then I am justified in believing the<br />

second disjunct of (21).<br />

(24) I know a priori that if my evidence is not e, then I am justified in believing the<br />

first disjunct of (21)<br />

(25) I know a priori that if I am justified in believing a disjunct of (21) I am justified<br />

in believing the disjunction (21).<br />

(26) I know a priori that my evidence is either e or not e.<br />

C. So, I’m justified a priori in believing (21).<br />

The problem here is the second premise, (24). It’s true that if my evidence is not e<br />

then the first disjunct of (21) is true. But there’s no reason to suppose I am justified<br />

in believing any true proposition about my evidence. Timothy (Williamson, 2000a,<br />

ch. 8) has argued that the problem with many sceptical arguments is that they assume<br />

agents know what their evidence is. I doubt that’s really the flaw in sceptical<br />

arguments, but it certainly is the flaw in the argument that (22) is bad.<br />

The problem with using (22) is that the argument for its badness relied on quite<br />

a strong privileged access thesis: whenever my evidence is not e I am justified in<br />

believing it is not. If we can find a weaker privileged access thesis that is true, we<br />

will be able to find a proposition similar to (22) that is bad. And the very argument<br />

Williamson gives against the thesis that we always know what our evidence is will<br />

show us how to find such a thesis.<br />

Williamson proposes a margin-of-error model for certain kinds of knowledge. On<br />

this model, X knows that p iff (roughly) p is true in all situations within X’s marginof-error.<br />

15 The intuitive idea is that all of the possibilities are arranged in some metric<br />

space, with the distance between any two worlds being the measure of their similarity<br />

with respect to X. Then X knows all the things that are true in all worlds within some<br />

sphere centred on the actual world, where the radius of that sphere is given by how<br />

accurate she is at forming beliefs.<br />

15 There’s a considerable amount of idealisation here. What’s really true is that X is in a position to<br />

know anything true in all situations within her margin-of-error. Since we’re working out what is a priori<br />

knowable, I’ll assume agents are idealised so they know what they are in a position to know. This avoids<br />

needless complications we get from multiplying the modalities that are in play.


Scepticism, Rationalism and Externalism 200<br />

One might think this would lead to the principle B: p → K¬K¬p, that is, if p is<br />

true then X knows that she does not know ¬p. Or, slightly more colloquially, if p is<br />

true then X knows that for all she knows p is true. (I use K here as a modal operator.<br />

KA means that X, the salient subject, knows that A.) On a margin-of-error model<br />

p → K¬K¬p is false only if p is actually true and there is a nearby (i.e. within the<br />

margin-of-error) situation where the agent knows ¬p. But if nearby is symmetric this<br />

is impossible, because the truth of p in this situation will rule out the knowability of<br />

¬p in that situation.<br />

As Williamson points out, that quick argument is fallacious, since it relies on a<br />

too simplistic margin-of-error model. He proposes a more complicated account: p<br />

is known at s iff there is a distance d greater than the margin-of-error and for any<br />

situation s ′ such that the distance between s and s ′ is less than d, p is true at s ′ . Given<br />

this model, we cannot infer p → K¬K¬p. Indeed, the only distinctive modal principle<br />

we can conclude is Kp → p. However, as Delia Graff Fara (2002) has shown, if we<br />

make certain density assumptions on the space of available situations, we can recover<br />

the principle (27) within this account. 16<br />

(27) p → K¬KK¬p<br />

To express the density assumption, let d(s 1 , s 2 ) be the ‘distance’ between s 1 and s 2 , and<br />

m the margin-of-error. The assumption then is that there is a k > 1 such that for any<br />

s 1 , s 2 such that d(s 1 , s 2 ) < km, there is an s 3 such that d(s 1 , s 3 ) < m and d(s 3 , s 2 ) < m.<br />

And this will be made true if there is some epistemic situation roughly ‘half-way’<br />

between s 1 and s 2 . 17 That is, all we have to assume to recover (27) within the marginof-error<br />

model is that the space of possible epistemic situations is suitably dense.<br />

Since the margin-of-error model, and Fara’s density assumption, are both appropriate<br />

for introspective knowledge, (27) is true when p is a proposition about the agent’s<br />

own knowledge.<br />

To build the bad proposition now, let G be a quite general property of evidence,<br />

one that is satisfied by everyone with a reasonable acquaintance with Ithaca’s weather<br />

patterns, but still precise enough that it is a priori that everyone whose evidence is<br />

G is justified in believing it will snow in Ithaca next winter. The internalist, remember,<br />

is committed to such a G existing and it being an introspective property. Now<br />

consider the following proposition, which I shall argue is bad. 18<br />

(28) I know that I know my evidence is G, and it will not snow in Ithaca next<br />

winter.<br />

The negation of (28) is (29).<br />

16 If we translate K as � and ¬K¬ as ⋄, (24) can be expressed as the modal formula p → �⋄⋄p.<br />

17 Fara actually gives a slightly stronger principle than this, but this principle is sufficient for her purposes,<br />

and since it is weaker than Fara’s, it is a little more plausible. But the underlying idea here, that<br />

we can get strong modal principles out of margin-of-error models by making plausible assumptions about<br />

density, is taken without amendment from her paper.<br />

18 If you preferred the amended version of (11) discussed in footnote 12, the bad proposition is I don’t<br />

exist or (28) is true.


Scepticism, Rationalism and Externalism 201<br />

(29) It will snow in Ithaca next winter, or I don’t know that I know my evidence is<br />

G.<br />

It might be more intuitive to read (29) as the material conditional (29a), though since<br />

English conditionals aren’t material conditionals this seems potentially misleading.<br />

(29a) If I know that I know that my evidence is G, then it will snow in Ithaca next<br />

winter.<br />

To avoid confusions due to the behaviour of conditionals, I’ll focus on the disjunction<br />

(29). Assume for now that the margin-of-error model is appropriate for propositions<br />

about my own evidence. I will return below to the plausibility of this assumption.<br />

This assumption implies that principle (27) is always correct when p is a proposition<br />

about my evidence. Given this, we can prove (28) is bad. Note that all my possible<br />

evidential states either are, or are not, G. If they are G then by hypothesis I am<br />

justified in believing that it will snow in Ithaca next winter and hence I am justified<br />

in believing (29). If they are not, then by the principle (27) I know that I don’t<br />

know that I know my evidence is G, so I can come to know (29), so I am justified in<br />

believing (29). So either way I am justified in believing (29). It’s worth noting that<br />

at no point here did I assume that I knew whether my evidence was G, though I do<br />

assume that I know that having evidence that is G justifies belief in snow next winter.<br />

All of this assumes the margin-of-error model looks appropriate for introspective<br />

properties. If it isn’t, then we can’t assume that (27) is true when p is a proposition<br />

about the introspective properties I satisfy, and hence the argument that (29) is knowable<br />

a priori fails. There’s one striking problem with assuming a priori that we can<br />

use the margin-of-error model in all situations. It is assumed (roughly) that anything<br />

that is true in all possibilities within a certain sphere with the subject’s beliefs at the<br />

centre is known. This sphere must include the actual situation, or some propositions<br />

that are actually false may be true throughout the sphere. Since for propositions<br />

concerning non-introspective properties there is no limit to how badly wrong the<br />

subject can be, we cannot set any limits a priori to the size of the sphere. So a priori<br />

the only margin-of-error model we can safely use is the sceptical model that says the<br />

subject knows that p iff p is true in all situations. For introspective properties the<br />

margin-of-error can be limited, because it is constitutive of introspective properties<br />

that the speakers beliefs about whether they possess these properties are not too far<br />

from actuality. So there seems to be no problem with using Williamson’s nice model<br />

as long as we restrict our attention to introspective properties.<br />

If belief in (29) can be justified a priori, and it is true, does that mean it is knowable<br />

a priori? If we want to respect Gettier intuitions, then we must not argue directly that<br />

since our belief in (29) is justified, and it is true, then we know it. Still, being justified<br />

and true is not irrelevant to being known. I assume here, far from originally, that it is<br />

a reasonable presumption that any justified true belief is an item of knowledge. This<br />

presumption can be defeated, if the belief is inferred from a false premise, or if the<br />

justification would vanish should the subject acquire some evidence she should have<br />

acquired, or if there is a very similar situation in which the belief is false, but it is a


Scepticism, Rationalism and Externalism 202<br />

reasonable presumption. Unless we really are in some sceptical scenario, there is no<br />

“defeater” that prevents our belief in (29) being an item of knowledge. We certainly<br />

did not infer it from a false premise, there is no evidence we could get that would<br />

undermine it, and situations in which it is false are very far from actuality.<br />

Since there are no such defeaters, it is reasonable to infer we can know (29) a priori.<br />

The important premises grounding this inference are an anti-sceptical premise, that<br />

we can know (1) on the basis of our current evidence, and the internalist premise that<br />

we used several times in the above argument. This completes the argument that the<br />

combination of empiricism, internalism and anti-scepticism is untenable.<br />

4 How Externalism Helps<br />

It should be obvious how the rationalist can respond to the above argument - by<br />

simply accepting the conclusion. Ultimately I think that’s the best response to this<br />

argument. As Hawthorne notes, rationalism is the natural position for fallibilists<br />

about knowledge to take, for it is just the view that we can know something a priori<br />

even though we could turn out to be wrong. In other words, it’s just fallibilism about<br />

a priori knowledge. Since fallibilism about a posteriori knowledge seems true, and<br />

there’s little reason to think fallibilism about the a priori would be false if fallibilism<br />

about the a posteriori is true, the rationalist’s position is much stronger than many<br />

have assumed. 19 The inductive sceptic also has an easy response - reject the initial<br />

premise that in my current situation I know that it will snow in Ithaca next winter.<br />

There are other responses that deserve closer attention: first, the inductive sceptic<br />

who is not a universal sceptic, and in particular is not a sceptic about perception, and<br />

second the externalist.<br />

I said at the start that the argument generalises to most kinds of scepticism. One<br />

kind of theorist, the inductive sceptic who thinks we can nonetheless acquire knowledge<br />

through perception, may think that the argument does not touch the kind of<br />

anti-sceptical, internalist, empiricist position she adopts. The kind of theorist I have<br />

in mind says that the objects and facts we perceive are constitutive of the evidence<br />

we receive. So given we are getting the evidence we are actually getting, these objects<br />

must exist and those facts must be true. She says that if I’d started with (30), instead<br />

of (1), my argument would have ended up claiming that (31) is bad for some G.<br />

(30) A hand exists.<br />

(31) A hand exists, or I don’t know that I know that I’m perceiving a hand.<br />

She then says that (31) is not deeply contingent, since in any situation where the first<br />

disjunct is false the second is true, so it cannot be bad. This response is correct as<br />

far as it goes, but it does not go far enough to deserve the name anti-sceptical. For<br />

it did not matter to the above argument, or to this response that (1) is about the<br />

future. All that mattered was that (1) was not entailed by our evidence. So had (1)<br />

19 As BonJour points out, rationalism has fallen into such disrepute that many authors leave it out even<br />

of surveys of the options. This seems unwarranted given the close connection between rationalism and<br />

the very plausible thesis of fallibilism.


Scepticism, Rationalism and Externalism 203<br />

been a proposition about the present that we cannot directly perceive, such as that<br />

it is not snowing in Sydney right now, the rest of the argument would have been<br />

unaffected. The summary here is that if one is suitably externalist about perception,<br />

so one thinks the existence of perceptual states entail the existence of the things being<br />

perceived, one can accept this argument, accept internalism, accept empiricism, and<br />

not be an external world sceptic. For it is consistent with such a position that one<br />

know the existence of the things one perceives. But on this picture one can know<br />

very little beyond that, so for most practical purposes, the position is still a sceptical<br />

one.<br />

The externalist response is more interesting. Or, to be more precise, the externalist<br />

reponses are more interesting. Although I have appealed to internalism a couple of<br />

times in the above argument, it might not be so clear how the externalist can respond.<br />

Indeed, it may be worried that by exercising a little more care in various places I could<br />

have shown that everyone must accept either rationalism or scepticism. That is the<br />

conclusion Hawthorne derives in his paper on deeply contingent a priori knowledge,<br />

though as noted above he uses somewhat more contentious reasoning than I do<br />

in order to get there. To conclude, I will argue that the internalism is crucial to the<br />

argument I have presented, and I will spell out how the externalist can get out of the<br />

trap I’ve set above.<br />

One easy move that’s available to an externalist is to deny that any facts about<br />

justification are a priori. That blocks the move that says we can find a G such that<br />

it’s a priori that anyone whose evidence is G can know that it will snow in Ithaca<br />

next year. This is not an essential feature of externalism. One can be an externalist<br />

about justification and still think it is a priori that if one’s evidence has the property is<br />

reliably correlated with snow in the near future then it justifies belief that it will shortly<br />

snow. But the position that all facts about justification are a posteriori fits well with<br />

a certain kind of naturalist attitude, and people with that attitude will find it easy to<br />

block the sceptical argument I’ve presented.<br />

Can, however, we use an argument like mine to argue against an anti-sceptic empiricist<br />

externalist who thinks some of the facts about justification can be discovered<br />

a priori? The strategy I’ve used to build the argument is fairly transparent: find a disjunctive<br />

a priori knowable proposition by partitioning the possible evidence states<br />

into a small class, and adding a disjunct for every cell of the partition. In every case,<br />

the disjunct that is added is one that is known to be known given that evidence. If<br />

one of the items of knowledge is ampliative, if it goes beyond the evidence, then it is<br />

possible the disjunction will be deeply contingent. But the disjunction is known no<br />

matter what.<br />

If internalism is true, then the partition can divide up evidential states according<br />

to the introspective properties of the subject. If externalism is true, then such a partition<br />

may not be that useful, because we cannot infer much about what the subject<br />

is justified in believing from the introspective properties she instantiates. Consider,<br />

for example, the above partition of subjects into the G and the not-G, where G is<br />

some introspective property, intuitively one somewhat connected with it snowing in<br />

Ithaca next year. The subjects that are not-G know that they don’t know they know


Scepticism, Rationalism and Externalism 204<br />

they are G, because they aren’t. Externalists need not object to this stage of the argument.<br />

They can, and should, accept that a margin-of-error model is appropriate<br />

for introspective properties. Since it’s part of the nature of introspective properties<br />

that we can’t be too badly wrong about which ones we instantiate, we’re guaranteed<br />

to satisfy some reliability clause, so there’s no ground there to deny the privileged<br />

access principle I defended above.<br />

The problem is what to say about the cases where the subject is G. Externalists<br />

should say that some such subjects are justified in believing it will snow in Ithaca<br />

next winter, and some are not. For simplicity, I’ll call the first group the reliable<br />

ones and the others the unreliable ones. If I’m G and reliable, then I’m justified in<br />

believing it will snow, and hence in believing (29). But if I’m G and unreliable, then<br />

I’m not justified in believing this. Indeed, if I’m G and unreliable, there is no obvious<br />

argument that I’m justified in believing either of the disjuncts of (29). Since this is a<br />

possible evidential state, externalists should think there is no dominance argument<br />

that (29) is a priori knowable.<br />

Could we solve this by adding another disjunct, one that is guaranteed to be<br />

known if I’m G and unreliable? There is no reason to believe we could. If we’re<br />

unreliable, there is no guarantee that we will know we are unreliable. Indeed, we<br />

may well believe we are reliable. So there’s no proposition we can add to our long<br />

disjunction while saying to ourselves, “In the case where the subject is G and unreliable,<br />

she can justifiably believe this disjunct.” If the subject is unreliable, she may not<br />

have any justified beliefs about the external world. But this is just to say the above<br />

recipe for constructing bad propositions breaks down. Externalists should have no<br />

fear that anything like this approach could be used to construct a proposition they<br />

should find bad. This is obviously not a positive argument that anti-sceptical empiricist<br />

externalism is tenable, but it does suggest that such a position is immune to the<br />

kind of argument I have presented here.


Disagreements, Philosophical and Otherwise<br />

This paper started life as a short note I wrote around New Year 2007 while in Minneapolis.<br />

It was originally intended as a blog post. That might explain, if not altogether<br />

excuse, the flippant tone in places. But it got a little long for a post, so I made<br />

it into the format of a paper and posted it to my website. The paper has received a<br />

lot of attention, so it seems like it will be helpful to see it in print. Since a number of<br />

people have responded to the argument as stated, I’ve decided to just reprint the article<br />

warts and all, and make a few comments at the end about how I see its argument<br />

in the context of the subsequent debate.<br />

Disagreeing about Disagreement (2007)<br />

I argue with my friends a lot. That is, I offer them reasons to believe all sorts of<br />

philosophical conclusions. Sadly, despite the quality of my arguments, and despite<br />

their apparent intelligence, they don’t always agree. They keep insisting on principles<br />

in the face of my wittier and wittier counterexamples, and they keep offering their<br />

own dull alleged counterexamples to my clever principles. What is a philosopher to<br />

do in these circumstances? (And I don’t mean get better friends.)<br />

One popular answer these days is that I should, to some extent, defer to my<br />

friends. If I look at a batch of reasons and conclude p, and my equally talented<br />

friend reaches an incompatible conclusion q, I should revise my opinion so I’m now<br />

undecided between p and q. I should, in the preferred lingo, assign equal weight to<br />

my view as to theirs. This is despite the fact that I’ve looked at their reasons for concluding<br />

q and found them wanting. If I hadn’t, I would have already concluded q.<br />

The mere fact that a friend (from now on I’ll leave off the qualifier ’equally talented<br />

and informed’, since all my friends satisfy that) reaches a contrary opinion should be<br />

reason to move me. Such a position is defended by Richard Feldman (2005; 2006),<br />

David Christensen (2007) and Adam Elga (2007).<br />

This equal weight view, hereafter EW, is itself a philosophical position. And<br />

while some of my friends believe it, some of my friends do not. (Nor, I should add<br />

for your benefit, do I.) This raises an odd little dilemma. If EW is correct, then the<br />

fact that my friends disagree about it means that I shouldn’t be particularly confident<br />

that it is true, since EW says that I shouldn’t be too confident about any position<br />

on which my friends disagree. But, as I’ll argue below, to consistently implement<br />

EW, I have to be maximally confident that it is true. So to accept EW, I have to<br />

inconsistently both be very confident that it is true and not very confident that it is<br />

true. This seems like a problem, and a reason to not accept EW. We can state this<br />

argument formally as follows, using the notion of a peer and an expert. Some people<br />

are peers if they are equally philosophically talented and informed as each other, and<br />

one is more expert than another if they are more informed and talented than the<br />

other.<br />

† In progress. Intended for a volume on disagreement, forthcoming with OUP in 2011/12.


Disagreements, Philosophical and Otherwise 206<br />

1. There are peers who disagree about EW, and there is no one who is an expert<br />

relative to them who endorses EW.<br />

2. If 1 is true, then according to EW, my credence in EW should be less than 1.<br />

3. If my credence in EW is less than 1, then the advice that EW offers in a wide<br />

range of cases is incoherent.<br />

4. So, the advice EW offers in a wide range of cases is incoherent.<br />

The first three sections of this paper will be used to defend the first three premises.<br />

The final section will look at the philosophical consequences of the conclusion.<br />

1 Peers and EW<br />

Thomas Kelly (2005) has argued against EW and in favour of the view that a peer with<br />

the irrational view should defer to a peer with the rational view. Elga helpfully dubs<br />

this the ’right reasons’ view. Ralph Wedgwood (2007, Ch. 11) has argued against EW<br />

and in favour of the view that one should have a modest ’egocentric bias’, i.e. a bias<br />

towards one’s own beliefs. On the other hand, as mentioned above, Elga, Christensen<br />

and Feldman endorse versions of EW. So it certainly looks like there are very talented<br />

and informed philosophers on either side of this debate.<br />

Now I suppose that if we were taking EW completely seriously, we would at this<br />

stage of the investigation look very closely at whether these five really are epistemic<br />

peers. We could pull out their grad school transcripts, look at the citation rates for<br />

their papers, get reference letters from expert colleagues, maybe bring one or two of<br />

them in for job-style interviews, and so on. But this all seems somewhat inappropriate<br />

for a scholarly journal. Not to mention a little tactless. 1 So I’ll just stipulate that<br />

they seem to be peers in the sense relevant for EW, and address one worry a reader<br />

may have about my argument.<br />

An objector might say, “Sure it seems antecedently that Kelly and Wedgwood<br />

are the peers of the folks who endorse EW. But take a look at the arguments for<br />

EW that have been offered. They look like good arguments, don’t they? Doesn’t<br />

the fact that Kelly and Wedgwood don’t accept these arguments mean that, however<br />

talented they might be in general, they obviously have a blind spot when it comes to<br />

the epistemology of disagreement? If so, we shouldn’t treat them as experts on this<br />

question.” There is something right about this. People can be experts in one area, or<br />

even many areas, while their opinions are systematically wrong in another. But the<br />

objector’s line is unavailable to defenders of EW.<br />

Indeed, these defenders have been quick to distance themselves from the objector.<br />

Here, for instance, is Elga’s formulation of the EW view, a formulation we’ll return<br />

to below.<br />

Your probability in a given disputed claim should equal your prior conditional<br />

probability in that claim. Prior to what? Prior to your thinking<br />

through the claim, and finding out what your advisor thinks of it. Conditional<br />

on what? On whatever you have learned about the circumstances<br />

of how you and your advisor have evaluated the claim. (Elga, 2007, 490)<br />

1 Though if EW is correct, shouldn’t the scholarly journals be full of just this information?


Disagreements, Philosophical and Otherwise 207<br />

The fact that Kelly and Wedgwood come to different conclusions can’t be enough<br />

reason to declare that they are not peers. As Elga stresses, what matters is the prior<br />

judgment of their acuity. And Elga is right to stress this. If we declared anyone who<br />

doesn’t accept reasoning that we find compelling not a peer, then the EW view will<br />

be trivial. After all, the EW view only gets its force from cases as described in the<br />

introduction, where our friends reject reasoning we accept, and accept reasons we<br />

reject. If that makes them not a peer, the EW view never applies. So we can’t argue<br />

that anyone who rejects EW is thereby less of an expert in the relevant sense than<br />

someone who accepts it, merely in virtue of their rejection of EW. So it seems we<br />

should accept premise 1.<br />

2 Circumstances of Evaluation<br />

Elga worries about the following kind of case. Let p be that the sum of a certain<br />

series of numbers, all of them integers, is 50. Let q be that the sum of those numbers<br />

is 400e. My friend and I both add the numbers, and I conclude p while he concludes<br />

q. It seems that there is no reason to defer to my friend. I know, after all, that he has<br />

made some kind of mistake. The response, say defenders of EW, is that deference is<br />

context-sensitive. If I know, for example, that my friend is drunk, then I shouldn’t<br />

defer to him. More generally, as Elga puts it, how much I should defer should depend<br />

on what I know about the circumstances.<br />

Now this is relevant because one of the relevant circumstances might be that my<br />

friend has come to a view that I regard as insane. That’s what happens in the case of<br />

the sums. Since my prior probability that my friend is right given that he has an insane<br />

seeming view is very low, my posterior probability that my friend is right should<br />

also, according to Elga, be low. Could we say that, although antecedently we regard<br />

Wedgwood and Kelly as peers of those they disagree with, that the circumstance of<br />

their disagreement is such that we should disregard their views?<br />

It is hard to see how this would be defensible. It is true that a proponent of EW<br />

will regard Kelly and Wedgwood as wrong. But we can’t say that we should disregard<br />

the views of all those we regard as mistaken. That leads to trivialising EW, for reasons<br />

given above. The claim has to be that their views are so outrageous, that we wouldn’t<br />

defer to anyone with views that outrageous. And this seems highly implausible. But<br />

that’s the only reason that premise 2 could be false. So we should accept premise 2.<br />

3 A Story about Disagreement<br />

The tricky part of the argument is proving premise 3. To do this, I’ll use a story involving<br />

four friends, Apollo, Telemachus, Adam and Tom. The day before our story<br />

takes place, Adam has convinced Apollo that he should believe EW, and organise<br />

his life around it. Now Apollo and Telemachus are on their way to Fenway Park to<br />

watch the Red Sox play the Indians. There have been rumours flying around all day<br />

about whether the Red Sox injured star player, David Ortiz, will be healthy enough<br />

to play. Apollo and Telemachus have heard all the competing reports, and are comparing<br />

their credences that Ortiz will play. (Call the proposition that he will play


Disagreements, Philosophical and Otherwise 208<br />

p.) Apollo’s credence in p is 0.7, and Telemachus’s is 0.3. In fact, 0.7 is the rational<br />

credence in p given their shared evidence, and Apollo truly believes that it is. 2 And,<br />

as it turns out, the Red Sox have decided but not announced that Ortiz will play, so<br />

p is true.<br />

Despite these facts, Apollo lowers his credence in p. In accord with his newfound<br />

belief in EW, he changes his credence in p to 0.5. Apollo is sure, after all, that when<br />

it comes to baseball Telemachus is an epistemic peer. At this point Tom arrives,<br />

and with a slight disregard for the important baseball game at hand, starts trying to<br />

convince them of the right reasons view on disagreement. Apollo is not convinced,<br />

but Telemachus thinks it sounds right. As he puts it, the view merely says that the<br />

rational person believes what the rational person believes. And who could disagree<br />

with that?<br />

Apollo is not convinced, and starts telling them the virtues of EW. But a little<br />

way in, Tom cuts him off with a question. “How probable,” he asks Apollo, “does<br />

something have to be before you’ll assert it?”<br />

Apollo says that it has to be fairly probable, though just what the threshold is<br />

depends on just what issues are at stake. But he agrees that it has to be fairly high,<br />

well above 0.5 at least.<br />

“Well,” says Tom, “in that case you shouldn’t be defending EW in public. Because<br />

you think that Telemachus and I are the epistemic peers of you and Adam. And<br />

we think EW is false. So even by EW’s own lights, the probability you assign to<br />

EW should be 0.5. And that’s not a high enough probability to assert it.” Tom’s<br />

speech requires that Apollo regard he and Telemachus as Apollo’s epistemic peers<br />

with regard to this question. By premises 1 and 2, Apollo should do this, and we’ll<br />

assume that he does.<br />

So Apollo agrees with all this, and agrees that he shouldn’t assert EW any more.<br />

But he still plans to use it, i.e. to have a credence in p of 0.5 rather than 0.7. But now<br />

Telemachus and Tom press on him the following analogy.<br />

Imagine that there were two competing experts, each of whom gave differing<br />

views about the probability of q. One of the experts, call her Emma, said that the<br />

probability of q, given the evidence, is 0.5. The other expert, call her Rae, said that<br />

the probability of q, given the evidence, is 0.7. Assuming that Apollo has the same<br />

evidence as the experts, but he regards the experts as experts at evaluating evidence,<br />

what should his credence in q be? It seems plausible that it should be a weighted<br />

average of what Emma says and what Rae says. In particular, it should be 0.5 only<br />

if Apollo is maximally confident that Emma is the expert to trust, and not at all<br />

confident that Rae is the expert to trust.<br />

The situation is parallel to the one Apollo actually faces. EW says that his credence<br />

in p should be 0.5. The right reason view says that his credence in p should<br />

be 0.7. Apollo is aware of both of these facts. So his credence in p should be 0.5 iff<br />

he is certain that EW is the theory to trust, just as his credence in q should be 0.5<br />

2 This is obviously somewhat of an idealisation, since there won’t usually be a unique precise rational<br />

response to the evidence. But I don’t think this idealisation hurts the argument to follow. I should note<br />

that the evidence here excludes their statements of their credences, so I really mean the evidence that they<br />

brought to bear on the debate over whether p.


Disagreements, Philosophical and Otherwise 209<br />

iff he is certain that Emma is the expert to trust. Indeed, a credence of 0.5 in p is<br />

incoherent unless Apollo is certain EW is the theory to trust. But Apollo is not at<br />

all certain of this. His credence in EW, as is required by EW itself, is 0.5. So as long<br />

as Apollo keeps his credence in p at 0.5, he is being incoherent. But EW says to keep<br />

his credence in p at 0.5. So EW advises him to be incoherent. That is, EW offers<br />

incoherent advice. We can state this more carefully in an argument.<br />

5. EW says that Apollo’s credence in p should be 0.5.<br />

6. If 5, then EW offers incoherent advice unless it also says that Apollo’s credence<br />

in EW should be 1.<br />

7. EW says that Apollo’s credence in EW should be 0.5.<br />

8. So, EW offers incoherent advice.<br />

Since Apollo’s case is easily generalisable, we can infer that in a large number of cases,<br />

EW offers advice that is incoherent. Line 7 in this argument is hard to assail given<br />

premises 1 and 2 of the master argument. But I can imagine objections to each of the<br />

other lines.<br />

Objection: Line 6 is false. Apollo can coherently have one credence in p while being<br />

unsure about whether it is the rational credence to have. In particular, he can coherently<br />

have his credence in p be 0.5, while he is unsure whether his credence in p<br />

should be 0.5 or 0.7. In general there is no requirement for agents who are not omniscient<br />

to have their credences match their judgments of what their credences should<br />

be.<br />

Replies: I have two replies to this, the first dialectical and the second substantive.<br />

The dialectical reply is that if the objector’s position on coherence is accepted,<br />

then a lot of the motivation for EW fades away. A core idea behind EW is that Apollo<br />

was unsure before the conversation started whether he or Telemachus would have the<br />

most rational reaction to the evidence, and hearing what each of them say does not<br />

provide him with more evidence. (See the ’bootstrapping’ argument in Elga (2007)<br />

for a more formal statement of this idea.) So Apollo should have equal credence in<br />

the rationality of his judgment and of Telemachus’s judgment.<br />

But if the objector is correct, Apollo can do that without changing his view on<br />

EW one bit. He can, indeed should, have his credence in p be 0.7, while being uncertain<br />

whether his credence in p should be 0.7 (as he thinks) or 0.3 (as Telemachus<br />

thinks). Without some principle connecting what Apollo should think about what<br />

he should think to what Apollo should think, it is hard to see why this is not the<br />

uniquely rational reaction to Apollo’s circumstances. In other words, if this is an objection<br />

to my argument against EW, it is just as good an objection to a core argument<br />

for EW.<br />

The substantive argument is that the objector’s position requires violating some<br />

very weak principles concerning rationality and higher-order beliefs. The objector<br />

is right that, for instance, in order to justifiably believe that p (to degree d), one<br />

need not know, or even believe, that one is justified in believing p (to that degree).<br />

If nothing else, the anti-luminosity arguments in Williamson (2000a) show that to


Disagreements, Philosophical and Otherwise 210<br />

be the case. But there are weaker principles that are more plausible, and which the<br />

objector’s position has us violate. In particular, there is the view that we can’t both<br />

be justified in believing that p (to degree d), while we know we are not justified in<br />

believing that we are justified in believing p (to that degree). In symbols, if we let J p<br />

mean that the agent is justified in believing p, and box and diamond to be epistemic<br />

modals, we have the principle MJ (for Might be Justified).<br />

MJ J p → ⋄J J p<br />

This seems like a much more plausible principle, since if we know we aren’t justified<br />

in believing we’re justified in believing p, it seems like we should at least suspend<br />

judgment in p. That is, we shouldn’t believe p. That is, we aren’t justified in believing<br />

p. But the objector’s position violates principle MJ, or at least a probabilistic<br />

version of it, as we’ll now show.<br />

We aim to prove that the objector is committed to Apollo being justified in believing<br />

p to degree 0.5, while he knows he is not justified in believing he is justified<br />

in believing p to degree 0.5. The first part is trivial, it’s just a restatement of the<br />

objector’s view, so it is the second part that we must be concerned with.<br />

Now, either EW is true, or it isn’t true. If it is true, then Apollo is not justified<br />

in having a greater credence in it than 0.5. But his only justification for believing p<br />

to degree 0.5 is EW. He’s only justified in believing he’s justified in believing p if he<br />

can justify his use of EW in it. But you can’t justify a premise in which your rational<br />

credence is 0.5. So Apollo isn’t justified in believing he is justified in believing p. If<br />

EW isn’t true, then Apollo isn’t even justified in believing p to degree 0.5. And he<br />

knows this, since he knows EW is his only justification for lowering his credence in<br />

p that far. So he certainly isn’t justified in believing he is justified in believing p to<br />

degree 0.5 Moreover, every premise in this argument has been a premise that Apollo<br />

knows to obtain, and he is capable of following all the reasoning. So he knows that<br />

he isn’t justified in believing he is justified in believing p to degree 0.5, as required.<br />

The two replies I’ve offered to the objector complement one another. If someone<br />

accepts MJ, then they’ll regard the objector’s position as incoherent, since we’ve just<br />

shown that MJ is inconsistent with that position. If, on the other hand, someone<br />

rejects MJ and everything like it, then they have little reason to accept EW in the<br />

first place. They should just accept that Apollo’s credence in p should be, as per<br />

hypothesis the evidence suggests, 0.7. The fact that an epistemic peer disagrees, in<br />

the face of the same evidence, might give Apollo reason to doubt that this is in fact<br />

that uniquely rational response to the evidence. But, unless we accept a principle<br />

like MJ, that’s consistent with Apollo retaining the rational response to the evidence,<br />

namely a credence of 0.7 in p. So it is hard to see how someone could accept the<br />

objector’s argument, while also being motivated to accept EW. In any case, I think<br />

MJ is plausible enough on its own to undermine the objector’s position. 3<br />

3 Added in 2010: I still think there’s a dilemma here for EW, but I’m less convinced than I used to be<br />

that MJ is correct.


Disagreements, Philosophical and Otherwise 211<br />

Objection: Line 5 is false. Once we’ve seen that the credence of EW is 0.5, then<br />

Apollo’s credence in first-order claims such as p should, as the analogy with q suggests,<br />

be a weighted average of what EW says it should be, and what the right reason<br />

view says it should be. So, even by EW’s own lights, Apollo’s credence in p should<br />

be 0.6.<br />

Replies: Again I have a dialectical reply, and a substantive reply.<br />

The dialectical reply is that once we make this move, we really have very little motivation<br />

to accept EW. There is, I’ll grant, some intuitive plausibility to the view that<br />

when faced with a disagreeing peer, we should think the right response is half way<br />

between our competing views. But there is no intuitive plausibility whatsoever to the<br />

view that in such a situation, we should naturally move to a position three-quarters<br />

of the way between the two competing views, as this objector suggests. Much of the<br />

argument for EW, especially in Christensen, turns on intuitions about cases, and the<br />

objector would have us give all of that up. Without those intuitions, however, EW<br />

falls in a heap.<br />

The substantive reply is that the idea behind the objection can’t be coherently<br />

sustained. The idea is that we should first apply EW to philosophical questions to<br />

work out the probability of different theories of disagreement, and then apply those<br />

probabilities to first-order disagreements. The hope is that in doing so we’ll reach<br />

a stable point at which EW can be coherently applied. But there is no such stable<br />

point. Consider the following series of questions.<br />

Q1 Is EW true?<br />

Two participants say yes, two say no. We have a dispute, leading to our next question.<br />

Q2 What is the right reaction to the disagreement over Q1?<br />

EW answers this by saying our credence in EW should be 0.5. But that’s not what<br />

the right reason proponents say. They don’t believe EW, so they have no reason to<br />

move their credence in EW away from 0. So we have another dispute, and we can ask<br />

Q3 What is the right reaction to the disagreement over Q2?<br />

EW presumably says that we should again split the difference. Our credence in EW<br />

might now be 0.25, half-way between the 0.5 it was after considering Q2, and what<br />

the right reasons folks say. But, again, those who don’t buy EW will disagree, and<br />

won’t be moved to adjust their credence in EW. So again there’s a dispute, and again<br />

we can ask<br />

Q4 What is the right reaction to the disagreement over Q3?<br />

This could go on for a while. The only ’stable point’ in the sequence is when we<br />

assign a credence of 0 to EW. That’s to say, the only way to coherently defend the<br />

idea behind the objection is to assign credence 0 to EW. But that’s to give up on EW.<br />

As with the previous objection, we can’t hold on to EW and object to the argument.


Disagreements, Philosophical and Otherwise 212<br />

4 Summing Up<br />

The story I’ve told here is a little idealised, but otherwise common enough. We often<br />

have disagreements both about first-order questions, and about how to resolve this<br />

disagreement. In these cases, there is no coherent way to assign equal weight to all<br />

prima facie rational views both about the first order question and the second order,<br />

epistemological, question. The only way to coherently apply EW to all first order<br />

questions is to put our foot down, and say that despite the apparent intelligence of<br />

our philosophical interlocutors, we’re not letting them dim our credence in EW. But<br />

if we are prepared to put our foot down here, why not about some first-order question<br />

or other? It certainly isn’t because we have more reason to believe an epistemological<br />

theory like EW than we have to believe first order theories about which there is<br />

substantive disagreement. So perhaps we should hold on to those theories, and let go<br />

of EW.<br />

Afterthoughts (2010)<br />

I now think that the kind of argument I presented in the 2007 paper is not really an<br />

argument against EW as such, but an argument against one possible motivation for<br />

EW. I also think that alternate motivations for EW are no good, so I still think it is an<br />

important argument. But I think it’s role in the dialectic is a little more complicated<br />

than I appreciated back then.<br />

Much of my thinking about disagreement problems revolves around the following<br />

table. The idea behind the table, and much of the related argument, is due to<br />

Thomas Kelly (2010). In the table, S and T antecedently had good reasons to take<br />

themselves to be epistemic peers, and they know that their judgments about p are<br />

both based on E. In fact, E is excellent evidence for p, but only S judges that p; T<br />

judges that ¬ p. Now let’s look at what seems to be the available evidence for and<br />

against p.<br />

Evidence for p Evidence against p<br />

S’s judgment that p T ’s judgment that ¬ p<br />

E<br />

Now that doesn’t look to me like a table where the evidence is equally balanced for<br />

and against p. Even granting that the judgments are evidence over and above E, and<br />

granting that how much weight we should give to judgments should track our ex ante<br />

judgments of their reliability rather than our ex post judgments of their reliability,<br />

both of which strike me as false but necessary premises for EW, it still looks like<br />

there is more evidence for p than against p. 4 There is strictly more evidence for p<br />

than against it, since E exists. If we want to conclude that S should regard p and<br />

4 By ex ante and ex post I mean before and after we learn about S and T ’s use of E to make a judgment<br />

about p. I think that should change how reliable we take S and T to be, and that this should matter to<br />

what use, if any, we put their judgments, but it is crucial to EW that we ignore this evidence. Or, at least,<br />

it is crucial to EW that S and T ignore this evidence.


Disagreements, Philosophical and Otherwise 213<br />

¬ p as equally well supported for someone in her circumstance, we have to show that<br />

the table is somehow wrong. I know of three possible moves the EW defender could<br />

make here.<br />

David Christensen (2010), as I read him, says that the table is wrong because when<br />

we are representing the evidence S has, we should not include her own judgment.<br />

There’s something plausible to this. Pretend for a second that T doesn’t exist, so it’s<br />

clearly rational for S to judge that p. It would still be wrong of S to say, “Since E<br />

is true, p. And I judged that p, so that’s another reason to believe that p, because<br />

I’m smart.” By hypothesis, S is smart, and that smart people judge things is reason to<br />

believe those things are true. But this doesn’t work when the judgment is one’s own.<br />

This is something that needs explaining in a full theory of the epistemic significance<br />

of judgment, but let’s just take it as a given for now. 5 Now the table, or at least the<br />

table as is relevant to S, looks as follows.<br />

Evidence for p Evidence against p<br />

E T ’s judgment that ¬ p<br />

But I don’t think this does enough to support EW, or really anything like it. First,<br />

it won’t be true in general that the two sides of this table balance. In many cases,<br />

E is strong evidence for p, and T ’s judgment won’t be particularly strong evidence<br />

against p. In fact, I’d say the kind of case where E is much better evidence for p than<br />

T ’s judgment is against p is the statistically normal kind. Or, at least, it is the normal<br />

kind of case modulo the assumption that S and T have the same evidence. In cases<br />

where that isn’t true, learning that T thinks ¬ p is good evidence that T has evidence<br />

against p that you don’t have, and you should adjust accordingly. But by hypothesis,<br />

S knows that isn’t the case here. So I don’t see why this should push us even close to<br />

taking p and ¬ p to be equally well supported.<br />

The other difficulty for defending EW by this approach is that it seems to undermine<br />

the original motivations for the view. As Christensen notes, the table above is<br />

specifically for S. Here’s what the table looks like for T .<br />

Evidence for p Evidence against p<br />

S’s judgment that p<br />

E<br />

It’s no contest! So T should firmly believe p. But that isn’t the intuition anyone gets,<br />

as far as I can tell, in any of the cases motivating EW. And the big motivation for EW<br />

comes from intuitions about cases. Once we acknowledge that these intuitions are<br />

unreliable, as we’d have to do if we were defending EW this way, we seem to lack any<br />

reason to accept EW.<br />

The second approach to blocking the table is to say that T ’s judgment is an undercutting<br />

defeater for the support E provides for p. This looks superficially promising.<br />

Having a smart person say that your evidence supports something other than you<br />

5 My explanation is that evidence screens any judgments made on the basis of that evidence, in the sense<br />

of screening to be described below.


Disagreements, Philosophical and Otherwise 214<br />

thought it did seems like it could be an undercutting defeater, since it is a reason<br />

to think the evidence supports something else, and hence doesn’t support what you<br />

thought it does. And, of course, if E is undercut, then the table just has one line on<br />

it, and the two sides look equal.<br />

But it doesn’t seem like it can work in general, for a reason that Kelly (2010) makes<br />

clear. We haven’t said what E is so far. Let’s start with a case where E consists of the<br />

judgments of a million other very smart people that p. Then no one, not even the<br />

EW theorist, will think that T ’s judgment undercuts the support E provides to p.<br />

Indeed, even if E just consists of one other person’s judgment, it won’t be undercut<br />

by T ’s judgment. The natural thought for an EW-friendly person to have in that<br />

case is that since there are two people who think p, and one who thinks ¬ p, then<br />

S’s credence in p should be 2/3. But that’s impossible if E, i.e., the third person’s<br />

judgment, is undercut by T ’s judgment. It’s true that T ’s judgment will partially<br />

rebut the judgments that S, and the third party, make. It will move the probability<br />

of p, at least according to EW, from 1 to 2/3. But that evidence won’t be in any way<br />

undercut.<br />

And as Kelly points out, evidence is pretty fungible. Whatever support p gets<br />

from other people’s judgments, it could get very similar support from something<br />

other than a judgment. We get roughly the same evidence for p by learning that a<br />

smart person predicts p as learning that a successful computer model predicts p. So<br />

the following argument looks sound to me.<br />

1. When E consists of other people’s judgments, the support it provides to p is<br />

not undercut by T ’s judgment.<br />

2. If the evidence provided by other people’s judgments is not undercut by T ’s<br />

judgment, then some non-judgmental evidence is not undercut by T ’s judgment.<br />

3. So, not all non-judgmental evidence is not undercut by T ’s judgment.<br />

So it isn’t true in general that the table is wrong because E has been defeated by an<br />

undercutting defeater.<br />

There’s another problem with the defeat model in cases where the initial judgments<br />

are not full beliefs. Change the case so E provides basically no support to<br />

either p or ¬ p. In fact, E is just irrelevant to p, and the agent’s have nothing to base<br />

either a firm or a probabilistic judgment about p on. For this reason, S declines to<br />

form a judgment, but T forms a firm judgment that p. Moreover, although both S<br />

and T are peers, that’s because they are both equally poor at making judgments about<br />

cases like p. Here’s the table then:<br />

Evidence for p Evidence against p<br />

T ’s judgment that p<br />

Since E is irrelevant, it doesn’t appear, either before or after we think about defeaters.<br />

And since T is not very competent, that’s not great evidence for p. But EW says that<br />

S should ‘split the difference’ between her initial agnositicism, and T ’s firm belief in<br />

p. I don’t see how that could be justified by S’s evidence.


Disagreements, Philosophical and Otherwise 215<br />

So that move doesn’t work either, and we’re left with the third option for upsetting<br />

the table. This move is, I think, the most promising of the lot. It is to say that<br />

S’s own judgment screens off the evidence that E provides. So the table is misleading,<br />

because it ‘double counts’ evidence.<br />

The idea of screening I’m using here, at least on behalf of EW, comes from Reichenbach’s<br />

The Direction of Time, and in particular from his work on deriving a<br />

principle that lets us infer events have a common cause. The notion was originally<br />

introduced in probabilistic terms. We say that C screens off the positive correlation<br />

between B and A if the following two conditions are met.<br />

1. A and B are positively correlated probabilistically, i.e. P r (A|B) > P r (A).<br />

2. Given C , A and B are probabilistically independent,<br />

i.e. P r (A|B ∧ C ) = P r (A|C ).<br />

I’m interested in an evidential version of screening. If we have a probabilistic analysis<br />

of evidential support, the version of screening I’m going to offer here is identical to<br />

the Reichenbachian version just provided. But I want to stay neutral on whether<br />

we should think of evidence probabilistically. 6 When I say that C screens off the<br />

evidential support that B provides to A, I mean the following. (Both these clauses, as<br />

well as the statement that C screens off B from A, are made relative to an evidential<br />

background. I’ll leave that as tacit in what follows.)<br />

1. B is evidence that A.<br />

2. B ∧ C is no better evidence that A than C is. 7<br />

Here is one stylised example of where screening helps conceptualise things. Detective<br />

Det is trying to figure out whether suspect Sus committed a certain crime. Let A be<br />

that Sus is guilty, B be that Sus’s was seen near the crime scene near the time the<br />

crime was committed, and C be that Sus was at the crime scene when the crime<br />

was committed. Then both clauses are satisfied. B is evidence for A; that’s why we<br />

look for witnesses who place the suspect near the crime scene. But given the further<br />

evidence C , then B is neither here nor there with respect to A. We’re only interested<br />

in finding if Sus was near the crime scene because we want to know whether he was<br />

at the crime scene. If we know that he was there, then learning he was seen near there<br />

doesn’t move the investigation along. So both clauses of the definition of screening<br />

are satisfied.<br />

When there is screened evidence, there is the potential for double counting. It<br />

would be wrong to say that if we know B ∧ C we have two pieces of evidence against<br />

Sus. Similarly, if a judgment screens off the evidence it is based on, then the table<br />

‘double counts’ the evidence for p. Removing the double counting, by removing E,<br />

makes the table symmetrical. And that’s just what EW needs.<br />

6In general I’m sceptical of always treating evidence probabilistically. Some of my reasons for scepticism<br />

are in <strong>Weatherson</strong> (2007).<br />

7Branden Fitelson pointed out to me that the probabilistic version entails one extra condition, namely<br />

that ¬B ∧ C is no worse evidence for A than C is. But I think that extra condition is irrelevant to disagreement<br />

debates, so I’m leaving it out.


Disagreements, Philosophical and Otherwise 216<br />

So the hypothesis that judgments screen the evidence they are based on, or JSE<br />

for short, can help EW respond to the argument from this table. But JSE is vulnerable<br />

to regress arguments. I now think that the argument in ‘Disagreeing about<br />

Disagreement’ is a version of the regress argument against JSE. So really it’s an argument<br />

against the most promising response to a particularly threatening argument<br />

against EW.<br />

Unfortunately for EW, those regress arguments are actually quite good. To see<br />

ths, let’s say an agent makes a judgment on the basis of E, and let J be the proposition<br />

that that judgment was made. JSE says that E is now screened off, and the agent’s<br />

evidence is just J . But with that evidence, the agent presumably makes a new judgment.<br />

Let J ′ be the proposition that that judgment was made. We might ask now,<br />

does J ′ sit alongside J as extra evidence, is it screened off by J , or does it screen off J ?<br />

The picture behind JSE, the picture that says that judgments on the basis of some evidence<br />

screen that evidence, suggest that J ′ should in turn screen J . But now it seems<br />

we have a regress on our hands. By the same token, J ′′ , the proposition concerning<br />

the new judgment made on the basis of J ′ , should screen off J ′ , and the proposition<br />

J ′′′ about the fourth judgment made, should screen off J ′′ , and so on. The poor agent<br />

has no unscreened evidence left! Something has gone horribly wrong.<br />

I think this regress is ultimately fatal for JSE. But to see this, we need to work<br />

through the possible responses that a defender of JSE could make. There are really<br />

just two moves that seem viable. One is to say that the regress does not get going,<br />

because J is better evidence than J ′ , and perhaps screens it. The other is to say that<br />

the regress is not vicious, because all these judgments should agree in their content.<br />

I’ll end the paper by addressing these two responses.<br />

The first way to avoid the regress is to say that there is something special about<br />

the first level. So although J screens E, it isn’t the case that J ′ screens J . That way,<br />

the regress doesn’t start. This kind of move is structurally like the move Adam Elga<br />

(2010) has recently suggested. He argues that we should adjust our views about firstorder<br />

matters in (partial) deference to our peers, but we shouldn’t adjust our views<br />

about the right response to disagreement in this way.<br />

It’s hard to see what could motivate such a position, either about disagreement<br />

or about screening. It’s true that we need some kind of stopping point to avoid these<br />

regresses. But the most natural stopping point is the very first level. Consider a toy<br />

example. It’s common knowledge that there are two apples and two oranges in the<br />

basket, and no other fruit. (And that no apple is an orange.) Two people disagree<br />

about how many pieces of fruit there are in the basket. A thinks there are four, B<br />

thinks there are five, and both of them are equally confident. Two other people, C<br />

and D, disagree about what A and B should do in the face of this disagreement. All<br />

four people regard each other as peers. Let’s say C ’s position is the correct one (whatever<br />

that is) and D’s position is incorrect. Elga’s position is that A should partially<br />

defer to B, but C should not defer to D. This is, intuitively, just back to front. A<br />

has evidence that immediately and obviously entails the correctness of her position.<br />

C is making a complicated judgment about a philosophical question where there are<br />

plausible and intricate arguments on each side. The position C is in is much more<br />

like the kind of case where experience suggests a measure of modesty and deference


Disagreements, Philosophical and Otherwise 217<br />

can lead us away from foolish errors. If anyone should be sticking to their guns here,<br />

it is A, not C .<br />

The same thing happens when it comes to screening. Let’s say that A has some<br />

evidence that (a) she has made some mistakes on simple sums in the past, but (b)<br />

tends to massively over-estimate the likelihood that she’s made a mistake on any given<br />

puzzle. What should she do? One option, in my view the correct one, is that she<br />

should believe that there are four pieces of fruit in the basket, because that’s what the<br />

evidence obviously entails. Another option is that she should be not very confident<br />

there are four pieces of fruit in the basket, because she makes mistakes on these kinds<br />

of sums. Yet another option is that she should be pretty confident (if not completely<br />

certain) that there are four pieces of fruit in the basket, because if she were not very<br />

confident about this, this would just be a manifestation of her over-estimation of her<br />

tendency to err. The ‘solution’ to the regress we’re considering here says that the<br />

second of these three reactions is the uniquely rational reaction. The idea behind the<br />

solution is that we should respond to the evidence provided by first-order judgments,<br />

and correct that judgment for our known biases, but that we shouldn’t in turn correct<br />

for the flaws in our self-correcting routine. I don’t see what could motivate such<br />

a position. Either we just rationally respond to the evidence, and in this case just<br />

believe there are four pieces of fruit in the basket, or we keep correcting for errors we<br />

make in any judgment. It’s true that the latter plan leads either to regress or to the<br />

kind of ratificationism we’re about to critically examine. But that’s not because the<br />

disjunction is false, it’s because the first disjunct is true.<br />

A more promising way to avoid the regress is suggested by some other work of<br />

Elga’s, in this case a paper he co-wrote with Andy Egan (Egan and Elga, 2005). Their<br />

idea, as I understand them, is that for any rational agent, any judgment they make<br />

must be such that when they add the fact that they made that judgment to their<br />

evidence (or, perhaps better given JSE, replace their evidence with the fact that they<br />

made that judgment), the rational judgment to make given the new evidence has<br />

the same content as the original judgment. So if you’re rational, and you come to<br />

believe that p is likely true, then the rational thing to believe given you’ve made that<br />

judgment is that p is likely true.<br />

Note that this isn’t as strong a requirement as it may first seem. The requirement<br />

is not that any time an agent makes a judgment, rationality requires that they say<br />

on reflection that it is the correct judgments. Rather, the requirement is that the<br />

only judgments rational agents make are those judgments that, on reflection, she<br />

would reflectively endorse. We can think of this as a kind of ratifiability constraint<br />

on judgment, like the ratifiability constraint on decision making that Richard Jeffrey<br />

uses to handle Newcomb cases Jeffrey (1983b).<br />

To be a little more precise, a judgment is ratifiable for agent S just in case the<br />

rational judgment for S to make conditional on her having made that judgment has<br />

the same content as the original judgment. The thought then is that we avoid the<br />

regress by saying rational agents always make ratifiable judgments. If the agent does<br />

do that, there isn’t much of a problem with the regress; once she gets to the first level,<br />

she has a stable view, even once she reflects on it.


Disagreements, Philosophical and Otherwise 218<br />

It seems to me that this assumption, that only ratifiable judgments are rational, is<br />

what drives most of the arguments in Egan and Elga’s paper on self-confidence, so I<br />

don’t think this is a straw-man move. Indeed, as the comparison to Jeffrey suggests,<br />

it has some motivation behind it. Nevertheless it is false. I’ll first note one puzzling<br />

feature of the view, then one clearly false implication of the view.<br />

The puzzling feature is that in some cases there may be nothing we can rationally<br />

do which is ratifiable. One way this can happen involves a slight modification of<br />

Egan and Elga’s example of the directionaly-challenged driver. Imagine that when<br />

I’m trying to decide whether p, for any p in a certain field, I know (a) that whatever<br />

judgment I make will usually be wrong, and (b) if I conclude my deliberations without<br />

making a judgment, then p is usually true. If we also assume JSE, then it follows<br />

there is no way for me to end deliberation. If I make a judgment, I will have to retract<br />

it because of (a). But if I think of ending deliberation, then because of (b) I’ll have excellent<br />

evidence that p, and it would be irrational to ignore this evidence. (Nicholas<br />

Silins (2005) has used the idea that failing to make a judgment can be irrational in a<br />

number of places, and those arguments motivated this example.)<br />

This is puzzling, but not obviously false. It is plausible that there are some epistemic<br />

dilemmas, where any position an agent takes is going to be irrational. (By that,<br />

I mean it is at least as plausible that there are epistemic dilemmas as that there are<br />

moral dilemmas, and I think the plausibility of moral dilemmas is reasonably high.)<br />

That a case like the one I’ve described in the previous paragraph is a dilemma is perhaps<br />

odd, but no reason to reject the theory.<br />

The real problem, I think, for the ratifiability proposal is that there are cases<br />

where unratifiable judgments are clearly preferable to ratifiable judgments. Assume<br />

that I’m a reasonably good judge of what’s likely to happen in baseball games, but<br />

I’m a little over-confident. And I know I’m over-confident. So the rational credence,<br />

given some evidence, is usually a little closer to 1/2 than I admit. At risk of being<br />

arbitrarily precise, let’s say that if p concerns a baseball game, and my credence in p<br />

is x, the rational credence in p, call it y, for someone with no other information than<br />

this is given by:<br />

s i n(2πx)<br />

y = x +<br />

50<br />

To give you a graphical sense of how that looks, the dark line in this graph is y, and<br />

the lighter diagonal line is y = x.


Disagreements, Philosophical and Otherwise 219<br />

Note that the two lines intersect at three points: (0,0),( 1/2, 1/2) and (1,1). So if my<br />

credence in p is either 0, 1/2 or 1, then my judgment is ratifiable. Otherwise, it is not.<br />

So the ratifiability constraint says that for any p about a baseball game, my credence<br />

in p should be either 0, 1/2 or 1. But that’s crazy. It’s easy to imagine that I know<br />

(a) that in a particular game, the home team is much stronger than the away team,<br />

(b) that the stronger team usually, but far from always, wins baseball games, and (c)<br />

I’m systematically a little over-confident about my judgments about baseball games,<br />

in the way just described. In such a case, my credence that the home team will win<br />

should be high, but less than 1. That’s just what the ratificationist denies is possible.<br />

This kind of case proves that it isn’t always rational to have ratifiable credences.<br />

It would take us too far afield to discuss this in detail, but it is interesting to think<br />

about the comparison between the kind of case I just discussed, and the objections to<br />

backwards induction reasoning in decision problems that have been made by Pettit<br />

and Sugden (1989), and by Stalnaker (1996; 1998; 1999). The backwards induction<br />

reasoning they criticise is, I think, a development of the idea that decisions should<br />

be ratifiable. And the clearest examples of when that reasoning fails concern cases<br />

where there is a unique ratifiable decision, and it is guaranteed to be one of the worst<br />

possible outcomes. The example I described in the last few paragraphs has, quite<br />

intentionally, a similar structure.<br />

The upshot of all this is that I think these regress arguments work. They aren’t,<br />

I think, directly an argument against EW. What they are is an argument against the<br />

most promising way the EW theorist has for arguing that the table I started with<br />

misstates S’s epistemic situation. Given that the regress argument against JSE works<br />

though, I don’t see any way of rescuing EW from this argument.


Do Judgments Screen Evidence?<br />

1 Screening<br />

Suppose a rational agent S has some evidence E that bears on p, and on that basis<br />

makes a judgment about p. For simplicity, we’ll normally assume that she judges that<br />

p, though we’re also interested in cases where the agent makes other judgments, such<br />

as that p is probable, or that p is well-supported by the evidence. We’ll also assume,<br />

again for simplicity, that the agent knows that E is the basis for her judgment. Finally,<br />

we’ll assume that the judgment is a rational one to make, though we won’t assume<br />

the agent knows this. Indeed, whether the agent can always know that she’s making<br />

a rational judgment when in fact she is will be of central importance in some of the<br />

debates that follow.<br />

Call the proposition that the agent has made this judgment J . The agent is, we’ll<br />

assume, aware that J is true. She’s also aware that she’s rational. The fact that a<br />

rational person judges p seems to support p. So it might look like J is a new piece<br />

of evidence for her, one that tells in favour of p. Here then is an informal version of<br />

the question I’ll discuss in this paper: How many pieces of evidence does the agent have<br />

that bear on p? Three options present themselves.<br />

1. Two - Both J and E.<br />

2. One - E subsumes whatever evidential force J has.<br />

3. One - J subsumes whatever evidential force E has.<br />

This paper is about option 3. I’ll call this option JSE, short for Judgments Screen<br />

Evidence. I’m first going to say what I mean by screening here, and then say why JSE<br />

is interesting. Ultimately I want to defend three claims about JSE.<br />

1. JSE is sufficient, given some plausible background assumptions, to derive a<br />

number of claims that have become prominent in recent epistemology (meaning<br />

approximately 2004 to the present day).<br />

2. JSE is necessary to motivate at least some of these claims.<br />

3. JSE is false.<br />

This section will largely be about saying what JSE is, and then defending 1 and 2. I’ll<br />

say a bit more about 1 and 2 in the following section, focussing in detail on the role<br />

JSE plays in the Equal Weight View of disagreement. Then in sections 3 and 4, I’ll<br />

develop two distinct objections to JSE.<br />

† In progress. References not even started, let alone complete, though there are hyperlinks to some<br />

of the papers I discuss. Draft only. Thanks to John Collins, Shamik Dasgupta, Adam Elga, Tom Kelly,<br />

Ishani Maitra, Ted Sider and audiences at Arché for comments on earlier drafts of this paper and of its<br />

constituent parts.


Do Judgments Screen Evidence? 221<br />

1.1 Screening<br />

The idea of screening I’m using here comes from Reichenbach’s The Direction of<br />

Time, and in particular from his work on deriving a principle that lets us infer events<br />

have a common cause. The notion was originally introduced in probabilistic terms.<br />

We say that C screens off the positive correlation between B and A if the following<br />

two conditions are met.<br />

1. A and B are positively correlated probabilistically, i.e. P r (A|B) > P r (A).<br />

2. Given C , A and B are probabilistically independent,<br />

i.e. P r (A|B ∧ C ) = P r (A|C ).<br />

I’m interested in an evidential version of screening. If we have a probabilistic analysis<br />

of evidential support, the version of screening I’m going to offer here is identical to<br />

the Reichenbachian version just provided. But I want to stay neutral on whether<br />

we should think of evidence probabilistically. In general I’m somewhat sceptical<br />

of probabilistic treatments of evidence for reasons Jim Pryor goes through in his<br />

Uncertainty and Undermining. I mention some of these in my The Bayesian and the<br />

Dogmatist. But I won’t lean on those points in this paper.<br />

When I say that C screens off the evidential support that B provides to A, I mean<br />

the following. (Both these clauses, as well as the statement that C screens off B from<br />

A, are made relative to an evidential background. I’ll leave that as tacit in what follows.)<br />

1. B is evidence that A.<br />

2. B ∧ C is no better evidence that A than C is, and ¬B ∧ C is no worse evidence<br />

for A than C is.<br />

Here is one stylised example, and one real-world example.<br />

Detective Det is trying to figure out whether suspect Sus committed a certain<br />

crime. Let A be that Sus is guilty, B be that Sus’s fingerprints were found at the crime<br />

scene, and C be that Sus was at the crime scene when the crime was committed. Then<br />

both clauses are satisfied. B is evidence for A; that’s why we dust for fingerprints. But<br />

given the further evidence C , then B is neither here nor there with respect to A.<br />

We’re only interested in finding fingerprints because they are evidence that Sus was<br />

there. If we know Sus was there, then the fingerprint evidence isn’t useful one way<br />

or the other. So both clauses of the definition of screening are satisfied.<br />

The real world example is fairly interesting. Imagine that we know Vot is an<br />

American voter in last year’s US Presidential election, and we know Vot is either<br />

from Alabama or Massachusetts, but don’t know which. Let A be that Vot voted<br />

for Barack Obama, let B be that Vot is from Massachusetts, and let C be that Vot is<br />

pro-choice. Then, somewhat surprisingly, both conditions are met. Since voters in<br />

Massachusetts were much more likely to vote for Obama than voters in Alabama, B<br />

is good evidence for A. But, at least according to the polls linked to the state names<br />

above, pro-choice voters in the two states voted for Obama at roughly the same rate.<br />

(In both cases, a little under two to one.) So C screens off B as evidence for A, and<br />

both clauses are satisfied.


Do Judgments Screen Evidence? 222<br />

1.2 The Idea Behind JSE<br />

When we think about the relation between J and E, there are three conflicting pressures<br />

we immediately face. First it seems J could be evidence for p. To see this, note<br />

that if someone else comes to know that S has judged that p, and they know that S<br />

is as rational as them, and as well informed as them, then that could be a good reason<br />

for them to believe that p. Or, at the very least, it could be evidence for them to take<br />

p to be a little more likely than they previously thought. Second, it seems like ‘double<br />

counting’ for S to take both E and J to be evidence. After all, she only formed<br />

judgment J because of E. Yet third, it seems wrong for S to simply ignore E, since<br />

by stipulation, she has E, and it is in general wrong to ignore evidence that one has.<br />

The simplest argument for JSE is that it lets us accommodate all three of these<br />

ideas. S can treat J just like everyone else does, i.e. as some evidence for p without<br />

either double counting or ignoring E. She can do that because she can take E to be<br />

screened off by J . That’s a rather nice feature of JSE.<br />

To be sure, it is a feature that JSE shares with a view we might call ESJ, or evidence<br />

screens judgments. That view says that S shouldn’t take J to be extra evidence for p,<br />

for while it is indeed some evidence for p, its evidential force is screened off by E.<br />

This view also allows for S to acknowledge that J has the same evidential force for<br />

her as it has for others, while also avoiding double counting. So we need some reason<br />

to prefer JSE to ESJ.<br />

One reason (and I don’t think this is what anyone would suggest is the strongest<br />

reason) is from an analogy with the fingerprint example. In that case we look for<br />

one kind of evidence, fingerprints, because it is evidence for something that is very<br />

good evidence of guilt, namely presence at the crime scene. But the thing that we are<br />

collecting fingerprint evidence for screens off the fingerprint evidence. Similarly, we<br />

might hold that we collect evidence like E because it leads to judgments like J . So the<br />

later claim, J should screen E, if this analogy holds up.<br />

1.3 JSE and Disagreement<br />

My main concern in this section isn’t with any particular argument for JSE, but with<br />

the role that JSE might play in defending contemporary epistemological theories.<br />

I’m going to argue later that JSE is false, but first I’ll argue that it is significant. I’ll<br />

discuss several different ways in which JSE is implicated in contemporary work. The<br />

various theses JSE supports might not have seemed to have a lot in common, though<br />

it is notable that they have a number of proponents in common. So one of the<br />

things I’ll argue is that JSE unifies some potentially disparate strands in contemporary<br />

epistemology. The primary case in which I’ll be interested in concerns disagreement.<br />

Here is Adam Elga’s version of the Equal Weight View of peer disagreement, from<br />

his Reflection and Disagreement.<br />

Upon finding out that an advisor disagrees, your probability that you are<br />

right should equal your prior conditional probability that you would be<br />

right. Prior to what? Prior to your thinking through the disputed issue,<br />

and finding out what the advisor thinks of it. Conditional on what? On<br />

whatever you have learned about the circumstances of the disagreement.


Do Judgments Screen Evidence? 223<br />

It is easy to see how JSE could lead to some kind of equal weight view. If your<br />

evidence that p is summed up in your judgment that p, and another person who you<br />

regard as equally likely to be right has judged that ¬ p, then you have exactly the same<br />

kind of evidence for p as against it. So you should suspend judgment about whether<br />

p is true or not. In section 2 I’ll discuss how to turn this informal idea into a full<br />

argument.<br />

But for now I want to focus on the role that JSE can play is in the clause about<br />

priority. Here is one kind of situation that Elga wants to rule out. S has some<br />

evidence E that she takes to be good evidence for p. She thinks T is an epistemic<br />

peer. She then learns that T , whose evidence is also E, has concluded ¬ p. She decides,<br />

simply on that basis, that T must not be an epistemic peer, because T has got this case<br />

wrong. This decision violates the Equal Weight View, because it uses S’s probability<br />

that T is a peer after thinking through the disputed issue, not prior to this, in forming<br />

her judgment about how likely it is that she was right, i.e., how likely it is that p is<br />

true.<br />

Now at first it might seem that S isn’t doing anything wrong here. If she knows<br />

how to apply E properly, and can see that T is misapplying it, then she has good<br />

reason to think that T isn’t really an epistemic peer after all. She may have thought<br />

previously that T was a peer, indeed she may have had good reason to think that.<br />

But she now has excellent evidence, gained from thinking through this very case, to<br />

think that T is not a peer, and so not worthy of deference.<br />

Since Elga thinks that there is something wrong with this line of reasoning, there<br />

must be some way to block it. I think by far the best option for blocking it comes<br />

from ruling that E is not available evidence for S once she is using J as a judgment.<br />

That is, the best block available seems to me to come from JSE. For once we have JSE<br />

in place, we can say very simply what is wrong with S here. She is like the detective<br />

who says that we have lots of evidence that Sus is guilty–not only was she at the crime<br />

scene, but her fingerprints were there. To make the case more analogous, we might<br />

imagine that there are detectives with competing theories about who is guilty in this<br />

case. If we don’t know who was at the crime scene, then fingerprint evidence may<br />

favour one detective’s theory over the other. If we do know that both suspects were<br />

known to be at the crime scene, then fingerprint evidence isn’t much help to either.<br />

So I think that if JSE is true, we have an argument for Elga’s strong version of<br />

the Equal Weight View, one which holds agents are not allowed to use the dispute<br />

at issue as evidence for or against the peerhood of another. And if JSE is not true,<br />

then there is a kind of reasoning which undermines Elga’s Equal Weight View, and<br />

which seems, to me at least, unimpeachable. So I think Elga’s version of the Equal<br />

Weight View requires JSE, and JSE is at least arguably sufficient for Elga’s version of<br />

the Equal Weight View.<br />

That last claim might look too strong in a couple of respects. On the one hand, we<br />

might worry that we could accept JSE and still reject the Equal Weight View because<br />

of epistemic partiality. Here’s a way to do that. Say we thought that S should given<br />

more weight to T 1 ’s judgment than T 2 ’s judgment if S stands in a special relationship<br />

to T 1 and not to T 2 , even if S has no reason independent of the relationship to believe<br />

that T 1 is more reliable. And say that S stands in that relationship to herself. Then S’s


Do Judgments Screen Evidence? 224<br />

own judgment that p might be better evidence for her that p than a peer’s judgment.<br />

The view I’ve just sketched is a schema; it becomes more precise when we fill in<br />

what the relationship is. Ralph Wedgwood has proposed a version of this view where<br />

the special relationship is identity. Sarah Stroud has proposed a version of this view<br />

where the special relationship is friendship. For what it’s worth, I’m a little sceptical<br />

of such views, but arguing against them would take us too far away from our main<br />

goal. Instead I’ll just note that if you do like such views, you should agree with<br />

me that JSE is necessary to motivate the Equal Weight View, and disagree that it’s<br />

sufficient. Put another way, the falsity of such views is a needed extra premise to get<br />

that JSE is necessary and sufficient for the Equal Weight View.<br />

For somewhat different reasons, considering the details of JSE might make us<br />

worry that JSE is not strong enough to support a full-blooded version of the Equal<br />

Weight View. After all, JSE was restricted to the case where the agent’s judgment<br />

is rational. So all it could support is a version of the Equal Weight View restricted<br />

to agents who initially make a rational judgment. But I think this isn’t actually a<br />

problem, since we need to put some kind of restriction on Equal Weight in any case.<br />

We need to put such a restriction on because the alternative is to allow a kind of<br />

epistemic laundering.<br />

Consider an agent who makes an irrational judgment. And assume her friend,<br />

who she knows to be a peer, makes the same irrational judgment. What does the<br />

Equal Weight View say she should do? It should be bad for it to say that she should<br />

regard her and her friend as equally likely to be right, so she should keep this judgment.<br />

After all, it was irrational! There are a couple of moves the friend of the Equal<br />

Weight View can make at this point. But I think the simplest one will be to put some<br />

kind of restriction on Equal Weight. If that restriction is to agents who have initially<br />

made rational judgments, then it isn’t a problem that JSE is restricted in the same<br />

way.<br />

1.4 White on Permissiveness<br />

In his 2005 Philosophical Perspectives paper, Epistemic Permissiveness, Roger White<br />

argues that there cannot be a case where it could be epistemically rational, on evidence<br />

E, to believe p, and also rational, on the same evidence, to believe ¬ p. One of<br />

the central arguments in that paper is an analogy between two cases.<br />

Random Belief: S is given a pill which will lead to her forming a belief<br />

about p. There is a 1/2 chance it will lead to the true belief, and a 1/2<br />

chance it will lead to the false belief. S takes the pill, forms the belief, a<br />

belief that p as it turns out, and then, on reflecting on how she formed<br />

the belief, maintains that belief.<br />

Competing Rationalities: S is told, before she looks at E, that some<br />

rational people form the belief that p on the basis of E, and others form<br />

the belief that ¬ p on the basis of E. S then looks at E and, on that basis,<br />

forms the belief that p.<br />

White claims that S is no better off in the second case than in the former. As he says,


Do Judgments Screen Evidence? 225<br />

Supposing this is so, is there any advantage, from the point of view of<br />

pursuing the truth, in carefully weighing the evidence to draw a conclusion,<br />

rather than just taking a belief-inducing pill? Surely I have no better<br />

chance of forming a true belief either way.<br />

But it seems to me that there is all the advantage in the world. In the second case, S<br />

has evidence that tells on p, and in the former she does not. Indeed, I long found it<br />

hard to see how we could even think the cases are any kind of analogy. But I now<br />

think JSE holds the key to the argument.<br />

Assume that JSE is true. Then after S evaluates E, she forms a judgment, and<br />

J is the proposition that she formed that judgment. Now it might be true that E<br />

itself is good evidence for p. (The target of White’s critique says that E is also good<br />

evidence for ¬ p, but that’s not yet relevant.) But given JSE, that fact isn’t relevant<br />

to S’s current state. For her evidence is, in its entirity, J . And she knows that, as a<br />

rational agent, she could just as easily have formed some other judgment, in which<br />

case J would have been false. Indeed, she could have formed the opposite judgment.<br />

So J is no evidence at all, and she is just like the person who forms a random belief,<br />

contradicting the assumption that believing p could, in this case, be rational, and that<br />

believing ¬ p could be rational.<br />

Without JSE, I don’t see how White’s analogy holds up. There seems to be a<br />

world of difference between forming a belief via a pill, and forming a belief on the<br />

basis of the evidence, even if you know that other rational agents take the evidence to<br />

support a different conclusion. In the former case, you have violated every epistemic<br />

rule we know of. In the latter, you have reasons for your belief, you can defend it<br />

against challenges, you know how it fits with other views, you know when and why<br />

you would give it up, and so on. The analogy seems worse than useless by any of<br />

those measures.<br />

1.5 Christensen on Higher-Order Evidence<br />

Next, I’ll look at some of the arguments David Christensen brings up in his Higher<br />

Order Evidence. Christensen imagines a case in which we are asked to do a simple<br />

logic puzzle, and are then told that we have been given a drug which decreases logical<br />

acumen in the majority of people who take it. He thinks that we have evidence<br />

against the conclusions we have drawn.<br />

Let’s consider a particular version of that, modelled on Christensen’s example of<br />

Ferdinand the bull. S knows that ∀x(F x → Gx) , and knows that ¬(F a ∧ Ga). S<br />

then infers deductively that ¬F a. S is then told that she’s been given a drug that<br />

dramatically impairs abilities to draw deductive conclusions. Christensen’s view is<br />

that this testimony is evidence against ¬F a, which I assume implies that it is evidence<br />

that F a.<br />

This looks quite surprising. S has evidence which entails that ¬F a, and her evidence<br />

about the drug doesn’t rebut that evidence. It does, says Christensen, undermine<br />

her evidence for ¬F a. But not because it undermines the entailment; it isn’t like<br />

the evidence gives her reason to believe some non-classical logic where this entailment<br />

does not go through is correct. So how could it be an underminer?


Do Judgments Screen Evidence? 226<br />

Again, JSE seems to provide an answer. If S’s evidence that ¬F a is ultimately just<br />

her judgment that it is entailed by her other evidence, and that judgment is revealed to<br />

be unreliable because of her recent medication, then S does lose evidence that ¬F a.<br />

But if we thought the original evidence, i.e., ∀x(F x → Gx) and ¬(F a ∧ Ga), was<br />

still available to S, then there is a good reason to say that her evidence conclusively<br />

establishes that ¬F a.<br />

Note that I’m not saying here that Christensen argues from JSE to his conclusion.<br />

Rather, I’m arguing that JSE delivers the conclusion Christensen wants, and without<br />

JSE there seems to be a fatal flaw in his argument. So Christensen’s view needs JSE<br />

as well.<br />

1.6 Egan and Elga on Self-Confidence<br />

Finally, I’ll look at some conclusions that Andy Egan and Adam Elga draw about selfconfidence<br />

in their paper I Can’t Believe I’m Stupid. I think many of the conclusions<br />

they draw in that paper rely on JSE, but I’ll focus just on the most prominent use of<br />

JSE in the paper.<br />

One of the authors of this paper has horrible navigational instincts. When<br />

this author—call him “AE”—has to make a close judgment call as to which<br />

of two roads to take, he tends to take the wrong road. If it were just AE’s<br />

first instincts that were mistaken, this would be no handicap. Approaching<br />

an intersection, AE would simply check which way he is initially<br />

inclined to go, and then go the opposite way. Unfortunately, it is not<br />

merely AE’s first instincts that go wrong: it is his all things considered<br />

judgments. As a result, his worse-than-chance navigational performance<br />

persists, despite his full awareness of it. For example, he tends to take the<br />

wrong road, even when he second-guesses himself by choosing against<br />

his initial inclinations.<br />

Now: AE faces an unfamiliar intersection. What should he believe about<br />

which turn is correct, given the anti-reliability of his all-things-considered<br />

judgments? Answer: AE should suspend judgment. For that is the only<br />

stable state of belief available to him, since any other state undermines<br />

itself. For example, if AE were at all confident that he should turn left,<br />

that confidence would itself be evidence that he should not turn left. In<br />

other words, AE should realize that, were he to form strong navigational<br />

opinions, those opinions would tend to be mistaken. Realizing this, he<br />

should refrain from forming strong navigational opinions (and should<br />

outsource his navigational decision-making to someone else whenever<br />

possible)<br />

I think that this reasoning goes through iff JSE is assumed. I’ll argue for this by first<br />

showing how the reasoning could fail without JSE, and then showing how JSE could<br />

fix the argument.<br />

Start with a slightly different case. I am trying to find out whether p, where this<br />

is something I know little about. I ask ten people whether p is true, each of them


Do Judgments Screen Evidence? 227<br />

being someone I have good reason to believe is an expert. The experts have a chance<br />

to consult before talking to me, so each of them knows what the others will advise.<br />

Nine of them confidently assure me that p is true. The tenth is somewhat equivocal,<br />

but says that he suspects it is not, although he cannot offer any reasons for this that<br />

the other nine have not considered. It seems plausible in such a case that I should, or<br />

at least may, simply accept the supermajority’s verdict, and believe p.<br />

Now change the case a little. The first nine are experts, but the tenth is something<br />

of an anti-expert. He is wrong considerably more often than not on these matters.<br />

Again, the first nine confidently assert that p. In this case, the tenth is equally confident<br />

that p. My epistemic situation looks much like it did in the previous paragraph.<br />

I have a lot of evidence for p, and a little evidence against it. The evidence against<br />

has changed a little; it is now the confident verdict of a sometimes anti-expert, rather<br />

than the equivocal anti-verdict of an expert, but this doesn’t look like a big difference<br />

all-things-considered. So I still should, or at least may, believe p.<br />

Now make one final change. I am the tenth person consulted. I ask the first nine<br />

people, who of course all know each other’s work, and they all say p. I know that<br />

I have a tendency to make a wrong judgment in this type of situation – even when<br />

I’ve had a chance to consult with experts. (Perhaps p is the proposition that the right<br />

road is to the left, and I am AE, for example. It does require some amount of hubris<br />

to continue to be an anti-expert even once you know you are one, and the judgments<br />

are made in the presence of expert advise. But I don’t think positing delusionally<br />

narcissistic agents makes the case unrealistic.) After listening to the experts, I judge<br />

that p. This is some evidence that ¬ p, since I’m an anti-expert. But, as in the last two<br />

paragraphs, it doesn’t seem that it should override all the other evidence I have. So,<br />

even if I know that I’m in general fairly anti-reliable on questions like p, I need not<br />

suspend judgment. On those (presumably rare) occasions where my judgment tracks<br />

the evidence, I should keep it, even once I acknowledge I have made the judgment.<br />

The previous paragraph assumed that JSE did not hold. It assumed that I could<br />

still rely on the nine experts, even once I’d incorporated their testimony into a judgment.<br />

That’s what JSE denies. According to JSE, the arguments of the previous<br />

paragraph rely on illicitly basing belief on screened-off evidence. That’s bad. If JSE<br />

holds, then once I make a judgment, it’s all the evidence I have. Now assume JSE is<br />

true, and that I know myself to be something of an anti-expert. Then any judgment<br />

I make is fatally self-undermining, just like Egan and Elga say. When I make a judgment,<br />

I not only have evidence it is false, I have undefeated evidence it is false. So if<br />

I know I’m an anti-expert, I must suspect judgment. That’s the conclusion Egan and<br />

Elga draw, and it seems to be the right conclusion iff JSE is true. So the argument<br />

here relies on JSE.<br />

2 Why JSE matters<br />

I’ve argued that Adam Elga’s version of the Equal Weight View of disagreement,<br />

Roger White’s view of permissiveness, David Christensen’s view of higher-order evidence,<br />

and Andy Egan and Adam Elga’s view of self-confidence, all stand or fall with


Do Judgments Screen Evidence? 228<br />

JSE. Not surprisingly, Christensen also has a version of the Equal Weight View of evidence,<br />

and, as Tom Kelly notes in his Peer Disagreement and Higher-Order Evidence,<br />

there is a strong correlation between holding the Equal Weight View, and rejecting<br />

epistemic permissiveness. So I don’t think it is a coincidence that these views stand<br />

or fall with JSE. Rather, I think JSE is a common thread to the important work done<br />

by these epistemologists on disagreement, permissiveness and higher-order evidence.<br />

This isn’t surprising; in fact it is hard to motivate these theories, especially the<br />

Equal Weight View, without JSE. (The arguments about permissiveness are a little<br />

distinct from the arguments about disagreement and higher-order evidence, since<br />

there I’m only responding to one kind of argument against permissiveness, and there<br />

may be other arguments against permissiveness that have a very different structure,<br />

and hence are not connected to JSE.) I already noted that there’s a very strong response<br />

to the Equal Weight View available if the JSE is false. But even if you didn’t<br />

like that response, without the JSE the Equal Weight View doesn’t really seem to<br />

have much motivation at all. Let’s consider what happens in a situation of peer disagreement<br />

without JSE, remembering that we argued earlier that JSE was sufficient<br />

to ground the Equal Weight Hypothesis.<br />

In a typical situation of peer disagreement, agent A has evidence E, and on that<br />

basis comes to a judgment that p. (Perhaps she isn’t so decisive, but we’ll work with<br />

this kind of case for simplicity.) As always, let J be that that judgment was made.<br />

And let’s assume that it was a rational judgment to make on the basis of E, and that A<br />

knows both that she’s made the judgment and that she’s a generally rational person.<br />

Then B, a peer of A, makes a conflicting judgment, say that ¬ p, and A comes to know<br />

about this. What should A do?<br />

If JSE is false, then A has two pieces of evidence in favour of p. She has E, and she<br />

has the fact that she, a generally rational person, judged p. And she has one piece of<br />

evidence against p, namely the fact that B made a conflicting judgment ¬ p. Two of<br />

these pieces of evidence look like they might cancel each other out, namely the two<br />

judgments. But there’s one more piece of evidence available for p, namely E. If JSE<br />

is false, that means A has a reason to be more confident in p that in ¬ p. And that<br />

means that the Equal Weight View is false, since the Equal Weight View says that A<br />

has no reason to be more confident in p that in ¬ p.<br />

The argument here is intended to complement the earlier discussion of disagreement<br />

with and without JSE. Earlier I argued that without JSE agents can use the fact<br />

of disagreement to conclude that someone who seemed to be a peer is not in fact a<br />

peer. If that’s right, then they might use E to simply not change their mind at all.<br />

Here I’m arguing that even if they accept that the other person is a peer, without JSE<br />

that’s compatible with not being ‘balanced’ between p and ¬ p. Surprisingly, this is<br />

so even if they give their own judgment and their peer’s judgment ‘equal weight’; the<br />

tie-breaker is E itself, which is not a judgment.<br />

To be sure, on this line of reasoning, it isn’t obvious that A should stay firmly<br />

wedded to p. There’s a difference between having one uncontested reason to judge<br />

p, versus having two reasons to judge p and one to judge ¬ p. In the latter case,<br />

perhaps it is reasonable to have a more tentative attitude towards p. But what seems<br />

to come through clearly is that with JSE we have a strong argument for the Equal


Do Judgments Screen Evidence? 229<br />

Weight View, and without JSE the Equal Weight View is not motivated, while an<br />

objection to it is motivated. So the JSE is tied very closely to the Equal Weight View.<br />

2.1 Varieties of Defeaters<br />

It might be worried that the argument of the previous section ignores the distinction<br />

between rebutting and undercutting defeaters. The proponent of the Equal Weight<br />

View might hold that in the situation I’ve described, B’s judgment ¬ p is both a rebutting<br />

defeater with respect to A’s judgment p, since it directly provides reason for<br />

an alternative judgment, and an undercutting defeater with respect to the support E<br />

provides for p, since it provides reason to think that E does not really support p. If<br />

that is the right way to think about things, then it is misleading to simply count up<br />

the two reasons in favour of p versus one reason against, since A’s judgment and B’s<br />

judgment have different effects.<br />

I think the right thing to do at this point is to get a little clearer on just what exactly<br />

undercutting defeaters are supposed to do. In general, an undercutting defeater<br />

purports to tell us that some evidence E is not good evidence for a conclusion H.<br />

This suggests a picture of how undercutting defeaters work. If D undercuts the inference<br />

from E to H, but does not rebut that inference, that’s because (a) D is a rebutting<br />

defeater for the agent’s previous belief that E supports H, and (b) the inference from<br />

E to H is reasonable if it is sufficiently rebutted.<br />

It’s important to draw a distinction between first-order and second-order justification<br />

here. The picture I’m suggesting is that undercutting defeaters are in the first<br />

instance rebutting defeaters for the claim that the agent is justified in believing H.<br />

That is, when the agent gets an undercutting defeater, the first thing that happens<br />

is that the second-order claim that the agent is justified in believing she is justified in<br />

believing H becomes false. What effect does this have on the first-order claim that<br />

the agent is justified in believing H? That, I’m going to argue, depends on the nature<br />

of the relationship between E and H.<br />

In normal caes, if you aren’t justified in believing that E supports H, then you<br />

aren’t justified in inferring H from E. If I don’t know that a particular kitchen scale<br />

is reliable, and so don’t know whether an agent is justified in believing what it says,<br />

then plausibly I’m not justified in inferring from the fact that it says that p to p.<br />

But that’s not because there’s a general rule that first-order justification, in this case<br />

believing p, requires second-order justification, in this case being justified in believing<br />

I’m justified in believing p. Rather, it’s because for non-basic inferential methods, we<br />

are only justified in using them if we are justified in believing they are sound. Regress<br />

arguments, of the kind Lewis Carroll alludes to in “Achilles and the Tortoise”, suggest<br />

that can’t be right for basic inferential methods.<br />

The regress arguments suggest that at some level we must have unjustified justifiers.<br />

One natural place to find such things, indeed the place Carroll suggests they<br />

are located, is in basic inferential rules. For instance, if classical logic is the correct<br />

logic, then plausibly the rule that says an agent is justified in inferring H from ¬¬H<br />

is basic. And by this, I mean that an agent can infer H from ¬¬H even if she has no<br />

antecedent reason to believe that this inference is justified.


Do Judgments Screen Evidence? 230<br />

Now we have to be a little careful here. 1 What the regress arguments show is that<br />

we need unjustified justifiers. They don’t show that we need indefeasible justifiers. So<br />

even if our agent the inference from ¬¬H to H is basic, that doesn’t mean it can’t<br />

be defeated by some evidence. There’s a useful comparison to be made here to James<br />

Pryor’s dogmatist theory of perception. Pryor says that we can trust our senses in<br />

the absence of reason to believe that they are reliable. But he doesn’t say, and it would<br />

be a somewhat implausible thing to say, that we can trust our senses in the presence<br />

of reason to believe they are unreliable. A kind of dogmatism about basic inference,<br />

one that says basic inferences do not need to be justified but can be defeated, seems<br />

like a plausible and natural parallel.<br />

This theory about basic inferences has some consequences for the theory of undercutting<br />

defeaters. Assume that the inference from E to H is basic. And assume<br />

that an agent who has E, and who had previously inferred H, gets a relatively weak<br />

undercutting defeater for this inference. I claim that in some such cases, she might<br />

still be justified in believing H, because the original inference was basic. What I mean<br />

by a relatively weak undercutting defeater is that when the agent gets the defeater,<br />

she is no longer justified in believing that she is justified in inferring H from E, but<br />

nor is she justified in believing she is unjustified in inferring H from E. Now if the<br />

inference from E to H can only be properly made by someone who knows it is a<br />

good inference, then the agent can’t properly make that inference. So in that case the<br />

undercutting defeater does defeat the agent’s justification. But I’ve argued that this<br />

extra premise, about the inference requiring that the agent know it is a good one, is<br />

not true when the inference is basic. So if the inference from E to H is basic, and the<br />

agent gets this undercutting defeater, then she is still justified in making the inference.<br />

So she is justified in believing H. But she is no longer justified in believing that she<br />

has got to H by a justified route, so she is not justified in believing that she is justified<br />

in believing H.<br />

I think that at least some of the time, that’s what happens in disagreement. If<br />

we infer p from some evidence, and our friend refuses to make this inference, that<br />

undercuts our belief in p. That is, it rebuts our prior belief that p is supported by<br />

this evidence. In most cases, that will be enough to make our belief in p unjustified.<br />

But at least some of the time, we can infer p from some evidence even if we are not<br />

justified in believing p is supported by that evidence. In those cases, the Equal Weight<br />

View is unmotivated.<br />

The treatment of undercutting defeaters I’ve just offered, as rebutting defeaters<br />

of claims about justification, assumes that what we should believe can come apart<br />

from what we should believe that we should believe. Some people might find that<br />

assumption intolerable. So the rest of this section is devoted to arguing that it is true.<br />

2.2 Misleading Evidence in Ethics and Epistemology<br />

Among the many things we don’t know include many truths about ethics. The existence<br />

of many intelligent philosophers promoting false theories doesn’t help much.<br />

1 I’m grateful to several people, including Shamik Dasgupta, Thomas Kelly and Ted Sider, for pointing<br />

out how I was not being quite so careful in an earlier draft of this paper.


Do Judgments Screen Evidence? 231<br />

(Since ethicists disagree with one another, I’m pretty sure many of them are saying<br />

false things!) There have been several recent works on what the moral significance<br />

of moral uncertainty is (e.g., Sepielli, Lockhart). I’m inclined to think that it isn’t<br />

very great, although moral uncertainty might have some very odd epistemological<br />

consequences in cases like the following.<br />

Kantians: Frances believes that lying is morally permissible when the<br />

purpose of the lie is to prevent the recipient of the lie performing a seriously<br />

immoral act. In fact she’s correct; if you know that someone<br />

will commit a seriously immoral act unless you lie, then you should lie.<br />

Unfortunately, this belief of Frances’s is subsequently undermined when<br />

she goes to university and takes courses from brilliant Kantian professors.<br />

Frances knows that the reasons her professors advance for the immorality<br />

of lying are much stronger than the reasons she can advance for<br />

her earlier moral beliefs. After one particularly brilliant lecture, Frances<br />

is at home when a man comes to the door with a large axe. He says he<br />

is looking for Frances’s flatmate, and plans to kill him, and asks Frances<br />

where her flatmate is. If Frances says, “He’s at the police station across<br />

the road", the axeman will head over there, and be arrested. But that<br />

would be a lie. Saying anything else, or saying nothing at all, will put<br />

her flatmate at great risk, since in fact he’s hiding under a desk six feet<br />

behind Frances. What should she do?<br />

That’s an easy one! The text says that unless someone will commit a seriously immoral<br />

act unless you lie, you should lie. So Frances should lie. The trickier question<br />

is what she should believe. I think she should believe that she’d be doing the wrong<br />

thing if she lies. After all, she has excellent evidence for that, from the testimony of<br />

the ethical experts, and she doesn’t have compelling defeaters for that testimony. So<br />

she should do something that she believes, and should believe, is wrong. That’s OK;<br />

by hypothesis her Kantian professors are wrong about what’s right and wrong.<br />

So I think the immediate questions about what Frances should do and believe<br />

are easy. But the result is that Frances is in a certain way incoherent. For her to be<br />

as she should, she must do something she believes is wrong. That is, she should do<br />

something even thought she should believe that she should not do it. So I conclude<br />

that it is possible that sometimes what we should do is the opposite of we should<br />

believe we should do.<br />

If we’ve gone this far, we might be tempted by an analogy between ethics and<br />

epistemology. The conclusion of this analogy is that sometimes what we should<br />

believe is different from what we should believe that we should believe. This is what<br />

would follow if we believe the conclusion of the previous paragraph, and we think<br />

that whatever is true of action is (or at least probably is) true of belief as well.<br />

It might be objected here that there is a big disanalogy between ethics and epistemology.<br />

The objector says that since actions are voluntary, but beliefs are voluntary,<br />

we shouldn’t expect the norms for the two to be the same. I think this objection fails<br />

twice over. I’ve argued elsewhere that beliefs are much more like actions in terms


Do Judgments Screen Evidence? 232<br />

of their normative status than is usually acknowledged (Deontology and Descartes’<br />

Demon). But I won’t rely on that here. Because I think the simpler point to notice<br />

is that if there is a disanalogy between ethics and epistemology, it works in favour of<br />

the position I’m taking.<br />

I’m trying to argue that it’s possible to justifiably believe p on the basis of E even<br />

though you don’t have reason to believe that E supports p, indeed even if you have<br />

reason to believe that E does not support p. That is, I’m trying to argue that which<br />

first-order beliefs it is rational to have can come apart from which beliefs it is rational<br />

to have about which first-order beliefs it is rational to have. And I’ve argued so far<br />

that which actions you have most reason to do can come apart from which beliefs it<br />

is rational to have about which actions you have most reason to do.<br />

Now if there’s a reason that these ‘first-order’ and ‘second-order’ states should<br />

line up, it’s presumably because you should, on reflection try to get your relations to<br />

the world into some kind of broad coherence. But that’s a kind of consideration that<br />

applies much more strongly to things that we do after considered reflection, like lie<br />

to the murderer, than it does to things we do involuntarily, like form beliefs. 2 Put<br />

another way, when we are considering the norms applicable to involuntary bodily<br />

movements, we generally settle on fairly externalist considerations. A good digestive<br />

system, for instance, is simply one that digests food well, not one that digests<br />

in accord with reason, or in a coherent manner given other bodily movements, or<br />

anything of the sort. So if beliefs are involuntary, we should think they are to be<br />

judged more on how they line up with reality, and less on how well they cohere to<br />

our broader worldview. But that’s to say we should care less about coherence between<br />

beliefs and beliefs about doxastic normativity than we do about coherence between<br />

action and beliefs about norms of action. So if there’s a disanalogy between ethics<br />

and epistemology here, it’s one that makes my position stronger, not weaker.<br />

There is one disanalogy between the ethical case and the epistemological case<br />

that’s a little harmful to my case, though I don’t think it’s a particularly strong disanalogy.<br />

In both cases there are two modal operators. In the epistemological case,<br />

they are both epistemic modals: I claim the agent can be justified in believing p although<br />

she’s not justified in believing she’s justified in believing p. In the moral case,<br />

one is deontic and one is epistemic: I claim the agent can be justified in doing φ even<br />

though she’s not justified in believing she’s justified in doing φ. This doesn’t seem<br />

like a particularly serious difference between the cases to me, but if you do, you’ll be<br />

less impressed by this argument from analogy.<br />

The upshot of this is that I think that the argument by analogy is a good argument<br />

for my preferred way of treating undermining defeaters. But even if you think the<br />

argument can’t bear all that weight, the analogy does help with deflecting some arguments<br />

against the position I’m defending. The arguments I have in mind are those<br />

proffered by Richard Feldman in “Respecting the Evidence”. He thinks that we can’t<br />

rationally be in a position where we believe p on the basis of E, and we know E is<br />

2 As I mentioned earlier, I don’t really think this is the right picture of belief formation, but I’m taking<br />

it on board for the sake of considering this objection.


Do Judgments Screen Evidence? 233<br />

our basis for p, and yet we have good (if misleading) grounds for believing E does not<br />

support p. Here’s his setup of the debate.<br />

Consider a person who has some evidence, E, concerning a proposition,<br />

p, and also has some evidence about whether E is good evidence for p.<br />

I will say that the person is respecting the evidence about E and p when<br />

the person’s belief concerning p corresponds to what is indicated by the<br />

person’s evidence about E’s support for p. That is, a person respects the<br />

evidence about E and p by believing p when his or her evidence indicates<br />

that this evidence supports p or by not believing p when the evidence<br />

indicates that this evidence does not support p. (Feldman 2005: 95-6)<br />

Feldman goes on to argue that we should respect the evidence. Obviously I disagree.<br />

If E is all the agent’s evidence, then the agent’s attitude towards p should be determined<br />

by how strongly E supports p. One case that strongly suggests this is when<br />

the agent has no evidence whatsoever about how E supports p, so the agent’s evidence<br />

indicates nothing about E’s support for p. If that implies that the agent can’t<br />

have any attitudes about p, then we are led quickly into an unpleasant regress.<br />

But the main point I want to make here concerns the arguments that Feldman<br />

makes for respecting the evidence. It is true that sometimes disrespecting the evidence<br />

has counterintuitive consequences. See, for example, Feldman’s example of the<br />

baseball website (Feldman 2005: 112) or, for that matter, the discussion of kitchen<br />

scales in the previous subsection. But other cases might point the other way. Is it<br />

really intuitive that the student can’t soundly infer p from ¬¬ p after hearing some<br />

lectures from a talented but mistaken intuitionistic logician? I think intuition is sufficiently<br />

confused here that we should rely on other evidence. Feldman’s primary<br />

argument concerns the oddity of the position I’m defending. In his taxonomy of<br />

views, ‘View 1’ is the view that when an agent has evidence E, which is in fact good<br />

evidence for p, and gets misleading evidence that E does not in fact support p, then<br />

she can know p, but not know that she knows p. That’s what I think happens at<br />

least some of the time, at least when the connection between E and p is basic, so his<br />

arguments are meant to tell against my position.<br />

View 1... has the implication that a person could be in a situation in<br />

which she justifiably denies or suspends judgment about whether her<br />

basis for believing a proposition is a good one, but nevertheless justifiably<br />

believes the proposition. Imagine such a person reporting her situation:<br />

“ p, but of course I have no idea whether my evidence for p is any good.”<br />

At the very least, this sounds odd. (Feldman 2005: 105)<br />

He later goes into more detail on this point.<br />

View 1 leads to the conclusion that our student can correctly believe<br />

things such as<br />

6. T , but my overall evidence does not support T .


Do Judgments Screen Evidence? 234<br />

... As noted, (6) seems odd. No doubt things like (6) can be true. And<br />

it may be that (6) does not have quite the paradoxical air that “T , but I<br />

don’t believe T ” has. Still, when the student acknowledges in the second<br />

conjunct that her evidence does not support T , she says that she does not<br />

have good reason to assert the first conjunct. So, if she is right about the<br />

second conjunct, her evidence does not support the first conjunct. Thus,<br />

if reasonable belief requires evidential support, it is impossible for the<br />

student’s belief in (6) to be both true and reasonable. And, if knowledge<br />

requires truth and reasonable belief, it also follows that she cannot know<br />

(6). While it does not follow that belief in (6) cannot be reasonable, it<br />

does make it peculiar. One wonders what circumstances could make<br />

belief in it reasonable. (Feldman 2005: 108-9)<br />

The last line is easy to reply to. The circumstances that could make it reasonable<br />

are ones where the agent has good evidence that T , but also has misleading evidence<br />

about the quality of her evidence.<br />

The more substantial point concerns the reasonableness of believing (6). As Feldman<br />

notes, the issue is not whether the subject could truly believe (6). In the circumstances<br />

we’re interested in, the second conjunct of (6) is false, although well supported<br />

by evidence. So the issue is just whether it could be a justified belief. To make the<br />

argument that it could not be justified work, Feldman needs to thread a very tight<br />

normative needle. If we say that a belief must actually constitute knowledge to be<br />

justified, then my view does not have the consequence that an agent could justifiably<br />

believe (6). That’s simply because the second conjunct is false. On the other hand,<br />

if justification does not require knowability, then it isn’t clear why the belief in (6)<br />

is not justified. By hypothesis, each conjunct is well supported by the evidence. So<br />

we only get a problem if justification requires knowability, but not knowledge. And<br />

there’s no reason, I think, to hold just that position; it makes the relation between<br />

justification and knowledge just too odd.<br />

And while there is something disarming about the conjunction, reflection on<br />

cases like Kantians should remind us that there’s nothing too odd about the two conjuncts.<br />

Frances should lie to the murderer at the door, and she should believe that<br />

she has most reason to not lie to the murderer at the door. That doesn’t seem too<br />

different from it being the case that she should believe p, and she should believe that<br />

she has most reason to not believe p. And, assuming a natural connection between<br />

evidence and reason, that in turn isn’t very different from it being the case that she<br />

should believe p, and she should believe that her evidence does not support p. Given<br />

that cases like Frances’ exist, it isn’t odd that someone could be in a postion to reasonably<br />

believe each conjunct of (6). So, contra Feldman, it isn’t odd that they could<br />

be in a position to reasonably believe (6).<br />

So I conclude that undercutting defeaters give us reason to think our evidence<br />

does not support a particular hypothesis, and this sometimes, but not all the time,<br />

gives us reason to lower our confidence in that hypothesis. And as I’ve argued above,<br />

on that picture of how undercutting defeaters work, there is no JSE-free argument for<br />

the Equal Weight View. Indeed, if that’s how undercutting works, then if JSE is not


Do Judgments Screen Evidence? 235<br />

assumed, it seems the Equal Weight View really requires that the peer’s judgment do<br />

double duty, both overriding our judgment and overriding the initial evidence. And<br />

that’s not a very intuitive picture.<br />

3 JSE and Practical Action<br />

As we noted earlier, Christensen’s position on higher-order evidence is closely related<br />

to JSE. Indeed, some of the examples he uses to motivate his claim about higher-order<br />

evidence seem also to provide motivation for JSE. In a number of cases that he offers,<br />

evidence about the circumstances in which a judgment is made provide reason, he<br />

says, for the judger to lower her confidence in a certain proposition, even if that initial<br />

judgment was correct. I’m going to argue that if Christensen is right, we should also<br />

expect to find cases where evidence about the circumstances in which the judgment<br />

is made provide reason for the judger to raise her confidence in a certain proposition,<br />

even if once again that initial judgment was correct. And, I’ll argue, that’s not what<br />

investigation of the cases reveals. But first let’s look two of at Christensen’s examples.<br />

(I’ve slightly changed the numbers in the second case.)<br />

Sleepy Hospital: I’m a medical resident who diagnoses patients and prescribes<br />

appropriate treatment. After diagnosing a particular patient’s<br />

condition and prescribing certain medications, I’m informed by a nurse<br />

that I’ve been awake for 36 hours. I reduce my confidence in my diagnosis<br />

and prescription, pending a careful recheck of my thinking.<br />

Tipping: My friend and I have been going out to dinner for many years.<br />

We always tip 20% and divide the bill equally, and we always do the math<br />

in our heads. We’re quite accurate, but on those occasions on which<br />

we’ve disagreed in the past, we’ve been right equally often. This evening<br />

seems typical, in that I don’t feel unusually tired or alert, and neither my<br />

friend nor I have had more wine or coffee than usual. I get $42 in my<br />

mental calculation, and become quite confident of this answer. But then<br />

my friend says she got $45. I dramatically reduce my confidence that $42<br />

is the right answer, and dramatically increase my confidence that $45 is<br />

correct, to the point that I have roughly equal confidence in each of the<br />

two answers.<br />

Christensen thinks the things the narrator does in each case are, all things considered,<br />

the right thing to do. We should note to start with that there’s something a little odd<br />

about this. This is easiest to see in Tipping. Let’s say the bill was $70. Then the<br />

narrator’s share was $70 ÷ 2, i.e. $35 plus 20% for the tip, so plus $7, so it is $42. The<br />

narrator is competent with simple arithmetic, so he has access to all of this evidence,<br />

and not believing something that is clearly entailed by one’s evidence is bad, especially<br />

when you are trying to figure out whether it is true and have worked through<br />

the computation. But Christensen thinks it is even worse to, immodestly, take oneself<br />

to be the one who is correct in a dispute like this. His primary motivation for<br />

this, I think, comes from the following variant on Sleepy Hospital.


Do Judgments Screen Evidence? 236<br />

Or consider the variant where my conclusion concerns the drug dosage<br />

for a critical patient, and ask yourself if it would be morally acceptable<br />

for me to write the prescription without getting someone else to corroborate<br />

my judgment. Insofar as I’m morally obliged to corroborate,<br />

it’s because the information about my being drugged should lower my<br />

confidence in my conclusion. (Christensen, pg 11)<br />

I think the last line here isn’t correct. Indeed, I think the last line reveals a crucial<br />

premise connecting belief and action that is at the heart of his argument, and which is<br />

mistaken. In the example, the medical evidence suggests that the prescription should<br />

be, let’s say, 100µg, and that’s what the narrator at first intends to prescribe. But the<br />

narrator also has evidence that he’s unreliable at the moment, since he’s been awake<br />

so long. Christensen thinks that this evidence is evidence against the claim that the<br />

prescription should be 100µg, and the narrator should not believe that the prescription<br />

should be 100µg. The alternative view, the one I ultimately want to defend, is<br />

that the narrator should believe that the prescription should be 100µg, although he<br />

shouldn’t, perhaps, believe that he should believe that. The second conjunct is because<br />

he has good reason to think that his actual judgment is clouded, because he has<br />

been awake so long.<br />

Christensen’s argument against my position, as I understand it, is that if that were<br />

so, then the narrator should make the 100µg prescription. But I think that relies on<br />

too simplistic an understanding of the relation between norms of belief and norms<br />

of action. It might be that given the duty of care a doctor has, he must not only<br />

know, but know that he knows, that the drug being prescribed is appropriate before<br />

the prescription is made. I think it’s crucial here that the Hippocratic Oath puts<br />

different requirements on actions by doctors than it puts on omissions. After all, the<br />

oath is Do no harm. A natural conclusion to draw from that is that when there is a<br />

chance to double-check before acting, the doctor should act only if both the medical<br />

evidence, and the evidence about the doctor’s own reliability, point in the direction<br />

of acting. Here’s a case that supports that explanation of the case.<br />

Cautious Hospital: A doctor has been on duty for 12 hours. In the<br />

world of the story, at that stage in a shift, doctors are typically excessively<br />

cautious about their diagnosis. The initial feelings of drowsiness cause<br />

them to second-guess themselves even though they are capable of making<br />

reliable confident judgments. Helen, a doctor, knows these facts, and<br />

has been on duty for 12 hours. Helen is in fact immune to this general<br />

tendency of over-caution, though she does not have any prior reason to<br />

believe this. She looks at the symptoms of a patient who is in some<br />

discomfort, and concludes that probably he should be given 100µg of<br />

drug X, although more tests would confirm whether this is really the<br />

best action. That’s the right reaction to the medical evidence; there are<br />

realistic explanations of the symptoms according to which 100µg of X<br />

would be harmful, and the tests Helen considers would rule out these<br />

explanations. Had she only just come on duty, she would order the tests,


Do Judgments Screen Evidence? 237<br />

because the risk of harming the patient if the probably correct diagnosis<br />

is wrong is too great. But Helen now has reason to worry that if she does<br />

this, she is being excessively cautious, and is making the patient suffer<br />

unnecessarily. What should Helen do?<br />

If Christensen is right that agents should act on ‘higher-order’ beliefs about what<br />

they should believe, then Helen should prescribe 100µg of X. She should think, “I<br />

think probably 100µg of X is the right treatment here, and people in my position<br />

are generally a little cautious, so there’s excellent reason to hold 100µg is the right<br />

treatment. So I’ll do that.” But that’s a horrible breach of good medical practice.<br />

She has great evidence that X might be harmful, and general consideration about<br />

the credence forming practices of slightly drowsy doctors isn’t the kind of thing that<br />

could defeat that evidence.<br />

So when the regular medical evidence, and the evidence about our cognitive capacities,<br />

point towards different actions, it isn’t that we should always do the action<br />

suggested by the evidence about our cognitive capacities. Rather, we should do the<br />

cautious action, especially if we have a duty of care to the people who will be harmed<br />

by the action should it all go awry. So Christensen doesn’t have an argument here for<br />

taking this ‘second-order’ evidence to be ruling when it comes to what we should,<br />

all-things-considered, do.<br />

That has two consequences for Christensen’s argument for JSE. First, it has an<br />

undermining consequence. If what we should do depends on what we should believe<br />

and on what we should believe about what we should believe, then Christensen<br />

doesn’t have a good reason for thinking that the doctor should be less confident in<br />

Sleepy Hospital. But second, it has a rebutting consequence. If JSE were true, then<br />

plausibly the doctor in Cautious Hospital should prescribe 100µg of drug X without<br />

running more tests. After all, the underlying evidence has been screened off by<br />

the judgment, and the judgment plus the background information about the circumstance<br />

in which the judgment is made strongly suggest that this is the right prescription.<br />

Since this is not what the doctor should do, Sleepy Hospital is good evidence<br />

that JSE is in fact false.<br />

It’s worth noting too that JSE seems to lead us into probabilistic incoherence.<br />

Standard Bayesian theory says that rational agents should have credence 1 in all mathematical<br />

truths. Now consider an agent who judges that 70 × 0.6 = 42, and who gets<br />

evidence that her arithmetic judgments are unreliable. Given JSE, that suggests her<br />

credence that 70 × 0.6 = 42 should be less than 1. But that, given Bayesianism, is<br />

incoherent. Now perhaps this is just as good an argument that sometimes probabilistic<br />

incoherence is desirable. Or perhaps there’s a way to make JSE probabilistically<br />

coherent. I won’t press the point, though it seems there are challenges around here<br />

to JSE.<br />

4 Regress Arguments<br />

In my Disagreeing about Disagreement, I argued that the Equal Weight View has an<br />

uncomfortable asymmetry in how it treats different ‘levels’ of disagreement. If two


Do Judgments Screen Evidence? 238<br />

peers disagree about first-order facts, it recommends that they adjust their views so<br />

as to take each other’s original position as equally likely to be true. If they disagree<br />

about how to respond to disagreement, it recommends that the one who has the<br />

incorrect view defer to the one who has the correct view.<br />

A similar kind of level asymmetry arises with JSE. Let’s say an agent makes a<br />

judgment on the basis of E, and let J be the proposition that that judgment was made.<br />

JSE says that E is now screened off, and the agent’s evidence is just J . But with that<br />

evidence, the agent presumably makes a new judgment. Let J ′ be the proposition that<br />

that judgment was made. We might ask now, does J ′ sit alongside J as extra evidence,<br />

is it screened off by J , or does it screen off J ? The picture behind JSE, the picture<br />

that says that judgments on the basis of some evidence screen that evidence, suggest<br />

that J ′ should in turn screen J . But now it seems we have a regress on our hands.<br />

By the same token, J ′′ , the proposition concerning the new judgment made on the<br />

basis of J ′ , should screen off J ′ , and the proposition J ′′′ about the fourth judgment<br />

made, should screen off J ′′ , and so on. The poor agent has no unscreened evidence<br />

left! Something has gone horribly wrong.<br />

I think this regress is ultimately fatal for JSE. But to see this, we need to work<br />

through the possible responses that a defender of JSE could make. There are really<br />

just two moves that seem viable. One is to say that the regress is not vicious, because<br />

all these judgments should agree in their content. The other is to say that the regress<br />

does not get going, because J is better evidence than J ′ , and perhaps screens it. The<br />

final two subsections of the paper will address these two responses.<br />

4.1 A Virtuous Regress?<br />

An obvious way to avoid the regress is to say that for any rational agent, any judgment<br />

they make must be such that when they add the fact that they made that judgment to<br />

their evidence (or, perhaps better given JSE, replace their evidence with the fact that<br />

they made that judgment), the rational judgment to make given the new evidence has<br />

the same content as the original judgment. So if you’re rational, and you come to<br />

believe that p is likely true, then the rational thing to believe given you’ve made that<br />

judgment is that p is likely true.<br />

Note that this isn’t as strong a requirement as it may first seem. The requirement<br />

is not that any time an agent makes a judgment, rationality requires that they say<br />

on reflection that it is the correct judgments. Rather, the requirement is that the<br />

only judgments rational agents make are those judgments that, on reflection, she<br />

would reflectively endorse. We can think of this as a kind of ratifiability constraint<br />

on judgment, like the ratifiability constraint on decision making that Richard Jeffrey<br />

uses to handle Newcomb cases.<br />

To be a little more precise, a judgment is ratifiable for agent S just in case the<br />

rational judgment for S to make conditional on her having made that judgment has<br />

the same content as the original judgment. The thought then is that we avoid the<br />

regress by saying rational agents always make ratifiable judgments. If the agent does<br />

do that, there isn’t much of a problem with the regress; once she gets to the first level,<br />

she has a stable view, even once she reflects on it.


Do Judgments Screen Evidence? 239<br />

It seems to me that this assumption, that only ratifiable judgments are rational, is<br />

what drives most of the arguments in Egan and Elga’s paper on self-confidence, so I<br />

don’t think this is a straw-man move. Indeed, as the comparison to Jeffrey suggests,<br />

it has some motivation behind it. Nevertheless it is false. I’ll first note one puzzling<br />

feature of the view, then one clearly false implication of the view.<br />

The puzzling feature is that in some cases there may be nothing we can rationally<br />

do which is ratifiable. One way this can happen involves a slight modification of<br />

Egan and Elga’s example of the directionaly-challenged driver. Imagine that when<br />

I’m trying to decide whether p, for any p in a certain field, I know (a) that whatever<br />

judgment I make will usually be wrong, and (b) if I conclude my deliberations without<br />

making a judgment, then p is usually true. If we also assume JSE, then it follows<br />

there is no way for me to end deliberation. If I make a judgment, I will have to retract<br />

it because of (a). But if I think of ending deliberation, then because of (b) I’ll have<br />

excellent evidence that p, and it would be irrational to ignore this evidence. (Nico<br />

Silins has used the idea that failing to make a judgment can be irrational in a number<br />

of places, and those arguments motivated this example.)<br />

This is puzzling, but not obviously false. It is plausible that there are some epistemic<br />

dilemmas, where any position an agent takes is going to be irrational. (By that,<br />

I mean it is at least as plausible that there are epistemic dilemmas as that there are<br />

moral dilemmas, and I think the plausibility of moral dilemmas is reasonably high.)<br />

That a case like the one I’ve described in the previous paragraph is a dilemma is perhaps<br />

odd, but no reason to reject the theory.<br />

The real problem, I think, for the ratifiability proposal is that there are cases<br />

where unratifiable judgments are clearly preferable to ratifiable judgments. Assume<br />

that I’m a reasonably good judge of what’s likely to happen in baseball games, but<br />

I’m a little over-confident. And I know I’m over-confident. So the rational credence,<br />

given some evidence, is usually a little closer to 1/2 than I admit. At risk of being<br />

arbitrarily precise, let’s say that if p concerns a baseball game, and my credence in p<br />

is x, the rational credence in p, call it y, for someone with no other information than<br />

this is given by:<br />

s i n(2πx)<br />

y = x +<br />

50<br />

To give you a graphical sense of how that looks, the dark line in this graph is y, and<br />

the lighter diagonal line is y = x.


Do Judgments Screen Evidence? 240<br />

Note that the two lines intersect at three points: (0,0),( 1/2, 1/2) and (1,1). So if my<br />

credence in p is either 0, 1/2 or 1, then my judgment is ratifiable. Otherwise, it is not.<br />

So the ratifiability constraint says that for any p about a baseball game, my credence<br />

in p should be either 0, 1/2 or 1. But that’s crazy. It’s easy to imagine that I know<br />

(a) that in a particular game, the home team is much stronger than the away team,<br />

(b) that the stronger team usually, but far from always, wins baseball games, and (c)<br />

I’m systematically a little over-confident about my judgments about baseball games,<br />

in the way just described. In such a case, my credence that the home team will win<br />

should be high, but less than 1. That’s just what the ratificationist denies is possible.<br />

This kind of case proves that it isn’t always rational to have ratifiable credences.<br />

It would take us too far afield to discuss this in detail, but it is interesting to think<br />

about the comparison between the kind of case I just discussed, and the objections to<br />

backwards induction reasoning in decision problems that have been made by Pettit<br />

and Sugden, and by Stalnaker. The backwards induction reasoning they criticise is, I<br />

think, a development of the idea that decisions should be ratifiable. And the clearest<br />

examples of when that reasoning fails concern cases where there is a unique ratifiable<br />

decision, and it is guaranteed to be one of the worst possible outcomes. The example<br />

I described in the last few paragraphs has, quite intentionally, a similar structure.<br />

4.2 A Privileged Stopping Point<br />

The other way to avoid the regress is to say that there is something special about<br />

the first level. So although J screens E, it isn’t the case that J ′ screens J . That way,<br />

the regress doesn’t start. This kind of move is structurally like the move Adam Elga<br />

makes in How to Disagree about How to Disagree, where he argues that we should<br />

adjust our views about first-order matters in (partial) deference to our peers, but we<br />

shouldn’t adjust our views about the right response to disagreement in this way.<br />

It’s hard to see what could motivate such a position, either about disagreement<br />

or about screening. It’s true that we need some kind of stopping point to avoid these


Do Judgments Screen Evidence? 241<br />

regresses. But the most natural stopping point is the very first level. Consider a toy<br />

example. It’s common knowledge that there are two apples and two oranges in the<br />

basket, and no other fruit. (And that no apple is an orange.) Two people disagree<br />

about how many pieces of fruit there are in the basket. A thinks there are four, B<br />

thinks there are five, and both of them are equally confident. Two other people, C<br />

and D, disagree about what A and B should do in the face of this disagreement. All<br />

four people regard each other as peers. Let’s say C ’s position is the correct one (whatever<br />

that is) and D’s position is incorrect. Elga’s position is that A should partially<br />

defer to B, but C should not defer to D. This is, intuitively, just back to front. A<br />

has evidence that immediately and obviously entails the correctness of her position.<br />

C is making a complicated judgment about a philosophical question where there are<br />

plausible and intricate arguments on each side. The position C is in is much more<br />

like the kind of case where experience suggests a measure of modesty and deference<br />

can lead us away from foolish errors. If anyone should be sticking to their guns here,<br />

it is A, not C .<br />

The same thing happens when it comes to screening. Let’s say that A has some<br />

evidence that (a) she has made some mistakes on simple sums in the past, but (b)<br />

tends to massively over-estimate the likelihood that she’s made a mistake on any given<br />

puzzle. What should she do? One option, in my view the correct one, is that she<br />

should believe that there are four pieces of fruit in the basket, because that’s what the<br />

evidence obviously entails. Another option is that she should be not very confident<br />

there are four pieces of fruit in the basket, because she makes mistakes on these kinds<br />

of sums. Yet another option is that she should be pretty confident (if not completely<br />

certain) that there are four pieces of fruit in the basket, because if she were not very<br />

confident about this, this would just be a manifestation of her over-estimation of her<br />

tendency to err. The ‘solution’ to the regress we’re considering here says that the<br />

second of these three reactions is the uniquely rational reaction. The idea behind the<br />

solution is that we should respond to the evidence provided by first-order judgments,<br />

and correct that judgment for our known biases, but that we shouldn’t in turn correct<br />

for the flaws in our self-correcting routine. I don’t see what could motivate such<br />

a position. Either we just rationally respond to the evidence, and in this case just<br />

believe there are four pieces of fruit in the basket, or we keep correcting for errors<br />

we make in any judgment. It’s true that the latter plan leads either to regress or to<br />

the kind of ratificationism we dismissed in the previous subsection. But that’s not<br />

because the disjunction is false, it’s because the first disjunct is true.<br />

5 Conclusion<br />

We started with five related theses that are all closely tied to JSE. These are:<br />

• The Equal Weight View of disagreement;<br />

• Roger White’s anti-permissiveness epistemology;<br />

• David Christensen’s conception of self-information as ‘higher-order evidence’;<br />

• Andy Egan and Elga’s view of the effects of knowledge of self-unreliability; and<br />

• Richard Feldman’s view that we should ‘Respect the evidence’.


Do Judgments Screen Evidence? 242<br />

Not all of these views depend on JSE, but they’re all supported in ways that I think<br />

rely on JSE. And some of them, mostly clearly the first, seem to be false if JSE is false.<br />

So JSE is deeply implicated in an important branch of contemporary epistemology.<br />

Yet JSE is false, as the last three sections have shown. This suggests that many of the<br />

positions on this branch have to be rethought.<br />

The failure of JSE suggests a kind of externalism, though not of the traditional<br />

kind. It does not suggest, or at least does not require, that evidence be individuated<br />

in ways in principle inaccessible to the agent. It does not suggest, or at least does<br />

not require, that the force of evidence be determined by contingent matters, such as<br />

the correlation between evidence of this type and various hypotheses. But it does<br />

suggest that there are facts about which hypotheses are supported by which pieces<br />

of evidence, and that rational agents do well when they respond to these epistemic<br />

facts. Moreover, it suggests these facts retain their normative significance even if the<br />

agent has reason to believe that she’s made a mistake in following them. That is, if<br />

an agent’s judgment conforms to the correct norms of judgment, then even if she has<br />

evidence that she is not good at judging, she should stick to her judgment. In such a<br />

case she could not defend her judgment without appeal to the evidence that judgment<br />

is based on. But that’s not a bad position to be in; judgments should be defensible by<br />

appeal to the evidence they’re based on.


Easy Knowledge and Other Epistemic Virtues<br />

This paper has three aims. First, I’ll argue that there’s no good reason to accept any<br />

kind of ‘easy knowledge’ objection to externalist foundationalism. It might be a little<br />

surprising that we can come to know that our perception is accurate by using our<br />

perception, but any attempt to argue this is impossible seems to rest on either false<br />

premises or fallacious reasoning. Second, there is something defective about using<br />

our perception to test whether our perception is working. What this reveals is that<br />

there are things we aim for in testing other than knowing that the device being tested<br />

is working. I’ll suggest that testing aims for sensitive knowledge that the device is<br />

working. Testing a device, such as our perceptual system, by using its own outputs<br />

may deliver knowledge, but it can’t deliver sensitive knowledge. So it’s a bad way to<br />

test the system. The big conclusion here is that sensitivity is an important epistemic<br />

virtue, although it is not necessary for knowledge. Third, I’ll argue that the idea<br />

that sensitivity is an epistemic virtue can provide a solution to a tricky puzzle about<br />

inductive evidence. This provides another reason for thinking that the conclusion of<br />

section two is correct: not all epistemic virtues are to do with knowledge. 1<br />

1 Responding to the Easy Knowledge Objection<br />

I often hear it argued that scepticism is very intuitive. It seems to me that this is<br />

completely wrong. I’m currently looking out the window and seeing the rain. It<br />

seems to me that I thereby come to know that it is raining. The sceptic says that this<br />

isn’t at all true. Put that bluntly, it’s hard to imagine anything more counterintuitive.<br />

What is intuitive about scepticism is that there are intuitively plausible principles<br />

that entail, fairly directly, sceptical conclusions. So although the sceptic’s conclusions<br />

are counterintuitive, her premises are very plausible. And the implication of her conclusion<br />

by her premises is so immediate that it sometimes seems that her conclusion<br />

is intuitive as well.<br />

You might think that if this was all that was going on with scepticism, it would<br />

be easy to argue against the sceptic. We just have to identify those premises, show<br />

that they are false, and declare scepticism defeated. The problem is that the sceptic<br />

doesn’t have to be pinned down to any particular set of premises. She could<br />

use premises related to infallibilism, as Descartes does. She could use anti-circularreasoning<br />

premises, as Hume does. Or she could use premises related to sensitivity,<br />

as Nozick does. Or, if she’s a confused philosophical neophyte, she could bounce<br />

between these three arguments without clearly distinguishing them. The result is<br />

that any objection to one of the arguments will just leave the sceptic, or the scepticalfriendly<br />

neophyte, feeling that the core reasons for being sceptical have not been<br />

touched.<br />

† In progress.<br />

1 The references here are obviously incomplete – this should be thought of more as a long blog post<br />

than a draft article.


Easy Knowledge and Other Epistemic Virtues 244<br />

I think we see a similar dialectic in the debate over the ’Easy Knowledge’ objection<br />

to certain kinds of externalist foundationalism. The worry is easy enough to<br />

state. Let’s assume our target theory says that for some privileged class of cognitive<br />

processes, if that process reliably produces true beliefs, then it produces justified beliefs.<br />

We’ll assume, for ease of exposition, that the privileged class includes visual<br />

perception and introspection of what she is perceiving.<br />

Then we imagine an agent who doesn’t antecedently know that her vision is<br />

working, though in fact it is. She looks at a bunch of things, and (a) sees what is<br />

there, and (b) introspects what she is seeing. For instance, she looks out the window,<br />

sees it is raining, and introspects that she is seeing this. She believes it is raining on<br />

the basis of the perception, and that she sees it is raining on the basis of the introspection.<br />

By a simple entailment, she comes to believe her visual system is working<br />

correctly. And she can repeat this trick indefinitely many times. So she concludes<br />

that her visual system is highly reliable.<br />

Many people intuit directly that something has gone wrong here, and so there is<br />

something wrong with our foundationalist externalist. 2 I think we should be very<br />

careful about such intuitions. After all, the following principle seems very plausible.<br />

If we know what the output of a machine is, and we can see that this output matches<br />

reality, and our visual system is working correctly, then we have good reason to believe<br />

that the machine is accurately reflecting reality. And that implies that our agent<br />

can know that the machine that is her visual system is working. The reasoning of the<br />

last few sentences might fail, but we need a good reason to say why it fails.<br />

Now there are several such reasons that we might propose. In fact, I’ll identify 6<br />

such reasons below. I don’t think any of these reasons work. But I think they mostly<br />

fail for independent reasons. And that’s part of what makes the easy knowledge<br />

problem so difficult. Just like with scepticism, a not particularly intuitive conclusion<br />

is supported by a wide variety of distinct, and intuitive, considerations. When you<br />

knock down one, it is easy for the anti-foundationalist, or anti-externalist, to reply<br />

using one of the other reasons. So it’s best to try to deal with all the reasons at once.<br />

I’ll make a start on that here.<br />

First, a small disclaimer. Several of the points I make will be familiar from the<br />

literature. A chunk of this section is my collecting my thoughts on easy knowledge<br />

arguments, not advancing the debate. And since this is basically a blog post in PDF<br />

form, not a scholarly article, I haven’t cited anyone. But that doesn’t mean I’m claiming<br />

much of this is original - except perhaps for the broad framing. Having said that,<br />

here are the six reasons I think one might find easy knowledge problematic. The last<br />

of these is the segue to the discussion of testing in section 2.<br />

2 It’s tempting to think that the easy knowledge problem is a distinctive problem for externalists, or for<br />

those who assign a special role for perception in epistemology. But as is well known, the structure of the<br />

problem is so general that it can arise for anyone. We could use the same structure to raise a problem for a<br />

theorist who gives introspection a special role in epistemology. Right now it seems to me that I’m looking<br />

at a red carpet, and it seems to me that it seems to me that I’m looking at a red carpet, so my introspection<br />

tells me that my introspective access to how things seem to me is accurate. This seems like just as bad a<br />

way to check on the accuracy of introspection as is checking on perception using perception.


Easy Knowledge and Other Epistemic Virtues 245<br />

1.1 Sensitivity<br />

Objection: If you use perception to test perception, then you’ll come to<br />

believe perception is accurate whether it is or not. So if it weren’t accurate,<br />

you would still believe it is. So your belief that it is accurate will be<br />

insensitive, in Nozick’s sense. And insensitive beliefs cannot constitute<br />

knowledge.<br />

The obvious reply to this is that the last sentence is false. As has been argued at great<br />

length, e.g. in Williamson (2000: Ch 7), sensitivity is not a constraint on knowledge.<br />

We can even see this by considering other cases of testing.<br />

Assume Smith is trying to figure out whether Acme machines are accurate at<br />

testing concrete density. She has ten Acme machines in her lab, and proceeds to test<br />

each of them in turn by the standard methods. That is, she gets various samples of<br />

concrete of known density, and gets the machine being tested to report on its density.<br />

For each of the first nine machines, she finds that it is surprisingly accurate, getting<br />

the correct answer under a very wide variety of testing conditions. She concludes<br />

that Acme is very good at making machines to measure concrete density, and that<br />

hence the tenth machine is accurate as well.<br />

We’ll come back to the question of whether this is a good way to test the tenth<br />

machine. It seems that Smith has good inductive grounds for knowing that the tenth<br />

machine is accurate. Yet the nearest world in which it is not accurate is one in which<br />

there were some slipups made in its manufacture, and so it is not accurate even though<br />

Acme is generally a good manufacturer. In that world, she’ll still believe the tenth<br />

machine is accurate. So her belief in its accuracy is insensitive, although she knows<br />

it is accurate. So whatever is wrong with testing a machine against its own outputs,<br />

if the problem is just that the resulting beliefs are insensitive, then that problem does<br />

not preclude knowledge of the machine’s accuracy.<br />

1.2 One-Sidedness<br />

Objection: If you use perception to test perception, then you can only<br />

come to one conclusion; namely that perception is accurate. Indeed, the<br />

test can’t even give you any reason to believe that perception is inaccurate.<br />

But any test that can only come to one conclusion, and cannot give<br />

you a reason to believe the negation of that conclusion, cannot produce<br />

knowledge.<br />

Again, the problem here is that the last step of the reasoning is mistaken. There are<br />

plenty of tests that can only produce knowledge in one direction only. Here are four<br />

such examples.<br />

Brown is an intuitionist, so she does not believe that instances of excluded middle<br />

are always true. She does, however, know that they can never be false. She is unsure<br />

whether F a is decidable, so she does not believe F a∨¬F a. She observes a closely, and<br />

observes it is F . So she infers F a ∨ ¬F a. Her test could not have given her a reason<br />

to believe ¬(F a ∨ ¬F a), but it does ground knowledge that F a ∨ ¬F a.


Easy Knowledge and Other Epistemic Virtues 246<br />

Jones is trying to figure out which sentences are theorems of a particular modal<br />

logic she is investigating. She knows that the logic is not decidable, but she also<br />

knows that a particular proof-evaluator does not validate invalid proofs. She sets the<br />

evaluator to test whether random strings of characters are proofs. After running<br />

overnight, the proof-evaluator says that there is a proof of S in the logic. Jones comes<br />

to know that S is a theorem of the logic, even though the failure to deliver S would<br />

not have given her any reason to believe it is not a theorem.<br />

Grant has a large box of Turing machines. She knows that each of the machines<br />

in the box has a name, and that its name is an English word. She also knows that<br />

when any machine halts, it says its name, and that it says nothing otherwise. She<br />

does not know, however, which machines are in the box, or how many machines<br />

are in the box. She listens for a while, and hears the words ‘Scarlatina’, ‘Aforetime’<br />

and ‘Overinhibit’ come out of the box. She comes to believe, indeed know, that<br />

Scarlatina, Aforetime and Overinhibit are Turing machines that halt. Had those machines<br />

not halted, she would not have been in the right kind of causal contact with<br />

those machines to have singular thoughts about them, so she could not have believed<br />

that they are not halting machines. So listening for what words come out of the box<br />

is one-sided in the way described in the objection, but still produced knowledge.<br />

Adams is a Red Sox fan in Australia in the pre-internet era. Her only access to<br />

game scores are from one-line score reports in the daily newspaper. She doesn’t know<br />

how often the Red Sox play. She notices that some days there are 2 games reported,<br />

some days there is 1 game reported, and on many days there are no games reported.<br />

She also knows that the paper’s editor is also a Red Sox fan, and only prints the score<br />

when the Red Sox win. When she opens the newspaper and sees a report of a Red<br />

Sox win (i.e. a line score like “Red Sox 7, Royals 3”) she comes to believe that the Red<br />

Sox won that game. But when she doesn’t see a score, she has little reason to believe<br />

that the Red Sox lost any particular game. After all, she has little reason to believe<br />

that any particular game even exists, or was played, let alone that it was lost. So the<br />

newspaper gives her reasons to believe that the Red Sox win games, but never reason<br />

to believe that the Red Sox didn’t win a particular game.<br />

So we have four counterexamples to the principle that you can only know p if<br />

you use a test that could give you evidence that ¬ p. The reader might notice that<br />

many of our examples involve cases from logic, or cases involving singular propositions.<br />

Both of those kinds of cases are difficult to model using orthodox Bayesian<br />

machinery. That’s not a coincidence. There’s a well known Bayesian argument in<br />

favour of the principle I’m objecting to, namely that getting evidence for p presupposes<br />

the possibility of getting evidence for ¬ p. (See ... for further discussion of<br />

this.) I haven’t discussed that objection here, because I think it’s irrelevant. When<br />

dealing with foundational matters, like logical inference, Bayesian modelling is inappropriate.<br />

We can see that by noting that in any field where Bayesian modelling is<br />

appropriate, the objection currently being considered works! 3 It’s crucial to the defence<br />

of foundationalist externalism I’m offering here that this objection not work,<br />

so it’s crucial that we not be able to model some features of learning by perception<br />

3 Or see the large literature on the Problem of Old Evidence.


Easy Knowledge and Other Epistemic Virtues 247<br />

using orthodox Bayesian techniques. If perception is genuinely foundational, like<br />

logical reasoning, that shouldn’t be too much of a surprise, given the well-known<br />

troubles Bayesians have with representing logical reasoning.<br />

1.3 Generality<br />

Objection: Assume we can use perception to come to know on a particular<br />

occasion that perception is reliable. Since we can do this in arbitrary<br />

situations where perception is working, then anyone whose perception<br />

is working can come to know, by induction on a number of successful<br />

cases, that their perception is generally reliable. And this is absurd.<br />

I’m not sure that this really is absurd, but the cases already discussed should make it<br />

clear that it isn’t a consequence of foundationalist externalism. It is easily possible to<br />

routinely get knowledge that a particular F is G, never get knowledge that any F is<br />

not G, and no way be in a position to infer, or even regard as probable, that all F s are<br />

Gs.<br />

For instance, if we let F be is a Turing machine in the box Grant is holding, and G<br />

be halts, then for any particular F Grant comes to know about, it is G. But it would<br />

be absurd for her to infer that all F s are Gs. Similarly, for any Red Sox game that<br />

Adams comes to know about, the Red Sox win. But it would be absurd for her to<br />

come to believe on that basis that they win every game.<br />

There’s a general point here, namely that whenever we can only come to know<br />

about the F s that are also Gs, then we are never in a position to infer inductively that<br />

all, or even most, F s are Gs. Since even the foundationalist externalist doesn’t think<br />

we can come to know by perception that perception is not working on an occasion,<br />

this means we can never know, by simple induction on perceptual knowledge, that<br />

perception is generally reliable.<br />

1.4 Circularity<br />

Objection: It is impossible in principle to come to know that a particular<br />

method delivers true outputs on an occasion by using that very method.<br />

Since the foundationalist externalist says you can do this, her view must<br />

be mistaken.<br />

The first sentence of this objection seems to be mistaken. Consider the way we might<br />

teach an intelligent undergraduate that (1) is a theorem.<br />

(( p → q) ∧ p) → q (12.1)<br />

We might first get the student to assume the antecedent, perhaps reminding them of<br />

other uses of assumption in everyday reasoning. Then we’ll note that both p → q<br />

and p follow from this assumption. If we’re ambitious we’ll say that they follow<br />

by ∧-elimination, though that might be too ambitious. And then we’ll draw their<br />

attention to the fact that these two claims together imply q, again noting that this<br />

rule is called →-elimination if we’re aiming high pedagogically. Finally, we’ll note<br />

that since assuming the antecedent of (1) has let us derive its consequent, then it


Easy Knowledge and Other Epistemic Virtues 248<br />

seems that if the antecedent of (1) holds, so does the consequent. But that’s just<br />

what (1) says, so (1) is true. For a final flourish, we might note the generalisation of<br />

that reasoning to the →-introduction rule. Or we might call it quits at proving (1);<br />

undergraduates can only handle so much logic at one time.<br />

That’s a pretty good way, I think, of teaching someone that (1) is true, and indeed<br />

why it is true. That is, it is a way the student can learn (1), i.e., come to know (1).<br />

Indeed, I think it’s more or less the way that I came to learn (1). But note something<br />

special about (1). It’s obviously closely related to one of the rules we used to prove<br />

it, namely →-elimination. Indeed, when Achilles first tries to do without that rule in<br />

Lewis Carroll’s famous dialogue, he appeals to an instance of (1). Given how closely<br />

related the two are, you might think using →-elimination to come to learn (1) would<br />

be viciously circular. But in fact it isn’t; the argument we just gave is a perfectly<br />

good way of learning (1). I don’t know whether we should say it is circular but not<br />

viciously so, or simply non-circular, but however we classify the argument it isn’t a<br />

bad argument. So I conclude that this rule can be used to come to learn that the rule<br />

is truth-preserving.<br />

It might be objected at this point that the undergraduate must have already known<br />

(1) if they consent to →-elimination. I think this turns on an unrealistic picture of<br />

what undergraduates are capable of. It looks to me that (1) is in fact a very complicated<br />

proposition. It has a conditional embedded in the antecedent of a conditional!<br />

Such sentences are not familiar in everyday life, and most introductory logic students<br />

have problems parsing such sentences, let alone knowing they are true. So I<br />

don’t think this kind of Socratic objection works.<br />

It might also be objected that perception is different to logical reasoning. Such an<br />

objection might even be made to work. But it isn’t a way of defending the objection<br />

we opened this subsection with; it relies on a principle that’s simply false.<br />

1.5 A Priority<br />

Objection: Assume it is possible to come to know that perception is reliable<br />

by using perception. Then before we even perceive anything, we<br />

can see in advance that this method will work. So we can see in advance<br />

that perception is reliable. That means we don’t come to know that perception<br />

is reliable using perception, we could have known it all along. In<br />

other words, it is a priori knowable that perception is reliable.<br />

This objection misstates the foundationalist externalist’s position. If perception is<br />

working, then we get evidence for this every time we perceive something, and reflect<br />

on what we perceive. But if perception is not working well, we don’t get any such<br />

evidence. The point is not merely that if perception is unreliable, then we can’t<br />

possibly know that perception is unreliable since knowledge is factive. Rather, the<br />

point is that if perception is unreliable, then using perception doesn’t give us any<br />

evidence at all about anything at all. So it doesn’t give us evidence that perception is<br />

reliable. Since we don’t know antecedently whether perception is reliable, we don’t<br />

know if we’ll get any evidence about its reliability prior to using perception, so we<br />

can’t do the kind of a priori reasoning imagined by the objector.


Easy Knowledge and Other Epistemic Virtues 249<br />

This response relies heavily on an externalist treatment of evidence. An internalist<br />

foundationalist is vulnerable to this kind of objection. As I’ve argued elsewhere,<br />

internalists have strong reasons to think we can know a priori that foundational<br />

methods are reliable. Some may think that this is a reductio of interalism. (I<br />

don’t.) But the argument crucially relies on internalism, not just on foundationalism.<br />

1.6 Testing<br />

Objection: It’s bad to test a belief forming method using that very method.<br />

The only way to learn that a method is working is to properly test it. So<br />

we can’t learn that perception is reliable using perception.<br />

This objection is, I think, the most interesting of the lot. And it’s interesting because<br />

in some ways I think the first premise, i.e. the first sentence in it, is true. Testing perception<br />

using perception is bad. What’s surprising is that the second premise is false.<br />

The short version of my reply is that in testing, we aim for more than knowledge. In<br />

particular, we aim for sensitive knowledge. A test can be bad because it doesn’t deliver<br />

sensitive knowledge. And that implies that a bad test can deliver knowledge, at<br />

least assuming that not all knowledge is sensitive knowledge. Defending these claims<br />

is the point of the next section.<br />

2 The Virtues of Tests<br />

I think the focus on knowledge obscures some features of the easy knowledge problem.<br />

The easy knowledge problem arises in the context of testing measuring devices.<br />

The most common versions of the problem have us testing our own perceptual faculties,<br />

but the problem can arise for testing an external device. We can’t properly test<br />

whether a scale is working by comparing its measurement of the weight of an object<br />

to the known weight of the object if the only way we know the weight of the object is<br />

by using that very device. Strikingly, this is true even if we do know that the scale is<br />

working, and so knowledge isn’t even at issue. So I’m going to argue that the problem<br />

of easy knowledge really should tell us something about testing, not something about<br />

knowledge.<br />

Although I think the easy knowledge problem doesn’t, after all, have much to tell<br />

us about knowledge, I think it has a lot to tell us about epistemic virtues. The story<br />

I’m going to tell about testing is driven by the tension between these two theses.<br />

Fallibilism We can know p even though p could, in some sense, be false.<br />

Exhaustion Knowledge exhausts the epistemic virtues, so if S knows that p, S’s belief<br />

that p could not be made more virtuous.<br />

I think Fallibilism is true, and that suggests Exhaustion must be false. After all, Fallibilism<br />

implies that S’s belief that p, which actually amounts to knowledge, could be<br />

strengthened in some ways. It could, for example, be made infallible. So Exhaustion<br />

is false.


Easy Knowledge and Other Epistemic Virtues 250<br />

The falsity of Exhaustion opens up a conceptual possibility that I don’t think<br />

has been adequately explored. It is commonly agreed these days that a belief need<br />

not be sensitive to be known, as we noted in section 1. But this doesn’t mean, and in<br />

fact doesn’t even suggest, that sensitivity is not an epistemic virtue. Indeed, it seems<br />

to me that sensitivity is an epistemic virtue, just as infallibility is. Neither of these<br />

virtues is necessary for knowledge, but they are virtues nonetheless.<br />

The core hypothesis of this section is that tests of measuring devices aim at sensitive<br />

beliefs about their accuracy. That is, we aim to draw a true conclusion about the<br />

accuracy of the device, and arrive at that conclusion through a method that would<br />

have yielded a different verdict were the device’s level of accuracy different. And<br />

this, I think, explains at least a part of the appeal of the easy knowledge objection.<br />

The way the objection is usually set up, someone is testing a measuring device by a<br />

method that clearly cannot yield sensitive conclusions. We conclude, correctly, that<br />

this means it is a bad test. I suspect that at least some people infer from that that<br />

the test cannot deliver knowledge that the device is inaccurate. This inference isn’t,<br />

in most cases, conscious, but I suspect it is part of what drives people who find easy<br />

knowledge implausible. If what I’ve said so far is correct, that last step may be a non<br />

sequiter. Good tests deliver sensitive conclusions, and knowledge doesn’t require sensitivity,<br />

so it is conceptually possible that a bad test can produce knowledge. To be<br />

sure, our knowledge is usually sensitive, so if our belief that the device is accurate is<br />

insensitive, that is some evidence that that belief does not amount to knowledge. But<br />

it is not a proof that the belief does not amount to knowledge. So a belief grounded<br />

in an insensitive, and therefore bad, test may amount to knowledge.<br />

A lot of what I want to say about testing comes from reflection on the following<br />

case. It is a slightly idealised case, but I hope not terribly unrealistic.<br />

In a certain state, the inspection of scales used by food vendors has two<br />

components. Every two years, the scales are inspected by an official and<br />

a certificate of accuracy issued. On top of that, there are random inspections,<br />

where each day an inspector must inspect a vendor whose biennial<br />

inspection is not yet due. Today one inspector, call her Ins, has to inspect<br />

a store run by a shopkeeper called Sho. It turns out Sho’s store was<br />

inspected just last week, and passed with flying colours. Since Sho has a<br />

good reputation as an honest shopkeeper, Ins knows that his scales will<br />

be working correctly.<br />

Ins turns up and before she does her inspection watches several people<br />

ordering caviar, which in Sho’s shop goes for $1000 per kilogram. The<br />

first customer’s purchase gets weighed, and it comes to 242g, so she hands<br />

over $242. The second customer’s purchase gets weighed, and it comes<br />

to 317g, so she hands over $317. And this goes on for a while. Then<br />

Ins announces that she’s there for the inspection. Sho is happy to let<br />

her inspect his scales, but one of the customers, call him Cus, wonders<br />

why it is necessary. “Look,” he says, “you saw that the machine said my<br />

purchase weighed 78g, and we know it did weigh 78g since we know it’s<br />

a good machine.” At this point the customer points to the certificate


Easy Knowledge and Other Epistemic Virtues 251<br />

authorising the machine that was issued just last week. “And that’s been<br />

going on for a while. Now all you’re going to do is put some weights<br />

on the scale and see that it gets the correct reading. But we’ve done that<br />

several times. So your work here is done.”<br />

There is something deeply wrong with Cus’s conclusion, but it is surprisingly hard<br />

to see just where the argument fails. Let’s lay out his argument a little more carefully.<br />

1. The machine said my caviar weighed 78g, and we know this, since we could all<br />

see the display.<br />

2. My caviar did weigh 78g, and we know this, since we all know the machine is<br />

working correctly.<br />

3. So we know that the machine weighed my caviar correctly. (From 1, 2)<br />

4. By similar reasoning we can show that the machine has weighed everyone’s<br />

caviar correctly. (Generalising 3)<br />

5. All we do in testing a machine is see that it weighs various weights correctly.<br />

6. So just by watching the machine all morning we get just as much knowledge as<br />

we get from a test. (From 4, 5)<br />

7. So there’s no point in running Ins’s tests. (From 6)<br />

Cus’s summary of how testing scales works is obviously a bit crude 4 , but we can<br />

imagine that the spot test Ins plans to do isn’t actually any more demanding than<br />

what the scale has been put through while she’s been standing there. So we’ll let<br />

premise 5 pass. 5 If 3 is true, it does seem 4 follows, since Cus can simply repeat his<br />

reasoning to get the relevant conclusions. And if 4 and 5 are true, then it does seem<br />

6 follows. To finish up our survey of the uncontroversial steps in Cus’s argument, it<br />

seems there isn’t any serious dispute about step 1. 6<br />

So the contentious steps are:<br />

• Step 2 - we may deny that everyone gets knowledge of the caviar’s weight from<br />

the machine.<br />

• Step 3 - we may deny that the relevant closure principle that Cus is assuming<br />

here.<br />

• Step 7 - we may deny that the aim of the test is (merely) to know that the<br />

machine is working.<br />

4 For a more accurate account of what should be done, see Edwin C. Morris and Kitty M. K. Fen,<br />

The Calibration of Weights and Measures, third edition, Sydney: National Measurement Institute, 2002, or<br />

David B. Prowse, The Calibration of Balances, Lindfield: Commonwealth Scientific and Industrial Research<br />

Organization, 1985.<br />

5 If you’d prefer more realism in the testing methodology, at the cost of less realism in the purchasing<br />

pattern of customers, imagine that the purchases exactly follow the pattern of weights that a calibrator<br />

following the guidelines of Prowse (1985) or Morris and Fen (2002) would put on the machine.<br />

6 It may be epistemologically significant that human perception does not, in general, work the same<br />

way as balances in this respect. We may come to know p through perception before we know that p<br />

was the output of a perceptual module. Indeed, an infant may come to know p through perception even<br />

though she lacks the concept of a perceptual module. This may be of significance to the applicability of<br />

the problem of easy knowledge to human perception, but it isn’t immediately relevant here.


Easy Knowledge and Other Epistemic Virtues 252<br />

One way to deny step 2 is to just be an inductive sceptic, and say that no one can<br />

know that the machine is working merely given that it worked, or at least appeared to<br />

work, last week. But that doesn’t seem very promising. It seems that the customers<br />

do know, given that the testing regime is a good one, and that the machine was properly<br />

tested, that the machine is working. Perhaps there is some reason to think that<br />

although the customers know this, the inspector does not. We’ll come back to that.<br />

But for now it seems like step 2 is good.<br />

In recent years there has been a flood of work by philosophers denying that what<br />

we know is closed under either single-premise closure, e.g. Dretske (2005), or multipremise<br />

closure, e.g. Christensen (2005). But it is hard to see how kind of anti-closure<br />

view could help here. We aren’t inferring some kind of heavyweight proposition like<br />

that there is an external world. And Dretske’s kind of view is motivated by avoidance<br />

of that kind of inference. And Christensen’s view is that knowledge of a conjunction<br />

might fail when the amount of risk involved in each conjunct is barely enough to<br />

sustain knowledge. But we can imagine that our knowledge of both 1 and 2 is far<br />

from the borderline.<br />

A more plausible position is that the inference from 1 and 2 to 3 fails to transmit<br />

justification in the sense that Crispin Wright (2000, 2004) has described. But that just<br />

means that Ins, or Cus, can’t get an initial warrant, or extra warrant, for believing<br />

the machine is working by going through this reasoning. And Cus doesn’t claim<br />

that you can. His argument turns entirely on the thought that we already know<br />

that the machine is reliable. Given that background, the inference to 3 seems pretty<br />

uncontroversial.<br />

That leaves step 7 as the only weak link. I want to conclude that Cus’s inference<br />

here fails; even if Ins knows that the machine is working, it is still good for her to test<br />

it. But I imagine many people will think that if we’ve got this far, i.e., if we’ve agreed<br />

with Cus’s argument up to step 6, then we must also agree with step 7. I’m going to<br />

offer two arguments against that, and claim that step 7 might fail, indeed does fail in<br />

the story I’ve told, even if what Cus says is true up through step 6.<br />

First, even if Ins won’t get extra knowledge through running the tests on this<br />

occasion, it is still true that this kind of randomised testing program is an epistemic<br />

good. We have more knowledge, or at least more reliable knowledge, through having<br />

randomised checks of machines than we would get from just having biennial tests. So<br />

there is still a benefit to conducting the tests even in cases where the outcome is not<br />

in serious doubt. The benefit is simply that the program, which is a good program,<br />

is not compromised.<br />

We can compare this reason Ins has for running the tests to reasons we have for<br />

persisting in practices that will, in general, maximise welfare. Imagine a driver, called<br />

Dri, is stopped at a red light in a quiet part of town in the middle of the night. Dri<br />

can see that there is no other traffic around, and that there are no police or cameras<br />

who will fine her for running the red light. But she should, I think, stay stopped at<br />

the light. The practice of always stopping at red lights is a better practice than any<br />

alternative practice that Dri could implement. I assume she, like most drivers, could<br />

not successfully implement the practice Stay stopped at red lights unless you know no<br />

harm will come from running the light. In reality, a driver who tries to occasionally


Easy Knowledge and Other Epistemic Virtues 253<br />

slip through red lights will get careless, and one day run a serious risk of injury to<br />

themselves or others. The best practice is simply to stay stopped. So on this particular<br />

occasion Dri has a good reason to stay stopped at the red light: that’s the only way to<br />

carry out a practice which it is good for her to continue.<br />

Now Ins’s interest is not primarily in welfare, it is in epistemic goods. (She cares<br />

about those epistemic goods because they are related to welfare, but her primary<br />

interest is in epistemic goods.) But we can make the same kind of point. There are<br />

epistemic practices which are optimal for us to follow given what we can plausibly<br />

do. And this kind of testing regime may be the best way to maximise our epistemic<br />

access to facts about scale reliability, even if on this occasion it doesn’t lead to more<br />

knowledge. Indeed, it seems to me that this is quite a good testing regime, and it is<br />

a good thing, an epistemically good thing, for Ins to do her part in maintaining the<br />

practice of randomised testing that is part of the regime.<br />

The second reason relates to the tension between Fallibilism and Exhaustion<br />

I mentioned at the top. It may be that the aims of the test are not exhausted by<br />

the aim of getting knowledge that the machine is working. We might also want a<br />

sensitive belief that the machine is working. Indeed, we may want a sensitive belief<br />

that the machine has not stopped working since its last inspection. That would be an<br />

epistemic good. In some sense, our epistemic standing improves if our belief that the<br />

machine has not stopped working since its last inspection becomes sensitive to the<br />

facts.<br />

This idea, that tests aim for sensitivity, is hardly a radical one. It is a very natural<br />

idea that good tests produce results that are correlated with the attribute being tested.<br />

When we look at the actual tests endorsed in manuals like Prowse (1985) or Kitty and<br />

Fen (2002), that seems to be one of their central aims. But ‘testing’ the machine by<br />

using its own readings cannot produce results that are correlated with the accuracy<br />

of the machine. If the machine is perfectly accurate, the test will say it is perfectly<br />

accurate. If the machine is somewhat accurate, the test will say it is perfectly accurate.<br />

And if the machine is quite inaccurate, the test will say that it is perfectly accurate.<br />

The test Ins plans to run, as opposed to the ‘test’ that Cus suggests, is sensitive to the<br />

machine’s accuracy. Since it’s good to have sensitive beliefs, it is good for Ins to run<br />

her tests.<br />

So I conclude that step 7 in Cus’s argument fails. There are reasons, both in terms<br />

of the practice Ins is part of, and in terms of what epistemic goods she’ll gain on<br />

this occasion by running the test, for Ins to test the machine. That’s true even if she<br />

knows that the machine is working. The epistemic goods we get from running tests<br />

are not restricted to knowledge.<br />

3 Evidence and Epistemic Virtues<br />

Timothy Williamson has argued in several places that our evidence consists of all<br />

and only the things we know. This view, usually called E=K, has become quite<br />

popular in recent years. It’s tempting to think it counts much to much as evidence.<br />

Here’s one quick objection to E=K that doesn’t work. Some of our knowledge is


Easy Knowledge and Other Epistemic Virtues 254<br />

not self-evident. But any time a piece of knowledge is part of our evidence, it is selfevident.<br />

So E=K is false. But the middle premise there fails. For something to be<br />

self-evident, it isn’t just true that it must be part of our evidence, it must also be that<br />

our knowledge of it is not based on anything else. And E=K does not imply anything<br />

about basing.<br />

There are other ways to argue for a similar conclusion. If I observe that E, and<br />

inductively infer H, and thereby come to know H, it seems a little odd to say that<br />

H is part of my evidence. If we ask ourselves in such a case, what possibilities are<br />

inconsistent with my evidence, then intuitively they include the ¬E possibilities, but<br />

not the ¬H possibilities. On the other hand, consider an extension of the case where<br />

H provides some support for a further conclusion H ′ . If we say that H ′ is probable,<br />

and someone asks us what evidence we have for this, it is often tempting to reply that<br />

our evidence is H. These examples have been rather abstract, but I think you’ll find<br />

if you fill in real propositions for E, H and H ′ , you’ll find that both intuitions have<br />

some support. Sometimes we intuitively treat inductively drawn conclusions as part<br />

of our evidence, sometimes we don’t. Intuition here doesn’t unequivocally support<br />

E=K, but it doesn’t totally undermine it either.<br />

Still, the idea that inductive inferences form part of our evidence does create a<br />

puzzle. Here’s one way to draw out the puzzling nature of E=K.<br />

Jack is inspecting a new kind of balance made by Acme Corporation.<br />

He thoroughly inspects the first 10 (out of a batch of 1,000,000) that<br />

come off the assembly line. And each of them passes the inspection with<br />

flying colours. Each of them is more accurate than any balance Jack had<br />

tested before that day. So Acme is making good balances. He knows,<br />

by observation, that the first 10 balances are reliable. He also knows, by<br />

induction, that the next balance will be reliable. It’s not obvious that he<br />

knows the next one will be phenomenal, like the ones he has tested, but<br />

he knows it will be good enough for its intended usage. But he doesn’t<br />

know they all will be that good. Surprisingly, it will turn out that every<br />

balance made in this assembly line will be reliable. But you’d expect,<br />

given what we know about assembly lines, for there to be a badly made<br />

machine turn up somewhere along the way.<br />

So Jack knows the first 11 are reliable, and doesn’t know the first 1,000,000<br />

are reliable. Let n be the largest number such that Jack knows the first<br />

n are reliable. (I’m assuming such an n exists; those who want to hold<br />

on to E=K by giving up the least number theorem are free to ignore everything<br />

that follows.) For any x, let R(x) be the proposition that the<br />

first x are reliable. So Jack knows R(n). Hence by E=K R(n) is part<br />

of his evidence. But he doesn’t know R(n + 1). This is extremely odd.<br />

After all, R(n) is excellent evidence for R(n + 1), assuming it is part of<br />

his evidence. And R(n + 1) is true. Indeed, by many measures it is safely<br />

true. So why doesn’t Jack know it?


Easy Knowledge and Other Epistemic Virtues 255<br />

It seems to me there is a mystery here that, given E=K, we can’t explain. If we<br />

have a more restrictive theory of evidence, then it is easy to explain what’s going<br />

on. If, for instance, evidence is perceptual knowledge, then Jack’s evidence is simply<br />

R(10). And it might well be true, given the correct theory of what hypotheses are<br />

supported by what evidence, that R(10) supports R(84) but not R(85). That explanation<br />

isn’t available to the E=K theorist. And we might well wonder what explanation<br />

could be available. Similarly, we can’t say that it is too much of a stretch to infer R(85)<br />

from Jack’s evidence. After all, his evidence includes R(84). If that’s not a sufficient<br />

evidential basis to conclude R(85), we’re getting very close to scepticism about enumerative<br />

induction. If this is not to appear to be a concession to the sceptic, we need<br />

a story about why this case is special.<br />

I think the best explanation of why the case is special turns on the idea that there<br />

are different epistemic virtues, and some of these cross-cut the presence of absence<br />

of knowledge. I don’t think this is the only explanation of what’s going on, but I<br />

think it tracks the phenomenon better than the alternative. 7 Let’s say that evidence,<br />

like knowledge, can be of better or worse quality. If you don’t know p, then p is of<br />

no evidential use to you. That’s the part E=K gets right. But even if you do know<br />

p, how much evidential use is might depend on how you know it. For instance,<br />

if you infallibly know p, then p is extremely useful evidence. More relevantly for<br />

current purposes, if you have sensitive knowledge that p, then p is more useful than<br />

if you have insensitive knowledge that p. That’s because, as we argued in the previous<br />

section, sensitive knowledge has epistemic qualities over and above insensitive<br />

knowledge.<br />

Let’s go through how this plays out in Jack’s case. Although he knows R(11), this<br />

knowledge is insensitive. If R(11) were false, he would still believe it. Had the production<br />

system malfunctioned when making the 11 t h balance, for instance, then the<br />

11 t h machine would have been unreliable, but Jack would have still believed it. The<br />

only sensitive evidence he has is R(10). By the time he gets to R(n), his knowledge is<br />

extremely insensitive. There are all sorts of ways that R(n) could have been false, in<br />

many fairly near worlds, and yet he would still have believed it.<br />

This suggests a hypothesis, one that dovetails nicely with the conclusions of the<br />

previous section. The more insensitive your evidence is, the less inductive knowledge<br />

it grounds. If Jack had sensitive knowledge that R(n), he would be in a position<br />

to infer, and thereby know R(n + 1). The reason he can’t know R(n + 1) is not<br />

that he doesn’t have enough evidence, but rather that the evidence he has is not of<br />

a high enough quality. That’s an explanation for why Jack can’t infer R(n + 1) that<br />

neither leads to inductive scepticism, nor violates the letter of E=K. Sometimes the<br />

way Williamson presents E=K suggests he wants something stronger, namely that<br />

7 The kind of alternatives I have in mind are ones where we appeal to other knowledge that Jack has.<br />

For instance, we might note that he knows he knows R(10), but doesn’t know that he knows R(n). This<br />

explanation will have to explain why this second-order knowledge is relevant to inductive inference. After<br />

all, why should facts about what Jack knows be relevant to whether the n + 1’th machine will function<br />

well? I think similar explanations will suffer similar defects. I’m grateful here to conversations with<br />

Jonathan Weinberg.


Easy Knowledge and Other Epistemic Virtues 256<br />

what we know determines not only what our evidence is, but which things are supported<br />

by our evidence. I think that isn’t true. Indeed, the best explanation of Jack’s<br />

predicament is that it isn’t true. But I also don’t think it is something that the original<br />

arguments for E=K support, so we shouldn’t feel any hesitation about dropping it.


Induction and Supposition<br />

Here’s a fairly quick argument that there is contingent a priori knowledge. Assume<br />

there are some ampliative inference rules. Since the alternative appears to be inductive<br />

scepticism, this seems like a safe enough assumption. Such a rule will, since it is<br />

ampliative, licence some particular inference From A infer B where A does not entail<br />

B. That’s just what it is for the rule to be ampliative. Now run that rule inside suppositional<br />

reasoning. In particular, first assume A, then via this rule infer B. Now do<br />

a step of →-introduction, inferring A → B and discharging the assumption A. Since A<br />

does not entail B, this will be contingent, and since it rests on a sound inference with<br />

no (undischarged) assumptions, it is a priori knowledge.<br />

This argument is hardly new. It is part of the argument in some recent papers<br />

promoting contingent a priori knowledge, such as Hawthorne (2002) and <strong>Weatherson</strong><br />

(2005b). But it is an intriguingly quick argument for a stunning philosophical<br />

conclusion, one that seems to rely on few dubious steps. I’m going to argue that it<br />

fails for a quite interesting reason. At least in natural deduction systems, some inferential<br />

rules (such as ∀-introduction) have restrictions on when they can be applied.<br />

I’m going to argue that ampliative reasoning rules cannot, in general, be applied inside<br />

the scope of suppositions, and that is why the above argument fails.<br />

I’ll argue for this conclusion by showing that a very weak ampliative rule leads,<br />

when combined with some other plausible principles, to absurd conclusions if it is<br />

applied inside the scope of suppositions. If even a weak ampliative rule cannot be<br />

used suppositionally, then it plausibly follows that no ampliative rule can be used<br />

suppositionally. The construction I’m going to use to show this is quite similar to<br />

one used by Sinan Dogramaci in his Dogramaci (2010), though as we’ll see at the end<br />

Dogramaci and I have different views about what to take away from these arguments.<br />

Some people might think we have already seen an argument that ampliative inference<br />

rules fail in suppositional reasoning. If these rules are allowed, then we have<br />

contingent a priori knowledge, and this is implausible. I don’t believe this argument<br />

works, since I think there are other arguments for contingent a priori knowledge.<br />

(Some of them are in the above cited papers.) So it is a live question whether this<br />

quick argument for contingent a priori knowledge works.<br />

Here’s the main argument. If any ampliative inference is justified, I think the<br />

following rule, called ‘R99’, is justified, since this is a very weak form of an inductive<br />

inference.<br />

R99 From Over 99% of Xs are Ys and a is X infer a is Y unless there is some Z such<br />

that it is provable from the undischarged assumptions that a is X and Z and Less<br />

than 99% of things that are both Xs and Zs are Ys.<br />

† In progress.


Induction and Supposition 258<br />

Note that the rule does not say that 99% of observed X s are Y s, but that 99% of all<br />

X s are Y s. So this seems like a very plausible inference; it really is just making an<br />

inference within the distribution, not outside it. And it is explicitly qualified to deal<br />

with defeaters. And yet even this rule, when applied inside the scope of suppositions,<br />

can lead to disaster.<br />

In the following proof, we’ll write ‘99(F,G)’ for Over 99% of Fs are Gs to shorten<br />

the presentation. And to make the rule a little weaker, we’ll say that 99(F,G) is false<br />

when there are no F s. We’ll write F H for the predicate taken by conjoining F and<br />

H. So 99(F H,¬G) means, in English, that over 99% of things that are both F and H<br />

are ¬G, and that at least one thing is both F and H. Finally, we’ll use second-order<br />

variables X , Y, Z with the obvious introduction and elimination rules and let I be the<br />

predicate is self-identical.<br />

99(F,G) ∧ F a assumption (13.1)<br />

99(F,G) ∧ -elimination, (1) (13.2)<br />

F a ∧ -elimination, (1) (13.3)<br />

Ga R99,(2),(3) (13.4)<br />

(99(F,G) ∧ F a) → Ga (1) − (4),discharging (1) (13.5)<br />

99(F H,¬G) ∧ (F a ∧ H a) assumption (13.6)<br />

99(F H,¬G) ∧ -elimination, (6) (13.7)<br />

F a ∧ H a ∧ -elimination, (6) (13.8)<br />

¬Ga R99,(7),(8) (13.9)<br />

(99(F H,¬G) ∧ (F a ∧ H a)) → ¬Ga (6) − (9),discharging (6) (13.10)<br />

¬((99(F H,¬G) ∧ (F a ∧ H a)) ∧ (99(F,G) ∧ F a)) classical logic, (5), (10) (13.11)<br />

¬((99(F H,¬G) ∧ 99(F,G)) ∧ (F a ∧ H a)) simplifying (11) (13.12)<br />

(99(F H,¬G) ∧ 99(F,G)) → ¬(F a ∧ H a) classical logic, (12) (13.13)<br />

(99(F H,¬G) ∧ 99(F,G)) → ∀x¬(F x ∧ H x) ∀-introduction, (13) (13.14)<br />

99(F H,¬G) → ∃x(F x ∧ H x) definition ‘99’ (13.15)<br />

(99(F H,¬G) ∧ 99(F,G)) → ∃x(F x ∧ H x) classical logic, (15) (13.16)<br />

¬(99(F H,¬G) ∧ 99(F,G)) classical logic, (14), (16) (13.17)<br />

99(F,G) → ¬99(F H,¬G) classical logic, (17) (13.18)<br />

This is already a bad result, but worse is to come. Since F,G, H are arbitrary we can<br />

infer...<br />

∀X , Y, Z(99(X , Y ) → ¬99(X Z,¬Y )) second order ∀-introduction,(18)<br />

(13.19)


Induction and Supposition 259<br />

Now substitute I for X and ¬Y for Z.<br />

∀Y (99(I , Y ) → ¬99(¬Y,¬Y )) second order ∀-elimination,(19)<br />

(13.20)<br />

And that can only be true in worlds with less than 101 individuals. For if there were<br />

101 (or more) individuals, there would be a predicate Y that applied to all but one of<br />

them, and then both 99(I , Y ) and 99(¬Y,¬Y ) would be true.<br />

Note that the only assumptions made in the proof are discharged. So we have<br />

an a priori reason to believe whatever conclusions are drawn. As we saw in the first<br />

paragraph, these may be contingent conclusions, but this is supposed to be a method<br />

for deriving contingent a priori conclusions. So if we can use R99 in the scope of<br />

suppositional reasoning, and every other rule in the argument is correctly used, then<br />

we have an a priori reason to believe that there are fewer than 101 individuals in the<br />

world. Obviously this couldn’t be a priori knowledge since it isn’t true, but it is an a<br />

priori justified belief.<br />

And that’s obviously crazy. We don’t have any reason, a priori, to believe the<br />

world has that few things in it. Maybe some diehard Occamists will think that a<br />

priori we should think the world has 1 thing in it, or 0 if that’s possible, but I suspect<br />

most will agree that this is a misuse of Occam’s Razor. Really, if the proof goes<br />

through, we have an argument that R99 cannot be used inside the scope of a supposition.<br />

And it seems the proof does work. At the time we use R99, the only live assumption<br />

is that a from a particular group, and something about the distribution of<br />

Gness in that group. The assumptions couldn’t prove anything stronger. The only<br />

remotely controversial step after that is the ∀-introduction in step 14. But since there<br />

are no undischarged assumptions involving a, indeed there are no undischarged assumptions<br />

at all, at this step, it is hard to see why this would fail. For a more positive<br />

reason, note that we could replicate the proof for any other name. That is, by repeating<br />

steps (1)-(13), we could easily prove (99(F H,¬G) ∧ 99(F,G)) → ¬(F b ∧ H b) or<br />

(99(F H,¬G)∧99(F,G)) → ¬(F c ∧ H c), or anything else we wanted to prove. That’s<br />

the usual defence of ∀-introduction, so we should be able to infer the universal here.<br />

(Let me set aside one small point. I’ve gone from p → φa to p → ∀xφx rather<br />

than ∀x( p → φx) simply because (a) it’s easier to interpret, and (b) it shortens the<br />

proof. But the latter two sentences here are classically equivalent, so this shouldn’t<br />

make a difference to the cogency of the proof.)<br />

Perhaps it will be objected that line (14) is a mistake because although we can<br />

prove every instance of the universal quantifier, inferring the universal version creates<br />

an undue aggregation of risks. 1 Thinking about this probabilistically, even if line<br />

(13) is very probable, and it would still be probable if a were replaced with b, c or<br />

any other name, it doesn’t follow that the universal is very probable. But I think<br />

1 Dogramaci (2010) blames the ∀-introduction step in the version of the similar proof that he uses. One<br />

of his objections is the probabilistic objection I’m making here. Another is, I think, a general sense that<br />

rules we learn in logic class, like ∀-introduction, are less plausible than intuitive modes of reasoning like<br />

statistical inference inside a conditional proof. I discuss both of these objections in turn.


Induction and Supposition 260<br />

this is to confuse defeasible reasoning with probabilistic reasoning. The only way<br />

to implement this restriction on making inferences that aggregate risk would be to<br />

prevent us making any inference where the conclusion was less probable than the<br />

premises. That will rule out uses of ∀-introduction as at (14). But it will also rule<br />

∧-introduction, and indeed any other inference with more than 1 input step. If we’re<br />

worried about risk aggregation, we shouldn’t even allow the inference to (11), which<br />

could be less probable than each of the steps that preceeded it. To impose such a<br />

restriction would be to cripple natural deduction.<br />

A determined Bayesian might agree at this point. Such a Bayesian will say that<br />

the problem here is that we haven’t reasoned probabilistically all along. Perhaps<br />

that’s right. But if that’s so, then R99, and all other ampliative rules like it, must be<br />

scrapped. Really the Bayesian should want us to infer from Over 99% of Xs are Ys and<br />

a is X that the probability that a is Y is over 0.99. And that doesn’t lead to disaster.<br />

But that just is to deny the existence of ampliative inference rules. What we’ve ended<br />

up with by following through this objection is a much more negative position than<br />

the one taken in this paper. I’ve argued that ampliative inference rules can’t be applied<br />

inside the scope of suppositions. The Bayesian, or at least the Bayesian who objects<br />

to the use of ∀-introduction at (14), thinks they can’t be used anywhere. Now such<br />

a Bayesian may go on to say why this isn’t incompatible with inductive knowledge,<br />

or they might entreat us to do away with traditional notions like ‘knowledge’ and<br />

replace them with notions like ‘high probability’. Either way, it isn’t a threat to the<br />

position argued here.<br />

We can make the argument of the last two paragraphs clearer by considering five<br />

distinct positions about ampliative inference.<br />

1. There is no cogent ampliative inference, and hence all knowledge is deductive<br />

consequences of facts of which we are ‘directly’ aware. Depending on how<br />

liberal we are with this notion of directness, this kind of position will allow<br />

quite a bit of knowledge gained through perception, testimony, memory and<br />

other sources, but it does not allow non-trivial knowledge about the future.<br />

2. There is no cogent ampliative inference, but we can gain knowledge about the<br />

future. That’s because we know (not via prior ampliative inference) various<br />

conditionals of the form If the past is this way then the future will be that way,<br />

and via such conditionals and non-ampliative inferential rules we can deduce<br />

facts about the future.<br />

3. There is cogent ampliative inference, but it is not rule governed the way nonampliative<br />

inference appears to be. This position is a kind of particularism<br />

about ampliative inference.<br />

4. There is cogent and rule-governed ampliative inference, but ampliative inferential<br />

rules behave differently inside and outside the scope of suppositions. In<br />

this respect, the rules are like ∀-introduction and neccessitation, which have<br />

constraints on when they can be applied, and unlike, say, ∧-elimination.<br />

5. There is cogent and rule-governed ampliative inference, and ampliative inferential<br />

rules do not behave differently inside and outside the scope of suppositions.


Induction and Supposition 261<br />

In particular, we can use ampliative inferential rules inside the scope of suppositions<br />

in order to generate contingent a priori knowledge of conditionals.<br />

My aim here has been to argue against option 5. I take option 1 to be highly implausible,<br />

though it isn’t entirely without adherents. The overall tenor of my remarks<br />

has been to push towards option 4, but I haven’t said anything against options 2 and<br />

3. Now if we try to fit the Bayesian into this framework, I think it is clear that they<br />

have a version of option 2. Updating by conditionalisation is just the probabilistic<br />

equivalent of updating by →-elimination; both the person who believes in option 2<br />

and the devotee of conditionalisation thinks the conditional structure of our thought<br />

is epistemologically prior to empirical evidence, and the role of evidence is to move<br />

us within this structure. I have deep doubts about this position, but those doubts are<br />

irrelevant to this paper. The point here is to argue against option 5. And a probabilistic,<br />

or Bayesian, objection to my argument isn’t really of any help, because once<br />

we take the Bayesian position on board we end up with a more radical objection to<br />

option 5 to mine, i.e. we end up with option 2.<br />

It might be argued though that this defence of line (14) is too theoretical. The<br />

problem is not that (14) makes some particular probabilistic error. Rather, the problem<br />

is that the conclusion is absurd, and one of the rules must be false. Since the<br />

steps of ∀-introduction at lines (14) and (19) are the least plausible steps intuitively,<br />

we should locate the error there. This is an important objection, but I think it is a<br />

misdiagnosis of the problem.<br />

For one thing, dropping ∀-introduction in these cases is very odd as well. It’s<br />

quite counterintuitive to say that for any given object o we can derive that it satisfies<br />

λx.(99(F H,¬G)∧99(F,G)) → x¬(F x ∧ H x), and we can know this, but we can’t go<br />

on to infer ∀x((99(F H,¬G) ∧ 99(F,G)) → ¬(F x ∧ H x)). We might ask what we’re<br />

waiting for? For another thing, this seems to locate the mistake too late in the proof.<br />

It’s very odd that we can infer that for any predicates F,G, H and any object o, that o<br />

satisfies λx.(99(F H,¬G) ∧ 99(F,G)) → ¬(F x ∧ H x). Even if we are barred for some<br />

reason for collecting these judgments into a universal claim, the fact that we can make<br />

each of them seems already too quick. 2<br />

And, at risk of blatantly begging questions, it seems to me we should be very<br />

suspicious of the quick argument for contingent a priori justification from the first<br />

paragraph. This is not to say that there aren’t any arguments for the contingent a<br />

priori. Perhaps reflections on the nature of natural kinds, and our relation to them,<br />

could motivate the contingent a priori a la Kripke (1980). Or perhaps reflections<br />

we see in papers such as Wright (2004) or White (2006) could drive us to think to<br />

avoid external world scepticism we need some kind of contingent a priori. But the<br />

contingent a priori has always been controversial in philosophy, so a view that makes<br />

2 The oddity of these conclusions is why the big proof early in the paper goes via H rather than inferring<br />

99(F,G) → ∀x(F x → Gx) directly from line (5), and then, via a second-order universal introduction,<br />

inferring that there must be at most 101 objects. Many people will think that if there is a problem here,<br />

it is in one of the steps of universal quantification. Indeed, some people I’ve spoken to think the intuitive<br />

problem with that argument is merely the second-order generalisation, since 99(F,G) → ∀x(F x → Gx)<br />

is a priori justified. I think that’s wrong, but I wanted an argument where the first line before a universal<br />

introduction was more clearly unintuitive.


Induction and Supposition 262<br />

one side of the controversy obviously correct is counterintuitive. Since the ability<br />

to use rules like R99 inside the scope of suppositional reasoning would make one<br />

side of the debate trivially correct—line (5) is already an example of the contingent a<br />

priori—that suggests the rule is counterintuitive.<br />

An alternative objection is that R99 is too strong, because it doesn’t restrict its<br />

scope to projectable predicates. It isn’t immediately obvious such a restriction is<br />

needed. After all, R99 is really a form of injection not projection, since we are inferring<br />

from things we know about the class to things we don’t antecedently know about<br />

the individual. But perhaps this argument shows that, despite this fact, we need a<br />

projectability-like restriction on statistical rules like R99.<br />

It would be too restrictive to say that R99 applies only when F and G are projectable.<br />

Consider a case where F and ¬G are projectable, but G is not. 3 And assume<br />

we know that 99(F,G). Then we know that less than 1% of F s are ¬Gs. And, since F<br />

and ¬G are projectable, that presumably means we have good reason to believe that<br />

the next F will not be ¬G, i.e., it will be a G. So if F and ¬G are projectable, then<br />

R99 looks like a good rule.<br />

And this is enough to lead to disaster. Let F be is an animal. Let G be any<br />

predicate of the form i s not anS, where S is a species, and let H be S. I assume that<br />

is an animal and S are projectable; in any case, they are predicates that we project<br />

with all the time. Then the kind of reasoning above lets us get to line (18), which says<br />

(after substitutions) 99(F,¬S) → ¬99(F S, S). But ¬99(F S, S) is true only if there are<br />

no F Ss, i.e., there are no Ss. So there can’t be an extant species such that more than<br />

99% of animals are from other species. And from that it follows immediately that<br />

there can be at most 100 species. But it is absurd to have an a priori argument that<br />

there are at most 100 species of animal in the world. So even a restricted version of<br />

R99, one that is sensitive to projectability considerations, still yields an absurd result.<br />

I claim the absurdity is from applying R99 inside suppositional reasoning.<br />

FInally, some discussants have argued that it is counterintuitive is if we can’t, in<br />

everyday situations, know claims of the form A → B, where in our actual situation A<br />

would be outstanding, if non-conclusive, evidence for B. But the fact that we can’t use<br />

ampliative rules in suppositional reasoning hardly entails that conclusion. For one<br />

thing, often A is outstanding evidence for B because we antecedently know A → B.<br />

For another, we can sometimes deduce that it is rational to believe A → B. If it is<br />

generally acceptable to infer A → B from the rationality of believing A → B, and<br />

I think it is, then we can derive A → B while only using an ampliative rule in a<br />

non-suppositional context. That is the form of argument that’s at the centre of the<br />

reasoning in <strong>Weatherson</strong> (2005b), and it isn’t threatened here. So the restriction I’m<br />

suggesting doesn’t yield any kind of invidious scepticism.<br />

The upshot of these reflections is that there is no plausible position which holds<br />

that rules like R99 can be applied inside the scope of a supposition. Either the argument<br />

here shows that such a use of R99 leads to absurdity, or it is a mistake to<br />

think of rules like R99 as rules of inference, rather than shorthands for probabilistic<br />

3 In general I think the negations of projectable predicates are not projectable.


Induction and Supposition 263<br />

rules. And if that’s right, then the quick argument for contingent a priori knowledge<br />

discussed in the first paragraph can’t succeed.


Part III<br />

Language


Epistemic Modals in Context<br />

Andy Egan, John Hawthorne, <strong>Brian</strong> <strong>Weatherson</strong><br />

Abstract<br />

A very simple contextualist treatment of a sentence containing an epistemic<br />

modal, e.g. a might be F, is that it is true iff for all the contextually<br />

salient community knows, a is F. It is widely agreed that the simple theory<br />

will not work in some cases, but the counterexamples produced so<br />

far seem amenable to a more complicated contextualist theory. We argue,<br />

however, that no contextualist theory can capture the evaluations<br />

speakers naturally make of sentences containing epistemic modals. If we<br />

want to respect these evaluations, our best option is a relativist theory<br />

of epistemic modals. On a relativist theory, an utterance of a might be<br />

F can be true relative to one context of evaluation and false relative to<br />

another. We argue that such a theory does better than any rival approach<br />

at capturing all the behaviour of epistemic modals.<br />

In the 1970s David Lewis argued for a contextualist treatment of modals (Lewis,<br />

1976a, 1979f). Although Lewis was primarily interested in modals connected with<br />

freedom and metaphysical possibility, his arguments for contextualism could easily<br />

be taken to support contextualism about epistemic modals. In the 1990s Keith<br />

DeRose argued for just that position (DeRose, 1991, 1998).<br />

In all contextualist treatments, the method by which the contextual variables get<br />

their values is not completely specified. For contextualist treatments of metaphysical<br />

modality, the important value is the class of salient worlds. For contextualist<br />

treatments of epistemic modality, the important value is which epistemic agents are<br />

salient. In this paper, we start by investigating how these values might be generated,<br />

and conclude that it is hard to come up with a plausible story about how they are<br />

generated. There are too many puzzle cases for a simple contextualist theory to be<br />

true, and a complicated contextualist story is apt to be implausibly ad hoc.<br />

We then look at what happens if we replace contextualism with relativism. On<br />

contextualist theories the truth of an utterance type is relative to the context in which<br />

it is tokened. On relativist theories, the truth of an utterance token is relative to the<br />

context in which it is evaluated. Many of the puzzles for contextualism turn out to<br />

have natural, even elegant, solutions given relativism. We conclude by comparing<br />

two versions of relativism.<br />

We begin with a puzzle about the role of epistemic modals in speech reports.<br />

† Penultimate draft only. Please cite published version if possible. Final version published in in<br />

Gerhard Preyer and Georg Peter (eds) Contextualism in Philosophy, OUP 2005, pp. 131-69. Thanks to<br />

Keith DeRose, Kai von Fintel, Ernie Lepore, Jason Stanley and especially Tamar Szabó Gendler and John<br />

MacFarlane for helpful discussions and suggestions for improvement.


Epistemic Modals in Context 266<br />

1 A Puzzle<br />

The celebrity reporter looked discomforted, perhaps because there were so few celebrities<br />

in Cleveland.<br />

“Myles”, asked the anchor, “where are all the celebrities? Where is Professor<br />

Granger?”<br />

“We don’t know,” replied Myles. “She might be in Prague. She was planning<br />

to travel there, and no one here knows whether she ended up there or whether she<br />

changed her plans at the last minute.”<br />

This amused Professor Granger, who always enjoyed seeing how badly wrong<br />

CNN reporters could be about her location. She wasn’t sure exactly where in the<br />

South Pacific she was, but she was certain it wasn’t Prague. On the other hand,<br />

it wasn’t clear what Myles had gotten wrong. His first and third sentences surely<br />

seemed true: after all, he and the others certainly didn’t know where Professor<br />

Granger was, and she had been planning to travel to Prague before quietly changing<br />

her destination to Bora Bora.<br />

The sentence causing all the trouble seemed to be the second: “She might be in<br />

Prague.” As she wiggled her toes in the warm sand and listened to the gentle rustling<br />

of the palm fronds in the salty breeze, at least one thing seemed clear: she definitely<br />

wasn’t in Prague – so how could it be true that she might be? But the more she<br />

thought about it, the less certain she became. She mused as follows: when I say<br />

something like x might be F, I normally regard myself to be speaking truly if neither I<br />

nor any of my mates know that x is not F. And it’s hard to believe that what goes for<br />

me does not go for this CNN reporter. I might be special in many ways, but I’m not<br />

semantically special. So it looks like Myles can truly say that I might be in Prague<br />

just in case neither he nor any of his mates knows that I am not. And I’m sure none<br />

of them knows that, because I’ve taken great pains to make them think that I am, in<br />

fact, in Prague – and reporters always fall for such deceptions.<br />

But something about this reasoning rather confused Professor Granger, for she<br />

was sure Myles had gotten something wrong. No matter how nice that theoretical<br />

reasoning looked, the fact was that she definitely wasn’t in Prague, and he said that<br />

she might be. Trying to put her finger on just where the mistake was, she ran through<br />

the following little argument.<br />

(1) When he says, “She might be in Prague” Myles says that I might be in Prague. 1<br />

(2) When he says, “She might be in Prague” Myles speaks truly iff neither he nor<br />

any of his mates know that I’m not in Prague.<br />

(3) Neither Myles nor any of his mates know that I’m not in Prague.<br />

(4) If Myles speaks truly when he says that I might be in Prague, then I might be<br />

in Prague.<br />

(5) I know I’m not in Prague.<br />

(6) It’s not the case that I know I’m not in Prague if I might be in Prague.<br />

1 Some of Professor Granger’s thoughts sound a little odd being in the present tense, but as we shall see,<br />

there are complications concerning the interaction of tense with epistemic modals, so for now it is easier<br />

for us to avoid those interactions.


Epistemic Modals in Context 267<br />

There must be a problem here somewhere, she thought – for (1) – (6) are jointly<br />

inconsistent. (Quick proof: (2) and (3) entail that Myles speaks truly when he says,<br />

“She might be in Prague”. From that and (1) it follows he speaks truly when he says<br />

Professor Granger might be in Prague. From that and (4) it follows that Professor<br />

Granger might be in Prague. And that combined with (5) is obviously inconsistent<br />

with (6).) But wherein lies the fault? Unless some fairly radical kind of scepticism is<br />

true, Professor Granger can know by observing her South Pacific idyll that she’s not<br />

in Prague – so (5) looks secure. And it seems pretty clear that neither Myles nor any<br />

of his mates know that she’s not in Prague, since they all have very good reason to<br />

think that she is – so it looks like (3) is also OK. But the other four premises are all<br />

up for grabs.<br />

Which exactly is the culprit is a difficult matter to settle. While the semantic<br />

theory underlying the reasoning in (1)-(6) is mistaken in its details, something like<br />

it is very plausible. The modal ‘might’ here is, most theorists agree, an epistemic<br />

modal. So its truth-value should depend on what someone knows. But who is this<br />

someone? If it is Myles, or the people around him, then the statement “she might<br />

be in Prague” is true, and it is unclear where to block the paradox. If it is Professor<br />

Granger, or the people around her, then the statement is false, but now it is unclear<br />

why a competent speaker would ever use this kind of epistemic modal. Assuming<br />

the someone is Professor Granger, and assuming Professor Granger knows where she<br />

is, then “Granger might be in Prague” will be true iff “Granger is in Prague” is true.<br />

But this seems to be a mistake. Saying “Granger might be in Prague” is a way to<br />

weaken one’s commitments, which it could not be if the two sentences have the same<br />

truth conditions under plausible assumptions. So neither option looks particularly<br />

promising.<br />

To make the problem even more pressing, consider what happens if a friend of<br />

Professor Granger’s who knows she is in the South Pacific overhears Myles’s comment.<br />

Call this third party Charles. It is prima facie very implausible that when<br />

Myles says that Professor Granger might be in Prague he means to rule out that<br />

Charles knows that she is not. After all, Charles is not part of the conversation, and<br />

Myles need not even know that he exists. So if Myles knows what he is saying, what<br />

he is saying could be true even if Charles knows Professor Granger is not in Prague.<br />

But if Charles knows this, Charles cannot regard Myles’s statement as true, else he<br />

will conclude that Professor Granger might be in Prague, and he knows she is not.<br />

So things are very complicated indeed.<br />

In reasoning as we have been, we have been assuming that the following inferences<br />

are valid.<br />

(7) A competent English speaker says It might be that S; and<br />

(8) S, on that occasion of use, means that p; entail<br />

(9) That speaker says that it might be that p<br />

Further, (9) plus<br />

(10) (10)That speaker speaks truly; entail


Epistemic Modals in Context 268<br />

(11) It might be that p<br />

If Charles accepts the validity of both of these inferences, then he is under considerable<br />

pressure to deny that Myles speaks truly. And it would be quite natural<br />

for him to do so – for instance, by interrupting Myles to say that “That’s wrong.<br />

Granger couldn’t be in Prague, since he left on the midnight flight to Tahiti.” But<br />

it’s very hard to find a plausible semantic theory that backs up this intervention, although<br />

such reactions are extremely common. (To solidify intuitions, here is another<br />

example: I overhear you say that a certain horse might have won a particular race. I<br />

happen to know that the horse is lame. I think: you are wrong to think that it might<br />

have won.) 2<br />

Our solutions to this puzzle consist in proposed semantic theories for epistemic<br />

modals. We start with contextualist solutions, look briefly at invariantist solutions,<br />

and conclude with relativist solutions. Although we will look primarily at the costs<br />

and benefits of these theories with respect to intuitions about epistemic modals, it<br />

is worth remembering that they differ radically in their presuppositions about what<br />

kind of theory a semantic theory should be. Solving the puzzles to do with epistemic<br />

modals may require settling some of the deepest issues in philosophy of language<br />

2 Contextualist Solutions<br />

In his (1991), Keith DeRose offers the following proposal:<br />

S’s assertion “It is possible that P” is true if and only if (1) no member of<br />

the relevant community knows that P is false, and (2) there is no relevant<br />

way by which members of the relevant community can come to know<br />

that P is false. (593-4)<br />

DeRose intends ‘possible’ here to be an epistemic modal, and the proposal is meant<br />

to cover all epistemic modals, including those using ‘might’. 3 We will not discuss<br />

2 Note that it also seems implausible to say that this is an instance of metalinguistic negation, as discussed<br />

in Horn (1989). When Charles interrupts Myles to object, the objection isn’t that the particular form of<br />

words that Myles has chosen is inappropriate. The form of words is fine, and Myles’ utterance would be<br />

completely unobjectionable if Charles’s epistemic state were slightly different. What’s wrong is that Myles<br />

has used a perfectly acceptable form of words to say something that’s false (at least by Charles’ lights—more<br />

on this later). We also think it’s implausible to understand the ‘might’ claims in question here as claims of<br />

objective chance or objective danger.<br />

3 We take the puzzle to be a puzzle about sentences containing epistemic modal operators, however they<br />

are identified. We are sympathetic with DeRose’s (1998) position that many sentences containing ‘might’<br />

and ‘possible’ are unambiguously epistemic, but do not wish to argue for that here. Rather, we simply take<br />

for granted that a class of sentences containing epistemic modal operators has been antecedently identified.<br />

There are two differences between ‘possible’ and ‘might’. The first seems fairly superficial. Sentences<br />

where might explicitly takes a sentence, rather than a predicate, as its argument are awkward at best, and<br />

may be ungrammatical. It is possible that Professor Granger is in Prague is much more natural than It might<br />

be the case that Professor Granger is in Prague, but there is no felt asymmetry between Professor Granger is<br />

possibly in Prague and Professor Granger might be in Prague. We will mostly ignore these issues here, and<br />

follow philosophical orthodoxy in treating epistemic modals as being primarily sentence modifiers rather<br />

than predicate modifiers. The syntactic features of epistemic modals are obviously important, but we’re<br />

fairly confident that the assumption that epistemic modals primarily operate on sentences does not bear


Epistemic Modals in Context 269<br />

here the issues that arise under clause (2) of DeRose’s account, since we’ll have quite<br />

enough to consider just looking at whether clause (1) or anything like it is correct. 4<br />

In our discussion below, we consider three promising versions of contextualist<br />

theory. What makes the theories contextualist is that they all say that Myles spoke<br />

truly when he said “She might be in Prague”, but hold that if Professor Granger<br />

had repeated his words she would have said something false. 5 And the reason for<br />

the variation in truth-value is just that Myles and Professor Granger are in different<br />

contexts, which supply different relevant communities. Where the three theories<br />

differ is in which constraints they place on how context can supply the community<br />

in question.<br />

The first is the kind of theory that DeRose originally proposed. On this theory,<br />

there is a side constraint that the relevant community always includes the speaker:<br />

whenever S truly utters a might be F, S does not know that a is not F. We’ll call this the<br />

speaker-inclusion constraint, or sometimes just speaker-inclusion. There is some quite<br />

compelling evidence for speaker-inclusion. Consider, for example, the following sort<br />

of case: Whenever Jack eats pepperoni pizza, he forgets that he has ten fingers, and<br />

thinks “I might only have eight fingers.” Jill (who knows full well that Jack has ten<br />

fingers) spots Jack sitting all alone finishing off a pepperoni pizza, and says, “He<br />

might have eight fingers.” Jill has said something false. And what she’s said is false<br />

because it’s not compatible with what she knows that Jack has eight fingers. But if the<br />

relevant community could ever exclude the speaker, one would think it could do so<br />

here. After all, Jack is clearly contextually salient: he’s the referent of ‘he,’ the fingers<br />

in question are on his hand, and no one else is around. 6 Now, a single case does not<br />

any theoretical load here, and could be replaced if necessary.<br />

The other difference will be relevant to some arguments that follow. ‘Might’ can interact with tense<br />

operators in a way that ‘possible’ does not. It might have rained could either mean MIGHT (WAS it rains)<br />

or WAS (MIGHT it rains), while It possibly rained unambiguously means POSSIBLY (WAS it rains). It is<br />

often hard in English to tell just which meaning is meant when a sentence contains both tense operators<br />

and epistemic modals, but in Spanish these are expressed differently: Puede haber llovido; Podría haber<br />

llovido.<br />

4 There are three kinds of cases where something like DeRose’s clause (2) could be relevant.<br />

First, Jack and Jill are in a conversation, and Jack knows p while Jill knows p → ¬ Fa. In this case<br />

intuitively neither could truly say a might be F even though neither knows a is not F.<br />

Second, there are infinitely many mathematicians discussing Fermat’s Last Theorem. The first knows<br />

just that it has no solutions for n=3, the second just that it has no solutions for n=4, and so on. Intuitions<br />

are (unsurprisingly) weaker here, but we think none of them could say Fermat’s Last Theorem might have<br />

solutions, because the group’s knowledge rules this out.<br />

Third, if S was very recently told that a is not F, but simply forgot this, then intuitively she speaks falsely<br />

if she says a might be F.<br />

Fourth, if S has the materials for easily coming to know P from her current knowledge, but has not<br />

performed the relevant inference, then we might be inclined (depending on how easy the inferential steps<br />

were to see and so on) to say that she is wrong to utter ‘It might be that not P’.<br />

Rather than try and resolve the issues these cases raise, we will stick to cases where the only thing that<br />

could make a might be F false is that someone knows that a is not F.<br />

5 She would also have violated some pragmatic principles by knowingly using a third-person pronoun<br />

to refer to herself, but we take it those principles are defeasible, and violation of them does not threaten<br />

the truth-aptness of her utterance.<br />

6 Notice that intuitions do not change if we alter the case in such a way that Jack has a strange disorder<br />

that makes it very hard for him to come to know how many fingers he has. Thus clause (2) of Derose’s


Epistemic Modals in Context 270<br />

prove a universal 7 – but the case does seem to provide good prima facie vidence for<br />

DeRose’s constraint.<br />

One implication of DeRose’s theory is that (1) is false, at least when Professor<br />

Granger says it. For when Professor Granger reports that Myles says “She might be<br />

in Prague,” she is reporting a claim he makes about his epistemic community – that<br />

her being in Prague is compatible with the things that they know. But when she says<br />

(in the second clause) that this means he is saying that she might be in Prague, she<br />

speaks falsely. For in her mouth the phrase “that I might be in Prague” denotes the<br />

proposition that it’s compatible with the knowledge of an epistemic community that<br />

includes Professor Granger (as the speaker) that Professor Granger is in Prague. And<br />

that is not a proposition that Myles assented to. So DeRose’s theory implies that the<br />

very intuitive (1) is false when uttered by Granger.<br />

(1) When he says, “She might be in Prague” Myles says that I might be in Prague.<br />

It is worth emphasizing how counterintuitive this consequence of speaker-inclusion<br />

is. If the speaker-inclusion constraint holds universally then in general speech involving<br />

epistemic modals cannot be reported disquotationally. But notice how natural it<br />

is, when telling the story of Jack and Jill, to describe the situation (as we ourselves<br />

did in an earlier draft of this paper) as being one where “Whenever Jack eats pepperoni<br />

pizza, he forgets that he has ten fingers, and thinks he might only have eight.”<br />

Indeed, it is an important generalization about how we use language that speakers<br />

usually do not hesitate to disquote in reporting speeches using epistemic modals. So<br />

much so that exceptions to this general principle are striking – as when the tenses of<br />

the original speech and the report do not match up, and the tense difference matters<br />

to the plausibility of the attribution.<br />

One might try to explain away the data just presented by maintaining a laxity for<br />

‘says that’ reports. A chemist might say ‘The bottle is empty’ meaning it is empty<br />

of air, while milkman might utter the same sentence, meaning in my context that<br />

it is empty of milk. Nevertheless, the milkman might be slightly ambivalent about<br />

denying:<br />

When the chemist says ‘The bottle is empty’, she says that the bottle is<br />

empty.<br />

And this is no doubt because the overt ‘says that’ construction frequently deploys<br />

adjectives and verbs in a rather quotational way. After all, the chemist could get away<br />

with the following speech in ordinary discourse: “I know the milkman said that the<br />

bottle is empty. But he didn’t mean what I meant when I said that the bottle is empty.<br />

When he said that the bottle was empty he meant that it was empty of milk.” 8 Thus<br />

the conventions of philosophers for using ‘say that’ involve regimenting ordinary use<br />

analysis cannot do the work of the relevant side constraint.<br />

7 And see the case of Tom and Sally in the maze below for some countervailing evidence.<br />

8 Notice that this use prohibits the inference from: The speaker said that the bottle was empty, to, The<br />

speaker expressed the proposition/said something that meant that the bottle was empty.


Epistemic Modals in Context 271<br />

in a certain direction. 9 But the disquotational facts that we are interested in cannot<br />

be explained away simply by invoking these peculiarities of ‘says that’ constructions,<br />

for the same disquotational ease surrounds the relevant belief reports. In the case just<br />

considered, while we might argue about whether it was acceptable for the chemist<br />

to say, in her conversational context, “The milkman said that the bottle was empty”,<br />

it is manifestly unacceptable for her to say “The milkman believes that the bottle is<br />

empty”. This contrasts with the case of ‘might’: If someone asked Professor Granger<br />

where Myles thought she was, she could quite properly have replied with (12).<br />

(12) He thinks that/believes that I might be in Prague.<br />

Indeed, we in general tend find the following inference pattern – a belief-theoretic<br />

version of (7) to (9) above – compelling:<br />

(i) A competent English speaker sincerely asserts It might be that S<br />

(ii) S, in that context of use, means that p.; therefore,<br />

(iii) That speaker believes that it might be that p<br />

Our puzzle cannot, then, be traced simply to a laxity in the ‘says that’ construction. 10<br />

Whatever the puzzle comes to, it certainly runs deeper than that.<br />

Notice that (12) does not suggest that Myles thinks that for all Professor Granger<br />

knows, she is in Prague; it expresses the thought that Myles thinks that for all he<br />

knows, that is where she is. Moreover, this is hardly a case where Granger’s utterance<br />

is of doubtful appropriateness: (12) is one of the ways canonically available for<br />

Granger to express that thought. But if we assume that what is reported in a belief<br />

report of this kind is belief in the proposition the reporter expresses by I might be in<br />

Prague, and we assume a broad-reaching speaker-inclusion constraint, we must concede<br />

that the proposition Granger expresses by uttering (12) is that Myles believes<br />

that for all Professor Granger knows, Professor Granger is in Prague.<br />

If the speaker-inclusion constraint holds universally, then anyone making such a<br />

report is wrong. There are two ways for this to happen—either they know what the<br />

sentences they’re using to make the attributions mean, and they have radically false<br />

views about what other people believe, or they have non-crazy views about what<br />

people believe, but they’re wrong about the meanings of the sentences they’re using.<br />

The first option is incredibly implausible. So our first contextualist theory needs<br />

to postulate a widespread semantic blindness; in general speakers making reports<br />

are mistaken about the semantics of their own language. In particular, it requires<br />

that such speakers are often blind to semantic differences between sentence tokens<br />

involving epistemic modals. It is possible that some theories that require semantic<br />

blindness are true, but other things being equal we would prefer theories that do<br />

9 We are grateful for correspondence with John MacFarlane here.<br />

10 For what its worth, we also note that ‘S claimed that P’ has less laxity (of the sort being discussed)<br />

than ‘S said that P’.


Epistemic Modals in Context 272<br />

not assume this. 11 In general the burden of proof is on those who think that the<br />

folk don’t know the meaning of their own words. More carefully: the burden of<br />

proof is on those who think that the folk are severely handicapped in their ability to<br />

discriminate semantic sameness and difference in their home language.<br />

So the plausibility of (1) counts as evidence against the first contextualist theory,<br />

and provides a suggestion for our second contextualist theory. The cases that provide<br />

the best intuitive support for the speaker-inclusion constraint and the case we used<br />

above, involved unembedded epistemic modals. Perhaps this constraint is true for<br />

epistemic modals in simple sentences, but not for epistemic modals in ‘that’ clauses.<br />

Perhaps, that is, when S sincerely asserts X Vs that a might be F, she believes that<br />

X Vs that for all X (and her community) knows, a is F. (This is not meant as an<br />

account of the logical form of X Vs that a might be F, just an account of its truth<br />

conditions. We defer consideration of what hypothesis, if any, about the underlying<br />

syntax could generate those truth conditions.) To motivate this hypothesis, note how<br />

we introduced poor Jack, above. We said that he thinks he might have eight fingers.<br />

We certainly didn’t mean by that that Jack thinks something about our epistemic<br />

state.<br />

The other problem with the speaker-inclusion constraint is that it does not seem<br />

to hold when epistemic modals are bound by temporal modifiers, as in the following<br />

example. A military instructor is telling his troops about how to prepare for jungle<br />

warfare. He says, “Before you walk into an area where there are lots of high trees, if<br />

there might be snipers hiding in the branches, clear away the foliage with flamethrowers.”<br />

Whatever the military and environmental merits of this tactic, the suggestion<br />

is clear. The military instructor is giving generic conditional advice: in any situation<br />

of type S, if C then do A. The situation S is easy to understand, it is when the<br />

troops are advancing into areas where there are high trees. And A, too, is clear: blaze<br />

’em. But what about C? What does it mean to say that there might be snipers in the<br />

high branches? Surely not that it’s compatible with the military instructor’s knowledge<br />

that there are snipers in the high branches – he’s sitting happily in West Point,<br />

watching boats sail lazily along the Hudson. What he thinks about where the snipers<br />

are is neither here nor there. Intuitively, what he meant was that the troops should<br />

use flamethrowers if they don’t know whether there are snipers in the high branches.<br />

(Or if they know that there are.) So as well as leading to implausible claims about<br />

speech reports, the speaker-inclusion constraint seems clearly false when we consider<br />

temporal modifiers.<br />

Here is a way to deal with both problems at once. There are constraints on the<br />

application of the speaker-inclusion constraint. It does not apply when the epistemic<br />

modal is in the scope of a temporal modifier (as the flamethrower example shows) and<br />

it does not apply when the epistemic modal is in a ‘that’ clause. 12 Our second con-<br />

11 Note that the negation of semantic blindness concerning some fragment of the language is not the<br />

theory that speakers know all the semantic equivalences that hold between terms in that fragment. All we<br />

mean by the denial of semantic blindness is that speakers not have false beliefs about the semantics of their<br />

terms.<br />

12 This theory looks like one in which propositional attitude operators become monsters, since the<br />

content of Jack thinks that Jill might be happy is naturally generated by applying the operator Jack thinks


Epistemic Modals in Context 273<br />

textualist theory then accepts the speaker-inclusion constraint, but puts constraints<br />

on its application.<br />

This kind of theory, with a speaker-inclusion constraint only applying to relatively<br />

simple epistemic modals, allows us to accept (1). The problematic claim on<br />

this theory turns out to be (4):<br />

(4) If Myles speaks truly when he says that I might be in Prague, then I might be<br />

in Prague.<br />

When Myles said that Professor Granger might be in Prague, he was speaking truly.<br />

That utterance expressed a true proposition. So the antecedent of (4) is true. But<br />

the consequent is false: the “might” that appears there is not in a that-clause or in<br />

the scope of a temporal modifier; so the speaker-inclusion constraint requires that<br />

Professor Granger be included in the relevant community; and since she knows that<br />

she is not in Prague, it’s not true that she might be. We would similarly have to reject:<br />

(4 ′ ) If Myles has a true belief that I might be in Prague, then I might be in Prague.<br />

But there are reasons to be worried about this version of contextualism, beyond the<br />

uneasiness that attaches to denying (4), and, worse still, (4 ′ ). For one, this particular<br />

version of the speaker-inclusion constraint seems a bit ad hoc: why should there be<br />

just these restrictions on the relevant community? More importantly, the theory<br />

indicts certain inferential patterns that are intuitively valid. Suppose a bystander in<br />

our original example reasoned 13 :<br />

(13) [Myles] believes that it might be that [Professor Granger is in Prague].<br />

(14) [Myles]’s belief is true; therefore,<br />

(15) It might be that [Professor Granger is in Prague].<br />

But this version of contextualism tells us that while (13) and (14) are true, (15) is<br />

false. In general, there are going to be counter-intuitive results whenever we reason<br />

from cases where the speaker-inclusion constraint does not apply to cases where it<br />

does.<br />

Finally, the theory is unable to deal with certain sorts of puzzle cases. The first<br />

kind of case directly challenges the speaker-inclusion constraint for simple sentences,<br />

although we are a little sceptical about how much such a case shows. 14 Tom is stuck<br />

to the proposition that that Jill might be happy denotes when it is expressed in Jack’s context. But this is<br />

not the easiest, or obviously the best, way to look at the theory. For one thing, that way of looking at<br />

things threatens to assign the wrong content to Jack thinks that Jill might have stolen my car. The content<br />

of Jill might have stolen my car in Jack’s context is that for all Jack knows, Jill stole Jack’s car, which is<br />

not what is intended. That is to say, thinking of propositional attitude operators as monsters here ignores<br />

the special status of epistemic modals in the semantics. It is better, we think, to hold that on this theory<br />

epistemic modals are impure indexicals whose value is fixed, inter alia, by their location in the sentence as<br />

well as their location in the world. But even if this theory does not officially have monsters, the similarity<br />

to monstrous theories is worth bearing in mind as one considers the pros and cons of the theory.<br />

Thanks to Ernest Lepore for helpful discussions here.<br />

13 What follows is a belief theoretic version of Charles’ reasoning.<br />

14 A similar case to the following appears in (Hawthorne, 2004b, 27).


Epistemic Modals in Context 274<br />

in a maze. Sally knows the way out, and knows she knows this, but doesn’t want to<br />

tell Tom. Tom asks whether the exit is to the left. Sally says, “It might be. It might<br />

not be.” Sally might be being unhelpful here, but it isn’t clear that she is lying. Yet if<br />

the speaker-inclusion constraint applies to unembedded epistemic modals, then Sally<br />

is clearly saying something that she knows to be false, for she knows that she knows<br />

which way is out.<br />

This case is not altogether convincing, for there is something slightly awkward<br />

about Sally’s speech here. For example, if Sally knows the exit is not to the left, then<br />

even if she is prepared to utter, “It might be [to the left],” she will not normally selfascribe<br />

knowledge that it might be to the left. And normally speakers don’t sincerely<br />

assert things they don’t take themselves to know. So it is natural to suppose that a<br />

kind of pretense or projection is going on in Sally’s speech that may well place it<br />

beyond the purview of the core semantic theory.<br />

The following case makes more trouble for our second contextualist theory, though<br />

it too has complications. Ann is planning a surprise party for Bill. Unfortunately,<br />

Chris has discovered the surprise and told Bill all about it. Now Bill and Chris are<br />

having fun watching Ann try to set up the party without being discovered. Currently<br />

Ann is walking past Chris’s apartment carrying a large supply of party hats. She sees<br />

a bus on which Bill frequently rides home, so she jumps into some nearby bushes to<br />

avoid being spotted. Bill, watching from Chris’s window, is quite amused, but Chris<br />

is puzzled and asks Bill why Ann is hiding in the bushes. Bill says<br />

(16) I might be on that bus.<br />

It seems Bill has, somehow, conveyed the correct explanation for Ann’s dive—he’s said<br />

something that’s both true and explanatory. But in his mouth, according to either<br />

contextualist theory we have considered, it is not true (and so it can’t be explanatory)<br />

that he might have been on the bus. He knows that he is in Chris’s apartment, which<br />

is not inside the bus.<br />

Chris’s question, like most questions asking for an explanation of an action, was<br />

ambiguous. Chris might have been asking what motivated Ann to hide in the bushes,<br />

or he might have been asking what justified her hiding in the bushes. This ambiguity<br />

is often harmless, because the same answer can be given for each. This looks to<br />

be just such a case. Bill seems to provide both a motivation and a justification for<br />

Ann’s leap by uttering (16). That point somewhat undercuts a natural explanation<br />

of what’s going on in (16). One might think that what he said was elliptical for<br />

She believed that I might be on the bus. And on our second contextualist theory,<br />

that will be true. If Bill took himself to be answering a question about motivation,<br />

that might be a natural analysis. (Though there’s the underlying problem that Ann<br />

presumably wasn’t thinking about her mental states when she made the leap. She<br />

was thinking about the bus, and whether Bill would be on it.) But that analysis is less<br />

natural if we think that Bill was providing a justification of Ann’s actions. 15 And it<br />

15 Though the theory will allow for the truth of, “I might have been on that bus” (since the epistemic<br />

modal clause doesn’t occur on its own, but in the scope of a temporal operator). So if we think that (i)<br />

that’s enough to do the justificatory and explanatory work, and (b) Bill’s utterance of “I might be on that


Epistemic Modals in Context 275<br />

seems plausible that he could utter (16) in the course of providing such a justification.<br />

This suggests that (16) simply means that for all Ann knew, Bill was on that bus.<br />

Alternatively, we could say that (16) is elliptical for Because I might be on that bus, and<br />

that the speaker-inclusion constraint does not apply to an epistemic modal connected<br />

to another sentence by ‘because’. This may be right, but by this stage we imagine<br />

some will be thinking that the project of trying to find all the restrictions on the<br />

speaker-inclusion constraint is a degenerating research program, and a paradigm shift<br />

may be in order.<br />

So our final contextualist theory is that DeRose’s original semantic theory, before<br />

the addition of any sort of speaker-inclusion constraint, was correct and complete.<br />

So ‘might’ behaves like ‘local’ and ‘nearby’. If Susie says “There are snipers nearby,”<br />

the truth condition for that might be that there are snipers near Susie, or that there<br />

are snipers near us, or that there are snipers near some other contextually salient<br />

individual or group. Similarly, if she utters “Professor Granger might be in Prague”<br />

the truth condition for that might be that for all she knows Professor Granger is in<br />

Prague, or that for all we know Professor Granger is in Prague, or that for all some<br />

other community knows, Professor Granger is in Prague. There are no universal<br />

rules requiring or preventing the speaker from being included in the class of salient<br />

epistemic agents.<br />

According to the third version of contextualism, if Professor Granger does not<br />

equivocate when working through her paradox, then the problem lies with (6):<br />

(6) It’s not the case that I can know I’m not in Prague if I might be in Prague.<br />

At the start of her reasoning process, Professor Granger’s use of ‘might’ means<br />

(roughly) ‘is compatible with what Myles and his friends know’. And if it keeps that<br />

meaning to the end, then the antecedent of (6) is true, because Professor Granger<br />

might (in that sense) be in Prague, even though she knows she is not. Any attempt to<br />

show that (1) through (6) form an inconsistent set will commit a fallacy of equivocation.<br />

16<br />

bus” is best understood as a clumsy stab at “I might have been on that bus”, then perhaps we can account<br />

for this kind of case using our second contextualist theory. Two worries: First, it is a cost of the theory that<br />

we have to reinterpret Bill’s utterance in this way, as a clumsy attempt to say something that the theory<br />

can accommodate. Second, there might be cases where the interpretation is less plausible: As a response<br />

to, “Why is Ann getting ready to jump over the hedge?”, “I might have been on that bus” sounds worse to<br />

us than “I might be on that bus”.<br />

16 The same kind of equivocation can be seen in other arguments involving contextually variable terms.<br />

Assume that Nomar lives in Boston, Derek lives in New York, and Nomar, while talking about Fenway<br />

Park in Boston says, “I live nearby.” Derek, at home in New York, hears this on television and runs<br />

through the following argument.<br />

1. In saying “I live nearby” Nomar says that he lives nearby. (Plausible disquotational premise about<br />

‘nearby’)<br />

2. Nomar speaks truly when he says “I live nearby” (Follows from the setup)<br />

3. If Nomar speaks truly when he says “I live nearby” and in saying “I live nearby” he says that he<br />

lives nearby, then he lives nearby. (I.e. if he speaks truly then what he says is true.)<br />

4. If Nomar lives nearby, then he lives in New York (Since everywhere that’s nearby to Derek’s home<br />

is in New York.); therefore<br />

5. Nomar lives in New York


Epistemic Modals in Context 276<br />

But (6) as uttered by Professor Granger sounds extremely plausible. And there<br />

are other, more general problems as well. It is difficult on such a theory to explain<br />

why it is so hard to get the relevant community to exclude the speaker in present<br />

tense cases: Why, for instance, can’t Jill’s statement about Jack, “He might have eight<br />

fingers,” be a statement about Jack’s epistemic state rather than her own? The third<br />

theory offers us no guidance. 17<br />

We’ll close this section with a discussion of the interaction between syntax and<br />

semantics in these contextualist theories. As is well known, in the last decade many<br />

different contextualist theories have been proposed for various philosophically interesting<br />

terms. Jason Stanley (2000) has argued that the following two constraints<br />

should put limits on when we posit contextualist semantic theories.<br />

The right thing to say about this argument is that it equivocates. Every premise has a true reading. Perhaps<br />

every premise is true on its most natural reading, but the denotation of ‘nearby’ has to change throughout<br />

the argument for every premise to be true. The current view is that ‘might’ behaves like ‘nearby’, and that<br />

Professor Granger’s argument equivocates, like Derek’s.<br />

17 There also seems to be a past/future asymmetry about epistemic modals which the third contextualist<br />

theory will have trouble explaining. Consider this case involving past tense epistemic modals. Romeo<br />

sees Juliet carrying an umbrella home on a sunny afternoon. When he asks her why she is carrying an<br />

umbrella, she replies “It might have rained today.” There’s a scope ambiguity in Juliet’s utterance. If the<br />

epistemic modal takes wide scope with respect to the tense operator, Juliet would be claiming that she<br />

doesn’t know whether it has rained today (implicating, oddly, that this is why she now has an umbrella.)<br />

Or, as Juliet presumably intends, the temporal operator could take wide scope with respect to the epistemic<br />

modal. In that case Juliet says that it was the case at some earlier time (presumably when she left for work<br />

this morning) that it was compatible with her knowledge that it would rain today. And that seems both<br />

true and a good explanation of her umbrella-carrying.<br />

It is much harder, if it is even possible, to find cases involving future tense operators where the temporal<br />

operator takes wide scope with respect to the pistemic modal. If S says, “It might rain tomorrow”, that<br />

seems to unambiguously mean that it’s compatible with S’s current knowledge (and her community’s) that<br />

it rains tomorrow. For a more dramatic case, consider a case where two people, Othello and Desdemona,<br />

have discovered that a giant earthquake next week will destroy humanity. No one else knows this yet, but<br />

there’s nothing that can be done about it. This rather depresses them, so they decide to take memorywiping<br />

drugs so that when they wake up tomorrow, they won’t know about the earthquake. Othello<br />

can’t say, “Tomorrow, humanity might survive,” even though it is true that tomorrow, for all anyone<br />

will know, humanity will survive. If the temporal modifier could take wide scope with respect to the<br />

epistemic modal, Othello’s utterance could have a true reading. But it does not. It’s possible at this point<br />

that our policy, announced in footnote 2, of ignoring issues relating to DeRose’s second clause will come<br />

back to haunt us. One possibility here is that tomorrow it will still be false that humanity might survive<br />

because it’s not compatible with what people tomorrow know and knew that humanity survives. We don’t<br />

think that’s what is going on, but it’s possible. Here’s two quick reasons to think that the problem is not<br />

so simple. First, if Othello and Desdemona commit suicide rather than take the memory-wiping drugs, it<br />

will be compatible tomorrow with all anyone ever knew that humanity survives. But still Othello’s speech<br />

seems false. Second, it’s not obviously right that what people ever knew matters for what is epistemically<br />

possible now. Presumably at one stage Bill Clinton knew what he had for lunch on April 20, 1973. (For<br />

example, when he was eating lunch on April 20, 1973.) But unless he keeps meticulous gastronomical<br />

records, this bit of knowledge is lost to humanity forever. So there will be true sentences of the form Bill<br />

Clinton might have eaten x for lunch on April 20, 1973 even though someone once knew he did not. Now<br />

change the earthquake case so that it will happen in thirty years not a week, and no one will then know<br />

about it (because Othello and Desdemona took the memory-wiping drugs and destroyed the machines<br />

that could detect it). Still it won’t be true if Othello says, “In thirty years, humanity might survive.” This<br />

suggests to us that some kind of constraints on epistemic modals will be required. The existence of these<br />

constraints seems to refute the ‘no constraints’ version of contextualism. It also undermines the argument<br />

that the second version of contextualism is too ad hoc. Once some constraints are in place, others may be<br />

appropriate.


Epistemic Modals in Context 277<br />

VARIABLE Any contextual effect on truth-conditions that is not traceable to an indexical,<br />

pronoun, or demonstrative in the narrow sense must be traceable to a<br />

structural position occupied by a variable. (Stanley, 2000, 401) 18<br />

SYNTACTIC EVIDENCE The only good evidence for the existence of a variable in<br />

the semantic structure corresponding to a linguistic string is that the string, or<br />

another that we have reason to believe is syntactically like it, has interpretations<br />

that could only be accounted for by the presence of such a variable.<br />

If any contextualist theory of epistemic modals is to be justifiably believed, then<br />

VARIABLE and SYNTACTIC EVIDENCE together entail the existence of sentences<br />

where the ‘relevant community’ is bound by some higher operator. So ideally we<br />

would have sentences like (17) with interpretations like (18).<br />

(17) Everyone might be at the party tonight.<br />

(18) For all x, it is consistent with all x knows that x will be at the party tonight.<br />

Now (17) cannot have this interpretation, which might look like bad news for the<br />

contextualist theory. It’s natural to think that if ‘might’ includes a variable whose<br />

value is the relevant community, that variable could be bound by a quantifier ranging<br />

over it. But if such a binding were possible, it’s natural to think that it would be<br />

manifested in (17). So VARIABLE and SYNTACTIC EVIDENCE together entail that<br />

we ought not to endorse contextualism about epistemic modals.<br />

This argument against contextualism fails in an interesting way, one that bears on<br />

the general question of what should count as evidence for or against a contextualist<br />

theory. The reason that any variable associated with ‘might’ in (17) cannot be bound<br />

by ‘everyone’ is that ‘might’ takes wider scope than ‘everyone’. Note that (17) does<br />

not mean (19), but rather means (20).<br />

(19) For all x, it is consistent with what we know that x will be at the party tonight.<br />

(20) It is consistent with what we know that for all x, x will be at the party tonight.<br />

As Kai von Fintel and Sabine Iatridou (2003) have shown, in any sentence of the form<br />

Every F might be G, the epistemic modal takes wide scope. For instance, (21) has no<br />

true reading if there is at most one winner of the election, even if there is no candidate<br />

that we know is going to lose.<br />

(21) Every candidate might win.<br />

18 We assume here, following Stanley, a ‘traditional syntax involving variables’ [fn. 13]Stanley2000-<br />

STACAL. At least one of us would prefer a variable-free semantics along the lines of Jacobson (1999)<br />

Adopting such a semantics would involve, as Stanley says, major revisions to the presentation of this<br />

argument, but would not clearly involve serious changes to the argument. Most contextualists happily<br />

accept the existence of variables so we do not beg any questions against them, but see Pagin (2005) for an<br />

important exception.


Epistemic Modals in Context 278<br />

More generally, epistemic modals take wide scope with respect to a wide class of<br />

quantifiers. 19 This fact is called the Epistemic Containment Principle by von Fintel<br />

and Iatridou. Even if there is a variable position for the relevant community in the<br />

lexical entry for ‘might’, this might be unbindable because the epistemic modal always<br />

scopes over a quantifier that could bind it. If that’s true then the requirement<br />

imposed by SYNTACTIC EVIDENCE is too strong. If the evidence from binding is<br />

genuinely neutral between the hypothesis that this variable place exists and the hypothesis<br />

that it does not, because there are no instances of epistemic modals that take<br />

narrow scope with respect to quantifiers, it seems reasonable to conclude that there<br />

are these variable places on the basis of other evidence.<br />

Having said all that, there still may be direct evidence for the existence of a variable<br />

position for relevant communities. Consider again our example of the military<br />

instructor, reprinted here as (22).<br />

(22) Before you walk into an area where there are lots of high trees, if there might be<br />

snipers hiding in the branches use your flamethrowers to clear away the foliage.<br />

As von Fintel and Iatridou note, it is possible for epistemic modals to take narrow<br />

scope with respect to generic quantifiers. That’s exactly what happens in (22). And<br />

it seems that the best interpretation of (22) requires a variable attached to ‘might’.<br />

Intuitively, (22) means something like (23).<br />

(23) Generally in situations where you are walking into an area where there are lots<br />

of high trees, if it’s consistent with your party’s knowledge that there are snipers<br />

hiding in the branches use your flamethrowers to clear away the foliage.<br />

The italicised your party seems to be the semantic contribution of the unenunciated<br />

variable. We are not saying that the existence of sentences like (23) shows that there<br />

are such variables in the logical form of sentences involving epistemic modals. 20 We<br />

just want to make two points here. First, if you are a partisan of SYNTACTIC EV-<br />

IDENCE, then (22) should convince you not to object to semantic accounts of epistemic<br />

modals that appeal to variables, as our contextualist theories do. Second, we<br />

note a general concern that principles like SYNTACTIC EVIDENCE presupposes that<br />

a certain kind of construction, where the contextually variable term is bound at a<br />

level like LF, is always possible. Since there are rinciples like the Epistemic Containment<br />

Principle, we note a mild concern that this presupposition will not always be<br />

satisfied.<br />

19 It is not entirely clear what the relevant class of quantifiers is, although von Fintel and Iatridou have<br />

some intriguing suggestions about what it might be.<br />

20 As previously noted, we are not all convinced that semantics ever needs to appeal to such variables, let<br />

alone that it does to account for the behaviour of epistemic modals.


Epistemic Modals in Context 279<br />

3 Invariantist Solutions<br />

The most plausible form of invariantism about epistemic modals is that DeRose’s<br />

semantics is broadly correct, but the relevant community is not set by context - it is<br />

invariably the world. We will call this position universalism. Of course when we say a<br />

might be F we don’t normally communicate the proposition that no one in the world<br />

knows whether a is F. The analogy here is to pragmatic theories of quantifier domain<br />

restriction, according to which when we say Everyone is F, we don’t communicate the<br />

proposition that everyone in the world is F, even though that is the truth condition<br />

for our utterance.<br />

The universalist position denies (2) in Professor Granger’s argument. Myles did<br />

not speak truly when he said “Professor Granger might be in Prague” because someone,<br />

namely Professor Granger, knew she was not in Prague. Although (2) is fairly<br />

plausible, it probably has weaker intuitive support than the other claims, so this is a<br />

virtue of the universalist theory.<br />

The big advantage (besides its simplicity) of the universalist theory is that it explains<br />

some puzzle cases involving eavesdropping. Consider the following kind of<br />

case. Holmes and Watson are using a primitive bug to listen in on Moriarty’s discussions<br />

with his underlings as he struggles to avoid Holmes’s plan to trap him. Moriarty<br />

says to his assistant,<br />

(24) Holmes might have gone to Paris to search for me.<br />

Holmes and Watson are sitting in Baker Street listening to this. Watson, rather inexplicably,<br />

says “That’s right” on hearing Moriarty uttering (24). Holmes is quite<br />

perplexed. Surely Watson knows that he is sitting right here, in Baker Street, which<br />

is definitely not in Paris. But Watson’s ignorance is semantic, not geographic. He<br />

was reasoning as follows. For all Moriarty (and his friends) know, Holmes is in Paris<br />

searching for him. If some kind of contextualism is true, then it seems that (24) is<br />

true in Moriarty’s mouth. And, thought Watson, if someone says something true,<br />

it’s OK to say “That’s right.”<br />

Watson’s conclusion is clearly wrong. It’s not OK for him to say “That’s right,”<br />

in response to Moriarty saying (24). So his reasoning must fail somewhere. The<br />

universalist says that where the reasoning fails is in saying the relevant community<br />

only contains Moriarty’s gang members. If we include Holmes and Watson, as the<br />

universalist requires, then Moriarty speaks falsely when he says (24).<br />

There are a number of serious (and fairly obvious) problems with the universalist<br />

account. According to universalism, the following three claims are inconsistent.<br />

(25) x might be F.<br />

(26) x might not be F.<br />

(27) Someone knows whether x is F.


Epistemic Modals in Context 280<br />

Since these don’t look inconsistent, universalism looks to be false.<br />

The universalist’s move here has to be to appeal to the pragmatics. If (27) is true<br />

then one of (25) and (26) is false, although both might be appropriate to express in<br />

some contexts. But if we can appropriately utter sentences expressing false propositions<br />

in some contexts, then presumably we can inappropriately utter true sentences<br />

in other contexts. (Indeed, the latter possibility seems much more common.) So one<br />

could respond to the universalist’s main argument, their analysis of eavesdropping<br />

cases like Watson’s, by accepting that Watson can’t appropriately say “That’s right”<br />

but he can truly say this. The universalist will have a hard time explaining why such<br />

a theory cannot work, assuming, of course, that she can explain how her own pragmatic<br />

theory can explain all the data.<br />

The major problem here is one common to all appeals to radical pragmatics in<br />

order to defend semantic theories. If universalism is true then speakers regularly, and<br />

properly, express propositions they know to be false. 21 (We assume here that radical<br />

scepticism is not true, so sometimes people know some things.) Myles knows full<br />

well than someone knows whether Professor Granger is in Prague, namely Professor<br />

Granger. But if he’s a normal English speaker, this will not seem like a reason for<br />

him to not say, “Professor Granger might be in Prague.” Some might not think this<br />

is a deep problem for the universalist theory, for speakers can be mistaken in their<br />

semantic views in ever so many ways. But many ill regard it as a serious cost of the<br />

universalist claim.<br />

This problem becomes more pressing when we look at what universalism says<br />

about beliefs involving epistemic modals. Myles does not just say that Professor<br />

Granger might be in Prague, he believes it. And he believes Professor Granger might<br />

not be in Prague. If he also believes that Professor Granger knows where she is,<br />

these beliefs are inconsistent given universalism. Perhaps the universalist can once<br />

again invoke pragmatics. It is not literally true in the story that Myles believes that<br />

Granger might be in Prague. But in escribing the situation we use “Myles believes<br />

that Granger might be in Prague,” to pragmatically communicate truths by a literal<br />

falsehood. This appeal to a pragmatic escape route seems even more strained than the<br />

previous universalist claims.<br />

In general, the universalism under discussion here seems to run up against a constraint<br />

on semantic theorising imposed by Kripke’s Weak Disquotation Principle.<br />

The principle says that if a speaker sincerely accepts a sentence, then she believes its<br />

semantic value. 22 If we have some independent information about what a speaker<br />

believes, then we can draw certain conclusions about the content of the sentences she<br />

accepts, in particular that she only accepts sentences whose content she believes. The<br />

universalist now has two options. 23 First, she can say that Myles here does accept<br />

inconsistent propositions. Second, she can deny the Weak Disquotation Principle,<br />

21By “express” we will always mean “semantically express”. We’re not concerned with, and hope not to<br />

commit ourselves to any views about, for example, what’s conveyed via various pragmatic processes.<br />

22Note that something like this had better be true if what it is to believe p is to have a sentence that<br />

means p in one’s ‘belief box’.<br />

23We assume that it is not a serious option to deny that we ever accept unnegated epistemic modal<br />

sentences.


Epistemic Modals in Context 281<br />

and say that although Myles sincerely asserts, and accepts, “Professor Granger might<br />

be in Prague” he doesn’t really believe that Professor Granger might be in Prague.<br />

Generally, it’s good to have options. But it’s bad to have options as unappealing as<br />

these. 24<br />

4 Reporting Epistemic Modals<br />

Our third class of solutions will be relatively radical, so it’s worth pausing to look<br />

at the evidence for it. Consider again the dialogue between Moriarty, Holmes and<br />

Watson. Moriarty, recall, utters (24)<br />

(24) Holmes might have gone to Paris to search for me.<br />

Watson knows that Holmes is in Baker Street, as of course does Holmes. In the above<br />

case we imagined that both Watson and Holmes heard Moriarty say this. Change the<br />

story a little so Holmes does not hear Moriarty speak, instead when he comes back<br />

into the room he asks Watson what Moriarty thinks. Watson, quite properly, replies<br />

with (30).<br />

(30) He thinks that you might have gone to Paris to search for him.<br />

This is clearly not direct quotation because Watson changes the pronouns in Moriarty’s<br />

statement. It is not as if Watson said “He sincerely said, ‘Holmes might<br />

have gone to Paris to search for me.”’ This might have been appropriate if Holmes<br />

suspected Moriarty was speaking in code so the proposition he expressed was very<br />

sensitive to the words he used.<br />

Nor was Watson’s quote a ‘mixed’ quote, in the sense of what happens in (31). 25<br />

The background is that Arnold always uses the phrase ‘my little friend’ to denote<br />

24 There also a technical problem with universalism that mirrors one of the problems Stanley and Szabó<br />

(2000) raise for pragmatic theories of quantifier domain restriction. Normally (28) would be used to express<br />

a proposition like (29).<br />

(28) Every professor enjoys every class.<br />

(29) Every salient professor enjoys every class that s/he teaches.<br />

Intuitively, by uttering (28) we express a proposition that contains two restricted quantifiers. Let’s accept,<br />

for the sake of the argument, that a pragmatic theory of quantifier domain restriction can sometimes<br />

explain why the quantifiers in the propositions we express are more restricted than the quantifiers in<br />

the truth conditions for the sentences we use. Stanley and Szabó argue that such an explanation will<br />

not generalise to cover embedded quantifiers where the quantifier domain in the proposition expressed<br />

is bound to the outer quantifier. One such quantifier is the quantifier over classes in (28). We will not<br />

repeat their arguments here, but simply note that if they are correct, the universalist faces a problem<br />

in explaining how we use sentences with embedded epistemic modals that are (intuitively) defined with<br />

respect to a community that is bound by a higher level quantifier. As we saw, (22) provides an example of<br />

this kind of epistemic modal.<br />

25 Earlier we used speech reports to illustrate the oddities of epistemic modals inside propositional attitude<br />

ascriptions. There are well-known difficulties with connecting appropriate speech reports to the<br />

semantic content of what is said, as opposed to merely communicated. (For some discussion of these, see<br />

Soames (2002) and Capellen and Lepore (1997).) We don’t think those difficulties affect the above arguments,<br />

where the evidence is fairly clear, and fairly overwhelming. But matters get a little more delicate<br />

in what follows, so we move to belief reports because they are more closely tied to the content of what is<br />

believed.


Epistemic Modals in Context 282<br />

his Hummer H2, despite that vehicle being neither little nor friendly. No one else,<br />

however, approves of this terminology.<br />

(31) Arnold: My little friend could drive up Mt Everest.<br />

Chaz: Arnold believes his little friend could drive up Mt Everest. 26<br />

We’ve left off the punctuation here so as to not beg any questions, but there is a<br />

way this could be an acceptable report if the fourth and fifth word, and those two<br />

words only, are part of a quotation. This is clearly not ordinary direct quotation, for<br />

Arnold did not think, in English or Mentalese, “His little friend could drive up Mt<br />

Everest.” Nevertheless, this is not ordinary indirect quotation. In ordinary spoken<br />

English Chaz’s report will be unacceptable unless ‘little friend’ is stressed. The stress<br />

here seems to be just the same stress as is used in metalinguistic negation, as described<br />

in Horn (1989). Note the length of the pause between ‘his’ and ‘little’. With an<br />

ordinary pause it sounds as if Chaz is using, not mentioning, ‘little friend’. So it is<br />

possible in principle to have belief reports, like this one, that are neither strictly direct<br />

nor strictly indirect. 27 Nevertheless, it does not seem like (30) need such a case. In<br />

particular, there need be no distinctive metalinguistic stress on ‘might’ in Watson’s<br />

utterance of (30), and such stress seems to be mandatory for this mixed report.<br />

Assuming Moriarty was speaking ordinary English, Watson’s report seems perfectly<br />

accurate. This is despite the fact that the relevant community one would naturally<br />

associate with Watson’s use of ‘might’ is quite different to the community we<br />

would associate with Moriarty’s use. When reporting speeches involving epistemic<br />

modals – and the beliefs express by sincere instances of such speeches, speakers can<br />

simply disquote the modal terms.<br />

As is reasonably well known, there are many terms for which this kind of disquoting<br />

report is impermissible. In every case, Guildenstern’s report of Ophelia’s<br />

utterance is inappropriate.<br />

(32) Ophelia: I love Hamlet.<br />

. . .<br />

Guildenstern: *Ophelia thinks that I love Hamlet.<br />

(33) Guildenstern: What think you of Lord Hamlet?<br />

Ophelia: He is a jerk.<br />

. . .<br />

Rosencrantz: What does Ophelia think of the King?<br />

Guildenstern: *She thinks that he is a jerk.<br />

26 In this case, as with all the belief reports discussed below, the only evidence the reporter has for the<br />

report is given by the speech immediately preceding it. We assume there is good reason from the context<br />

to assume that the speakers are sincere.<br />

27 There are somewhat delicate questions about what a direct belief report means, but we assume the<br />

notion is well enough understood, even if we could not formally explicate what is going on in all such<br />

reports.


Epistemic Modals in Context 283<br />

(34) Guildenstern: Are you ready to teach the class on contextualism?<br />

Ophelia: I’m ready.<br />

. . .<br />

Rosencrantz: Does Ophelia think she is ready to defend her dissertation?<br />

Guildenstern: *She thinks she is ready.<br />

(35) (Guildenstern and Ophelia are on the telephone, Guildenstern is in Miami, and<br />

Ophelia is in San Francisco)<br />

Guildenstern: What do you like best about San Francisco?<br />

Ophelia:There are lots of wineries nearby.<br />

. . .<br />

Rosencrantz: Is it possible to grow wine in south Florida?<br />

Guildenstern: *Ophelia thinks that there are lots of wineries nearby. 28<br />

Even when the contextualist claim is not obviously true, as with ‘local’ and ‘enemy’,<br />

disquotational reports are unacceptable after context shifts.<br />

(36) (<strong>Brian</strong> is calling from Providence, Hud and Andy are in Bellingham)<br />

<strong>Brian</strong>: When I get all this work done, I’ll head off to a local bar for some drinks.<br />

Andy: How much work is there?<br />

<strong>Brian</strong>: Not much. I should get to the bar in a couple of hours.<br />

Hud: Hey, is <strong>Brian</strong> in town? Where’s he going tonight?<br />

Andy: *He thinks he’ll be at a local bar in a couple of hours.<br />

(37) The Enemy, speaking of us: The enemy have the advantage.<br />

One of us: How are we doing?<br />

Another of us: Someone just informed me that the enemy have the advantage.<br />

(38) (Terrell is an NFL player, and Dennis is his coach.)<br />

Terrell: Why are you cutting me coach?<br />

Dennis: Because you are old and slow.<br />

(After this Terrell returns to academia. Kate and Leopold are students in his<br />

department.)<br />

Kate: Do you think Terrell would do well on our department ultimate frisbee<br />

team?<br />

Leopold: ??I’m not sure. Someone thinks he’s old and slow.<br />

This data provides us with the penultimate argument against the contextualist theory<br />

of epistemic modals. We have already seen several such arguments.<br />

28 We do not say that ‘nearby’ in a speech report could never refer to the area near the location of the<br />

original speaker. Had Rosencrantz asked a question about San Francisco, and Guildenstern given the same<br />

response, that is presumably what it would have done. We just say that it does not automatically refer<br />

back to that area, and in some cases, like (35), it can refer to a quite different area. ‘Nearby’ behaves quite<br />

differently in this respect to ‘near here’, which always refers to the area near the reporter.


Epistemic Modals in Context 284<br />

First, as seen through the difficulties with each of the options discussed in section<br />

2, any version of contextualism faces serious problems, though by altering the version<br />

of contextualism we are using, we can alter what problems we have to face.<br />

Second, there is nothing like the speaker-inclusion constraint for terms like ‘local’<br />

and ‘enemy’ for which contextualism is quite plausible. This disanalogy tells against<br />

the contextualist theory of ‘might’. With the right stage setting (and it doesn’t usually<br />

take very much), we can get ‘local’ and ‘enemy’ to mean local to x and enemy of x<br />

for pretty much any x we happen to be interested in talking about. At least for<br />

‘bare’ (unembedded) epistemic modals, the situation is markedly different. We can’t,<br />

just by making Jack salient, make our own knowledge irrelevant to the truth of our<br />

utterance of, for example, “Jack might have eight fingers.” The only way we can<br />

make our knowledge irrelevant is if we are using this sentence in an explanation or<br />

justification of Jack’s actions. 29<br />

Third, there is a difference in behaviour between embedded and unembedded<br />

occurrences of epistemic modals. When epistemic modals are embedded in belief<br />

contexts, conditionals, etc., they behave differently—the speaker inclusion constraint<br />

seems to be lifted. (Think about belief reports and that military instructor case.)<br />

‘Local’ and ‘enemy’ don’t seem to show any analogous difference in their behaviour<br />

between their bare and embedded occurrences.<br />

Fourth, ‘local’ and ‘enemy’ don’t generate any of the peculiar phenomena about<br />

willingness to agree. If Myles (still in Cleveland), says<br />

(39) Many local bars are full of Browns fans.<br />

Professor Granger (still in the South Pacific), will not hesitate to say “that’s right” (as<br />

long as she knows that many bars in Cleveland really are, as usual, full of Browns<br />

fans). The fact that the relevant bars aren’t local to her doesn’t interfere with her<br />

willingness to agree with (39) in the way that the fact that she knew that she wasn’t in<br />

Prague interfered with her willingness to agree with Myles’ claim that she might be in<br />

Prague, or in the way that Watson’s knowledge that Holmes was in London (should<br />

have) interfered with his willingness to assent to Moriarty’s claim that Holmes might<br />

be in Paris.<br />

Fifth, when there is a context shift, we are generally hesitant to produce belief<br />

reports by disquoting sincerely asserted sentences involving contextually variable<br />

terms. This is what the examples (32) through (36) show. For a wide range of contextually<br />

variable terms, speakers will quite naturally hesitate to make disquotational<br />

reports unless they are in the same context as the original speaker. Such hesitation is<br />

not shown by speakers reporting epistemic modals.<br />

The sixth argument, that there is an alternative theory that does not have these<br />

flaws, will have to wait until the next section. For now, let’s note that there are other<br />

words that seem at first to be contextually variable, but for which disquotational<br />

reports seem acceptable.<br />

29 And then it would probably be more natural to say “He might have eight fingers,” but that’s possibly<br />

for unrelated reasons.


Epistemic Modals in Context 285<br />

(40) Vinny the Vulture: Rotting flesh tastes great.<br />

John: Vinny thinks that rotting flesh tastes great.<br />

(41) Ant Z: He’s huge (said of 5 foot 3 141 lb NBA player Muggsy Bogues)<br />

Andy: Ant Z thinks that Muggsy’s huge.<br />

(42) Marvin the Martian: These are the same colour (said of two colour swatches<br />

that look alike to Martians but not to humans.)<br />

<strong>Brian</strong>: Marvin thinks that these are the same colour.<br />

In all three cases the report is accurate, or at least extremely natural. And in all three<br />

cases it would have been inappropriate for the reporter to continue “and he’s right”.<br />

But crucially, in none of the three cases is it clear that the original speaker made a<br />

mistake. In his context, it seems Vinny utters a truth by uttering, “Rotting flesh<br />

tastes great”, for rotting flesh does taste great to vultures. From Ant Z’s perspective,<br />

Muggsy Bogues is huge. We assume here, a little controversially, that there is a use<br />

of comparative adjectives that is not relativised to a comparison class, but rather to<br />

a perspective. Ant Z does not say that Muggsy is huge for a human, or for an NBA<br />

player, but just relative to him. And he’s right. Even Muggsy is huge relative to an<br />

ant. Note the contrast with (36) here. There’s something quite odd about Leopold’s<br />

statement, which intuitively means that someone said Terrell is old and slow for a<br />

graduate student, when all that was said was that he is old and slow for an NFL<br />

player. 30 And, relative to the Martian’s classification of objects into colours, the two<br />

swatches are the same colour. So there’s something very odd going on here.<br />

The following very plausible principle looks like it is being violated.<br />

TRUTH IN REPORTING If X has a true belief, then Y ’s report X believes that S<br />

accurately reports that belief only if in the context Y is in, S expresses a true<br />

proposition. 31<br />

Not only do our three reports here seem to constitute counterexamples to TRUTH<br />

IN REPORTING, Watson’s report in (30) is also such a counterexample, if Moriarty<br />

speaks truly (and sincerely). One response here would be to give up TRUTH IN<br />

REPORTING, but that seems like a desperate measure. And we would still have the<br />

puzzle of why we can’t say “and he’s right” at the end of an accurate report.<br />

Another response to these peculiar phenomena would be to follow the universalist<br />

and conclude that Moriarty, Vinny, Ant Z and Marvin all believe something false.<br />

It should be clear how to formulate this kind of position: something tastes great iff<br />

30Or perhaps something more specific than that, such as that he is old and slow for a player at his<br />

position.<br />

31One might also consider a ‘says that’ version of TRUTH IN REPORTING: If X speaks true, then Y ’s<br />

report X says that S is accurate only if in the context Y is in, S expresses a true proposition. This is more<br />

questionable, since it is questionable whether ‘says that’ constructions must report what is semantically<br />

expressed by a speech, as opposed to what is merely communicated. See again the papers mentioned in<br />

footnote 25.


Epistemic Modals in Context 286<br />

every creature thinks it tastes great; something is huge iff it is huge relative to all observers;<br />

and two things are the same colour iff they look alike (in a colour kind of<br />

way) to every observer (in conditions that are normal for them). As we saw, there<br />

are problems for the universalist move for epistemic modals. And the attractiveness<br />

of the other universal seems to dissipate when we consider the cases from a different<br />

perspective.<br />

(43) <strong>Brian</strong>: Cognac tastes great.<br />

Vinny: <strong>Brian</strong> believes that cognac tastes great.<br />

(44) Andy:He’s huge (said of Buggsy Mogues, the shortest ever player in the Dinosaur<br />

Basketball Association).<br />

Tyrone the T-Rex: Andy believes that Buggsy’s huge.<br />

(45) John: These are the same colour (said of two colour swatches that look alike to<br />

humans but not to pigeons).<br />

Pete the Pigeon: John believes that these are the same colour.<br />

Again, every report seems acceptable, and in every case it would seem strange for the<br />

reporter to continue “and he’s right.” The universalist explanation in every case is<br />

that the original utterance is false. That certainly explains the data about reports, but<br />

look at the cost! All of our utterances about colours and tastes will turn out false,<br />

as will many of our utterances about sizes. It seems we have to find a way to avoid<br />

both contextualism and universalism. Our final suggestions for how to think about<br />

epistemic modals attempt to explain all this data.<br />

5 Relativism and Centred Worlds<br />

John MacFarlane (2003a) has argued that believers in a metaphysically open future<br />

should accept that the truth of an utterance is relative to a context of evaluation. 32<br />

For example, if on Thursday Emily says, “There will be a sea battle tomorrow”,<br />

the believer in the open future wants to say that at the time her utterance is neither<br />

determinate true nor determinately false. One quick objection to this kind of theory<br />

is that if we look back at Emily’s statement while the sea battle is raging on Friday, we<br />

are inclined to say that she got it right. From Friday’s perspective, it looks like what<br />

Emily said is true. The orthodox way to reconcile these intuitions is that the only<br />

sense in which Emily’s statement is indeterminate on Thursday is an epistemic sense –<br />

we simply don’t know whether there will be a sea battle. MacFarlane argues instead<br />

that we should simply accept the intuitions as they stand. From Friday’s perspective,<br />

32 We are very grateful in this section to extensive conversations with John MacFarlane. His (2003a)<br />

was one of the main inspirations for the relativist theory discussed here. His (ms), which he was kind<br />

enough to show us a copy of while we were drafting this paper, develops the argument for a relativist<br />

approach to epistemic modals in greater detail than we do here. Mark Richard also has work in progress<br />

that develops a relativist view on related matters, which he has been kind enough to show us, and which<br />

has also influenced our thinking.


Epistemic Modals in Context 287<br />

Emily’s statement is determinately true, from Thursday’s it is not. Hence the truth<br />

of statements is relative to a context of evaluation.<br />

There is a natural extension of this theory to the cases described above. Moriarty’s<br />

statement is true relative to a context C iff it is compatible with what the<br />

people in C know that Holmes is in Paris. So in the context he uttered it, the statement<br />

is true, because it is consistent with what everyone in his context knows that<br />

Holmes is in Paris. But in the context of Watson’s report, it is false, because Watson<br />

and Holmes know that Holmes is not in Paris.<br />

We will call any such theory of epistemic modals a relativist theory, because it says<br />

that the truth of an utterance containing an epistemic modal is relative to a context of<br />

evaluation. As we will see, relativist theories do a much better job than contextualist<br />

theories of handling the data that troubled contextualist theories. Relativist theories<br />

are also plausible for the predicates we discussed at the end of the last section: ‘huge’,<br />

‘color’ and ‘tastes’. On such a theory, any utterance that x tastes F is true iff x tastes<br />

F to us. Similarly, an utterance x is huge that doesn’t have a comparison class, as in<br />

(41) or (44), is true iff x is huge relative to us. And Those swatches are the same color is<br />

true iff they look the same colour to us. The reference to us in the truth conditions<br />

of these sentences isn’t because there’s a special reference to us in the lexical entry<br />

for any of these worlds. Rather, the truth of any utterance involving these terms is<br />

relative to a context of evaluation, and when that is our context of evaluation, we get<br />

to determine what is true and what is false. If the sentences were being evaluated in<br />

a different context, it would be the standards of that context that mattered to their<br />

truth.<br />

So far we have not talked about the pragmatics of epistemic modals, assuming that<br />

their assertability conditions are given by their truth conditions plus some familiar<br />

Gricean norms. But it is not obvious how to apply some of those norms if utterance<br />

truth is contextually relative, because one of the norms is that one should say only<br />

what is true.<br />

One option is to say that utterance appropriateness is, like utterance truth, relative<br />

to a context of evaluation. This is consistent, but it does not seem to respect<br />

the data. Watson might think that Moriarty’s utterance is false, at least relative to<br />

his context of evaluation 33 , but if he is aware of Moriarty’s epistemic state he should<br />

think it is appropriate. So if something like truth is a norm of assertion, it must be<br />

truth relative to one or other context. But which one?<br />

We could say that one should only say things that are true relative to all contexts.<br />

But that would mean John’s statement about the two swatches being the same colour<br />

would be inappropriate, and that seems wrong.<br />

We could say that one should only say things that are true relative to some contexts.<br />

But then <strong>Brian</strong> could have said, “Rotting carcases taste great” and he would<br />

have said something appropriate, because that’s true when evaluated by vultures.<br />

33 We do not assume here that ordinary speakers, like Watson, explicitly make judgments about the<br />

truth of utterances relative to a context of evaluation, as such. They do make judgments about the truth of<br />

utterances, and those judgments are made in contexts, but they don’t explicitly makes judgments of truth<br />

relative to context of evaluation. One of the nice features, however, of the relativist account is that it is<br />

possible to do an attractive rational reconstruction of most of their views in terms of contexts.


Epistemic Modals in Context 288<br />

The correct norm is that one should only say something that’s true when evaluated<br />

in the context you are in. We assume here that contexts can include more than<br />

just the speaker. If Vinny the Vulture is speaking to a group of humans he arguably<br />

cannot say Rotting flesh tastes great. The reason is that rotting flesh does not taste<br />

great to the group of speakers in the conversation, most of whom are humans. This<br />

norm gives us the nice result that Myles’s statement is appropriate, as is Moriarty’s,<br />

even though in each case their most prominent audience member knows they speak<br />

falsely. 34<br />

This helps explain, we think, the somewhat ambivalent attitude we have towards<br />

speakers who express epistemic modals that are false relative to our context, but true<br />

relative to their own. What the speaker said wasn’t true, so we don’t want to endorse<br />

what they said. Still, there’s still a distinction between such a speaker and someone<br />

who says that the sky is green or that grass is blue. That speaker would violate the<br />

properly relativised version of the only say true things rule, and Myles and Moriarty<br />

do not violate that rule.<br />

As MacFarlane notes, relativist theories deny ABSOLUTENESS OF UTTERANCE<br />

TRUTH, the claim that if an utterance is true relative to one context of evaluation it<br />

is true relative to all of them. It is uncontroversial of course that the truth value of an<br />

utterance type can be contextually variable, the interesting claim that relativists make<br />

is that the truth value of utterance tokens can also be different relative to different<br />

contexts. So they must deny one or more premises in any argument for ABSOLUTE-<br />

NESS OF UTTERANCE TRUTH, such as this one.<br />

1. ABSOLUTENESS OF PROPOSITIONAL CONTENT: If an utterance expresses<br />

the proposition p relative to some context of evaluation, then it expresses that<br />

proposition relative to all contexts of evaluation.<br />

2. ABSOLUTENESS OF PROPOSITIONAL TRUTH VALUE: If a proposition p is<br />

true relative to one context in a world it is true relative to all contexts in that<br />

world; therefore,<br />

3. ABSOLUTENESS OF UTTERANCE TRUTH<br />

This argument provides a nice way of classifying relativist theories. One relativist<br />

approach is to say that Moriarty (or anyone else who utters an epistemic modal)<br />

says something different relative to each context of evaluation. Call this approach<br />

content relativism. Another approach is to say that there is a single proposition that<br />

he expresses with respect to every context, but the truth value of that proposition<br />

is contextually variable. Call this approach truth relativism. (So that the meaning<br />

of ‘proposition’ is sufficiently understood here, let us stipulate that we understand<br />

propositions to be the things that are believed and asserted and thus, relatedly, the<br />

semantic values of ‘that’-clauses.)<br />

It might look like some of our behaviour is directly inconsistent with any sort of<br />

relativism. Consider the following dialogue.<br />

34 Can we even say that someone speaks falsely here now that truth and falsity is always relative to a<br />

context of evaluation? It turns out we can, indeed we must, although the matter is a little delicate. We<br />

return to this point below.


Epistemic Modals in Context 289<br />

(46) Vinny: Rotting flesh tastes great<br />

Vinny’s brother: That’s true.<br />

John: That (i.e. what Vinny’s brother said) is not true.<br />

If what Vinny’s brother is saying is that Vinny’s utterance Rotting flesh tastes great<br />

is true in his context, then John is wrong in saying that what Vinny’s brother said<br />

isn’t true. For it is true, we claim, that Rotting flesh tastes great is true in Vinny’s<br />

context. 35 But this prediction seems unfortunate, because John’s utterance seems<br />

perfectly appropriate in his context.<br />

The solution here is to recognise a disquotational concept of truth, to go alongside<br />

the binary concept of truth that is at the heart of the relativist solution. 36 The binary<br />

concept is a relation between an tterance and a context of evaluation. Call this true B .<br />

So Vinny’s utterance is true B relative to his context, and to his brother’s context, and<br />

false B relative to John’s context. One crucial feature of the binary concept is that<br />

it is not a relativist concept. If it is true relative to one context that an utterance is<br />

true B relative to context C, it is true relative to all contexts that the utterance is true B<br />

relative to context C. The disquotational concept is unary. Call this true T . As far<br />

as is permitted by the semantic paradoxes, it claims that sentences of the form S is<br />

true T iff S will be true B relative to any context (note here the primacy of truth B for<br />

semantic explanation) True T is a relative concept. An utterance can be true T relative<br />

to C and not true T relative to C ′ . When an utterance is given the honorific true in<br />

ordinary discourse, it is the unary relative concept true T that is being applied. That<br />

explains what is going on in (46). Vinny’s brother says that Vinny’s utterance is true T .<br />

Relative to his context, that’s right, since Vinny’s utterance is true in his context. But<br />

relative to John’s context, that’s false, because an utterance is true T relative to John’s<br />

context iff it is true relative to John’s context. So John spoke truly relative to his own<br />

context, so he spoke correctly. The important point is that assignments of truth T are<br />

relative rather than contextually rigid, so they might be judged true relative to some<br />

contexts and false relative to others.<br />

Although both truth relativism and content relativism can explain (46) if they<br />

help themselves to the distinction between truth B and truth T , there are four major<br />

problems for content relativism that seem to show it is not the correct theory.<br />

The first problem concerns embeddings of “might” clauses in belief contexts. Suppose<br />

Watson says,<br />

(47) Moriarty believes that Holmes might be in Paris.<br />

On the content relativist view, (47) will say, relative to Watson, that Moriarty believes<br />

that, as far as Watson knows, Holmes is in Paris. That would be a crazy thing for Watson<br />

to assert. Suppose Watson is talking to Holmes. Then, relative to Holmes, Watson<br />

will have claimed that Moriarty believes that, as far as Holmes knows, Holmes is<br />

in Paris. That would also be a crazy thing for Watson to assert. But, given what he’s<br />

35We assume here the vultures are talking mainly to other vultures, and John is talking mainly to other<br />

humans.<br />

36We are grateful to John Macfarlane for helpful correspondence that influenced what follows.


Epistemic Modals in Context 290<br />

just overheard, it would be perfectly natural—and pretty clearly correct, so long as<br />

nothing funny is going on behind the scenes—for Watson to assert (47). A view that<br />

tells us that Watson’s saying something crazy relative to everybody who’s likely to be<br />

a member of his audience is in pretty serious conflict with our pretheoretical judgements<br />

about the case. (Enlarging the context to include both Holmes and Watson<br />

obviously doesn’t help, either.)<br />

The second problem concerns the social function of assertion. In particular, it<br />

causes difficulties for an attractive part of the Stalnakerian story about assertion,<br />

that the central role of an assertion is to add the proposition asserted to the stock<br />

of conversational presuppositions (Stalnaker, 1978). On the content relativist view,<br />

it can’t be that the essential effect of assertion is to add the proposition asserted to<br />

the stock of common presuppositions, because there’s no such thing as the proposition<br />

asserted. There will be a different proposition asserted relative to each audience<br />

member. That’s not part of an attractive theory. And it’s not terribly clear what<br />

the replacement story about the essential effect of assertion—about the fundamental<br />

role of assertion in communication—is going to be. It may be that there’s a story to<br />

be told about assertability—about when Moriarty is entitled to assert, for example,<br />

“it might be that Holmes is in Paris”—but there’s no obvious story about what he’s<br />

up to when he’s making that assertion—about what the assertion is supposed to accomplish.<br />

(And if you think that appropriateness of assertion’s got to be tied up with<br />

what your assertion’s supposed to accomplish, then you’ll be sceptical about even the<br />

first part.)<br />

The third problem concerns epistemic modals in the scope of temporal modifiers.<br />

The content relativist has difficulties explaining what’s going on with sentences like<br />

(48).<br />

(48) The Trojans were hesitant in attacking because Achilles might have been with<br />

the Greek army.<br />

On the content relativist view, (48) will be false relative to pretty much everybody—<br />

certainly relative to everybody alive today. It’s certainly false that the Trojans were<br />

hesitant because, as far as we know, Achilles was with the Greek army. (Or worse,<br />

because, as far as we knew then, Achilles was with the Greek army.) But, depending<br />

on how the Trojan war went, (48) could be true relative to everybody. 37<br />

Finally, content relativism has a problem with commands. Keith’s Mom says:<br />

(49) For all days d, you should carry an umbrella on d if and only if it might rain<br />

on d.<br />

37 We don’t take any stand here on just how the war went, if it happened at all. The important point is<br />

that whether (48) is true when said of a particular battle is a wide-open empirical question, not one that<br />

can be settled by appeal to the semantics of might. The content relativist says, falsely, that it can be thus<br />

settled.


Epistemic Modals in Context 291<br />

Suppose on Monday Keith checks the forecast and it says there’s a 50% chance of<br />

rain. So he takes an umbrella. It doesn’t rain, and on Tuesday he wonders whether<br />

what he did on Monday was what his Mom said he should. On the content relativist<br />

view, we get the following strange result: on Monday, it would have been true to<br />

say that he was doing what his Mom said he should, since at the time, the embedded<br />

clause expressed a proposition that was true relative to him. Looking back on Tuesday,<br />

though, it looks like he did what his Mom said he shouldn’t, because now the<br />

embedded clause expresses a proposition that’s false relative to him. But that’s not<br />

right. He just plain did what his Mom told him to do.<br />

The same thing happens with the soldiers trying to follow the imperative issued<br />

as (22). Assume one of them attempts to follow the command by burning down some<br />

trees that seem to contain snipers. Relative to the time she is doing the burning, she<br />

will be complying with the command. But later, when it turns out the trees were<br />

sniper-free, she will not have been following the command. If we assume there’s an<br />

overarching command to not use flamethrowers unless explicitly instructed to do so,<br />

then it will turn out that, as of now, she violated her orders then. But that’s not right.<br />

She just plain followed her orders.<br />

There’s a similar problem with the other terms about which relativism seems<br />

plausible. Consider the following commands:<br />

(50) Don’t pick fights with huge opponents.<br />

(51) Stack all of the things that are the same color together.<br />

(52) If it tastes lousy, spit it out.<br />

It’s possible to sensibly issue these commands, even in relevantly mixed company.<br />

And if we’re going to get the right compliance conditions, we don’t want content<br />

relativism about great-tastingness, hugeness, and same-coloredness here. When we<br />

hear a command like (52), we take (a) the same command to have been issued to<br />

everybody, and (b) everybody to be following it if we all spit out the things that taste<br />

lousy to us. On the content relativist view, we’ve each gotten different commands,<br />

and the philosopher who spits out the chunk of week-old antelope hasn’t complied<br />

with the command that Vinny was given. This seems wrong.<br />

So the content relativist theory has several problems. The truth relativist theory<br />

does much better. Let us begin with the familiar notion of a function from worlds to<br />

truth values. Call any such function a Modal Profile. On the standard way of looking<br />

at things, propositions – the objects of belief and assertion, the semantic values<br />

of ‘that’-clauses – are, or at least determine a Modal Profile. The truth relativist denies<br />

this. According to the truth-relativist, the relevant propositions are true or false<br />

not relative to worlds, but relative to positions within worlds—that is, they’re true or<br />

false relative to centered worlds. (A centered world is a triple of a possible world, an<br />

individual, and a time.) There’s a few ways to formally spell out this idea. One is<br />

to replace talk of Modal Profiles with Centring Profiles, i.e. functions from centred<br />

worlds to truth values. Another is to say that a centred world and proposition combine<br />

to determine a Modal Profile, so propositions determine functions from centred<br />

worlds to Modal Profiles. Each of these proposals has some costs and benefits, and


Epistemic Modals in Context 292<br />

we postpone discussion of their comparative virtues to an appendix. For now we are<br />

interested in the idea, common to these proposals, that propositions only determine<br />

truth values relative to something much more fine-grained than a world. (We take no<br />

stand here on whether propositions should be identified with either Modal Profiles<br />

or Centering Profiles or functions from Centred Worlds to Modal Profiles).<br />

Truth relativism is not threatened by the four problems that undermine content<br />

relativism.<br />

According to truth relativism, Watson and Moriarty express the very same proposition<br />

by the words Holmes might be in Paris, so it is no surprise that Watson can report<br />

Moriarty’s assertive utterance by using the very same words. Similarly, it is no<br />

surprise that if Moriarty has a belief that he would express by saying Holmes might<br />

be in Paris, Watson can report that by (53).<br />

(53) Moriarty believes that Holmes might be in Paris.<br />

Above we noted that it’s unlikely that Watson could use this to express the proposition<br />

that for all Watson knows Holmes is in Paris. We used that fact to argue that<br />

DeRose’s constraint did not apply when an epistemic modal is inside a propositional<br />

attitude report. The truth relativist theory predicts not only that DeRose’s constraint<br />

should not apply, but that a different constraint should apply. When one says that a<br />

believes that b might be F, one says that a believes the proposition b might be F. And<br />

a believes that proposition iff a believes it is consistent with what they know that b<br />

is F. And that prediction seems to be entirely correct. It is impossible for Watson to<br />

use (53) to mean that Moriarty believes that for all Holmes knows he is in Paris, or<br />

that for all Watson knows Holmes is in Paris. This seems to be an interesting generalisation,<br />

and while it falls out nicely from the truth relativist theory, it needs to be<br />

imposed as a special constraint on contextualist theories.<br />

Since there is a proposition that is common to speakers and hearers when an<br />

epistemic modal is uttered, we can keep Stalnaker’s nice idea that the role of assertion<br />

is to add propositions to the conversational context. Since propositions are no<br />

longer identified with sets of possible worlds we will have to modify other parts of<br />

Stalnaker’s theory, but those parts are considerably more controversial.<br />

The truth relativist can also explain how (48) can be true, though the explanation<br />

requires a small detour through the nature of psychological explanations involving<br />

relativist expressions go.<br />

(48) The Trojans were hesitant in attacking because Achilles might have been with<br />

the Greek army.<br />

All of the following could be true, and not because the things in question are rude,<br />

huge or great tasting for us.<br />

(54) Marvin the Martian dropped his pants as the Queen passed by because it would<br />

have been rude not to.<br />

(55) Children are scared of adults because they are huge.<br />

(56) Vultures eat rotting flesh because it tastes great.


Epistemic Modals in Context 293<br />

In general it seems that the truth of an explanatory claim of the form, X ϕed because<br />

p depends only on whether p is true in X’s context (plus whether the truth of p in X’s<br />

context bears the right relation to X’s ϕing).. Whether or not p is true in our context<br />

is neither here nor there. Adults are not huge, rotting flesh does not taste great, and<br />

it is rude to drop one’s pants as the Queen passes by, but (54)-(56) could still be true,<br />

and could all count as good explanations. Similarly, (48) can be true because Achilles<br />

might have been with the Greek army could be true relative to the Trojans.<br />

Similarly, what it is to comply with a command is to act in a way that makes the<br />

command true in the context of action. This is not a particular feature of epistemic<br />

modals, but just a general property of how commands involving propositions with<br />

centered-worlds truth conditions behave. If Don picks a fight with Pedro after Don<br />

has shrunk so much that Pedro is now relatively huge, he violates (50), even if Pedro<br />

was not huge when the command was issued. And he still violates it from a later<br />

perspective when Pedro and Don are the same size. The general point is that whether<br />

the command is violated depends on the applicability of the salient terms from the<br />

perspective of the person to whom the command applies. Similarly, Keith does not<br />

violate his Mom’s command if he takes an umbrella where It might rain is true in<br />

the context the action is performed. And this, of course, matches up perfectly with<br />

intuitions about the case.<br />

It’s a little tricky to say just which statement in Professor Granger’s original<br />

hexalemma gets denied by the truth relativist. It all depends what we mean by spoke<br />

truly. If Myles spoke truly means that Myles said something true T , then (2) is false (relative<br />

to Granger’s context), for its right-hand-side is true but its left-hand-side is false.<br />

If, on the other hand, it means he said something true B relative to his own context,<br />

then (4) is false, for he did speak truly B relative to his context, but it’s not the case<br />

that Professor Granger might be in Prague. This is awkward, but we might expect<br />

that any good solution to the paradox will be awkward.<br />

6 Objections to Truth Relativism<br />

It might be thought that the truth relativist has to deny TRUTH IN REPORTING, but<br />

in fact this can be retained in its entirety provided we understand it the right way. The<br />

following situation is possible on the truth relativist theory. X has a belief that is true<br />

in her context, and Y properly reports this by saying X believes that S, where S in Y ’s<br />

mouth expresses a proposition that is false in Y ’s mouth in her context. But this is no<br />

violation of TRUTH IN REPORTING. What would be a violation is if X’s belief was<br />

true in Y ’s context, and still Y could report it as described here. But there’s no case<br />

where, intuitively, we properly report an epistemic modal but violate that constraint.<br />

And the same holds for reports of uses of huge, color or tastes. Even if Vinny (truly)<br />

believes that rotting flesh tastes great, and the words “Rotting flesh tastes great” in<br />

John’s mouth express a false proposition, John’s report, “Vinny believes that rotting<br />

flesh tastes great” would only violate TRUTH IN REPORTING if Vinny’s belief is still<br />

true in John’s context. And it is not.<br />

Given that the relativist has the concept of truth T , or as we might put it truth<br />

simpliciter, what should be done with it? The answer seems to be not much. We


Epistemic Modals in Context 294<br />

certainly shouldn’t restate the norms of assertion in terms of it, because that will<br />

lead to the appropriateness of assertion being oddly relativised. Whether it was appropriate<br />

for Vinny to say “Rotting flesh tastes great,” is independent of the context<br />

of evaluation, even if the truth of what he uttered is context-relative. (It would not<br />

at all be appropriate for him to have said “Rotting flesh tastes terrible” even though<br />

we should think he would have said something true by that remark, and something<br />

false by what he actually said.) And the same thing seems to hold for generalisations<br />

about truth as the end of belief. It is entirely appropriate for Myles to believe that<br />

Granger might be in Prague, because it’s true B relative to his context. Relatedly, if<br />

knowledge is tied to truth T rather than truth B , knowledge can’t be the norm of assertion<br />

or end of belief. 38 On the other hand, using truth T we can say that TRUTH IN<br />

REPORTING is true in the truth relativist theory without reinterpreting it in terms<br />

of relative truth concepts. Moreover, we can invoke truth T to explain why we got<br />

confused when thinking about the original puzzle: It is arguable that, even if we<br />

should distinguish truth T from truth B in our semantic theorizing, we aren’t unreflectively<br />

as clear about that distinction as we might be. No wonder then that we get<br />

a little confused as we think about the Granger case. We want to say Myles doesn’t<br />

make a mistake. And we also want to say “That’s wrong” speaking of the object of<br />

his assertion and belief, and what’s more, when we say that, we don’t seem to be<br />

making a binary claim about the relation between ourselves and what is believes.<br />

Once we clearly distinguish truth T from truth B things become clearly. Using the disquotational<br />

notion, we can say ‘That is false T ‘, which is a monadic claim, and not a<br />

binary one. The binary truth B explains why that claim is assertable (it is assertable<br />

because ‘That is false T ‘ is truth B at my context), but doesn’t figure in the proposition<br />

believed. Meanwhile, the relevant notion of mistake – that of an agent believing a<br />

proposition that is not true B at her context, can only be properly articulated once the<br />

distinction between the more explanatory truth T is carefully distinguished from the<br />

(arguably) conceptually more basic truth B.<br />

One final expository point. In general, truth relativism makes for irresolvable disputes.<br />

Let us say that two conversational partners are in deadlock concerning a claim<br />

when the following situation arises: There is a pair of conversational participants, x<br />

and y, and a sentence S, under dispute, such that each express the same proposition<br />

(in the sense explained) by S but that S is true B at each of the contexts x is in during<br />

the conversation, and false B at each of the contexts y is in during the conversation.<br />

Neither speaks past one another in alternately asserting and denying the same sentence,<br />

since each expresses the same proposition by it. And each asserts what they<br />

should be asserting when each says: What I say is truth T and what the other says is<br />

false T., since each makes a speech that is true T at the respective contexts. In general,<br />

truth relativism about a term will lead one to predict deadlock for certain conversations,<br />

traceable to the truth relativity of the term. But in the case of ‘might’, it is<br />

38 Arguably, then, one will have to distinguish (and posit an ordinary conflation between) knowledgeT<br />

from knowledge B, the latter being needed to make good on the normative importance of knowledge, the<br />

former being need to make sense of the validity of the inference from knowing that p to p. Is trouble<br />

lurking here for the truth relativist, especially given link between the truth B of ‘might’ claims and facts<br />

about knowledge? We shall not pursue the matter further here.


Epistemic Modals in Context 295<br />

arguable that conversation tends to force a situation where, even if at the outset, a<br />

‘might’ sentence was true relative to x and not to y (on account of the truth-relativity<br />

of the ‘might’ sentence), x and y will, in the course of engagement and dispute, be<br />

quickly put into a pair of contexts which do not differ with respect to truth B (unless<br />

the ‘might’ sentence contained other terms that themselves made for deadlock). This<br />

is not merely because the conversational participants will, through testimony, pool<br />

knowledge about the sentence embedded in the ‘might’ claim. It is in any case arguable<br />

that the relevant community whose body of knowledge determines whether<br />

a ‘might’ claim is true B at a context always includes not just that of the person at that<br />

context but also that of his conversational partners. In the special case of ‘might’,<br />

then, Truth Relativism may well generate far less by way of deadlock than in other<br />

cases.<br />

There are two primary objections to the truth relativist theory: it doesn’t quite<br />

handle all the cases and that it is too radical.<br />

There are some cases that seem to tell directly against the truth relativist position.<br />

Consider the case again of Tom and Sally stuck in a maze. Sally knows the way out,<br />

but doesn’t want to tell Tom. She says, inter alia, (57), and does not seem to violate<br />

any semantic norms in doing so, even though she knows the exit is some other way.<br />

(57) The exit might be that way.<br />

This seems to directly contradict the relativist claim that the norm for assertion is<br />

speaking truly in one’s own context. We suspect that what’s going on here is that<br />

Sally is projecting herself into Tom’s context. She is, we think, merely trying to verbalise<br />

thoughts that are, or should be, going through Tom’s head, rather than making<br />

a simple assertion. As some evidence for this, note (as was mentioned above) that<br />

it would be wrong to take (57) as evidence that Sally believes the exit might be that<br />

way, whereas when a speaker asserts that p that is usually strong evidence that she<br />

believes that p. It is unfortunate for the relativist to have to appeal to something like<br />

projection, but we think it is the simplest explanation of these cases that any theorist<br />

can provide.<br />

The idea that utterances have their truth value absolutely is well-entrenched in<br />

contemporary semantics, so it should only be overturned with caution. And it might<br />

be worried that once we add another degree of relativisation, it will be open to relativise<br />

in all sorts of directions. We are sensitive to these concerns, but we think<br />

the virtues of the relativist theory, and the vices of the contextualist and invariantist<br />

theories, provides a decent response to them. Invariantist theories are simply implausible,<br />

and any contextualist theory will have to include so many ad hoc conditions,<br />

conditions that seem to be natural consequences of relativism, that there are methodological<br />

considerations telling in favour of relativism. (Let us be clear: we are not<br />

recommending a general preference for relativism over contextualism in semantic<br />

theory. As we have been trying to make clear, for example, the case of ‘might’ is very<br />

different from, say, the case of ‘ready’.) It is (as always) hard to tell which way the<br />

balance tips when all these methodological considerations are weighed together, but<br />

we think the relativist has a good case.


Epistemic Modals in Context 296<br />

Appendix on Types of Content<br />

Robert Stalnaker has long promoted the idea that the content of an assertoric utterance<br />

is a set of possible worlds, or a function from worlds to truth values. This<br />

idea has been enormously influential in formal semantics, although it has come in for<br />

detailed criticism by various philosophers. (See especially Soames (1987) and King<br />

(1994, 1995, 1998).) But even philosophers who think that there is more to content<br />

than a set of possible worlds would agree that propositions determine a function<br />

from worlds to truth values. Some would agree that such a function exhausts the ‘discriminatory<br />

role’ of a proposition, although this depends on the (highly contestable)<br />

assumption that the role of propositions is to discriminate amongst metaphysical possibilities.<br />

Still, even philosophers who disagree with what Stalnaker says about the<br />

nature of propositions could agree that if all we wanted from a proposition was to divide<br />

up some metaphysical possibilities, propositions could be functions from worlds<br />

to truth values, but they think some propositions that divide up the metaphysical<br />

possibilities the same way should be distinguished.<br />

We don’t want to take sides in that debate, because our truth relativism means<br />

we are in conflict with even the idea that a proposition determines a function from<br />

worlds to truth values. To see this, consider a sentence whose truth value is relative<br />

to a context of evaluation, such as Vegemite tastes great. The truth relativist says that<br />

this sentence should be evaluated as true from a context where people like the taste<br />

of Vegemite (call this the Australian context) and should be evaluated as false from<br />

a context where people dislike this taste (call that the American context) and both<br />

evaluations are correct (from their own perspective) even though the Australians and<br />

Americans agree about what the content of Vegemite tastes great is, and they are in<br />

the same world. So there’s just no such thing as the truth value of Vegemite tastes great<br />

in the actual world, so it does not determine a function from worlds to truth values.<br />

What kind of function does it determine then?<br />

One option, inspired by Lewis’s work on de se belief, is to say that it determines<br />

a function from centred worlds to truth values. The idea is that we can identify a<br />

context of evaluation with a centred world, and then Vegemite tastes great will be true<br />

relative to a centred world iff it is properly evaluated as true within that context. Alternatively,<br />

the content of Vegemite tastes great will determine a set of centred worlds,<br />

the set of contexts from which that sentence would be evaluated as true. Just as propositions<br />

were traditionally thought to determine (or be) sets of possible worlds, properties<br />

were traditionally thought to determine (or be) functions from worlds to sets<br />

of individuals. 39 Now if we identify centred worlds with 〈individual, world〉 pairs,<br />

39 Lewis preferred the theory on which properties were sets of individuals, potentially from different<br />

worlds. This theory has difficulties accounting for individuals that exist in more than one world. And<br />

since properties exist in more than one world, and properties have to be treated as individuals in some<br />

contexts (e.g. when they are the subjects of predication) this is a serious problem. Treating properties as<br />

functions from worlds to sets of individuals removes this problem without introducing any other costs.<br />

(See Egan (2004) for more details.)


Epistemic Modals in Context 297<br />

a function from worlds to sets of individuals just is a set of centred worlds. 40 So the<br />

content of Vegemite tastes great could just be a property, very roughly the property<br />

of being in a context where most people are disposed to find Vegemite great-tasting.<br />

This proposal has three nice features. First, even though the content of Vegemite<br />

tastes great is not, and does not even determine, a proposition as Stalnaker conceived<br />

of propositions, it does determine a property. So the proposal is not as radical as<br />

it might at first look. Second, properties are the kind of thing that divide up possibilities.<br />

The possibilities they divide are individuals, not worlds, but the basic idea<br />

that to represent is to represent yourself as being in one class of possible states rather<br />

than another is retained. The only change is that instead of representing yourself as<br />

being in one class of worlds rather than another, you represent yourself as being in<br />

one class of 〈individual, world〉 pairs rather than another. Third, the proposal links<br />

up nicely with David Lewis’s account of de se belief, and offers some prospects for<br />

connecting the contents of beliefs with the contents of assertions, even when both of<br />

these contents have ceased to be propositions in Stalnaker’s sense. 41<br />

But there’s a problem for this account. Consider what we want to say about<br />

Possibly Vegemite tastes great, where context makes it clear that the ‘possibly’ is a<br />

metaphysical modal. There’s a trivial problem and a potentially deep problem for<br />

this account. The trivial problem is that we know what the meaning of possibly is.<br />

It’s a function that takes propositions as inputs and delivers as output a proposition<br />

that is true iff the input proposition is true at an accessible world. If the content of<br />

Vegemite tastes great is a property rather than a proposition, then we have a typemismatch.<br />

This is a trivial problem because it’s a fairly routine exercise to convert<br />

the meanings of words like possibly so they are the right kind of things to operate on<br />

what we now take the meaning of Vegemite tastes great to be.<br />

The deep problem is that when we go through that routine exercise, we get the<br />

wrong results. We don’t want Possibly Vegemite tastes great to be true in virtue of<br />

there being an accessible world where the people there like the taste of Vegemite. We<br />

want it to be true in virtue of there being a world where Vegemite’s taste is a taste that<br />

in this context we’d properly describe as great. And it’s not clear how to get that on<br />

the current story. To see how big a problem this is, consider (58), where the modal is<br />

meant to be metaphysical and have wide scope.<br />

(58) Possibly everyone hates Vegemite but it tastes great.<br />

That’s true, on its most natural reading. But the content of Everyone hates Vegemite<br />

but it tastes great will be the empty set of centred worlds, for there is no centred world<br />

on which this is true. Now it’s not clear just what the meaning of possibly could be<br />

that delivers the correct result that (58) is true.<br />

40 Matters are a little more complicated when we introduce times into the story. For purposes of this<br />

appendix we ignore all matters to do with tense. As you’ll see, the story is complicated enough as it is, and<br />

this omission doesn’t seriously affect the dialectic to follow.<br />

41 It might be that propositions just are whatever things are the contents of assertions and beliefs, so we<br />

shouldn’t say that the contents of sentences like Vegemite tastes great are not propositions. But they will<br />

be very different kinds of propositions to what we are used to. Thanks here to John MacFarlane.


Epistemic Modals in Context 298<br />

So we are tempted to consider an alternative proposal. Start with a very natural<br />

way of thinking about why the relativist has to modify the Stalnakerian story about<br />

content. The problem is that (even given a context of utterance) tastes great does<br />

not determine a property. Rather, relative to any context of evaluation, i.e. centred<br />

world, it determines a property. That is, its content is (or at least determines) a function<br />

from centred worlds to properties. So given our actual context, it determines<br />

the property of having a taste that people around here think is great. Now properties<br />

combine with individuals to form Stalnakerian propositions. So tastes great is a<br />

function from centred worlds to functions from individuals to sets of worlds. Hence<br />

Vegemite tastes great is a function from centred worlds to sets of worlds, the previous<br />

function with the value for the ‘individual’ being fixed as Vegemite.<br />

Our second option then is that in general that sentences containing ‘relative’<br />

terms like ‘tastes’ or ‘huge’ or ‘might’ determines a function from centred worlds<br />

to sets of worlds. This makes it quite easy to understand how (58) could work. Possibly<br />

type-shifts so that it is now a function from functions from centred worlds to<br />

sets of worlds to functions from centred worlds to sets of worlds. It’s fairly easy to<br />

say what this function is. If the content of p is (or determines) f, a function from<br />

centred worlds to sets of worlds, then the content of ◊p is (or determines) g, the function<br />

such that for any centred world c, w ∈ g(c) iff for some w ′ accessible from w, w ′<br />

∈ f (c). The core idea is just that we ignore the role of the centred worlds until the end<br />

of our semantic evaluation, and otherwise just treat ◊ as we’d treated it in traditional<br />

semantics. This is a rather nice position in many ways, but there are two issues to be<br />

addressed.<br />

First, it is not clear that functions from centred worlds to sets of worlds are really<br />

kinds of content. They are not things that divide up intuitive possibilities, in the<br />

way that sets of individuals, and sets of 〈individual, world〉 pairs do. It’s no good<br />

to say that relative to a centred world a content is determined. That would be fine<br />

if we were content relativists, and we said the content was meant to be determined<br />

relative to a centred world. But as argued in the text the content of Vegemite tastes<br />

great should be the same across various contexts of evaluation. A better response is to<br />

say functions from centred worlds to sets of worlds do determine a kind of content.<br />

For any such function f, we can determine the set of centred worlds 〈i, w〉 such that<br />

w ∈ f (〈i, w〉). These will be the centred worlds that the proposition is true at. It’s not<br />

necessarily a problem that the proposition does more than determine this set. (It’s<br />

not an objection to King’s account of propositions that on his theory propositions<br />

do more than determine a set of possibilities.)<br />

Second, it isn’t exactly clear how to fill out these functions when we get back<br />

to our core case: epistemic modals. It’s easy to say what it is for Vegemite tastes<br />

great to be true in a world relative to our context of evaluation; indeed we did so<br />

above. It’s a lot harder to say what it is for Granger might be in Prague to be true<br />

in an arbitrary world w relative to an arbitrary context of evaluation c. As a first<br />

pass, we might say this is true in w iff for all the people in c know, it is true in w<br />

that Granger is in Prague. But the problem is that whenever c is not a centre in<br />

w, it’s very hard to say just what the people in c know about w. Under different<br />

descriptions of w they will know different things about it. If w is described as a


Epistemic Modals in Context 299<br />

nearby world in which Granger is in Cleveland, they will know Granger is not in<br />

Prague in w. If it is described as a nearby world in which Myles knows where Granger<br />

is they may not know anything about whether Granger is in Prague is in w, even if<br />

those descriptions pick out the same worlds. Ideally we would cut through this by<br />

talking about their de re knowledge about w, but most folks have very little de re<br />

knowledge about other possible worlds. It’s not clear this is a huge problem though.<br />

Remember that a sentence containing an epistemic modal is meant to determine a<br />

function from centred worlds to functions from worlds to truth values. Provided<br />

we have a semantics that allows for semantic indeterminacy, we can just say that the<br />

functions from worlds to truth values are partial functions, and they simply aren’t<br />

determined when it’s unclear what the people in c know about w. Or we can say<br />

there’s a default semantic rule such that w is not in f (c) (where f is the function<br />

determined by the sentence) whenever this is unclear. Since the sentences whose<br />

meanings are determined by these values of the function, like Possibly Granger might<br />

be in Prague are similarly vague, it is no harm if the function is a little vague.<br />

So we have two options on the table for what kind of functions sentences might<br />

determine if they don’t determine functions from world to truth values. One option<br />

is that they determine functions from centred worlds to truth values, another that<br />

they determine functions from centred worlds to functions from worlds to truth<br />

values. Neither is free from criticism, and the authors aren’t in agreement about<br />

which is the best approach, so it isn’t entirely clear what the best way to formally<br />

implement truth relativism is. But it does not look like there are no possible moves<br />

here. Moving to truth relativism does not mean that we will have to totally abandon<br />

the fruitful approaches to formal semantics that are built on ideas like Stalnaker’s,<br />

although it does mean that those semantic theories will need to be modified in places.


Attitudes and Relativism<br />

Data about attitude reports provide some of the most interesting arguments for, and<br />

against, various theses of semantic relativism. This paper is a short survey of three<br />

such arguments. First, I’ll argue (against recent work by von Fintel and Gillies) that<br />

relativists can explain the behaviour of relativistic terms in factive attitude reports.<br />

Second, I’ll argue (against Glanzberg) that looking at attitude reports suggests that<br />

relativists have a more plausible story to tell than contextualists about the division<br />

of labour between semantics and meta-semantics. Finally, I’ll offer a new argument<br />

for invariantism (i.e. against both relativism and contextualism) about moral terms.<br />

The argument will turn on the observation that the behaviour of normative terms<br />

in factive and non-factive attitude reports is quite unlike the behaviour of any other<br />

plausibly context-sensitive term. Before that, I’ll start with some taxonomy, just so<br />

as it’s clear what the intended conclusions below are supposed to be.<br />

1 How Not to be a Strawman<br />

Here are three mappings that we, as theorists about language, might be interested in.<br />

• Physical Movements → Speech Acts. I’m currently moving my fingers across<br />

a keyboard. In doing so, I’m making various assertions. It’s a hard question to<br />

say just how I manage to make an assertion by moving my fingers in this way.<br />

Relatedly, it’s a hard question to say just which assertions, requests, questions,<br />

commands etc people make by making physical movements. This mapping is<br />

the answer to that question.<br />

• Speech Acts → Contents. For some speech acts, their content is clear. If<br />

I assert that grass is green, the content of my assertion is that grass is green.<br />

For other speech acts, including perhaps other assertions, it is not immediately<br />

obvious what their content is. This mapping answers that question.<br />

• Contents → Truth Values. Given a content, it isn’t always clear whether it<br />

is true or false (or perhaps something else). This mapping provides the truth<br />

values of every content of every speech act.<br />

The details of each of these mappings is interesting, to say the least. Indeed, a full<br />

description of those mappings would arguably include an answer to every question<br />

ever asked. We’re not likely to know that any time soon. But we can ask, and perhaps<br />

answer, interesting questions about the topology of the mappings. Here are two<br />

distinct questions we can ask about each of the three mappings.<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philosophical<br />

Perspectives 22 (2008): 527-44. Thanks to Herman Cappellen, Andy Egan, Carrie Jenkins, Thony<br />

Gillies, Michael Glanzberg, Ishani Maitra and the participants at the Contextualism and Relativism workshop<br />

at Arché, St Andrews, May 2008.


Attitudes and Relativism 301<br />

• Is it one-one or one-many? The mapping from a natural number to its natural<br />

predecessor is one-one. That’s to say, every number is mapped to at most<br />

one other number. The mapping from a person to the children they have is<br />

one-many. That’s not to say that everyone has many children, or indeed that<br />

everyone has any children. It’s merely to say that some things in the domain<br />

are mapped to many things in the range.<br />

• Is the mapping absolute or relative, and if relative, relative to what? Consider<br />

a mapping that takes a person A as input, and relative to any person B,<br />

outputs the first child that A has with B. This mapping is relative; there’s no<br />

such thing as the output of the function for any particular input A. All there<br />

is, is the output relative to B1, relative to B2, and so on. But note that in a good<br />

sense the mapping is one-one. Relative to any B, A is mapped to at most one<br />

person.<br />

A very simple model of context-sensitivity in language says that all three of these<br />

mappings are one-one and absolute. It will be helpful to have a character who accepts<br />

that, so imagine we have a person, called Strawman, who does. There are six basic<br />

ways to reject Strawman’s views. For each of the three mappings, we can say that it<br />

is one-many or we can say that it is relative.<br />

This way of thinking about Strawman’s opponents gives us a nice taxonomy. The<br />

first mapping is about speech acts, the second about contents and the third about truth.<br />

Someone who disagrees with Strawman disagrees about one of these three mappings.<br />

They might disagree by being a pluralist, i.e. thinking that the mapping is one-many.<br />

Or they might disagree by being a relativist, i.e. thinking the mapping is relative to<br />

some thing external. The respective defenders of Strawman on these questions are<br />

the monists and the absolutists.<br />

So the speech act pluralist is the person who thinks the first mapping is one-many.<br />

The content relativist, is the person who thinks the second is relative to some other<br />

variable (perhaps an assessor). And the truth monist is the person who thinks that<br />

contents have (at most) one truth value. All these terms are a little bit stipulative, but<br />

I think it the terminology here somewhat matches up with regular use. And it’s the<br />

terminology I’ll use throughout this paper.<br />

One other nice advantage of this taxonomy is that it helps clarify just what is at<br />

issue between various opponents of Strawman. So Andy Egan (2009) has some data<br />

about uses of “you” in group settings that suggest such utterances pose a problem<br />

for Strawman. But it’s one thing to have evidence that Strawman is wrong, another<br />

altogether to know which of his views is, on a particular occasion, wrong. I think separating<br />

out Strawman’s various commitments helps clarify what is needed to isolate<br />

Strawman’s mistake on an occasion.<br />

It is, I think, more or less common ground that the first of Strawman’s commitments,<br />

speech act monism, is false. The King can, by uttering “It’s cold in here”, both<br />

assert that it’s cold in here, and command his lackey to close the window. Those look<br />

like two distinct speech acts that he’s made with the one physical movement. Herman<br />

Cappellen and Ernest Lepore (Cappelen and Lepore, 2005) have many more<br />

examples to show that Strawman is wrong here. Once we go beyond that though, it’s


Attitudes and Relativism 302<br />

less clear that Strawman is mistaken. Perhaps by thinking about cases where, by the<br />

one physical movement, we intend to communicate p to one audience member, and<br />

q to another, we can try to motivate speech-act relativism. That’s an issue I’ll leave<br />

for another day. In contrast to what he says about speech acts, what Strawman says<br />

about content and truth is, if not universally accepted, at least popular. So I’ll call<br />

orthodox contextualism the view that Strawman is right about the content mapping<br />

and the truth mapping; each mapping is both one-one and absolute.<br />

It is worthwhile noting two very separate models for content that lead to two<br />

quite distinct ways in which we can reject Strawman’s last two absolutist views.<br />

John MacFarlane’s paper on Non-Indexical Contextualism MacFarlane (2009) was<br />

extremely useful in setting up the relevant distinctions here, but the particular models<br />

for content I’m describing here are both set out in greatest detail in recent work<br />

by Andy Egan.<br />

The first is the centred worlds model for content. This is the idea that for some<br />

utterance types, any token of that type expresses the same content. But that content<br />

is a set of centred worlds, that is true at some centres and false at other centres in<br />

the same world. So we might think that the content of “Beer is tasty” is, roughly,<br />

the set of possibilia who have pro-attitudes to the taste of beer. More precisely, it is<br />

the set of world-centre pairs such that the agent at (or perhaps closest to) the centre<br />

has pro-attitudes towards the taste of beer. On this view, content monism will be<br />

maintained – what an utterance of “Beer is tasty” says is invariant across assessors.<br />

(I’m assuming here that assessor-relativity is the only relativity we’re interested in.)<br />

But truth absolutism will fail, since whether that content is true for a 1 and a 2 will<br />

depend on what their attitudes are towards beer. This kind of centred worlds model<br />

for content is what Egan has developed in (Egan, 2007a).<br />

The second model lets assessors get into the content-fixing mechanism, but says<br />

the content that is fixed is a familiar proposition whose truth is not assessor relative.<br />

This is easiest to explain with an example involving second-person pronouns. For<br />

some utterances of “You are a fool”, the content of that utterance, relative to x, is that<br />

x is a fool. Now whether x is indeed a fool is a simple factual question, and whether it<br />

is true isn’t assessor relative. But if some people are fools and others are not, whether<br />

the utterance is true or false depends on who is assessing it. So content relativism is<br />

true, while truth absolutism is preserved. This is a view Egan has defended for some<br />

tokens of second person pronouns (Egan, 2009) .<br />

In “Conditionals and Indexical Relativism” (<strong>Weatherson</strong>, 2009), I called the combination<br />

of content relativism and truth absolutism “indexical relativism”, and defended<br />

such a view about indicative conditionals. I called something similar to the<br />

combination of truth relativism and content absolutism “non-indexical contextualism”.<br />

More precisely, I followed MacFarlane in using that phrase for the combination<br />

of truth relativism and content absolutism and the view that whether a speaker’s utterance<br />

is true (relative to an assessor) is a matter of whether the proposition they express<br />

is true relative to their context. I like the name “indexical relativism”, but it has<br />

also been used for theories that aren’t even heterodox in the above sense, so perhaps<br />

persisting with it would just invite confusion. (And the name implies a particular


Attitudes and Relativism 303<br />

view about how the relativity works; namely that there is something like an indexical<br />

element in what’s asserted that gets its value from the context of assessment.) In<br />

other contexts I’ve used “relativism” as the label for all and only heterodox views, but<br />

this label is potentially quite confusing. Indeed, it’s a possible worry about the arguments<br />

in my “Conditionals and Indexical Relativism” that they really just support<br />

heterodoxy; a separate argument would be needed (and might not be easy to supply)<br />

against pluralist alternatives to content relativism about indicative conditionals.<br />

2 Factive Verbs and Relativism<br />

In “CIA Leaks” (von Fintel and Gillies, 2008), Kai von Fintel and Thony Gillies raise<br />

a problem for heterodox theories about ‘might’. (Actually they raise several, but<br />

I’m only going to deal with one of them here.) Their primary target is what I called<br />

truth relativist theories, but the argument they raise is interesting to consider from all<br />

heterodox perspectives. The problem concerns embedding of ‘might’-clauses under<br />

factive attitude verbs. They argue as follows:<br />

1. S realises that p presupposes that p.<br />

2. This presupposition is carried over when the sentence is used as the antecedent<br />

of a conditional. So, for instance, If S realises that p, then q presupposes that p.<br />

3. But, on standard heterodox proposals, we can properly say If S realises that it<br />

might be that p, then q, even though it isn’t true that it might be that p.<br />

4. So heterodox proposals are false.<br />

Here is the example they use to make the case.<br />

Bond planted a bug and some misleading evidence pointing to his being<br />

in Zuurich and slipped out. Now he and Leiter are listening in from<br />

London. As they listen, Leiter is getting a bit worried: Blofeld hasn’t<br />

yet found the misleading evidence that points to Bond’s being in Zurich.<br />

Leiter turns to Bond and says:<br />

(34) If Blofeld realizes you might be in Zurich, you can breathe easy—<br />

he’ll send his henchman to Zurich to find you. (93)<br />

Now the problem is that for the heterodox theorist, “You might be in<br />

Zurich”, as uttered by Leiter to Bond, expresses (relative to Bond), a<br />

proposition that is true iff for all Bond knows, Bond might be in Zurich.<br />

Just how it does this will differ for different heterodox theorists, but so<br />

far they all agree. But that isn’t the case; since Bond knows he is in London.<br />

So (34) should sound defective, since it contains a presupposition<br />

failure. But it isn’t defective, so heterodoxy is mistaken.<br />

Before we look at how heterodox theorists might respond to this case, it’s worth<br />

looking thinking about how Strawman might respond to it. The simplest idea is to<br />

say that in It might be that p, there is a hidden variable X. The value of X is set by


Attitudes and Relativism 304<br />

context. And the sentence expresses the proposition that for all X knows, p is true.<br />

(Perhaps we might use some epistemic relation other than ‘knows’, but that’s not<br />

relevant here.)<br />

Now, and this is crucial, the variable X might be either free or bound. If there<br />

is nothing around to bind it, as in a simple utterance of It might be that p, then it<br />

will be free. And typically if it is free, X denotes a group consisting of the speaker<br />

and perhaps those in the same conversation. But when the might-clause is embedded<br />

under a propositional attitude ascription (factive or not), the variable X will be bound<br />

to the subject of the attitude ascription. So in Y believes that it might be that p, the<br />

value of X will simply be Y . So in Blofeld realises you might be in Zurich, the value of<br />

X is Blofeld. And hence the embedded might claim is true, since that claim is simply<br />

that for all Blofeld knows, Bond is in Zurich. Which, in the story, is true.<br />

The reason for going through all of this is that the theorist who accepts truth<br />

absolutism, but rejects content absolutism, can say exactly the same thing. There is<br />

a variable X in the structure of what’s asserted. Strawman thinks that you only get a<br />

determinate assertion when you fill in the value of X. We can disagree with that; we<br />

can say that an assertion can literally contain a variable, one that potentially gets its<br />

value from assessors. That way the content of a particular assertion can be different<br />

for different assessors. Once we’ve made this move, we can rejoin Strawman’s story.<br />

This variable is either free or bound. If it is bound, we tell exactly the same story as<br />

Strawman. But, we insist, if it is free, the value of X is sometimes set by contextual<br />

features of the hearer as well as of the assessor. In the standard case, X is a group<br />

consisting of the speaker, the hearer, and perhaps some people who get in the group<br />

in virtue of their proximity to the speaker or hearer.<br />

So we end up saying the same thing about the acceptability of (34) as Strawman.<br />

The content of You might be in Zurich, as embedded in (34), is quite different to the<br />

content those words would have if uttered as a standalone sentence, because the value<br />

that a key variable takes is different. For us, the value that variable takes differs for<br />

different assessors, but that’s completely irrelevant to the explanation of the acceptability<br />

of (34).<br />

For the truth relativist (who is also a content absolutist) things are a little more<br />

interesting. Such a theorist will typically reject the presence of a variable like X in<br />

the structure of what is said. So they cannot appeal to the kind of explanation that<br />

we’ve offered (twice over) for how (34) may be acceptable. The solution is to simply<br />

reject the generalisation about factive verbs. Let’s start with some seemingly distant<br />

examples, in particular examples about fiction. It seems that (5) doesn’t have any false<br />

presuppositions.<br />

(5) Watson realised that private detectives were (in late 19th Century London) better<br />

at solving murder mysteries than police.<br />

I’ll leave off the parenthetical in what follows. Now I take it that it simply isn’t true<br />

that private detectives were better at solving murder mysteries than police. But it<br />

doesn’t matter; this was true in the fiction and that’s what is relevant. Note that<br />

neither (6) nor (7) has a false presupposition.


Attitudes and Relativism 305<br />

(6) Had Watson realised earlier that private detectives were better at solving murder<br />

mysteries than police, he would have liked Holmes more than he did.<br />

(7) Had Watson realised earlier that private detectives were better at solving murder<br />

mysteries than police, the early chapters of the book would have been more<br />

interesting.<br />

What’s interesting about (7) is that it’s clearly meant to be a claim about this world.<br />

When we say (6), it’s naturally interpreted as making a claim about the world of the<br />

Holmes fiction. But that’s not how we interpret (7); what matters is that the book<br />

would have been more interesting to us.<br />

Note that this isn’t anything particular to do with subjunctive conditionals. Imagine<br />

that we are settling down to watch a new adaptation of the Holmes stories that<br />

we are told won’t be particularly faithful to the books in detail. I might properly say<br />

(8).<br />

(8) If Watson realises that private detectives were better at solving murder mysteries<br />

than police, the early scenes will be more interesting.<br />

The lesson we take away from sentences like (8) is that the generalisation about factives<br />

and presupposition that von Fintel and Gillies rely upon isn’t strictly true.<br />

When the subject of the attitude ascription is in another possible world, all that is<br />

presupposed is that the proposition they believe is true in their world. We should all<br />

agree to that restriction to the principle.<br />

But now it is easy to see the way out of the argument for the proponent of the<br />

centred world view. The crucial thing about Watson isn’t, such a theorist will say, that<br />

he’s in another possible world. The crucial thing is that some propositions that are<br />

false relative to us (e.g. the proposition that private detectives were better at solving<br />

murder mysteries than police) are true relative to him. The true generalisation seems<br />

to be that S Vs that p, where V is factive, presupposes that p is true relative to S. And<br />

that’s true in the cases that von Fintel and Gillies describe. So it’s not true that the<br />

centred world theorist should predict that these utterances have false presuppositions.<br />

And that’s all to the good, because of course they don’t.<br />

3 Glanzberg on Metasemantics<br />

<strong>Papers</strong> arguing for relativism about some term t frequently, perhaps typically, start<br />

with a survey of reasons why (orthodox) contextualism about t cannot be correct.<br />

And such a survey will frequently include a sojourn through some quite specific<br />

contextualist theories, with some fairly obvious counterexamples. Egan et al. (2005)<br />

sticks to the script as far as this goes. We note that a might be F can’t, for instance,<br />

mean that For all S knows, a is F, where S is the speaker. And we note that some other<br />

simple theories along the same lines can’t be true.<br />

It’s interesting to think through what kind of force such tours through the possible<br />

contextualist theories could have. We might think that there’s a tacit argument<br />

that if some contextualist theory were true, it would be one of these simple ones, and<br />

none of the simple ones is true, so no contextualist theory is true. I’m not going to


Attitudes and Relativism 306<br />

take a stand on exegetical debates here, so I’m not going to consider who may or may<br />

not have been making such an argument. I don’t think that it was the intended argument<br />

in Egan et al. (2005), but that’s beside the point, because it is an argument that’s<br />

now being debated. The argument under consideration isn’t, or at least isn’t directly,<br />

that the contextualist’s semantic proposal is mistaken. Rather, the argument is that<br />

the accompanying metasemantic theory, i.e. the theory of how semantic values get<br />

fixed, is intolerably complicated. Slightly more formally, we can argue as follows.<br />

1. If contextualism is true, the metasemantic theory of how a particular use of<br />

“might” gets its semantic value is hideously complicated.<br />

2. Metasemantic theories about how context-sensitive terms get their values on<br />

particular occasions are never hideously complicated.<br />

3. So, contextualism is false.<br />

The problem with this argument, as Glanzberg (2007) has argued, is that premise 2<br />

seems to be false. If we just look at some so-called ‘automatic’ indexicals, like “I” or<br />

“here” or “now”, it just may be plausible. (But even in those cases things are tricky<br />

when we look at recordings, as in <strong>Weatherson</strong> (2002).) Once we widen our gaze<br />

though, we see that there are examples of uncontroversially context-sensitive terms,<br />

like “that”, for which the accompanying metasemantic theory is, by any standard,<br />

hideously complicated. So the prospects of getting to relativism from metasemantic<br />

complexity are not, I think, promising.<br />

That isn’t the only kind of metasemantic argument for relativism though. A<br />

better argument for relativism turns on the fact that metasemantics is generally complicated.<br />

The contextualist, I think, has to make metasemantics too systematic at a<br />

key point. (Again, I’m not doing exegesis, but I do think something like this argument<br />

was intended in Egan et al (2005). I’m largely here highlighting something that<br />

I think has been thus far overlooked in the commentaries on that piece.) Consider<br />

the following pair of sentences.<br />

(9) Those guys are in trouble, but they don’t know that they are.<br />

(10) ??Those guys are in trouble, but they might not be.<br />

Something has gone wrong in (10). I conclude that (10) can’t be used to express (9).<br />

That is, there’s no good interpretation of (10) where those guys can be the denotation<br />

of X in the theory I attributed to Strawman in the previous last section. That’s to<br />

say, the value of X can’t just be the most salient individual(s) in the context, since<br />

the guys are being made rather salient. And nor can it be anaphoric on previously<br />

mentioned people, unless they are the subjects of a propositional attitude ascription.<br />

We’ll investigate this exception in what follows.<br />

A natural move at this stage is to adopt what in Egan et al. (2005) we called the<br />

Speaker Inclusion Constraint (hereafter SIC). That is, in unembedded uses of “might”<br />

the group X always includes the speaker. Now the explanation of the problem with<br />

(10) is that for the speaker to assert the first clause, she must know that the guys are<br />

in trouble, but if that’s the case, and she’s in group X, then the second clause is false.


Attitudes and Relativism 307<br />

If the SIC really holds, it looks like it should hold in virtue of the meaning (in<br />

some sense of “meaning”) of “might”. As a rule, tidy generalisations like this should<br />

be part of semantics, not metasemantics. Compare two possible theories about “we”.<br />

Both theories say that “we” is a plural pronoun. One theory goes on to say that it is<br />

part of the meaning of “we” that it picks out a group that includes the speaker. That<br />

is, it puts this version of the SIC into the semantics. Another theory says that the SIC<br />

for “we” is just a nice metasemantic generalisation. I take it that the second position is<br />

very unattractive; it’s part of the meaning of “we” that it picks out a group including<br />

the speaker. And I think the relevant point generalises. At least it generalises as far as<br />

another SIC, namely the one that holds for “might”.<br />

Semantic constraints on indexical terms hold, as a rule, for both embedded and<br />

unembedded uses of those indexicals. You can’t use “she”, even as a bound variable, to<br />

denote (relative to any variable assignment) a male human. There’s something badly<br />

wrong with Every student thinks she will win, if some students are female and others<br />

male. As Michael Glanzberg pointed out to me, complex demonstratives headed by<br />

“this” have to pick out an object that is in some way proximal, and this applies to<br />

bound complex demonstratives as well. So we have to use “that” rather than “this”<br />

in sentences like Every connoisseur remembers that/*this first great wine they drank.<br />

For a more familiar example, you can’t interpret O’Leary thinks that I am a fool, as<br />

uttered by Daniels, to mean that O’Leary self-ascribes foolishness. In short, it seems<br />

that there are three related conclusions we can draw here. First, there are semantic<br />

constraints on the possible values of context-sensitive expressions. Second, any<br />

interesting generalisation about the possible value of a context-sensitive expression<br />

is traceable to such a semantic constraint. Third, these constraints remain in force<br />

when the expression is used in an attitude report, or as a bound variable.<br />

The problem for contextualists about “might” is that it doesn’t behave at all this<br />

way. The SIC holds for unembedded uses. That implies that it is part of the meaning<br />

of “might”. So it should hold for embedded uses. But it does not. Indeed, for many<br />

embedded uses of “might”, a reading compatible with the SIC is simply unavailable.<br />

For instance, we can’t interpret Every student thinks that they might have failed as<br />

meaning that every student thinks that, for all I know, they failed. My knowledge<br />

just doesn’t matter; we’re talking about those students’ fears. This all suggests the<br />

following argument.<br />

1. If contextualism is true, then the explanation of the SIC is that it is part of the<br />

meaning of “might” that the relevant group X includes the speaker.<br />

2. If it is part of the meaning of “might” that the relevant group X includes the<br />

speaker, then this must be true for all uses of “might”, included embedded uses.<br />

3. When “might” is used inside the scope of an attitude ascription, the relevant<br />

group need not include the speaker.<br />

4. So, contextualism is not true.<br />

Glanzberg argued, correctly, that it’s no problem for the contextualist if, according to<br />

their theory, metasemantics was complicated and messy. It’s not a problem because,<br />

well, metasemantics is complicated and messy. But this cuts both ways. And it is a


Attitudes and Relativism 308<br />

problem for the contextualist that they have to put the SIC into metasemantics. It’s<br />

just not messy enough to go there.<br />

There are two objections to this argument (both of which were pressed when this<br />

paper was presented at Arché) that are worth considering together.<br />

Objection One: There are other generalisations that do go into metasemantics<br />

It’s very odd, to say the least, to use third-person pronouns to denote<br />

oneself. But this doesn’t seem to go into the meaning of the pronoun.<br />

Relatedly, it is possible to use bound third-person pronouns that take<br />

(relative to some variable assignments) oneself as value. For instance, an<br />

Australian boy can say “Whenever an Australian boy goes to the cricket,<br />

he cheers for Australia.” So probably premise 1 of the above argument,<br />

requiring that generalisations be semantic, is false. If not, premise 2,<br />

requiring that semantic constraints on unbound pronouns also constrain<br />

bound pronouns is false.<br />

Objection Two: The SIC is false<br />

Egan et al. (2005) note that the SIC seems to fail in ‘game-playing’ and<br />

‘advice’ contexts. So in a game of hide and seek, where Billy is looking<br />

for something Suzy has hidden, if he asks “Is it in the basement?”, Suzy<br />

can truly say “It might be”, even if she knows it isn’t true. And a parent<br />

can tell their child “Wash that strawberry before you eat it; it might be<br />

contaminated” even if the parent knows that the strawberry has been<br />

washed.<br />

The simplest response to the first objection is that the purported generalisation, a<br />

third-person pronoun does not denote the speaker, isn’t really a universal generalisation<br />

at all. It’s possible to refer to oneself by name; certain people in the news do it<br />

on a regular basis. For example, a famous footballer Smith might say “Smith deserves<br />

a pay raise”. In such a context, it isn’t at all odd (or at least any odder than it already<br />

is) to use third-person pronouns, e.g. by continuing the above utterance with “and if<br />

he doesn’t get one, he’s not going to play”.<br />

The second objection is a little trickier, but I think it’s possible to understand<br />

these utterances as a kind of projection. The speaker is speaking from the perspective<br />

of the hearer. This isn’t an unattested phenomenon. Something like it must be going<br />

on when speakers use “we” to denote the hearer. Imagine, for instance, a policeman<br />

coming across a person staggering out of a pub and saying “We’ve had a bit much to<br />

drink it seems”. The policeman certainly isn’t confessing to dereliction of duty. Nor<br />

is this case sufficient to throw out the idea that “we” is a first person plural pronoun.<br />

Rather, the policeman is speaking from the drunk’s perspective. I suspect the same<br />

thing is going on in both of the examples above.<br />

So I think both objections can be answered. But neither answer is completely<br />

convincing. And the two responses undermine each other. If we accept projection is a<br />

widespread phenomenon, then perhaps self-denotation with a third person pronoun


Attitudes and Relativism 309<br />

is a kind of projection. We should then restate the argument, without assuming<br />

there’s a response to this pair of objections.<br />

To do so, let’s step back from the details of the SIC. What we started with was<br />

a simple fact, that (10) can’t be used to express (9). That’s not threatened by counterexamples<br />

to the SIC, and it still needs explanation. The SIC is a proposed semantic<br />

explanation, and perhaps, if it has counterexamples, it fails. I suspect something like<br />

it is correct, but I don’t plan to rely on that suspicion here. That’s because we can be<br />

independently confident that the explanation here will be semantic, not pragmatic.<br />

We can be confident of this because there just doesn’t look to be anything like a pragmatic<br />

explanation available.<br />

Compare the discussion of third-person pronouns. Even if we can’t use thirdperson<br />

pronouns to pick out ourselves (when not speaking projectively), there is an<br />

obvious pragmatic explanation for this. We have first-person pronouns available,<br />

and if we mean to denote ourselves, using a first-person pronoun will do so in the<br />

clearest possible way. Since it is good to be clear, when we pass up the chance to use<br />

a first-person pronoun, the obvious conclusion for a hearer to draw is that we don’t<br />

mean to denote ourselves. The details of this explanation could use some filling out,<br />

but it at least has the form of an explanation. It simply doesn’t seem that any such<br />

explanation will be available for why (10) can’t be used to express (9). It’s not that<br />

(9) isn’t a thought that we might be interested in expressing. And it’s not that if we<br />

wanted to express it, we would have had an obviously preferable form of words to<br />

(10). It’s true that we have the words in (9) itself, but (a) they are longer, and (b)<br />

on the contextualist view whenever we use an epistemic modal there is some such<br />

paraphrase available, a paraphrase that typically does not defeat the acceptability of<br />

epistemic modals.<br />

If there isn’t a pragmatic explanation of why (10) can’t be used to express (9),<br />

then there must be a semantic explanation. But the only semantic explanation that<br />

looks plausible from a contextualist perspective, is a semantic restriction on X. And<br />

we know, from considering the data about embedded epistemic modals, that there is<br />

no such restriction. So we have a problem for contextualism. Slightly more formally,<br />

we can offer the following argument against orthodox contextualism about epistemic<br />

modals.<br />

1. Whatever acceptability data can’t be explained pragmatically must be explained<br />

semantically.<br />

2. There is no pragmatic explanation for why (10) can’t be used to express (9).<br />

3. If 1 and 2 then the meaning of “might” must explain why (10) can’t be used to<br />

express (9).<br />

4. If the meaning of “might” must explain why (10) can’t be used to express (9),<br />

and contextualism is true, there must be a restriction, provided by the meaning<br />

of “might” on the relevant group X that excludes groups like those guys.<br />

5. If there is a restriction, provided by the meaning of “might” on the relevant<br />

group X that excludes groups like those guys, then when “might” is embedded<br />

under an attitude verb, the group X still can’t be those guys.


Attitudes and Relativism 310<br />

6. In “Those guys believe that they might be in trouble”, the relevant group X<br />

just is those guys.<br />

7. So, contextualism is false.<br />

This argument is just an argument against contextualism about “might”. It doesn’t<br />

obviously generalise very far. It’s crucial to the argument that (10) can’t be used to<br />

express (9), even when the relevant guys are made pretty salient. A similar argument<br />

against contextualism about, say, taste claims, would have to start with the premise<br />

that a clause like “but it’s tasty”, at the end of a sentence about a, couldn’t be used<br />

to express the thought that it is tasty to a. And such a premise wouldn’t be true. As<br />

Tamina Stephenson (2007a) points out, make a particular dog salient and “It’s tasty”<br />

can express the thought that it is tasty to the dog. So I’m rather sceptical that the<br />

considerations here generalise to an argument against contextualism about predicates<br />

of personal taste.<br />

4 Against Moral Relativism<br />

I’m going to close with an argument here that any kind of contextualism, whether orthodox<br />

or heterodox, about moral terms, especially “wrong”, does not fit our usage<br />

of those terms. The argument is going to be that in order to offer a contextualistfriendly<br />

account of the behaviour of “wrong” in belief ascriptions and knowledge<br />

ascriptions, we have to suppose that it behaves quite differently in those two settings.<br />

But other context-sensitive terms do not behave that way, and we have good theoretical<br />

reasons to believe that this is not in general how context-sensitive terms behave.<br />

So “wrong” is not context-sensitive, either to contexts of usage or assessment.<br />

At first glance there seems to be very little pattern to the way that contextually<br />

sensitive terms behave in attitude ascriptions. We can find terms, like we, whose<br />

denotation inside a belief ascription is not particularly sensitive to the context of the<br />

ascribee. So in (11), we is naturally interpreted as denoting the speaker and those<br />

around her.<br />

(11) Otto believes that we are fools.<br />

We can find terms, like tasty, whose denotation inside a belief ascription seems to<br />

vary quite a bit depending on the sentence being used. Sometimes tasty seems to<br />

mean tasty to the ascribee, as in (12).<br />

(12) Vinny the Vulture believes that rotting carcasses are tasty.<br />

As we mentioned above, Stephenson (2007) notes that sometimes it seems to denote<br />

something like tasty to a contextually salient taster. This is illustrated in (13).<br />

(13) Suzy believes that this kind of dog food is tasty.


Attitudes and Relativism 311<br />

It is easy to set up a circumstance where this means that Suzy thinks that the salient<br />

dog food is tasty for dogs. So the relevant taster can be the ascribee, but could also be<br />

given by context. It’s not impossible, but not as easy as perhaps it should be, to get<br />

the relevant taster to be the speaker, or the speaker and those around them.<br />

On the other hand, when we use epistemic modals in belief reports, the relevant<br />

‘knower’ is always the ascribee. Consider, for example, (14).<br />

(14) Jack believes that Smith might be happy.<br />

That can only mean that Jack believes that for all Jack (and perhaps his friends)<br />

knows, Smith is happy. It can’t, for instance, mean that Jack believes that for all the<br />

speaker knows, Smith his happy. (Unless the speaker is Jack or one of his friends.)<br />

So we have a progression of cases, where in (11) the contextually sensitive term<br />

‘we’ has to get its denotation from the context of utterance, in (14) the contextually<br />

sensitive term ‘might’ gets its denotation from the context of the ascribee, and (12)<br />

and (13) show that ‘tasty’ can behave in either of these ways. I’ve been putting this<br />

all in terms that will make most sense if we are accepting truth absolutism, but the<br />

same points can be made without that assumption if we so desire.<br />

As I said, at first it might look like there is no pattern here at all. But if we look at<br />

other attitudes, we see that there is an interesting pattern. The way that ‘we’, ‘tasty’<br />

and ‘might’ behave in belief reports is just the same as they behave in knowledge<br />

reports. We can see this in the following examples.<br />

(11a) Otto knows that we are fools.<br />

(12a) Vinny the Vulture knows that rotting carcasses are tasty.<br />

(13a) Suzy knows that this dog food is tasty.<br />

(14a) Jack knows that Smith might be happy.<br />

In each of these sentences, the contextually sensitive term behaves just as it does in<br />

the parallel belief report. This isn’t too surprising. It would be a real shock if some<br />

term t behaved quite differently in belief and knowledge reports. If that were the case<br />

it would be possible in principle to find a passage of the form of (15) that’s true.<br />

(15) S believes that . . . t . . . . Indeed S knows it. But S doesn’t know that . . . t . . . .<br />

It’s impossible to survey every instance of (15) to see whether they all sound contradictory.<br />

But I suspect that they will sound contradictory. So we’ll assume in what<br />

follows that context-sensitive terms behave the same way in belief ascriptions and<br />

knowledge ascriptions, whether or not the kind of context-sensitivity at issue is orthodox.<br />

The problem for contextualism about “wrong” is that it is forced to violate this<br />

principle. Assume that X is wrong means that X is wrong relative to the standards<br />

of some salient group G. We’ll leave aside for now the question of whether G is<br />

determined by the speaker’s context or the assessor’s context, as well as the question<br />

of whether the sentence expresses a proposition involving G, or a proposition that<br />

is true or false relative to groups. We’ll also leave aside the question of just what


Attitudes and Relativism 312<br />

it means for something to be wrong relative to the standards. (Does it mean that<br />

G actually disapproves of it, or would disapprove of it under reflection, or that it<br />

doesn’t have properties that G wants to promote, or something else?) We’ll simply<br />

assume that there have been people whose standards are different to ours in ways<br />

that make a difference for the wrongness of actions. If that isn’t the case, we hardly<br />

have a relativism worthy of the name. It’s obviously controversial just what could be<br />

an example of this, but I’ll take as my example Jefferson Davis’s belief that helping<br />

fugitive slaves was wrong. It seems true to say Davis had this belief, so (16) is true.<br />

(16) Davis believed that helping fugitive slaves was wrong.<br />

Now (16) clearly doesn’t mean that Davis believed that helping fugitive slaves was<br />

wrong by my standards. And that’s not just because he didn’t have any de re attitudes<br />

towards me. Whatever (false) proposition I would (according to the contextualist)<br />

express by saying “Helping fugitive slaves was wrong” is not what Davis believed.<br />

He believed something that was made true (if it was) by his moral standards. Now<br />

compare (17).<br />

(17) Davis knew that helping fugitive slaves was wrong.<br />

It seems to me that that’s just false. And it’s false because helping fugitive slaves<br />

wasn’t, in fact, wrong. That’s not to deny that it was wrong by Davis’s standards.<br />

I suspect that by Davis’s standards, helping fugitive slaves was wrong. Maybe his<br />

dispositions to accept various general claims about moral standing, in appropriate<br />

condition, combined with some facts about the nature of fugitive slaves, meant that<br />

he wouldn’t in ideal conditions reject helping fugitive slaves. But I doubt it. It seems<br />

to me he had deeply ingrained, deeply misguided standards, and by those standards<br />

it was wrong to help fugitive slaves. Perhaps he even knew that about his own standards.<br />

But that’s neither here nor there to the truth of the English sentence (17),<br />

which sounds simply false in virtue of the rightness of helping fugitive slaves.<br />

Now neither the truth of (16), nor the falsity of (17) is, on its own, sufficient to<br />

undermine contextualism about “wrong”. The truth of (16) is consistent with the<br />

claim that “wrong” behaves like “might”. So in attitude ascriptions, what matters is<br />

the ascribees context. And the falsity of (17) is consistent with the claim that “wrong”<br />

behaves like “we”, and (17) is false because what helping fugitive slaves was wrong<br />

expresses in our context is false.<br />

Rather, the problem is that an adequate account of “wrong” has to account both<br />

for the truth of (16) and the falsity of (17). And that doesn’t seem to be possible. At<br />

least it isn’t possible without supposing that “wrong” behaves differently in knowledge<br />

reports and belief reports. And we’ve seen some reasons above to believe that<br />

that’s not how context-sensitive terms behave, whether the term is one like “we”, for<br />

which an orthodox theory seems best, or like “might”, for which a heterodox theory<br />

seems best.<br />

I’ll end with some objections that I’ve encountered since I’ve started discussing<br />

this argument, with my replies to each of them.


Attitudes and Relativism 313<br />

Objection: This is question-begging against the moral relativist.<br />

This is the most common reaction I’ve heard, but I do find it hard to make sense of. It<br />

is hard to see just which premise is question-begging. Nothing in moral relativism as<br />

such prevents us accepting the truth of Davis believed that helping fugitive slaves was<br />

wrong, and nothing in moral relativism prevents us from rejecting Davis knew that<br />

helping fugitive slaves was wrong. There is, I say, a problem with doing both of these<br />

things, as we should want to do. But if an argument is going to be rejected as questionbegging<br />

because the other side can’t simultaneously accept all of its premises, well we<br />

won’t have many arguments left to work with.<br />

A little more seriously, the relativist theories that I’m opposing here are semantic<br />

theories. If we can’t reject them because they commit us to endorsing sentences that<br />

we (the opponents of the view) can see to be false, then it is hard to know what could<br />

count as an argument in semantics. It’s no defence of a view to say that its proponents<br />

cannot see it is false, if the rest of us can see it.<br />

Objection: We would see the knowledge claim (17) is true, if only we didn’t<br />

have anti-relativist prejudices.<br />

This might well be right; it’s certainly hard to know when one is prejudice free.<br />

Perhaps all that’s going on is that we don’t want to be committed in any way to<br />

saying that it’s wrong to help fugitive slaves, and we’re worried that accepting (17)<br />

would, in some way, so commit us.<br />

But note how much I’ve done to stack the deck in favour of pro-relativist intuitions,<br />

and to dissipate this worry. The argument is coming at the end of a whole paper<br />

defending relativism. Earlier in this very paper I defended some relativist views<br />

from arguments using factive attitude verbs by noting that it is tricky to state just<br />

what factivity comes to. In particular, I noted that we can sometimes say A knows<br />

that S in circumstances where we would not, indeed could not truthfully, utter S.<br />

And I repeated this observation at the start of this section. I think I’ve done as much<br />

as possible to (a) overcome anti-relativist prejudice, and (b) frame the argument in<br />

such a way as to make it as easy as possible to accept (17). But even in this frame, I<br />

still say we can see that it is false.<br />

Objection: This is only an objection to one kind of context-sensitivity in<br />

moral terms, the kind we associate with traditional moral relativism. But it<br />

doesn’t show that moral terms are in no way context-sensitive. We’d expect<br />

that they are, since some moral terms are gradable adjectives, and gradable<br />

adjectives are context-sensitive.<br />

There’s a really interesting philosophical position around here. Start with the idea<br />

that we should be invariantists, perhaps realists in some quite strong sense, about<br />

moral comparatives. Perhaps this could be tied to the fairly intuitive idea that comparatives<br />

are what’s crucial to morality. Then say that terms like “right” and “wrong”<br />

pick out, in a context-sensitive way, points on the moral scale. So some kind of contextualism,<br />

presumably orthodox, is right for those terms. This position is immune


Attitudes and Relativism 314<br />

to the objection given above, because (16) turns out to be true, and (17) false, if we<br />

interpret “wrong” to mean above the actually contextually-salient level of wrongness,<br />

on the objectively correct wrongness scale.<br />

But I think a similar pair of examples show that this won’t work. Assume that<br />

we’re talking about various people’s charitable giving in a context where we don’t<br />

hold people to super-high standards. So the charitable actions of, say, Bill Gates<br />

count as laudable in our circumstances. (I assume that on the merits Gates’s donations<br />

are for the good; determining whether this is true is well outside the scope of<br />

this paper.) Now consider a philosopher, call him Peter, who doesn’t believe in the<br />

moral supererogatory, so he thinks anything less than the best you can do is morally<br />

wrong. It seems to me that, as uttered in our context, (18) expresses a truth and (19)<br />

a falsehood.<br />

(18) Peter believes that Bill Gates’s level of charitable donation is wrong.<br />

(19) Peter knows that Bill Gates’s level of charitable donation is wrong.<br />

And I don’t think there’s a good contextualist explanation for this pair of judgments.<br />

If “wrong” was just a simple context-sensitive term in the way suggested, then (18)<br />

should be false, because Peter doesn’t believe that Bill Gates’s level of charitable donations<br />

do rise to the level of wrongness that is, as it happens, is salient in our context.<br />

But intuitively, (18) is true.<br />

The same kind of objection can be raised to a more prominent kind of theory<br />

that takes a certain kind of normative standard to be set by context, namely classical<br />

contextualism about “knows”. Assume it is common ground that George has excellent,<br />

but not irrefutable, evidence that he has hands. Assume also that we’re in a low<br />

standards context for knowledge. And assume that René thinks knowledge requires<br />

objective certainty. Then it seems that we can truly say (20), but not (21).<br />

(20) René believes that George doesn’t know he has hands.<br />

(21) René knows that George doesn’t know he has hands.<br />

Again, the pattern here is very hard to explain on any kind of contextualist theory,<br />

be it orthodox or heterodox.<br />

Objection: Normative terms might be sui generis. Perhaps they are the only<br />

counterexamples to the pattern in (15).<br />

Anything could be the exception to any rule we like. But it’s bad practice to assume<br />

that we have an exception on our hands. If we heterodox theorists simply responded<br />

to von Fintel and Gillies’ argument from factive verbs by saying that we had an exception<br />

to a general pattern here, we would, quite rightly, not be taken seriously.<br />

Contextualists and relativists about normative terms should be held to the same standard.


Attitudes and Relativism 315<br />

Objection: Relativism does have counterintuitive consequences, but all theories<br />

have some counterintuitive consequences. Arguably everyone is going<br />

to have to accept some kind of error theory, and this is a relatively harmless<br />

kind of error to attribute to the folk.<br />

If we were convinced, perhaps by one or other of the contemporary developments<br />

of Mackie’s argument from queerness (Mackie, 1977), that no non-relativistic moral<br />

theory is possible (apart from a Mackie-style moral error theory), that would be an<br />

interesting argument for moral relativism. Certainly I’d be more willing to accept<br />

that (16) and (17) don’t have the same kind of context-sensitivity than I would be to<br />

accept that, say, it’s not wrong to torture babies.<br />

It is well beyond the scope of this paper to adjudicate such debates in any detail,<br />

but I am a little sceptical that we will ever face such a choice. Generalising<br />

wildly, most of the time our choice is between (a) accepting a moral error theory, (b)<br />

accepting some odd semantic consequences, as outlined here, or (c) rejecting some<br />

somewhat plausible claim about the nature of moral judgment, such as motivational<br />

internalism. (Without internalism there’s nothing to make moral properties “queer”<br />

in Mackie’s sense, for example.) Arguments that present a trilemma such as this deserve<br />

to be judged on their merits, but my feeling is that we should normally take<br />

option (c). That’s not to say this objection is obviously flawed, or that the argument<br />

I’ve offered is a knock-down refutation of relativism. It clearly is not. But I think it<br />

is a challenge that moral relativists and contextualists should face up to.


Conditionals and Indexical Relativism<br />

This paper is about a class of conditionals that Anthony Gillies (ms) has dubbed<br />

‘open indicatives’, that is, indicative conditionals “whose antecedents are consistent<br />

with our picture of the world” (1). I believe that what I say here can eventually be<br />

extended to all indicative conditionals, but indicatives that aren’t open raise special<br />

problems, so I’ll set them aside for today. In [auth 2001] I argued for an epistemic<br />

treatment of open indicatives, and implemented this in a contextualist semantics. In<br />

this paper I want to give another argument for the epistemic approach, but retract<br />

the contextualism. Instead I’ll put forward a relativist semantics for open indicatives.<br />

The kind of relativism I’ll defend is what I’ll dub ‘indexical relativism’.<br />

I’ve changed my mind since<strong>Weatherson</strong> (2001a) largely because of developments<br />

since I wrote that paper. There have been six primary influences on this paper, listed<br />

here in order that they become relevant to the paper.<br />

1. The arguments that Jason Stanley (along with co-authors) puts forward in his<br />

(2007) for the view that all effects of context on semantic content are syntactically<br />

triggered, and in particular involve context setting the value for a tacit or<br />

overt variable.<br />

2. John MacFarlane’s defences, starting with MacFarlane (2003a) of semantic relativism.<br />

3. John MacFarlane’s recent work, including MacFarlane (2009) at distinguishing<br />

the view that propositional truth can vary between different contexts in<br />

the same world, and the view that the truth of an utterance can be assessorsensitive.<br />

4. Tamina Stephenson’s (2007a) arguments in favour of a variable PRO J whose<br />

value is set by assessors.<br />

5. Philippe Schlenker’s (2003) idea, modelled on some examples from Barbara<br />

Partee (1989) that plural variables can be ‘partially bound’.<br />

6. Anthony Gillies’s (2009) suggestions for how to explain the acceptability of the<br />

‘import-export’ schema in an epistemic theory of indicative conditionals.<br />

I think it is noteworthy, in light of the claims one sometimes hears about philosophy<br />

not making progress, that most of the building blocks of the theory defended here<br />

weren’t even clearly conceptualised at the time I wrote the earlier paper.<br />

This paper is in seven sections. The paper starts with an argument in favour of an<br />

epistemic treatment of open indicatives, namely that only the epistemic theory can<br />

explain our judgments about inferences involving open indicatives. The argument<br />

isn’t completely original, indeed much of what I’ll say here can be found in Stalnaker<br />

(1975), but I don’t think the scope of this argument has been sufficiently appreciated.<br />

There are a large family of epistemic theories, and in section two I’ll set out some of<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Synthese<br />

166 (2009): 333-357. Thanks to Herman Cappelen, Andy Egan, Anthony Gillies, John Hawthorne,<br />

Jonathan Ichikawa, Jeff King, Ishani Maitra, Jason Stanley, Zoltan Szabó, Crispin Wright and audiences at<br />

the 2005 LOGOS workshop in Barcelona, and at Michigan, Rutgers and St Andrews.


Conditionals and Indexical Relativism 317<br />

the choice points that an epistemic theorist faces. I’ll also introduce a fairly simple<br />

epistemic theory, not the one I favour actually, that I’ll focus on in what follows.<br />

My preferred theory has several more bells and whistles, but I don’t think those are<br />

relevant to the issues about relativism and indexicalism that I’ll focus on here, and<br />

including them would just complicate the discussion needlessly.<br />

In section three I look at four ways a theory could say that the truth of an utterance<br />

type is sensitive to context. 1 The four ways are generate by the ways the theory<br />

answers two questions. First, is the truth of the utterance type sensitive to facts about<br />

the context of utterance, as contextualists say, or to facts about the context of evaluation,<br />

as relativists say? Second, does the utterance type express different propositions<br />

in different contexts, as indexicalists say, or does it express a proposition that takes<br />

different truth values in different contexts, as non-indexicalists say? In using the term<br />

‘indexicalist’, I’m implicitly assuming the theory, most associated with Jason Stanley<br />

(2007) , that the way an utterance type can express different propositions in different<br />

contexts is that it has a variable in its semantic structure, and different contexts assign<br />

different values to this variable.<br />

Three of the four options generated by the two questions, indexical contextualism,<br />

non-indexical relativism, and non-indexical contextualism, have received some<br />

coverage in the literature. The fourth option, indexical relativism, has not been as<br />

widely discussed. In section four I say a little about its motivations, including its<br />

connection to recent work by Tamina Stephenson. The variable in the semantics for<br />

open indicatives is a plural variable; roughly it takes as values all those propositions<br />

that are known by the salient people in the context. In section five I note some odd<br />

properties about bound plural pronouns that will become relevant to the story that<br />

follows. The short version is that some plural pronouns can have their values set<br />

partially by antecedent linguistic material, and partially by context. So if a pronoun<br />

v refers to the Xs, it might be that a is one of the Xs because v is bound to a term that<br />

denotes a, and b is one of the Xs because b is contextually salient in the right kind of<br />

way to be one of the things that v denotes.<br />

In section six I put forward the arguments against contextualism, and in particular<br />

against indexical contextualism. I start with some arguments from ‘faultless disagreement’,<br />

and go through four reasons why these might not be too strong. I then discuss<br />

some arguments from facts about when two people have said the same thing, or have<br />

said different things. These arguments tell against simple forms of indexical contextualism,<br />

but not against more sophisticated versions. But I conclude with a somewhat<br />

simpler argument, an argument from what I’ll call easy agreement, that does seem to<br />

undermine indexical contextualism.<br />

1 It is actually a little tricky to say just what the relevant types are. I mean to use the term so that two<br />

people make an utterance of the same type if their utterances use the same words, with the same syntax,<br />

and the same elided material, and the same meaning. So two utterances of “I am Australian” could be of<br />

the same type, even if offered by different people. In some cases, e.g. “Steel is strong enough”, it might<br />

be controversial whether two utterances that intuitively have different contents are either (a) of the same<br />

type, or (b) use terms with different meanings, or (c) have different elided material. I’ll try to stay neutral<br />

on this point.


Conditionals and Indexical Relativism 318<br />

Finally in section seven I’ll argue against non-indexical theories of open indicatives.<br />

The primary argument will be that the indexicalist has a good explanation<br />

of what’s going on in McGee’s ‘counterexamples to modus ponens’, an explanation<br />

borrowed from some recent work by Anthony Gillies, but the non-indexicalist does<br />

not. The indexicalist’s explanation is that these arguments contain fallacies of equivocation;<br />

on the non-indexicalist position, it is hard to see how they equivocate. The<br />

upshot of the final two sections is that indexical relativism is correct.<br />

1 Inferences Involving Conditionals<br />

I’m going to start by offering an argument for an epistemic treatment of conditionals.<br />

The argument isn’t particularly new, I’m basically just offering an extension of the argument<br />

in Stalnaker (1975), but I don’t think the force of it has been fully appreciated.<br />

The argument starts with the observation that any instance of any of the following<br />

inference schema seems acceptable when the conclusion is an open indicative.<br />

(1.1) Not A or B; so, If A then B<br />

(1.2) All Fs are Gs; so, If Fa then Ga<br />

(1.3) f (x) = f (y); so If f (a) = c then f (b) = c<br />

Here are some instances of each of these inferences.<br />

(1.4) Either Jack won’t come to the party or Jill will; so if Jack comes to the party,<br />

so will Jill.<br />

(1.5) All of Kim’s students failed; so if Alex was one of Kim’s students, then Alex<br />

failed.<br />

(1.6) Peter’s mother is Paul’s mother; so if Peter’s mother is Mary, Paul’s mother is<br />

Mary.<br />

Quite a lot has been written about (1.1)/(1.4) and I don’t propose to add to it. It<br />

is arguable that part of the explanation for its attractiveness comes from pragmatic<br />

properties of disjunctions, and if that’s right it would complicate the story I want to<br />

tell here. Instead I’ll focus on the other two inferences.<br />

Jonathan Ichikawa pointed out to me that (1.3) is not particularly compelling in<br />

cases where it is common ground in the conversation that f (a) is not c. For instance,<br />

it is at least odd to say “Peter’s mother is Jane, and she’s also Paul’s mother. So Peter’s<br />

mother is Paul’s mother; so if Peter’s mother is Mary, Paul’s mother is Mary.” I think<br />

intuitions differ on these cases. I don’t find the inferences as bad as many informants<br />

do, and generally speaking intuitions about knowledge-contravening indicatives are<br />

rather fuzzy. So I’m just going to focus on the case where the conclusions are open.<br />

Ernest Adams (1998) has objected to the claim that instances of (1.2) always seem<br />

like acceptable inferences using the following example.<br />

(1.7) Everyone who was at the party is a student. So if the Chancellor was at the<br />

party, the Chancellor is a student. (Adams 1998: 289)


Conditionals and Indexical Relativism 319<br />

I actually think this sounds like a perfectly fine inference. If I say that everyone<br />

at the party was a student, and someone takes me to thereby to be committed to<br />

the claim that if the Chancellor was at the party, she too is a student, then I won’t<br />

complain. But perhaps my intuitions here are odd. Here is an intuition that I feel<br />

more comfortable with. If the conclusion of (1.7) is an open indicative, that is if it<br />

isn’t ruled out that the Chancellor is a student, then the inference in (1.7) sounds<br />

perfectly fine to me.<br />

Adams has to object to (1.2) because it provides counterexamples to a thesis he<br />

defends at some length. This thesis is that an inference from a single premise to a<br />

conditional If p then q is a good inference iff necessarily the probability of q given p<br />

is not lower than the probability of the premise. Schema (1.3) is also a problem here<br />

as well. In each case it isn’t too hard to find instances where the probability of the<br />

premise is arbitrarily high, but the probability of the conclusion’s consequent given<br />

its antecedent is arbitrarily low. For instance, let the salient probabilities be as below:<br />

P r ( f (a) = f (b) = d) = 1 − x<br />

P r ( f (a) = c ∧ f (b) = e) = x<br />

If we let x be arbitrarily small, then the probability of the premise f (a) = f (b) will<br />

be arbitrarily high. But the conditional probability of f (b) = c given f (a) = c will be<br />

arbitrarily low. So the probability preservation property Adams has highlighted isn’t<br />

what is always preserved in good inferences.<br />

Each of the inferences in (1.1) to (1.3) is, in some sense, a good inference. It<br />

is easy to prove from that fact, and the assumption that good inferences are valid<br />

implications, that the conditional If p then q is true if p is false or q is true. If we<br />

assume Modus Ponens is valid (as I will throughout) then we can strengthen this<br />

conditional to a biconditional. It is obviously easy to prove this using the goodness<br />

of (1.1). Here is a proof that uses just (1.3), and some weak assumptions about truth.<br />

Line 1 is the only assumption, and every line seems to follow from the one before it.<br />

(1) The truth value of p is the truth value of p & q<br />

(2) So if the truth value of p is true, then the truth value of p & q is true<br />

(3) So if p is true, then p & q is true<br />

(4) So if p, then p & q<br />

(5) So if p, then q<br />

I’ve assumed here that we can treat The truth value of p is true, p is true, and p as equivalent,<br />

but this seems uncontroversial. And on pretty much any conditional logic there<br />

is, the move from (4) to (5) will be valid. Assuming bivalence, (1) is equivalent to the<br />

disjunction p is false or q is true. So we can infer from that disjunction to If p, then q<br />

without using the schema (1.1).<br />

This might all suggest that open indicatives should be interpreted as material implications.<br />

But there is some data that tells against that. This suggestion from Richard<br />

Bradley (2000) seems correct.


Conditionals and Indexical Relativism 320<br />

[O]ne cannot be certain that B is not the case if one thinks that it is possible<br />

that if A then B, unless one rules out the possibility of A as well. You<br />

cannot, for instance, hold that we might go to the beach, but that we certainly<br />

won’t go swimming and at the same time consider it possible that<br />

if we go to the beach we will go swimming! To do so would reveal a misunderstanding<br />

of the indicative conditional (or just plain inconsistency).<br />

(Bradley, 2000, 220)<br />

More generally, someone who regards A as an epistemic possibility, but knows<br />

that B is false, should regard If A, B also as something they know to be false. Bradley<br />

puts this in probabilistic terms as follows.<br />

Preservation Condition: If Pr(A) > 0 but Pr(B) = 0, then Pr(A → B) = 0<br />

This isn’t obviously the best formulation of his principle. In the example, what matters<br />

is not that A has non-zero probability, but that it is something that might be<br />

true. (These are different. The probability that the average temperature in Ithaca on<br />

January 1 next year will be exactly 32 degree Fahrenheit is 0, but that might be the<br />

exact temperature.) The structure of the inference looks to be what is given in (1.8),<br />

where Kp means the relevant agents knows that p.<br />

(1.8) ¬K ¬A<br />

K¬B<br />

So, K¬(If A, B)<br />

But this is not valid if the conditional is a material implication. So now it is impossible<br />

to accept all of the intuitively plausible principles about inference involving<br />

conditionals are truth-preserving. There must be some other explanation of the reasonableness<br />

of all these inferences other than their being valid implications.<br />

The best explanation I know of this ‘reasonableness’ is the one endorsed by<br />

Daniel Nolan (2003) as an explanation of inferences like (1.1). Nolan says that given<br />

an epistemic theory of the indicative, we can say that each of the inferences has the<br />

following property. Any speaker who knows the premise is in a position to truly<br />

assert the conclusion. Call an inference like this, where knowledge of the premise<br />

implies truth of the conclusion, epistemically acceptable. If we are confusing valid<br />

implications with epistemically acceptable inferences, this could explain why all of<br />

(1.1) through (1.3) seem reasonable. More impressively, this hypothesis of Nolan’s<br />

explains why (1.8) seems reasonable, given an epistemic theory of indicatives. If we<br />

know that A is true in some epistemic possibilities, but B is false in all of them,<br />

then all the epistemically salient alternatives where A is true will be ones where B is<br />

false. So (1.8) will turn out to be a good inference, by Nolan’s criteria. So (1.8), like<br />

(1.1) through (1.3), is epistemically acceptable. So given Nolan’s epistemic account<br />

of reasonable inference, and an epistemic theory of indicative conditionals, we can<br />

explain the reasonableness of all five problematic inferences. In the absence of any<br />

other good explanation of this reasonableness, this seems to me to be a good reason<br />

to accept both Nolan’s account and an epistemic theory of indicative conditionals.


Conditionals and Indexical Relativism 321<br />

2 The Simple Epistemic Theory of Conditionals<br />

For concreteness, I’ll work in this paper with a very simple theory of conditionals. I<br />

assume that in general a conditional If p, q has the logical form C(p, q, X), where C<br />

is the conditional relation, and X is a plural variable that denotes some propositions<br />

taken as fixed in the context. The simple epistemic theory makes two additions to<br />

this basic assumption.<br />

First, there is some epistemic relation R such a proposition s is among the X iff<br />

some salient individual i stands in relation R to s. We’ll use R(i) to represent those<br />

propositions. It will become important later that X is genuinely a plural variable, so<br />

R(i) is not a set of propositions, or a fusion of propositions (whatever that would be).<br />

Rather, I just mean to be plurally referring to the propositions that stand in relation<br />

R to i. (Note that I’m not saying anything here about how i is determined; my<br />

preferred theory is that it is the evaluator of any conditional utterance, but nothing<br />

in the simple epistemic theory turns on this.)<br />

A very conservative version of the theory says that R is the knowledge relation.<br />

One can liberalise the theory in two respects. First, we can say that R is the ‘position<br />

to know’ relation. Second, we can say that sRi iff someone salient to i knows that<br />

S. A maximally liberal version of the theory says that sRi iff someone salient to i is<br />

in a position to know that S. I’m not going to argue for this here, but I think this<br />

maximally liberal option is the way to proceed, so that’s what I’m going to adopt for<br />

the sake of exposition. Nothing turns on this adoption in what follows. Indeed the<br />

arguments against contextualism and for relativism are stronger the more constrained<br />

R is, so this is tilting the playing field away from my preferred outcome.<br />

Second, the simple theory I have in mind says that C is basically just the a priori<br />

entailment relation. So C(p, q, X) is true iff p plus X a priori entail q. If you want to<br />

say that entailment is a relation between a set and a proposition, the claim is that the<br />

union of {p} and {s: s is among the X} a priori entail that q.<br />

There are several ways in which one might want to complicate the simple epistemic<br />

theory. My preferred theory involves some of these complications. Here are<br />

some complications that have been proposed in various ways.<br />

First, we might change C to stipulate that whenever p and q are true, C(p, q, X)<br />

is true. This is equivalent to endorsing strong centring in the sense of Lewis (1973b).<br />

Assuming every proposition in X is true, as I’ve done above, means that we’ve already<br />

guaranteed that C(p, q, X) is false when p is true and q is false.<br />

Second, we might deny bivalence in the following way. Say that C(p, q, X) is<br />

true iff p and X a priori entail that q, false if it is not true and also p and X entail<br />

¬q, and indeterminate in truth value otherwise. Going down this path allows one<br />

to endorse conditional excluded middle, as supported by Stalnaker (1981). Denying<br />

bivalence does not compel acceptance of conditional excluded middle, but it becomes<br />

an interesting possibility once you go down this path.<br />

Third, we might say that R is a disjunctive relation, so some propositions are<br />

among the R(i) because they stand in an epistemic relation to i, and others are in<br />

because they are in some sense ‘fixed facts’ of i’s world. Nolan (2003) uses a quite


Conditionals and Indexical Relativism 322<br />

different formalism, but if we wanted to translate his theory into this formalism,<br />

that’s what we’d do.<br />

Fourth, we could make C a more complicated relation. In particular, we could<br />

make it in a sense non-monotonic, e.g. by saying that C(p, q, X) holds iff the epistemic<br />

probability of q given p and X is sufficiently high. If C is non-monotonic in this sense,<br />

then we can have a conditional logic that looks like some of the conditional logics in<br />

Lewis (1973b).<br />

For what it’s worth, I favour the first and (a version of) the second of these complications,<br />

but not the third and fourth. Defending those preferences would take us<br />

too far afield however. What I mostly want to show in this paper is that whatever<br />

form of epistemic theory we adopt, we should adopt what I’ll call an indexical relativist<br />

version of that theory. So I’ll just presuppose the simple epistemic theory<br />

throughout, because the general form of the argument should be easily adoptable<br />

whichever complications one adds on. The task of the next section is to introduce<br />

indexical relativism.<br />

3 Four Kinds of Sensitivity<br />

Let’s say that one is tempted towards a kind of moral relativism. So when old Horace,<br />

way back when, “Premarital sex is morally worse than driving drunk” he said<br />

something true in some sense, and when modern Kayla now says “Driving drunk<br />

is morally worse than premarital sex”, she also says something true in a sense. How<br />

might we formalise these intuitions? (Not, I might add, intuitions that I share.) There<br />

are a few simple options, breaking down along two distinct axes. To save space, I’ll<br />

write P for pre-marital sex, D for driving drunk, and < for the relation is morally<br />

worse than, in what follows. 2<br />

The first axis concerns the nature of propositions about the moral. One option is<br />

to say that moral codes are part of the propositions that are the content of Horace’s<br />

and Kayla’s utterances. For example, we might say that when old Horace makes his<br />

utterance, its content is the proposition P < D in M O , where M O is Horace’s old<br />

moral code. Conversely, when Kayla makes her utterance, its content is the proposition<br />

D < P in M N . where M N is Kayla’s new moral code. This option, as Sayre-<br />

McCord (1991) notes, treats ‘moral’ as being like ‘legal’. When we say “Insulting the<br />

Thai monarch is illegal”, the content of our utterance is the proposition Insulting<br />

the Thai monarch is illegal in L, where L is some salient legal code. That’s why typical<br />

utterances of that sentence in Bangkok are true, but typical utterances of it in St<br />

Andrews are false. Call this option indexicalism, since it thinks there is an indexical<br />

element in the semantic structure of what Horace and Kayla say. Because this will<br />

become crucial later, what I’m taking to be essential to indexicalism is simply the<br />

view that there is a moral code in the proposition expressed, not that it is the moral<br />

code of the speaker.<br />

2 I don’t mean to suggest that these are the only options. I’m leaving off options on which there are<br />

contextual effects on semantic content that are not syntactically triggered, for example. My reason for<br />

doing that is that there are, I think, good reasons for thinking that the context-sensitivity of indicative<br />

conditionals is syntactically triggered, so I don’t need to investigate non-syntactic triggers here.


Conditionals and Indexical Relativism 323<br />

A quite different option is to say that the content of Horace’s utterance is simply<br />

the proposition P < D. The relativism comes in because it turns out that propositions<br />

are true or false relative to, inter alia, moral codes. The proposition P < D is true in<br />

M O , and false in M N . The analogy here is to the way the very same proposition can<br />

be true in one world false in another. This option, which I’ll call non-indexicalism,<br />

says that moral codes function much like worlds; they are things relative to which<br />

propositions are true or false. The non-indexicalist takes Kaplan (1989a) to be on the<br />

right track in saying that propositions are true or false relative to world-time pairs,<br />

but thinks that the indices relative to which propositions are true or false are even<br />

more fine-grained than that.<br />

The second axis concerns which context is relevant to the truth of the utterance.<br />

One option is to say that it is the context of utterance. A second option is to say<br />

that it is the context of evaluation. Following Macfarlane (2007); MacFarlane (2009),<br />

I’ll call the first option contextualism, and the second option relativism. The point<br />

that’s worth focussing on here is that what choice we make here cuts across the choice<br />

we make on the first axis. So there are four options available. To set these out, we<br />

need to introduce a third character (call him Deval) who assesses Horace’s and Kayla’s<br />

utterances. For concreteness, call Deval’s moral code M A , and say that it agrees with<br />

M N on the point at issue. Then here are the four options we have.<br />

• Indexical Contextualism. The propositions that are the content of Horace’s and<br />

Kayla’s utteranes include moral codes, and which code that is is determined by<br />

features of their utterance. So the content of Horace’s utterance is the proposition<br />

P < D in M O , and Kayla’s the proposition D < P in M N . Deval should<br />

assess each of them as having uttered truths.<br />

• Non-indexical Relativism. The propositions that are the content of their utterances<br />

do not include moral codes, and their utterances are only true or false<br />

relative to a moral code provided by an assessor. So the content of Horace’s<br />

utterance is simply P < D, and Kayla’s D < P. Since in M A D < P is true, Deval<br />

should assess Horace’s utterance as false, and Kayla’s as true.<br />

• Non-indexical contextualism. The propositions that are the content of their<br />

utterances do not include moral codes, but the truth-value a moral utterance is<br />

the truth-value of its content in the context it is expressed in. So the content<br />

of Horace’s utterance is simply P < D, and Kayla’s D < P. Since Horace makes<br />

his utterance in a context where P < D is part of the prevailing moral code,<br />

his utterance is true. So Deval and Kayla should assess it as true, even though<br />

they think the proposition Horace expressed is false. This isn’t contradictory;<br />

a person in another possible world can make a true utterance by expressing<br />

a proposition that is (actually) false, and for the non-indexical contextualist,<br />

moral codes are in a way like worlds. Kayla’s utterance is true as well, since it<br />

is made in a different context.<br />

• Indexical relativism. The propositions that are the content of Horace’s and<br />

Kayla’s utteranes include moral codes, and which code that is is determined<br />

by features of the context of assessment. So when Deval hears of these two<br />

utterances, he should interpret the content of Horace’s utterance to be P < D


Conditionals and Indexical Relativism 324<br />

in M A , and of Kayla’s to be D < P in M A . Since the latter proposition is true,<br />

he should interpret Kayla’s utterance as true, and Horace’s as false.<br />

It might make it easier to picture these positions in a small table, as follows.<br />

Propositions include<br />

moral codes<br />

(indexicalism)<br />

Propositions are true or<br />

false relative to moral<br />

codes<br />

(non-indexicalism)<br />

The speaker’s moral<br />

code matters to utterance<br />

truth<br />

(contextualism)<br />

The assessor’s moral<br />

code matters to utterance<br />

truth<br />

(relativism)<br />

Indexical Contextualism Indexical Relativism<br />

Non-indexical Contextualism<br />

Non-indexical<br />

Relativism<br />

The modern discussion of non-indexical relativism, though not under that name,<br />

traces to MacFarlane (2003a). The modern discussion of non-indexical contextualism,<br />

under that name, traces to Macfarlane (2007); MacFarlane (2009). Much of this paper,<br />

and all of this section, is about setting out the distinctions that MacFarlane makes in<br />

the latter papers between indexicalism and contextualism. But once we do that, we<br />

see that there is a position, indexical relativism, that hasn’t had much attention. I<br />

plan to change that.<br />

Before we get on to the content of indexical relativism, a small note on nomenclature<br />

is in order. I’ve picked the names I have so (a) we’ll have a compositional naming<br />

scheme and (b) we get ‘non-indexical contextualism’ to denote what it denotes<br />

in MacFarlane’s terminology. This does mean using the term ‘indexical relativism’<br />

in a slightly different way to how it has been used in the part. Einheuser (2008)<br />

and López de Sa (2007a) each use ‘indexical relativism’ to mean just what I’ve meant<br />

by ‘indexical contextualism’. Kölbel (2004) also uses the term ‘indexical relativism’,<br />

though López de Sa (2007b) argues that he too just means contextualism, and I’m<br />

inclined to agree. So though the name had been previously used, it had not been used<br />

to express a distinctive view.<br />

On the other hand, there had been some discussions of the position I’m calling<br />

‘indexical relativism’. In Egan et al. (2005) we call such a position ‘content relativism’,<br />

though Cappelen (2008) uses that term for a slightly different position. In MacFarlane<br />

(2005b) he discusses ‘assessment indexicality’, a property sentences have if they<br />

express different propositions in different contexts. So there doesn’t seem to be a<br />

settled terminology for this corner of the table, and I propose to take ‘indexical relativism’<br />

for it.


Conditionals and Indexical Relativism 325<br />

4 Indexical Relativism<br />

In “Judge Dependence, Epistemic Modals, and Predicates of Personal Taste”, Tamina<br />

Stephenson proposes a variant on Peter Lasersohn’s (2005) relativist account of predicates<br />

of personal taste. Stephenson proposes that predicates of personal taste always<br />

encode relations between an object and an assessor. So when we say “Warm beer is<br />

tasty” we express some proposition of the form Warm beer is tasty to X. So far, this is<br />

not particularly new. What is interesting is Stephenson’s suggestion that some of the<br />

time (but not always) there is a ‘silent nominal’ PRO J , whose value is the ‘judge’. So<br />

the utterance will be true as judged by y iff warm beer is tasty to y. There are several<br />

advantages to positing a tacit parameter. One that Stephenson stresses is that in some<br />

cases, e.g. when we are talking about the tastiness of various brands of cat food, we<br />

can let the value of this parameter be the cat rather than any human. But, by letting<br />

it by default take the value PRO J , Stephenson shows that we can accommodate most<br />

of the intuitions that motivate Lasersohn’s relativism.<br />

Now Stephenson is not an indexical relativist, as I’ve defined that position. For<br />

according to the indexical relativist, propositions are only true or false relative to<br />

worlds. And Stephenson has propositions be world-time-judge triples. But I think<br />

we can adopt her idea to set out a kind of indexical relativism. I’ll first say how this<br />

could go in the moral case, then apply it to conditionals.<br />

The moral indexical relativist says that the context-neutral content of an utterance<br />

like Kayla’s is not a complete proposition. Rather, it is a propositional frame that we<br />

might express as D < P in M(PRO J ), where M(x) is x’s moral code, and PRO J is (as<br />

always) the judge. Relative to any judge, the content of her utterance is that D < P<br />

in that judge’s moral code. So relative to Horace, the content of her utterance is the<br />

false proposition D < P in M O , and relative to Deval it is the true proposition D < P<br />

in M A . That’s why it is fine for Horace to say “I disagree”, or “That’s false”, or “She<br />

speaks falsely”, and fine for Deval to say “I agree”, or “That’s true”, or “She speaks<br />

truly”. Now I reject any kind of moral relativism, so this isn’t my theory for moral<br />

language, but it’s a theory that could in principle work.<br />

What I will defend is indexical relativism for indicative conditionals. In general,<br />

the content of an indicative conditional If p, q is C(p, q, X), where the propositions in<br />

X are the ‘background’ propositions relative to which the conditional is assessed, and<br />

C is the conditional relation 3 . An epistemic theory of indicatives says that the value<br />

of X is (by default) R(x), where r is some epistemic relation (on a broad construal of<br />

‘epistemic’) and x is a salient individual. The indexical relativist position is that the<br />

content of an utterance of a conditional is (by default) a propositional frame that we<br />

might express as C(p, q, R( PRO J )). Relative to an assessor a, the content is C(p, q,<br />

R(a)).<br />

The argument for this position will come in sections 6 and 7. In section 6 I’ll argue<br />

against indexical contextualism. The argument will be that if indexical contextualism<br />

were true, it should be harder to agree with an utterer of a conditional than it actually<br />

3 I’m assuming throughout that it is sufficient for the truth of C(p, q, X) that it is a priori that p plus X<br />

entail q. I also think that’s necessary, but I won’t lean on this assumption.


Conditionals and Indexical Relativism 326<br />

is. Then in section 7, I’ll argue for indexicalism. The argument will be that we need<br />

to posit the third argument place in the conditional relation to explain what goes<br />

wrong in some arguments that are alleged to be both instances of modus ponens and<br />

invalid. I’ll argue (not particularly originally) that these arguments involve a shift of a<br />

tacit parameter, namely X. This suggests that X exists. Between those two arguments,<br />

we can conclude that indexical relativism is true. Before that, I want to look at some<br />

arguments against indexical relativism.<br />

In Egan et al. (2005) we mention two arguments against this position. One is that<br />

indexical relativism is incompatible with a Stalnakerian account of the role of assertion.<br />

Assertions, we said, are proposals to add something, namely their content, to<br />

the context set. But if the content of an assertion is different in different contexts,<br />

then it is impossible to add it to the context set. And that, we thought, was a problem.<br />

I now think there’s a relatively simple way around this. 4 If you want to add a<br />

proposition to the context set, then there has to be a context. And relative to any<br />

context, a conditional does have a content. So given any context, the content of the<br />

conditional (relative to that context) can be added to the context set. And that’s all<br />

the Stalnakerian account requires.<br />

Perhaps a stronger version of this objection is that even if you can figure out,<br />

given the rules, what move a speaker is making according to this theory, this isn’t<br />

a move that sensible speakers should want to make. So imagine that A says that<br />

if p, then q, and says this because they have just discovered something no one else<br />

knows, namely ¬(p & ¬q). Now B hears this, not because A tells her, but because<br />

of a fortuitous echo. B takes A to be expressing the proposition C(p, q, R(B)), and<br />

proposing that it be added to the context set. But that’s a terrible proposal, we might<br />

object, because A has no reason to know that C(p, q, R(B)), since she knows nothing<br />

about B’s knowledge. Since there is nothing wrong with A’s utterance, and the theory<br />

interprets her as making an indefensible proposal to add something to the context set,<br />

the theory is wrong.<br />

This objection is potentially a powerful one, and any version of indexical relativism<br />

must say something about such an objection. What I say is that the objection<br />

misconstrues R(B). If B is considering an utterance by A, even if B does not know<br />

that A is the author of that utterance, then any proposition that A knows is among<br />

the R(B). I think this holds quite generally. If A knowably asserts that p, and B considers<br />

it and says “That might not be true”, what B says is false, even if B does not<br />

know whether p is true. The reason is that B is taking A’s knowledge to be, for the<br />

time being, relevant to the context of her utterance. So in short, the knowledge on<br />

which A relies for her utterance is carried in to every context in which that very utterance<br />

is assessed. That’s why it is acceptable for A to make such a sweeping proposal,<br />

namely that for every x who evaluates her utterance, C(p, q, R(x)) should be added to<br />

the common ground of x’s context. This response relies heavily on specific features<br />

of the interaction between conditionals and context, and I don’t think it generalises<br />

very far. It may be that this style of objection does defeat some prima facie plausible<br />

4 The move I’m about to make bears at least a family resemblance to some moves López de Sa (2008)<br />

makes in defending what he calls ‘indexical relativism’, though he means something different by that<br />

phrase.


Conditionals and Indexical Relativism 327<br />

versions of indexical relativism, though it does not touch indexical relativism about<br />

open indicatives.<br />

A related objection, to a quite different proposal, is made in King and Stanley<br />

(2005). They oppose the theory that the semantic content of an utterance is something<br />

like a character. They say the content has to be a proposition, and the reason<br />

for this is that “Our understanding of a sentence in a context is due to a compositional<br />

procedure that calculates the content of the whole sentence from the referential contents<br />

of its parts.” This seems like a good reason for not taking the semantic content<br />

of a sentence in general to be its character. And we might worry that it could be<br />

extended to an argument against the view that the content of a conditional is a function<br />

from contexts of assessment to propositions. But on closer inspection it seems<br />

like no such generalisation is possible. After all, someone interpreting a conditional<br />

can assign a value to the variable X that takes different values in different contexts of<br />

assessment. They can just ‘replace’ PRO J with themselves when interpreting the conditional.<br />

The important point here is that the view that the utterance does not have<br />

a context-neutral semantic content is consistent with it having a content relative to<br />

any interpreter, and hence to interpreters discovering its content (relative to them).<br />

The other objection we made to indexical relativism in the earlier paper was that<br />

it left as unexplained some phenomena about the behaviour of epistemic modals in<br />

propositional attitude reports that the non-indexical theory could explain. I still<br />

think this is an advantage of the non-indexical theory, but I don’t think it is decisive.<br />

(In general, it isn’t a deal breaker that one theory has to take as a brute fact something<br />

that a rival theory explain, although it does count in favour of the rival.) I’ll say more<br />

about this objection when we discuss propositional attitude reports in more detail<br />

in section 6. But first, I need to introduce some facts about the behaviour of plural<br />

pronouns that my indexical relativist theory will exploit.<br />

5 Partial Binding<br />

Philosophical orthodoxy has it that all pronouns fall into one of two broad categories.<br />

On the one hand, there are deictic pronouns, whose job it is to refer (presumably directly)<br />

to a contextually salient object. On the other hand, there are pronouns whose<br />

job it is to denote (one way or another), an object denoted earlier in the discourse.<br />

Examples of the latter kind include the tokens of ‘she’ in (5.1) to (5.3).<br />

(5.1) If Suzy enters the competition, she will win.<br />

(5.2) Every student will get the grade that she deserves.<br />

(5.3) If a dictator has a daughter, she is pampered by the state.<br />

I’m not going to go into the (very interesting) debates about how many different<br />

kinds of pronouns are represented by the three tokens of ‘she’ above, nor about which<br />

of these pronouns are directly referential, which are quantifiers, and which are neither<br />

of these. All I want to note is that the pronouns represented here fall into a different<br />

category than simple deictic pronouns, that refer to a contextually salient individual.<br />

Thomas (McKay, 2006, Ch. 9) has argued that the behaviour of plural pronouns<br />

mirrors the behaviour of singular pronouns. He shows that for every different kind


Conditionals and Indexical Relativism 328<br />

of singular pronoun we can find, or even purport to find, we can find plural pronouns<br />

behaving the same way. The following three sentences, which are about a film school<br />

where girls make films in large groups, have pronouns that behave just like the three<br />

tokens of ‘she’ above.<br />

(5.4) If some girls enter the competition, they will win.<br />

(5.5) Some students will produce a better film than we expect them to.<br />

(5.6) If a student dislikes some girls, their work suffers.<br />

What is quite noteworthy about plural pronouns, however, is that they need not<br />

fall into one of the two major categories I mentioned at the start of the section. 5<br />

It is possible to have a plural pronoun whose denotation is determined partially by<br />

context, and partially by the denotation of earlier parts of the discourse. Consider,<br />

for example, (5.7), as uttered by Jason.<br />

(5.7) If Ted comes over, we’ll go and get some beers.<br />

It seems the ‘we’ there denotes Ted and Jason. It denotes Jason because it’s a firstperson<br />

plural pronoun, and Jason is the speaker, and the speaker is always among the<br />

denotata of a first-person plural pronoun. Arguably, the ‘we’ is anaphoric on ‘Ted’,<br />

but this does not mean it denotes only Ted. Rather, it means that Ted is among the<br />

denotata of ‘we’, the others being determined by context. One might object that<br />

really ‘we’ in (5.7) is deictic, and Ted is among its denotata because he has been made<br />

salient. I think that’s probably a mistake, but I don’t want to press the point. Rather,<br />

I’ll note some other cases where such an explanation is unacceptable. The following<br />

example, due to Jeff King (p.c.) shows that ‘we’ can behave like a donkey pronoun.<br />

(5.8) If any friend comes over, we’ll go and get some beers.<br />

Intuitively that’s true, as uttered by Jason, just in case for some salient class of friends,<br />

if any member of that class comes over, Jason and that friend will go and get beers.<br />

Now it is controversial just how to account for donkey pronouns in general, and<br />

I’m not going to take a side on that. But however they work, donkey pronouns<br />

seem to fall on the second side of the divide I mentioned at the top of the section.<br />

And first person singular pronouns are paradigmatic instances of pronouns that get<br />

their reference from context. What’s notable is that first-person plural pronouns can<br />

display both kinds of features.<br />

This is not a particularly new point. Example (5.9) was introduced by Barbara<br />

Partee in 1989, and there is a longer discussion of the phenomena by Phillipe Schlenker<br />

in his (2003). The latter paper is the source for (5.10) and (5.11).<br />

5 I thought this was a way in which plural pronouns were unlike singular pronouns, but Zoltán Szabó<br />

suggested persuasively that singular pronouns could also be partially bound in the sense described below.<br />

The interesting cases concern pronouns that seem to refer to objects made from multiple parts, with<br />

circumstances of utterance determining some parts of the referent, and the other parts being the denotata<br />

of the terms to which the pronoun is partially bound. I’m not going to take a stand on whether such<br />

pronouns exist, but if Szabó’s suggestion is correct, then the need to take X to be a plural variable is<br />

lessened.


Conditionals and Indexical Relativism 329<br />

(5.9) John often comes over for Sunday brunch. Whenever someone else comes over<br />

too, we (all) end up playing trios. Partee (1989)<br />

(5.10) Each of my colleagues is so difficult that at some point or other we’ve had an<br />

argument. Schlenker (2003)<br />

(5.11) [Talking about John] Each of his colleagues is so difficult that at some point or<br />

other they’ve had an argument. Schlenker (2003)<br />

Schlenker describes what is going on here as ‘partial binding’, and I’ll follow his lead<br />

here. The ‘we’ in (5.10) is bound to the earlier quantifier phrase ‘Each of my colleagues’,<br />

but, as above, this does not mean that it merely denotes (relative to a variable<br />

assignment) one of the speaker’s colleagues. Rather, it denotes a plurality that<br />

includes the colleague, and includes the speaker. And the speaker is supplied as one<br />

of the denotata by context.<br />

The reason for mentioning all this here is that my theory of how conditionals<br />

behave involves, among other things, partial binding. The general semantic structure<br />

of a conditional is C(p, q, X), where X is a plural variable that denotes some propositions.<br />

I think X can, indeed often is, partially bound. In simple cases a proposition<br />

is among the X just in case it stands in relation R to a salient individual i. In more<br />

complex cases, X is partially bound to an earlier phrase, and in virtue of that some<br />

proposition s is among the X. But the propositions that stand in relation R to i are<br />

also among the X, because X is only partially bound to the earlier proposition. If<br />

there were no other instances of partial binding in natural language, this would be a<br />

fairly ad hoc position to take. But there shouldn’t be any theoretical problem with<br />

assuming that tacit variables can behave the way that overt pronouns behave.<br />

6 Against Indexical Contexualism<br />

One usual way to argue for relativist theories is to appeal to instances of faultless<br />

disagreement. It is natural to think that such arguments could work in the case of<br />

open indicatives. Since Gibbard (1981) there has been a lot of discussion over cases<br />

where A knows ¬(p & ¬q), and B knows ¬(p & q). It seems A can truly, even knowledgeably,<br />

say If p, q, and B can truly, even knowledgeably, say If p, ¬q. And, in the<br />

right context, it might seem that this is a case where A and B disagree. One might try<br />

and argue that the only way to explain this faultless disagreement between A and B is<br />

through some variety of relativistic semantics. I think that will be a hard argument<br />

to make out for four reasons.<br />

First, some people hold that the notion of a faultless disagreement is incoherent.<br />

I suspect that’s wrong, and the concept is coherent, but making this argument stick<br />

would require showing that faultless disagreement is indeed coherent. I want the<br />

argument for indexical relativism about open indicatives to not rely on the coherence<br />

of faultless disagreement.<br />

Second, two people can disagree without there being any proposition that one<br />

says is true and the other is false. (This should be familiar from debates about noncognitivism<br />

in ethics.) If A says “I like ice cream” and B says “I don’t like ice cream”,


Conditionals and Indexical Relativism 330<br />

then there is a natural sense in which they are disagreeing, for instance. But arguments<br />

from disagreement for relativism generally require that when two people disagree,<br />

there is a proposition that one accepts and the other rejects, and that may not<br />

be true.<br />

Third, there is some special reason to think that this is what happens in the conditionals<br />

case. In this case A and B are having a conditional disagreement. Perhaps we<br />

intuit that A and B are disagreeing merely because of this conditional disagreement.<br />

For comparison, if A and B had made a conditional bet, we would describe them as<br />

having made a bet in ordinary discourse, even if the bet is not realised because the<br />

condition is not satisfied.<br />

Finally, as Grice (1989) showed, there can be cases where we naturally describe<br />

A and B as disagreeing in virtue of two utterances, even though (a) those utterances<br />

are simple assertions, and (b) the assertions are consistent. Grice’s case is where A<br />

says that p or q is true, and B says that p or r is true, with stress on r, where q and r<br />

are obviously incompatible. Perhaps the natural thing to say here too is that A and<br />

B have a conditional disagreement; conditional on ¬p, A thinks that q and B thinks<br />

that r. So this argument seems to need a lot of work.<br />

An apparently stronger argument comes from indirect speech reports. It seems<br />

that in any case where a speaker, say Clarke, says “If the doctor didn’t do it, the lawyer<br />

did”, then in any other context, we can report that by saying “Clarke said that if the<br />

doctor didn’t do it, the lawyer did.” This might look to pose a problem for indexical<br />

contextualism.<br />

Assuming that the content of If p, q is C(p, q, R(i)), where i is some person made<br />

salient by the context of utterance. Let p be The doctor didn’t do it, and q be The<br />

lawyer did it, c be Clarke, and h be the person who reports what Clarke said. Then<br />

it seems that what Clarke said is C(p, q, R(c)). But it seems that the content of what<br />

comes after the that in the report is C(p, q, R(h)). But since R(c) might not be the<br />

same as R(h), this should look like a bad report.<br />

I think this is something of a problem for the indexical contextualist, but it isn’t<br />

beyond repair. It could be that the variable i in the speech report is bound to the<br />

name at the start of the report, so the value of i in Clarke said C(p, q, R(i)) is simply<br />

Clarke herself. This is a slightly odd kind of binding, but it isn’t impossible, so this<br />

doesn’t quite rule out a contextualist theory.<br />

As MacFarlane (2009) argues, the felicity of homophonic reports does not raise a<br />

problem for either kind of non-indexical theory. I’ll argue that it also doesn’t pose a<br />

problem for indexical relativism.<br />

The indexical relativist thinks that, on its most natural interpretation, the content<br />

of Clarke’s utterance is C(p, q, R( PRO J )). Similarly, it might be thought that the<br />

natural interpretation of what comes after the that in the report is C(p, q, R( PRO J )).<br />

So it isn’t surprising that the report is acceptable.<br />

I can imagine an objector making the following speech. “Assume that Clarke’s<br />

utterance was sincere. Then it seems natural to say that Clarke believes that if the<br />

doctor didn’t do it, the lawyer did. But it is odd to think that Clarke believes C(p,<br />

q, R( PRO J )). What she believes is C(p, q, R(c)). The only way to get belief reports


Conditionals and Indexical Relativism 331<br />

to work on the indexical contextualist theory is to insist that the i is bound to the<br />

subject of the report. But once you say that it is ad hoc to deny that the i is also<br />

bound to the subject in a speech report. And not just ad hoc, it implies that (relative<br />

to our context) Clarke doesn’t believe what she says, for she says C(p, q, R( PRO J ))<br />

but believes C(p, q, R(c)). On the other hand if the i is bound to the speaker in a<br />

speech report, so what Clarke is said to have said is C(p, q, R(c)), then (a) you have no<br />

advantage over the indexical contextualist, and (b) you can’t explain why the report<br />

is felicitous, since you say she says something else, namely C(p, q, R( PRO J )).”<br />

I have three responses to this critic. The first is that I’m not convinced that having<br />

a different treatment of belief reports and speech reports, letting i be PRO J in speech<br />

reports and the believer in belief reports, is too terrible. The argument that it would<br />

be ad hoc to treat speech reports and belief reports separately seems weak. It is worse<br />

if we end up, because of the structure of the theory, accusing Clarke of insincerity.<br />

One way of avoiding that response is to accept the binding proposal. But it isn’t the<br />

only way. If we make two assumptions about C and R, we can sidestep the danger.<br />

The first assumption is that C is monotonic in the sense that C(p, q, X) entails<br />

C(p, q, X + Y). The second is that xRp is true in case someone salient to x bears R to<br />

p, and the utterer of the judged sentence is salient (in this sense) to the judge. (Note<br />

this is exactly the assumption that I made earlier to defend my proposal about what<br />

effect uttering a conditional has on the Stalnakerian context.) Now not all indexical<br />

relativists will want to make these assumptions, but I’m happy to do so. Now if<br />

Clarke believes If p, q, she believes C(p, q, R(c)). So she either believes she knows ¬(p<br />

& ¬q), or believes she knows some things that (perhaps unbeknownst to her) entail<br />

it. That means that, whoever the judge of her utterance is, she believes that R(PRO J )<br />

either includes or entails ¬(p & ¬q). So she believes C(p, q, R( PRO J )), as required.<br />

So I don’t think the indexical relativist has to concede to the critic that speech<br />

reports involve binding in this way. But I might be wrong about this, so my second<br />

and third responses concede this point, and argue that it doesn’t harm the indexical<br />

relativist position.<br />

The second response is that even with this concession, the indexical relativist has<br />

a small advantage over the indexical contextualist. Drawing on Stephenson’s work,<br />

we could argue that (a) PRO J is often the value of a tacit variable, and (b) whenever<br />

it is the default value of a variable, then that variable is bound to the subject of a<br />

propositional attitude report. If that is the case, then the indexical relativist could<br />

unify a number of different cases that would have to be treated separately by the<br />

indexical contextualist. Still, it is true that the non-indexicalist has an even larger<br />

advantage here, since they can explain why this (apparent) binding holds, but I don’t<br />

think this advantage is decisive. This is the one argument for non-indexicalism that<br />

I mentioned in section 4 might still have some force, though not I think enough to<br />

override the argument for indexicalism in the next section.<br />

The third response is that the indexical relativist has a simple explanation of why<br />

the reports are natural, even on the assumption that the i is bound to the speaker.<br />

First consider a similar case. Imagine Clarke had simply said “The lawyer did it”, i.e.<br />

q. It would be natural to report her as having said that the lawyer actually did it. Now


Conditionals and Indexical Relativism 332<br />

one can imagine being surprised at this. Clarke said something contingently true, but<br />

we reported her using a proposition that the lawyer actually did it, that is necessarily<br />

true. How is this possible? Well, it is because in saying q, she immediately and<br />

obviously commits herself to Actually q, and if a speaker immediately and obviously<br />

commits themselves to a proposition in virtue of an utterance, then it is natural to<br />

report them as having said that proposition. Speakers are generally committed to the<br />

truth of the utterances from their own perspective, so Clarke is committed to C(p, q,<br />

R(c)). (Arguably that is all she is committed to, as opposed to C(p, q, R( PRO J )).) So<br />

we can report her as having said C(p, q, R( PRO J )). And if i is bound to the speaker,<br />

that is what we do report her as having said by saying “Clarke said that if the doctor<br />

didn’t do it, the lawyer did.” 6<br />

So both the argument from disagreement and the argument from speech report<br />

against indexical contextualism have run up against some blocks. There is another<br />

argument, however, that is effective against it. This is an argument from easy agreement.<br />

Assume again that Clarke said “If the doctor didn’t do it, the lawyer did.”<br />

Assume that an arbitrary person, call him Rebus, knows that Clarke made this utterance,<br />

and knows that either the doctor or the lawyer did it, that is knows that ¬(p<br />

& ¬q). On that basis alone, it will be natural for Rebus to make any of the following<br />

utterances. “I agree”; “That’s right”; “That’s true”; “What she said is true”; “She<br />

spoke truly”. Any of these are hard to explain on the indexical contextualist view,<br />

according to which agreement should be harder to get than this.<br />

On the indexical contextualist view, Clarke said C(p, q, R(c)). Now on most<br />

accounts of what r is, Rebus need not know that ¬(p & ¬q) is one of the propositions<br />

in R(c). He need not know that Clarke know this, or that Clarke could have known<br />

this, or really anything else. As long as he knows that Clarke made this utterance, it<br />

seems acceptable for him to agree with it using any of the above formulations.<br />

There are two ways that the indexical contextualist might try to explain this agreement.<br />

First, they could try to argue that it is acceptable for Rebus to agree with<br />

Clarke’s utterance despite not agreeing with the propositional content of it. Second,<br />

they could try to argue that it is the case, in any case fitting the above description,<br />

that he agrees with the propositional content of what Clarke said.<br />

The first approach does not seem particularly attractive. Not only does it seem<br />

theoretically implausible, it is hard to find independent reason to believe that this is<br />

how agreement works. Generally if a speaker utters some term with a contextually<br />

sensitive term in it, then another speaker will not agree with the utterance unless<br />

they agree with the proposition we get by filling in the appropriate value for the<br />

contextually sensitive term. Or, perhaps more precisely, they will not accept all five<br />

of above forms of agreement.<br />

This is how agreement works when the utterance contains an explicit indexical<br />

like ‘I’. Note that if Clarke had said “I like Hibs”, Rebus could say “I agree” if he<br />

too likes Hibs. But he couldn’t have said, for instance, “What she said is true” unless<br />

Clarke liked Hibs. This is why the variety of forms of agreement matters. Perhaps<br />

6 This response is similar to some of the arguments for speech act pluralism in Cappelen and Lepore<br />

(2005).


Conditionals and Indexical Relativism 333<br />

more contentiously, I think this is also what happens when the original utterance<br />

involves quantifiers with tacit domain restriction, or comparative adjectives with tacit<br />

comparison classes, or modals, or any other kind of context sensitive language. So<br />

the indexical contextualist should look at a second option.<br />

The second option is to say that the proposition Rebus knows, ¬(p & ¬q), will be<br />

one of the R(c). There are two ways to do this. First, we could say that R(c) includes<br />

all propositions that are known by anyone, so as long as Rebus knows ¬(p & ¬q),<br />

it is one of the R(c). But this just about reduces back to the material implication<br />

theory of indicatives, since any conditional will be true as long as anyone knows the<br />

corresponding material implication. And that is implausible. Second, we could say<br />

that R(c) includes any proposition known by anyone who hears c’s utterance. That<br />

would again ensure that ¬(p & ¬q) is one of the R(c). But again, it is fairly implausible.<br />

For one thing, it doesn’t seem that the truth of the conditionals I’m writing in this<br />

paper depend on how wide a readership the paper has. For another, under some<br />

assumptions this again collapses into the material implication theory. Assume that<br />

there is an omniscient deity. Then for any conditional If p, q, the deity’s knowledge<br />

is among R(c), and if ¬(p & ¬q) is true, then it is one of the R(c). But then If p, q<br />

will be true, which was not what we wanted. Now we don’t know that there is an<br />

omniscient deity, but it seems reasonable to require that our semantic theories be at<br />

least consistent with the existence of such a deity.<br />

So I think the indexical contextualist has no explanation of this agreement phenomena.<br />

But the indexical relativist has no such problem. When Clarke’s utterance<br />

is being judged by Rebus, it expresses the proposition C(p, q, R( Rebus)), so ¬(p &<br />

¬q) is clearly one of the propositions in the third clause. That’s why agreement with<br />

another’s utterance of a conditional is so easy.<br />

Similarly there is no problem for a non-indexical relativist. It is a little trickier to<br />

know whether there is a problem here for the non-indexical contextualist. It is easy to<br />

see why many of the locutions Rebus could use are acceptable. Rebus does, after all,<br />

accept the proposition that Clarke expresses, namely C(p, q). The only complication<br />

concerns “She spoke truly.” There is a sense in which that’s not really true. After<br />

all, the non-indexical contextualist thinks that Clarke’s utterance was false, since they<br />

think that an utterance is true iff the proposition it expresses is true in its context.<br />

If we think, as probably isn’t compulsory, that “She spoke truly” means what the<br />

theorist means by saying the utterance is true, then there is a problem for the nonindexical<br />

contextualist.<br />

Setting aside those complications, what is clear is that the phenomena of agreement<br />

raises a problem for the indexical contextualist, and not for the relativist. We<br />

can put the problem another way. If indexical contextualism is true, it should be possible<br />

for Rebus to say “Clarke’s utterance “If the doctor didn’t do it, the lawyer did”<br />

was not true, but if the doctor didn’t do it, the lawyer did.” Again, this seems like<br />

it should be possible on theoretical grounds, and it is possible for most contextually<br />

sensitive sentences. But this doesn’t seem to be a coherent speech on Rebus’s part.<br />

This is of course just another manifestation of the phenomena that agreement with<br />

conditionals is easy. If Rebus accepts that if the doctor didn’t do it, the lawyer did,


Conditionals and Indexical Relativism 334<br />

then he accepts that Clarke’s utterance was true. The indexical contextualist can’t<br />

explain this, so indexical contextualism is false.<br />

7 Against Non-Indexicalism<br />

The argument for an indexicalist account of indicatives is that they allow an elegant<br />

account of what is going on in apparent counterexamples to modus ponens, such as<br />

the cases due to Vann McGee (1985). What these cases turn on is that right-embedded<br />

conditionals, like If p, then if q, r seem equivalent, in some sense, to conditional with<br />

conjunctive antecedents, in this case If p and q, r. Given this equivalence, and the<br />

triviality of If p and q, p and q, we get the result that (7.1) is trivial.<br />

(7.1) If p, then if q, p and q<br />

And if (7.1) is genuinely trivial, then a number of awkward consequences follow.<br />

Perhaps the worst of these consequences is that we seem to get counterexamples to<br />

modus ponens. Let p be some truth that isn’t knowable. Since (7.1) is trivial, it is<br />

true. And by hypothesis p is true. But on pretty much any epistemic theory of<br />

conditionals, If q, p and q will not be true. So we have counterexamples to modus<br />

ponens. (This is more like McGee’s ‘lungfish’ example than the more widely cited<br />

example about the 1980 election, but the structure I think is basically the same, and<br />

the solution I offer will generalise to all these cases.)<br />

What I’m going to say about these cases borrows heavily from some remarks by<br />

Anthony Gillies (2009). Gillies makes two observations that point towards a solution<br />

to the puzzle McGee’s cases raise.<br />

First, we cannot in general assert both of the premises, namely (7.1) and p, in<br />

contexts where the conclusion, namely If q, p and q is not assertable. This might<br />

need to be qualified in cases where people don’t know what they can assert, but it<br />

is largely right. As Gillies demonstrates by close attention to the cases, some kind<br />

of context shift between the premises and conclusion is needed in order to assert the<br />

conclusion after the premises have been asserted.<br />

Second, there are many reasons to believe that part of what why (7.1) seems trivial<br />

is that we evaluate its consequent relative to a context in which p is taken to be part of<br />

the evidence. Gillies formalises this by having the antecedent play two separate roles,<br />

first as a constituent of the conditional uttered, and second as a context-modifier<br />

relative to which the consequent is interpreted. The formal theory I’m building here<br />

is quite different to Gillies’ because of very different starting assumptions, but I will<br />

adopt Gillies’ idea to the framework I’m using.<br />

Despite Gillies’ first observation, there are still three reasons to take seriously<br />

the challenge McGee’s cases raise. Two of these involve using modus ponens under<br />

the scope of a supposition, and the third involves agents who don’t know what they<br />

know. The first problem concerns the following implication.


Conditionals and Indexical Relativism 335<br />

(1) If p, if q, p and q Premise<br />

(2) If ¬p, if q, ¬p and q Premise<br />

(3) p or ¬p Logical truth<br />

(4) p Assumption for argument by cases<br />

(5) If q, p and q Modus Ponens, 1, 4<br />

(6) (If q, p and q) or (If q, ¬p and q) Or introduction, 5<br />

(7) ¬p Assumption for argument by cases<br />

(8) If q, ¬p and q Modus Ponens, 2, 7<br />

(9) (If q, p and q) or (If q, ¬p and q) Or introduction, 8<br />

(10) (If q, p and q) or (If q, ¬p and q) Argument by cases, 3, 4-6, 7-9<br />

But on the simple epistemic theory we’ve been using here, (10) will not be true in<br />

cases where the truth value of p is unknown, even though it seems to follow from two<br />

trivialities and a logical truth. (I’m assuming here either that classical logic is correct,<br />

or that p is decidable.) Now it might be noted here that on some theories, particularly<br />

those that follow Stalnaker (1981) in accepting conditional excluded middle,<br />

(10) will be true. But even on those Stalnakerian theories, there will be cases where<br />

neither disjunct of (10) will be determinately true. And we can rerun a version of this<br />

argument, taking as premises that (1), (2) and (3) are determinately true, to derive as<br />

a conclusion that one or other disjunct is determinately true.<br />

The third reason is similar to the second. We can use modus ponens in the scope<br />

of a reductio proof. Or, more colloquially, we can use modus tollens. But the following<br />

argument does not look to be particularly compelling.<br />

(1) If p, if q, p and q Premise<br />

(2) It is not the case that if q, p and q Premise<br />

(3) Not p Modus Tollens, 1, 2<br />

It may be objected that modus tollens is more controversial than modus ponens.<br />

But since we can derive it using just modus ponens and reductio ad absurdum, this<br />

objection looks weak. So this would be a bad result.<br />

It might be thought best to say here that modus ponens doesn’t preserve truth,<br />

but it does preserve knowledge. If a subject is in knows each premise, they can know<br />

the conclusion. But that doesn’t seem right either, though the cases are slightly obscure.<br />

Assume a perfectly rational S knows that p, but does not know that she knows<br />

that p, and in fact for all she knows she knows, q and ¬p is true. Again assuming (7.1)<br />

is trivial, she knows it, and she knows that p, but on an epistemic interpretation of<br />

the conditional, she won’t know If q, p and q, since she doesn’t know she knows that<br />

p.<br />

So there is a serious problem here. Once we accept that (7.1) is trivial, a lot<br />

of unfortunate consequences follow for the epistemic theory of conditionals. Any<br />

explanation of why (7.1) seems trivial will, I think, have to start with Gillies’ insight<br />

that when we interpret (7.1), we evaluate its consequent relative to a context where p<br />

is taken as given. How might we do this? Three options spring to mind.


Conditionals and Indexical Relativism 336<br />

The first option is Gillies’ theory is that it is part of the meaning of the conditional<br />

that its consequent be interpreted relative to a context where its antecedent is<br />

part of the background information. That has the nice result that (7.1) is indeed trivial.<br />

It seems, however, to lead to all the problems mentioned above. Gillies’ response<br />

to these is to develop a new theory of validity, which has the effect that while modus<br />

ponens is itself valid, it can’t be used inside the scope of suppositions, as I frequently<br />

did above. This is a very interesting theory, and it may well work out, but I’m going<br />

to try to develop a more conservative approach.<br />

The second option is to say that just uttering a conditional, If A, B, adds A to the<br />

background information. This seems like a bad option. For one thing, there is no<br />

independent reason to believe that this is true. For another, it can’t explain what is<br />

wrong with the following kind of argument.<br />

(1) Burns knows that if p, then if q, p and q<br />

(2) Burns knows that it is not the case that if q, p and q<br />

(3) Burns is logically perfect, and knows the logical consequences of everything he<br />

knows<br />

(4) So, Burns knows not p<br />

In a case where Burns doesn’t know whether p is true, and Burns is indeed logically<br />

perfect, then intuitively (1), (2) and (3) are true, but (4) are false. And since no conditionals<br />

were asserted, it is hard to see how the context was shifted.<br />

The third, and best, option is to say that the variable in the semantics of an embedded<br />

conditional is partially bound to the antecedent. Normally when we say If<br />

q, p and q, the content of that is C(q, p and q, X), and normally X is R( PRO J ). The<br />

view under consideration says that when that conditional is itself the consequent of<br />

a conditional, the variable X is partially bound to the antecedent of the conditional.<br />

So the value of X is p plus whatever is supplied by context.<br />

The contextualist says that that value is R(i), where i is usually the speaker. So the<br />

semantic content of (7.1) is C(p, C(q, p and q, p + R(i)), R(i)). And that will be trivial<br />

since the middle term is trivial. The relativist says that the contribution of context<br />

to X is R( PRO J ). So the semantic content of (7.1) is C(p, C(q, p and q, p + R( PRO J )),<br />

R( PRO J )). And again, that is trivial.<br />

This gives us a natural explanation of what is going on in the McGee cases. There<br />

is simply an equivocation between premise and conclusion in all of the cases. What<br />

follows from C(p, C(q, p and q, p + R(x)), R(x)) and p is C(q, p and q, p + R(x)).<br />

But that’s not what we normally express by If q, p and q. At least, it isn’t what we<br />

express once we’ve made it clear that p is not part of the background information.<br />

(Here is where Gillies’ observation that the McGee cases seem to require a context<br />

shift between stating the premises and stating the conclusion becomes relevant.) So<br />

although modus ponens is valid, the McGee cases are simply not instances of modus<br />

ponens, since there is an equivocation in the value of a tacit variable.<br />

It might be argued that this is too much of a concession to McGee. Some people<br />

have the judgment that (7.1) is not always trivial, in particular that conditionals If<br />

A, then if B, A are not always trivially true. Personally I don’t get these readings,


Conditionals and Indexical Relativism 337<br />

but I note that the theory allows for their possibility. After all, binding need not<br />

be compulsory. We can interpret the ‘she’ in If Suzy enters the race, she will win<br />

deictically, if that’s what makes the best sense in the context. Perhaps in cases where<br />

people are hearing the false readings of If A, then if B, A, all that is going on is that the<br />

tacit indexical in the embedded conditional is unbound. Similarly, if one’s reaction<br />

to seeing the McGee arguments is to interpret the embedded conditionals as false, I<br />

suspect what is going on is that one is hearing the variables here as unbound. As I<br />

said, I don’t get these readings, but I can explain where these readings come from.<br />

The story I’m telling about the McGee cases is hardly new. Indeed, the view that<br />

the McGee cases are not strictly speaking instances of Modus Ponens is old enough<br />

to have been disparaged by William Lycan in his attacks on Modus Ponens.<br />

But this very strict sense of ‘instance’ is neither specific nor intended<br />

in logic textbooks ... What students and professional philosophers have<br />

always been told is that barring equivocation or overt indexicals, arguments<br />

of the sentential form If A, B; A; therefore, B are valid arguments,<br />

period ... One can continue to insist that Modus Ponens is valid for the<br />

strict sense of ‘instance’, but at the price of keeping us from telling easily<br />

and uncontroversially when a set of ordinary English sentences is an ‘instance’<br />

of an argument form. (Lycan, 1993, 424, notation slightly altered)<br />

But why should we give any privilege to overt indexicals? Tacit variables can be<br />

just as important in determining which form an argument takes. For example, the<br />

following argument is, on the most natural interpretation of each sentence, invalid.<br />

(1) No foreigner speaks a foreign language.<br />

(2) Ségolinè is a foreigner.<br />

(3) French is a foreign language.<br />

(4) Ségolinè does not speak French.<br />

That is invalid on its most natural reading because the tacit variable attached to ‘foreign’<br />

in premises 1 and 3 takes a different value. No one would reasonably say that we<br />

should rewrite the logic books so the argument form No F Rs a G; Fa; Gb; so ¬Rab is<br />

not valid on this account. Lycan is right about the downside of this point. There is<br />

no way to tell easily and uncontroversially what the form of an argument in natural<br />

language is. But we should never have believed such careful matters of interpretation<br />

would be easy. (They say life wasn’t meant to be.)<br />

Having said that, on the indexical relativist proposal offered here, it isn’t that hard<br />

to tell what the value of X in a typical indicative is. It is usually R( PRO J ), and there<br />

might be a very short list of circumstances where it takes any other value. Any indexical<br />

account faces a potential cost that it makes interpretation more difficult than<br />

it might otherwise be, since the hearer has to determine the value for the indexical.<br />

The fact that X is usually R( PRO J ) minimises that cost. What is new to my proposal<br />

is that X might be partially bound in the McGee cases. But that only helps the interpretative<br />

task, since it reduces the task to a familiar problem interpreters face when<br />

the speaker uses a partially bound plural pronoun.


Conditionals and Indexical Relativism 338<br />

But the primary point of this proposal is not to offer a new solution to the McGee<br />

cases. Rather it is to note one of the requirements of this kind of (relatively familiar)<br />

solution. An equivocation solution requires that there be something in the semantic<br />

content of the conditional that takes different values in the consequent of premise 1<br />

and in the conclusion. And non-indexical theories, by definition, can’t say that there<br />

is any such thing. For the whole point of such theories is to deny that the content of<br />

a conditional is always different in contexts with different information sets. So they<br />

cannot say the McGee arguments (or the other arguments I surveyed above that use<br />

Modus Ponens in embedded contexts) involve equivocation. But then it is hard to<br />

say what is wrong with those arguments. So these theories seem, implausibly, to be<br />

committed to denials of Modus Ponens. That’s a sufficient reason, I think, to be an<br />

indexicalist.<br />

Let’s take stock. In section 6 I argued that the indexical contextualist has no<br />

explanation of why it is so easy to agree with another’s utterance of a conditional. In<br />

this section I argued that only the indexicalist can offer a satisfactory explanation of<br />

what is going on in the McGee argument. The upshot of these two arguments is that<br />

we should be indexical relativists. For only the indexical relativist can (a) explain the<br />

agreement data and (b) explain what goes wrong in the McGee arguments.<br />

As a small coda, let me mention one other benefit of the partial binding account.<br />

When I presented an earlier version of this paper at the LOGOS workshop on Relativising<br />

Utterance Truth, the following objection was pressed to the argument in<br />

section 1 for an epistemic treatment of indicatives. It is true that when we know that<br />

f (a) = f (b), then we are prepared to assert If f (a) = x, then f (b) = x. But it is also true<br />

that when we merely suppose that f (a) = f (b), then we are prepared to infer inside<br />

the scope of the supposition that If f (a) = x, then f (b) = x. The epistemic account<br />

cannot satisfactorily explain this. At the time I didn’t know how to adequately explain<br />

these intuitions, but now it seems the partial binding story can do the work.<br />

It seems that inside the scope of a supposition that p, the value of X is p + Y, where<br />

Y is the value X would otherwise have had. That is, the variable in the conditional<br />

is partially bound to the supposition that governs the discourse. That explains why<br />

all the inferences mentioned in section one are acceptable, even when the premise is<br />

merely a supposition.<br />

Coda: Methodological Ruminations<br />

The version of relativism defended here is conservative in a number of respects.<br />

Three stand out.<br />

First, it is conservative about what propositions are. The propositions that are<br />

the content of open indicatives (relative to contexts of assessment) are true or false<br />

relative to worlds, not to judges, or epistemic states, or anything of the sort.<br />

Second, it is (somewhat) conservative about how the sentences get to have those<br />

propositions as content. The standing meaning of the sentence contains a variable<br />

place that gets filled by context. To be sure, it is a plural variable that can be partially<br />

bound, but there is independent evidence that plural variables can be partially bound.<br />

And of course, and this is a radical step, its value can be different for different assessors


Conditionals and Indexical Relativism 339<br />

of the one utterance. But from the indexical relativist perspective, the contextualist<br />

theory that values for variables are set by contexts of utterance is an overly hasty<br />

generalisation from the behaviour of a few simple indexicals. (It isn’t clear even clear<br />

that the contextualist theory can account for simple pronouns, like ‘you’ or ‘now’<br />

as they appear in sentences like the one you are now reading, so this generalisation<br />

might have been very poorly motivated in the first place.)<br />

Third, it is conservative about the motivation for relativism. I haven’t relied on<br />

intuitions about faultless disagreement, which is an inherently controversial topic.<br />

Rather, I’ve argued that we can motivate relativism well enough by just looking at<br />

the grounds on which people agree with earlier utterances. I think there is a general<br />

methodological point here; most of the time when theorists try to motivate<br />

relativism using cases of disagreement, they could derive most of their conclusions<br />

from careful studies of cases of agreement. This method won’t always work; I don’t<br />

think you can replicate the disagreement-based arguments for moral relativism with<br />

arguments from agreement for example. But I think that is a weakness with moral<br />

relativism, rather than a weakness with the methodology of focussing on agreement<br />

rather than disagreement with arguing for relativism.<br />

Now one shouldn’t fetishise epistemic conservativeness. But a relativism that<br />

requires less of a revision of our worldview should be more plausible to a wider range<br />

of people than a more radical relativist view. And that’s what I’ve provided with the<br />

indexical relativist theory defended here.


Indicatives and Subjunctives<br />

This paper presents a new theory of the truth conditions for indicative conditionals.<br />

The theory allows us to give a fairly unified account of the semantics for indicative<br />

and subjunctive conditionals, though there remains a distinction between the two<br />

classes. Put simply, the idea behind the theory is that the distinction between the<br />

indicative and the subjunctive parallels the distinction between the necessary and the<br />

a priori. Since that distinction is best understood formally using the resources of<br />

two-dimensional modal logic, those resources will be brought to bear on the logic of<br />

conditionals.<br />

1 A Grand Unified Theory?<br />

Our primary focus is the indicative conditional ‘If A, B’, written as A → B. Most<br />

theorists fail to distinguish between this conditional and ‘If A, then B’, and for the<br />

most part I will follow this tradition. The most notable philosophical exception is<br />

Grice, who suggested that only the latter says that B follows from A in some relevant<br />

way (1989: 63). Theorists do distinguish between this conditional and the subjunctive<br />

‘If it were the case that A, it would be the case that B’, written as A � B. There<br />

is some debate about precisely where to draw the line between these two classes,<br />

which I’ll discuss in section three, but for now I’ll focus on cases far from the borderline.<br />

One important tradition in work on conditionals holds that the semantics of<br />

indicatives differs radically from the semantics of subjunctives. According to David<br />

Lewis (1973b, 1976b) and Frank Jackson (1987) for example, indicatives are truthfunctional,<br />

but subjunctives are not. This makes a mystery of some of the data. For<br />

example, as Jackson himself writes:<br />

Before the last presidential election commentators said ‘If Reagan loses,<br />

the opinion polls will be totally discredited’, afterwards they said ‘If Reagan<br />

had lost, the opinion polls would have been totally discredited’, and<br />

this switch from indicative to subjunctive counterfactual did not count<br />

as a change of mind (Jackson, 1987, 66).<br />

The point can be pushed further. To communicate the commentators’ pre-election<br />

opinions using indirect speech we would say something like (1).<br />

(1) Commentators have said that if Reagan were to lose the opinion polls would<br />

be totally discredited.<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philosophical<br />

Quarterly 51 (2001): 200-216. Thanks to Stephen Barker, John Bigelow, Richard Holton, Lloyd<br />

Humberstone, Frank Jackson, Europa Malynicz, Daniel Nolan, Graham Oppy, Ted Sider and an anonymous<br />

referee for Philosophical Quarterly for helpful comments and suggestions.


Indicatives and Subjunctives 341<br />

Yet it is possible on Jackson’s view that what the commentators said was true, since<br />

Reagan won, yet the words after ‘that’ in (1) form a false sentence. So we can accurately<br />

report someone speaking truly by using a false sentence. Jackson’s response<br />

plays on the connections between A → B and the disjunction ‘Not-A or B’. That disjunction<br />

has undeniably different truth conditions to A � B. Pushing the truth conditions<br />

of A → B closer to those of A � B will move them away from ‘Not- A or B’.<br />

One gain in similarity and theoretical simplicity is bought at the cost of another.<br />

Jackson’s account, by making A → B have similar truth conditions to ‘Not - A or B’<br />

but similar assertibility conditions to A � B, tries to have the best of both worlds.<br />

How great the similarity between indicative conditionals and disjunctions really is,<br />

and hence how great the cost of linking indicatives and subjunctives, might well be<br />

questioned. After all, we don’t report an utterance of an indicative using a disjunction.<br />

Two types of cases seem to threaten the success of a unified theory. First, rigidifying<br />

expressions like ‘actually’ behave differently in indicatives and subjunctives.<br />

Secondly, some conditionals differ in intuitive truth value when we transpose them<br />

from the indicative to the subjunctive. The most famous examples of this phenomenon<br />

involve various presidential assassinations. The effects of rigidity on conditionals<br />

are less explored, so we will first look at that. Consider the following example, from<br />

page 55 of Naming and Necessity.<br />

(2) If heat had been applied to this stick S at t 0 , then at t 0 stick S would not have<br />

been one meter long.<br />

The background is that we have stipulated that a metre is the length of stick S at time<br />

t 0 . (2) contrasts with (3), which seems false.<br />

(3) If heat was applied to this stick S at t 0 , then at t 0 stick S was not one meter<br />

long.<br />

If we have stipulated that to be a meter long is to be the length of S at t 0 , then whatever<br />

conditions S was under at t 0 , it was one meter long. As Jackson points out, we<br />

can get the same effect with explicit rigidifiers like ‘actually’. We could, somewhat<br />

wistfully, say (4). It may even be true. But (5) seems barely coherent, and certainly<br />

not something we could ever say.<br />

(4) If Hillary Clinton were to become the next U.S. President, things would be<br />

different from the way they actually will be.<br />

(5) If Hillary Clinton becomes the next U.S. President, things will be different<br />

from the way they actually will be.<br />

It looks like any theory of conditionals will have to account for a difference between<br />

the behaviour of rigid designators in indicatives and subjunctives. We may avoid the<br />

conclusion by showing that the difference only appears in certain types of conditionals,<br />

and we already have an explanation for those cases. For example, it is well known<br />

that usually one cannot say A → B if it is known that not-A. As Dudman (1994)<br />

points out, (6) is clearly infelicitous on its most obvious reading.


Indicatives and Subjunctives 342<br />

(6) *Granny won, but if she lost she was furious.<br />

To complete the diagnosis, note that the most striking examples of the different behaviour<br />

of rigid designators in different types of conditionals comes up in cases where<br />

the antecedent is almost certainly false. The effect is that the subjunctive can be asserted,<br />

but not the indicative. So this phenomenon may be explainable by some<br />

other part of the theory of conditionals. 1 These are the most striking exemplars of<br />

the difference I am highlighting, but not the only examples. Hence, this point cannot<br />

explain all the data, though it may explain why pairs like (2)/(3) and (4)/(5) are<br />

striking. For instance, in the following pairs, the indicative seems appropriate and<br />

intuitively true, and the subjunctive seems inappropriate and intuitively false.<br />

(7) If C-fibres firing is what causes pain sensations, then C-fibres firing is what<br />

actually causes pain sensations.<br />

(8) If C-fibres firing were what caused pain sensations, then C-fibres firing would<br />

be what actually causes pain sensations.<br />

(9) If the stuff that plays the gold role has atomic number 42, then gold has atomic<br />

number 42.<br />

(10) If the stuff that played the gold role had atomic number 42, gold would have<br />

atomic number 42.<br />

In (9) and (10) I assume that to play the gold role one must play it throughout a large<br />

part of the world, and not just on a small stage. Something may play the gold role in<br />

a small part of the world without being gold. Since there are pairs of conditionals like<br />

these where the indicative is appropriate, but the subjunctive is not, the explanation<br />

of the behaviour of rigid terms cannot rely on the fact that the antecedents of indicatives<br />

must be not known to be false. We will also need a more traditional example of<br />

the differences between indicatives and subjunctives, as in (11) and (12).<br />

(11) If Hinckley didn’t shoot Reagan, someone else did.<br />

(12) If Hinckley hadn’t shot Reagan, someone else would have.<br />

I have concentrated on the examples involving rigidity because they seem to pose a<br />

deeper problem for unifying the theory of conditionals than the presidential examples.<br />

As Jackson (1987, 75) points out, one can presumably explain (11) and (12) on<br />

a possible worlds account by varying the similarity metric between indicatives and<br />

subjunctives, or on a probabilistic account by varying the background evidence. It is<br />

unclear, however, how this will help with the rigidity examples. Assume, for example,<br />

that C-fibres firing is not what causes pain sensations. Still, (7) seems true, but<br />

its consequent is false in all possible worlds. Therefore, the nearest world in which<br />

its antecedent is true is a world in which its consequent is false, and on a simple possible<br />

worlds theory it should turn out false. On a simple probabilistic account, the<br />

probability that C-fibres firing actually cause pain sensations given that they do is<br />

1, whatever the background evidence, so (8) should turn out true, contrary to our<br />

1 An anonymous reviewer for Philosophical Quarterly suggested this point.


Indicatives and Subjunctives 343<br />

intuitions. So while the details deal with the presidential examples, the structure of<br />

the theory must deal with the rigidity examples.<br />

I will follow that strategy here. In section two I set out the framework of a unified<br />

possible worlds account of indicatives and subjunctives. In section three I present my<br />

preferred way of filling out the details of that framework. The framework deals with<br />

the differing behaviour of rigid designators in indicatives and subjunctives; the details<br />

deal with examples like (11) and (12). One reason for dividing the presentation in this<br />

way is to highlight the option of accepting the framework and filling in the details in<br />

different ways.<br />

2 The New Theory<br />

2.1 Actually<br />

As Kripke (1980) showed, the reference for some terms is fixed by what plays a particular<br />

role in the actual world. Even if it were the case that XYZ fills the ocean, falls<br />

from the sky, is drinkable and transparent and so on, for short is watery, it would<br />

still be the case that water is H2O, not XYZ. For it would still be that H2O actually<br />

is watery. Whatever were the case, this world would be actual.<br />

Yet, we want to have a way to talk about what would have happened had some<br />

other world been actual. In particular, had the actual world been one in which XYZ is<br />

watery, it would be true, indeed necessarily true, that water is XYZ. Throughout the<br />

1970s a number of methods for doing this were produced. The following presentation<br />

is indebted to Davies and Humberstone (1980), but other approaches might have been<br />

x<br />

used. The notation y A is interpreted as ‘A is true in world y from the perspective of<br />

world x as actual’. So, letting @ be the actual world and w be a world in which only<br />

XYZ is watery, we can represent what was said informally above as follows.<br />

@<br />

@<br />

H 2 O is watery and H 2 O is water.<br />

@<br />

w XYZ is watery and H2O is water.<br />

w<br />

H2O is watery and XYZ is water.<br />

@<br />

w XYZ is watery and XYZ is water.<br />

w<br />

Now as Kripke noted, it is necessary but a posteriori that water is H 2 O. Conversely,<br />

it is a priori but contingent that water is watery. This is a priori because we knew<br />

before we determined what water really is that it would be whatever plays the watery<br />

role in this world, the actual world. In general A is necessary iff, given this is the<br />

actual world, it is true in all worlds. And A is a priori iff, whatever the actual world<br />

turns out to be like, it makes A true. So we get the following definitions.<br />

w<br />

A is a priori iff for all worlds w, w A.<br />

@<br />

A is necessary iff for all worlds w, w A.


Indicatives and Subjunctives 344<br />

The connection between actuality and the a priori is important. It is a priori that we<br />

are in the actual world. Something is a priori iff it is true whenever the two indices<br />

are the same. If we regard possible worlds as sets of sentences, we can think of the sets<br />

x<br />

{A: x A} for each possible world x as the epistemically possible worlds. Note that<br />

I don’t make the set of epistemically possible worlds relative to an evidence set, as<br />

others commonly do. Rather they are just the sets of sentences consistent with what<br />

x<br />

we know a priori. More accurately, identify a world pair 〈x, y〉 with the set of {A: y<br />

A}. Then 〈x, y〉 is an epistemically possible world pair iff x = y.<br />

To finish this formal excursion, we note the definition of ‘Actually A’. Given<br />

what has been said so far, this needs no explanation.<br />

x x<br />

y Actually A iff x A.<br />

2.2 The Analysis of Indicatives<br />

Now we have the resources for my theory of the truth conditions for indicatives. I<br />

also give the parallel truth condition for subjunctives to show the similarities.<br />

@<br />

@<br />

@<br />

@<br />

x x<br />

A → B iff the nearest possible world x that x A is such that x B.<br />

@<br />

@<br />

A � B iff the nearest possible world x that x A is such that x B.<br />

These only cover the special case of what is true here from the perspective of this<br />

world as actual. We can partially generalise the analysis of indicatives in one dimension<br />

as follows.<br />

w x<br />

w A → B iff the nearest possible world x to w such that x A is such that<br />

x<br />

x B.<br />

I will make some comments below about how we might fully generalise the analysis,<br />

but for now, I want to focus on these simpler cases. Note that straight away this makes<br />

A → Actually A come out true, by the definition of ‘Actually’. If we allow ourselves<br />

quantification over propositions, we can give an analysis of ‘things are different from<br />

the way they actually are’, as follows:<br />

x x<br />

( y Things are different from the way they actually are) iff (∃ p: y p and<br />

not x<br />

x p)<br />

Since nothing both is and is not the case in x from the perspective of x as actual, this<br />

can never be true when y is x. This explains why it can never serve as the consequent<br />

of an indicative conditional.


Indicatives and Subjunctives 345<br />

2.3 Motivations<br />

The theory outlined here is reasonably unified, and accounts for the rigidity phenomena,<br />

but without any further justification, the resort to two-dimensional modal<br />

logic is ad hoc. This subsection responds to that problem with some independent<br />

motivations for the theory. In particular I argue that this theory best captures the<br />

well-known epistemic feel of the indicative conditional.<br />

Ever since Ramsey (1929) most theorists have held that there is an epistemic element<br />

to indicatives. Here is Ramsey’s sketch of an analysis of indicatives.<br />

If two people are arguing ‘If p will q?’ and are both in doubt as to p,<br />

they are adding p hypothetically to their stock of knowledge and arguing<br />

on that basis about q; so that in a sense ‘If p, q’ and ‘If p, ¬q’ are<br />

contradictories (Ramsey, 1929, 247n).<br />

Nothing of the sort could be true about subjunctives. What is in our ‘stock of knowledge’,<br />

or the contextually relevant knowledge, makes at most an indirect contribution<br />

to the truth- value of a subjunctive. It makes an indirect contribution because the<br />

common knowledge might affect the context, which in turn determines the similarity<br />

measure. But given a context, a subjunctive makes a broadly metaphysical claim,<br />

an indicative a broadly epistemic claim. Hence, the relationship between the indicative<br />

and subjunctive should parallel the relationship between the necessary and the a<br />

priori. As should be clear, this is exactly what happens on this theory.<br />

The close similarity between the indicative/subjunctive distinction and the a priori/necessary<br />

distinction can be demonstrated in other ways. For example, corresponding<br />

to the contingent a priori (13) the indicative (14) is true, but the subjunctive<br />

(15) is false. And corresponding to the necessary a posteriori (16) the subjunctive (17)<br />

is true but the indicative (18) is false. (I am assuming that it is part of the definitions<br />

of the water role and the fire role that nothing can play both roles.)<br />

(13) Water is what plays the water role.<br />

(14) If XYZ plays the water role, XYZ is water.<br />

(15) If XYZ played the water role, it would be water.<br />

(16) Water is H 2 O.<br />

(17) If all H 2 O played the fire role, all water would be fire.<br />

(18) If all H 2 O plays the fire role, all water is fire.<br />

This suggests the analysis sketched here is not ad hoc at all, but follows naturally from<br />

considerations about the necessary and a priori. These sketchy considerations might<br />

not provide much positive support for my theory. The main evidence for the theory,<br />

however, is the way it manages the hard cases, particularly cases involving rigid<br />

designation. What these considerations show is that the correct theory of indicatives<br />

may invoke the resources of two-dimensional modal logic without automatically renouncing<br />

any claim to systematicity.


Indicatives and Subjunctives 346<br />

3 The Details<br />

In this section, I want to look at four questions. First, what can we say about the<br />

similarity measure at the core of this account? Secondly, how should we generalise<br />

the theory to cover cases where the definite description in the analysis appears to<br />

denote nothing? Thirdly, how should we generalise the theory to cover cases where<br />

the two indices differ? Finally, how should we draw the line between indicatives<br />

and subjunctives? If what I said in the previous section is correct, there should be<br />

something to say about each of these questions, and what is said should be motivated.<br />

While it is not important that what I say here is precisely true, I do hope that it is.<br />

3.1 Nearness<br />

Ideally, we could use exactly the same similarity metric for both indicatives and subjunctives.<br />

The existence of pairs like (11) and (12) suggests this is impossible. So<br />

we must come up with a pair of measures on the worlds satisfying three constraints.<br />

First, the measure for subjunctives must deliver plausible verdicts for most subjunctive<br />

conditionals. Secondly, the measure for indicatives must deliver plausible verdicts<br />

for most indicative conditionals. Thirdly, the measures must be similar enough that<br />

we can explain the close relationship between indicatives and subjunctives set out in<br />

section one. The theory of section two requires that these objectives be jointly satisfiable.<br />

I will attempt to demonstrate that they are by outlining a pair of measures<br />

satisfying all three.<br />

Lewis (1979b) provides the measure for subjunctives. He suggests the following<br />

four rules for locating the nearest possible world in which A is true.<br />

(1) It is of the first importance to avoid big, widespread, diverse violations of law.<br />

(2) It is of the second importance to maximise the spatio-temporal region throughout<br />

which perfect match of particular fact prevails.<br />

(3) It is of the third importance to avoid even small, localized, simple violations of<br />

law.<br />

(4) It is of little or no importance to secure approximate similarity of particular<br />

fact, even in matters that concern us greatly. (Lewis, 1979b, 47-8)<br />

The right measure for indicatives is somewhat simpler. Notice that whenever we<br />

know that A ⊃ B and don’t know whether A, A → B seems true. More generally, if I<br />

know some sentence S such that A and S together entail B, and I would continue to<br />

know S even were I to come to doubt B, then A → B will seem true to me. No matter<br />

how good a card cheat I know Sly Pete to be, if I know that he has the worse hand,<br />

and that whenever someone with the worse hand calls they lose, it will seem true to<br />

me that If Sly Pete calls, he will lose. Further, if someone else knows these background<br />

facts and tells me that If Sly Pete calls, he will lose, she speaks truthfully.<br />

This data suggests that whenever there is a true S such that A and S entail B,<br />

A → B is true. But this would mean A → B is true whenever A ⊃ B is true, which<br />

seems incredible. On this theory it is true that If there is a nuclear war tomorrow,<br />

life will go on as normal. There are some very subtle attempts to make this palatable.<br />

The ‘Supplemented Equivalence Theory’ in Jackson (1987) may even be successful.


Indicatives and Subjunctives 347<br />

But two problems remain for all theories saying A → B has the truth value of A ⊃ B.<br />

First, they make some apparently true negated conditionals turn out false, such as It<br />

is not true that if there is a nuclear war tomorrow, life will go on as normal. It is hard<br />

to see how an appeal to Gricean pragmatics will avoid this problem. Secondly, such<br />

theories fail the third task we set ourselves at the start of the section: explaining the<br />

close connections between indicatives and subjunctives.<br />

So we might be tempted to try a different path. Let’s take the data at face value<br />

and say that A → B is true in a context if there is some S such that some person in<br />

that context knows S, and A and S together entail B. We can formalise this claim as<br />

follows. Let d(x, y) be the ‘distance’ from x to y. This function will satisfy few of<br />

the formal properties of a distance relationship, so remember this is just an analogy.<br />

Let K be the set of all propositions S known by someone in the context, W the set<br />

of all possible worlds, and i the impossible world, where everything is true. Then d:<br />

W × W ∪ {i} → ℜ is as follows:<br />

If y = x then d(x, y) = 0<br />

If y ∈ W , y �= x and ∀S: S ∈ K ⊃ y<br />

If y = i then d(x, y) = 2<br />

Otherwise, d(x, y) = 3<br />

y<br />

S, then d(x, y) = 1<br />

Less formally, the nearest world to a world is itself. The next closest worlds are<br />

any compatible with everything known in the context, then the impossible world,<br />

then the possible worlds incompatible with something known in the context. It may<br />

seem odd to have the impossible world closer than some possible worlds, but there<br />

are two reasons for doing this. First, in the impossible world everything known to<br />

any conversational participant is true. Secondly, putting the impossible world at this<br />

position accounts for some examples. This is a variant on a well known case; see for<br />

example Gibbard (1981) and Barker (1997).<br />

Jack and Jill are trying to find out how their local representative Kim, a Democrat<br />

from Texas, voted on a resolution at a particular committee meeting. So far, they<br />

have not even found out whether Kim was at the meeting. Jack finds out that all<br />

Democrats at the meeting voted against the resolution; Jill finds out that all Texans<br />

at the meeting voted for it. When they return to compare notes, Jack can truly say If<br />

Kim was at the meeting, she voted against the resolution, and Jill can truly say If Kim<br />

was at the meeting, she voted for the resolution. If i is further from the actual world<br />

than some possible world where Kim attended the meeting, these statements cannot<br />

both be true.<br />

It may be thought the distance function needs to be more fine-grained to account<br />

for the following phenomena 2 . It seems possible that in each of the following pairs,<br />

the first sentence is true and the second false.<br />

(19) (a) If Anne goes to the party, so will Billy.<br />

2 Lewis (1973b) makes this objection to a similar proposal for subjunctives; the objection has just as<br />

much force here as it does in the original case.


Indicatives and Subjunctives 348<br />

(b) If Anne goes to the party, Billy will not go.<br />

(20) (a) If Anne and Carly go to the party, Billy will not go.<br />

(b) If Anne and Carly go to the party, so will Billy.<br />

(21) (a) If Anne, Carly and Donna go to the party, so will Billy.<br />

(b) If Anne, Carly and Donna go to the party, Billy will not.<br />

Assume, as seems plausible, it is necessary and sufficient for A → B to be true that<br />

the nearest A ∧ B world is closer than the nearest A ∧ ¬B world. (This does not immediately<br />

follow from the analysis in section 2, but is obviously compatible with it.)<br />

Given this, there is no context in which the first conditional in each pair is true, and<br />

the second false. McCawley (1996) points out a way to accommodate these intuitions.<br />

Every time a conditional is uttered, or considered in a private context, the context<br />

shifts so as to accommodate the possibility that its antecedent is true. So at first we<br />

don’t consider worlds where Carly or Donna turn up, and agree that (19a) is true and<br />

(19b) false because in those worlds Billy loyally follows Anne to the party. When<br />

(20a) or (20b) is uttered, or considered, we have to allow some worlds where Carly<br />

goes to the party into the context set. In some of these worlds Anne goes to the party<br />

and Billy doesn’t, the worlds where Carly goes to party. A similar story explains how<br />

(21a) can be true despite (20b) being false. 3<br />

This move does seem to save the theory from potentially troubling data, but without<br />

further support it may seems rather desperate. There are two independent motivations<br />

for it. First, it explains the inappropriateness of (6).<br />

(6) *Grannie won, but if she lost she was furious.<br />

If assertion narrows the contextually relevant worlds to those where the assertion<br />

is true, as Stalnaker (1978) suggests, and uttering a conditional requires expanding<br />

the context to include worlds where the antecedent is true, it follows that utterances<br />

like (6) will be defective. The speech acts performed by uttering each clause give<br />

the hearer opposite instructions regarding how to amend the context set. Secondly,<br />

McCawley’s assumption explains why we generally have little use for indicative conditionals<br />

whose antecedents we know are false. To interpret an indicative we first<br />

have to expand the context set to include a world where the antecedent is true, but<br />

if we know the antecedent is false we usually have little reason to want to do that. If<br />

there is a dispute over the size of the context set, we may want to expand it so as to<br />

avoid miscommunication, which explains why we will sometimes assert conditionals<br />

with antecedents we know to be false when trying to convince someone else that the<br />

antecedent really is false.<br />

So we have a pair of measures that give plausible answers on a wide range of cases.<br />

Such a pair should also validate the close connection between indicatives and subjunctives<br />

we saw earlier. The data set out in section one suggests that this connection<br />

may be close to synonymy, as in (1), but in some cases, as in (11) and (12), the connection<br />

is much looser. The differing behaviour of rigid designators in indicatives<br />

3 There is an obvious similarity between this argument and some of the uses of contextual dependence<br />

in Lewis’s theory of knowledge (Lewis, 1996b). Indeed, McCawley credits Lewis (1979f) as an inspiration<br />

for his ideas.


Indicatives and Subjunctives 349<br />

and subjunctives reveals a further difference, but the two-dimensional nature of the<br />

analysis, not the particulars of the similarity metric, accounts for that. I propose to<br />

explain the data by looking at which facts we hold fixed when trying to determine<br />

the nearest possible world. The facts we hold fixed in evaluating indicatives and subjunctives,<br />

according to the two metrics outlined above, are the same in just the cases<br />

we feel that the indicatives and subjunctives say the same thing.<br />

When evaluating an indicative we hold fixed all the facts known by any member<br />

of the conversation. When evaluating a subjunctive we hold fixed (a) all facts about<br />

the world up to some salient time t and (b) the holding of the laws of nature at all<br />

times after t. The time t is the latest time such that some worlds fitting this description<br />

make A true and contain no large miracles. The two sets of facts held fixed match<br />

when we know all the salient facts about times before t, and know no particular facts<br />

about what happens after t.<br />

In the opinion poll case, when evaluating the original indicative our knowledge at<br />

the earlier time was held fixed. We knew that the polls predicted a Reagan landslide,<br />

that when one makes spectacularly false predictions one is discredited, and so on.<br />

When we turn to evaluating the subjunctive, we hold fixed the facts about the world<br />

before the election (presumably the relevant time t) and some laws. Therefore, we<br />

hold fixed the polls predictions, and the law that when one makes spectacularly false<br />

predictions one is discredited. So the same facts are held fixed. And in general, this<br />

will happen whenever all we know is all the specific facts up to the relevant time, and<br />

some laws that allow us to extrapolate from those facts.<br />

In the case where indicatives and subjunctives come apart, as in (11) and (12), the<br />

relevant knowledge differs from the first case. By hypothesis, we do not know who<br />

pulled the trigger, but we do know that a trigger was pulled. Our knowledge of the<br />

relevant facts does not consist in knowledge of all the details up to a salient time, and<br />

knowledge that the world will continue in a law-governed way after this. Therefore,<br />

we would predict that the indicatives and subjunctives would come apart, because<br />

what is held fixed when evaluating the two conditionals differs. We find exactly that.<br />

So the pair of measures can explain the close connection between indicatives and<br />

subjunctives when it exists, and explain why the two come apart when they do come<br />

apart.<br />

3.2 No Nearest Possible World<br />

Generally, there are three kinds of problems under this heading. First, there may be<br />

no A-worlds, and so no nearest A-world. Secondly, there may be an infinite sequence<br />

of ever-nearer A-worlds without a nearest A-world. Thirdly, there may be several<br />

worlds in a tie for nearest A-world. If the measure suggested in the previous section is<br />

correct, the first two problems do not arise here. The third problem, however, arises<br />

almost all the time, so we need to say something about it.<br />

The approach I favour is set out in Stalnaker (1981). The comparative similarity<br />

measure is a partial order on the possible worlds. Stalnaker recommends we assess<br />

conditionals using supervaluations, taking the precisifications to be the complete extensions<br />

of this partial order. In particular, if several possible worlds tie for being the


Indicatives and Subjunctives 350<br />

closest A-worlds 4 , then A → B will be true if they are all B-worlds, false if they are all<br />

¬B-worlds, and not truth-valued otherwise. For consistent A, this makes ¬(A → B)<br />

equivalent to A → ¬B. Since we generally deny A → B just when we would be prepared<br />

to assert A → ¬B, this seems like a good outcome. 5 Further, this account makes<br />

A → B generally come out gappy when A is false. Many theorists hold that indicative<br />

conditionals, especially those with false antecedents, lack truth values. 6 This can’t be<br />

right in general, since it is a platitude that A → A is true for every A, but the position<br />

has some attraction. Happily, our theory respects the motivations behind such<br />

positions without violating the platitude.<br />

In any case, these details are not important to the overall analysis. If someone<br />

favours a resolution of ties along the lines Lewis suggested this could easily be appended<br />

onto the basic theory.<br />

3.3 The General Theory<br />

So far, I have just defined what it is for A → B to be true in this world from the<br />

perspective of this world as actual. To have a fully general theory I need to say when<br />

A → B is true in an arbitrary world from the perspective of another (possibly different)<br />

world as actual. And that general theory must yield the theory above as a special<br />

case when applied to our world. As with the special theory above, the general theory<br />

will mostly be derived from Twin Earth considerations.<br />

x z z<br />

In general, y A → B iff the nearest world pair 〈z, v〉 such that v A is such that v<br />

B. Nearness is again defined epistemically, but what we know about x and y matters.<br />

z<br />

In particular if v C for all sentences C such that someone in the context knows that<br />

x u<br />

y C , but not w C for some such C , then 〈z, v〉 is closer to 〈x, y〉 than is 〈u, w〉. As<br />

should be clear from this, nearness is context-dependent, and the context it depends<br />

on is the actual speaker’s context. For conditionals as for quantified sentences, the<br />

same words will express different propositions in different contexts.<br />

Let’s draw out some consequences of this definition. First, for any x we know<br />

x x<br />

that x C for all a priori propositions C . In particular, we know that x D ≡ (Actually<br />

D) for any proposition D, where ‘≡’ represents the material biconditional. So the<br />

nearest world pair 〈z, v〉 to 〈x, x〉 must be one in which z = v, even if that means z is<br />

the impossible world i. Hence the general theory of indicatives reduces to the special<br />

theory set out above when applied to epistemically possible worlds: when assessing<br />

the truth value of an indicative in an epistemically possible world pair we need only<br />

look at other epistemically possible world pairs.<br />

Secondly, when evaluating conditionals with respect to epistemically impossible<br />

world pairs 〈x, y〉, we need to use other epistemically impossible world pairs. For<br />

example, imagine some explorers are wandering around Twin Australia, a dry continent<br />

to the south of Twin Earth. As explorers of such lands are wont to do, they are<br />

4 Of course in this context x is an A- world iff x<br />

x A.<br />

5 Edgington (1996) furnishes some nice examples against the view that A � B should be false when<br />

there are several equally close A-worlds in a tie for closest and some are B-worlds but some are ¬B-worlds.<br />

6 See Edgington (1995) for an endorsement of this position and discussion of others who have held it.


Indicatives and Subjunctives 351<br />

dying of thirst, so they are seeking some watery stuff to save themselves. Without<br />

knowing whether they succeed, we know (22) is false.<br />

(22) If the explorers find some watery stuff, they will find some water.<br />

This theory can explain the falsity of (22). We know, from the way Twin Earth is<br />

stipulated, that all the watery stuff of the explorers’ acquaintance is not water. So we<br />

know any watery stuff they find will not be water. And we know that water is scarce<br />

on Twin Earth, even scarcer than watery stuff in Twin Australia, so it is unlikely they<br />

will find some watery stuff and simultaneously stumble across some water.<br />

This theory also explains occurrences of indicatives embedded in subjunctives.<br />

These are very odd, as should be expected if indicatives are about epistemic connections<br />

and subjunctives about metaphysical connections, but we can just make sense<br />

of them some of the time. For example, it seems possible to make sense of (23) and<br />

that it is true.<br />

(23) If the bullet that actually killed JFK had instead killed Jackie Kennedy, then it<br />

would be true that if Oswald didn’t kill Jackie Kennedy, someone else did.<br />

@<br />

On our theory, to evaluate this we first find the nearest world pair 〈@, w〉 such that w<br />

The bullet that actually killed JFK instead killed Jackie Kennedy, and then evaluate<br />

the indicative relative to it. Now one thing we know about this world pair is that<br />

in it, someone killed Jackie Kennedy. So this must hold in all nearby world pairs.<br />

Hence in any such world pair that Oswald did not kill Jackie Kennedy, someone else<br />

did, so (23) turns out true.<br />

It might be thought that such embeddings do not make particularly good sense. I<br />

have some sympathy for such a view. If one adopts the ‘special theory’ developed in<br />

the previous section, and rejects the general theory developed in this subsection, one<br />

may have an explanation for the impossibility of such embeddings. However, even if<br />

we cannot make sense of such embeddings, we still need to account for the truth conditions<br />

of indicatives relative to epistemically impossible world pairs to make sense<br />

of claims such as Necessarily (A → A). 7<br />

3.4 Classifying Conditionals<br />

In recent years, there has been extensive debate over where the line between indicatives<br />

and subjunctives falls. This debate focuses on whether ‘future indicatives’ like<br />

(24) are properly classified with indicatives or subjunctives.<br />

(24) If Booth doesn’t shoot Lincoln, someone else will.<br />

Jackson (1990) and Bennett (1995) argue that this should go with ordinary indicatives.<br />

Dudman (1994) and Bennett (1988) argue that it should go with ordinary subjunctives,<br />

though this is not how Dudman would put it. This theory of indicatives<br />

appears to favour Jackson and (the later) Bennett, because of the apparent triviality<br />

of conditionals like (25).<br />

(25) If it will rain then it will actually rain.<br />

7 I am indebted to Lloyd Humberstone for pointing this out to me.


Indicatives and Subjunctives 352<br />

4 Conclusion<br />

Despite its lack of attention in the literature, data about the role of rigid designators<br />

in indicatives deserve close attention. Any plausible theory of indicatives must be<br />

able to deal with it, and it isn’t clear how existing possible worlds theories could do<br />

so. The easiest way to build a semantics for indicatives is to say that “If A then C ”<br />

is true just in case the nearest world in which A is true is a world where C is true.<br />

Even before the hard questions about the meaning of ‘nearest’ here start to be asked,<br />

we know a theory of this form is wrong because it makes mistaken predictions about<br />

the role of rigid designators. A conditional like “If the stuff in the rivers, lakes and<br />

oceans really is XYZ, then water is XYZ” is true, even though the consequent is true<br />

in no possible worlds. The simplest way to solve this difficulty is to revisit the idea<br />

of ‘true in a world’. Rather than looking for a nearby world in which A is true, and<br />

asking whether C is true in it, we look for a nearby world w such that A is true under<br />

the supposition that w is actual, and ask whether C is true under the supposition that<br />

w is actual. In the terminology of Jackson (1998), we look at worlds considered as<br />

actual, rather than worlds considered as counterfactual. This simple change makes<br />

an important difference to the way rigid designators behave. There is no world in<br />

which water is XYZ. However, under the supposition that the stuff in the rivers,<br />

lakes and oceans really is XYZ, and the H 2 O theory is just a giant mistake, that is,<br />

under the supposition that we are in the world known as Twin Earth, water is XYZ.<br />

In short, “water is XYZ” is true in Twin Earth considered as actual, even though it<br />

is false in Twin Earth considered as counterfactual. So the data about behaviour of<br />

rigid designators in indicatives, data like the truth of “If the stuff in the rivers, lakes<br />

and oceans really is XYZ, then water is XYZ”, does not refute the hypothesis that “If<br />

A then C ” is true iff the nearest world such that A is true in that world considered as<br />

actual is a world where C is true in that world considered as actual.<br />

In section two we looked at how the formal structure of a theory built around<br />

that hypothesis might look. In section three we looked at how some of the details<br />

may be filled in. The most pressing task is to provide a similarity metric so we can<br />

have some idea about which worlds will count as being nearby. The theory I defended<br />

has three important features. First, it is epistemic. Which worlds are nearby depends<br />

on what is known by conversational participants. Secondly, it is contextualist in two<br />

respects. The first respect is that it is the knowledge of the audience that matters, not<br />

just the knowledge of the speaker and the intended audience. The second respect is<br />

that it allows that what is known by the audience may be affected by the utterance<br />

of the conditional. In particular, if the utterance of “If A, B” causes the audience to<br />

consider A to be possible, and hence cease to know that ¬A, then A is not part of<br />

what is known for purposes of determining which worlds are nearby. (I assume here<br />

a broadly contextualist account of knowledge, as in Lewis (1996b), but this is inessential.<br />

If you do not like Lewis’s theory, replace all references to knowledge here, and<br />

in section 3.1, with references to epistemic certainty. I presume that what is epistemically<br />

certain really is contextually variable in the way Lewis suggests.) Thirdly, it is<br />

coarse- grained: whether a world is nearby depends only on whether it is consistent<br />

with what is known, not ‘how much’ it agrees with what is known. The resultant


Indicatives and Subjunctives 353<br />

theory seems to capture all the data, to explain the generally close connection between<br />

indicatives and subjunctives, and to explain the few differences which do arise<br />

between indicatives and subjunctives.<br />

The other detail to be filled in concerns embeddings of indicatives inside subjunctives.<br />

The formalism here requires that we use the full resources of two- dimensional<br />

modal logic, but the basic idea is very simple. Consider a sentence of the form “If it<br />

were the case that A, it would be the case that if B, C .” Roughly, this will be true<br />

iff the metaphysically nearest world in which A is true, call it w A , is a world where<br />

B → C is true. And that will be true iff the epistemically nearest world to w A is<br />

which B is true is a world where C is true. Less roughly, we have to quantify not<br />

over worlds, but over pairs of worlds, where the first element of the pair determines<br />

the reference for rigid designators, and the second element determines the truth of<br />

sentences given those references. But this only adds to the formal complexity; the<br />

underlying idea is still the same. The important philosophical point to note is that<br />

when we are trying to find the epistemically nearest world to w A (or, more strictly,<br />

the nearest world pair to 〈@, w A 〉) the facts that have to be held fixed are the facts that<br />

we know about w A , not what our counterparts in w A , or indeed what any inhabitant<br />

of w A knows about their world. These embeddings may be rare in everyday speech,<br />

but since they are our best guide to the truth values of indicatives in other possible<br />

worlds, they are theoretically very important.


Assertion, Knowledge and Action<br />

Ishani Maitra, <strong>Brian</strong> <strong>Weatherson</strong><br />

It is widely believed that the mere truth of p is insufficient for p to be properly assertable,<br />

even if p is relevant to current conversation. If a speaker simply guessed that<br />

p is true, then she shouldn’t say p, for example. There is some dissent from this view<br />

(e.g., Weiner (2005)), but it is something close to orthodoxy in the current literature<br />

on assertion that something further is needed. The most common ‘something else’<br />

is knowledge: a speaker shouldn’t say p unless they know p. This view is nowadays<br />

commonly associated with Timothy Williamson (1996; 2000a), but it has historical<br />

antecedents tracing back at least to Max Black’s 1952 paper “Saying and Disbelieving”.<br />

1 Call Williamson’s position The Knowledge Rule.<br />

The Knowledge Rule Assert that p only if you know that p.<br />

This paper aims to raise trouble for The Knowledge Rule, and several related positions,<br />

by focussing on a particular kind of assertion. We’ll be looking at assertions<br />

about what is to be done. The boldest statement of our position is that if an agent<br />

should do X, then that agent is in a position to say that they should do X. (We’ll<br />

qualify this a little below, but it’s helpful to start with the bold position.) We argue,<br />

following Williamson’s ‘anti-luminosity’ arguments, that its being true that X is the<br />

thing to do for an agent doesn’t entail that that agent knows it’s the thing to do. 2 If<br />

both these claims are true, then there will be cases where it is fine to assert that X<br />

is what to do, even though the agent doesn’t know this. So, The Knowledge Rule is<br />

mistaken. Slightly more formally, we’ll be interested in arguments of this structure.<br />

† Penultimate draft only. Please cite published version if possible. Final version forthcoming in Philosophical<br />

Studies. We’d like to thank Matthew Benton, Jessica Brown, Andy Egan, and Susanna Schellenberg,<br />

as well as an audience at the Bellingham Summer Philosophy Conference, for helpful discussion of<br />

earlier drafts of this paper.<br />

1 Timothy Williamson (Williamson, 2000a, Ch. 11) has the clearest statement of the view we’re considering<br />

here. It is also defended by Keith DeRose (2002). Both DeRose and John Hawthorne (2004b) deploy<br />

it extensively as a constraint on theories of knowledge. Jason Stanley (2008)argues for an even stronger<br />

constraint: that we should only assert p if we are certain that p. Igor Douven (2006) argues that truth is<br />

neither sufficient nor necessary, so the norm should be assert only what is rationally credible. Kent Bach<br />

(2010) and Frank Hindriks (2007) both suggest that the only real norm governing assertion is belief, but<br />

that since knowledge is a norm of belief, we shouldn’t generally assert what we do not know. In this paper<br />

we’re not concerned with the question of whether the rule Assert only what you know holds solely in virtue<br />

of the normative nature of assertion itself, as Williamson thinks, or partly in virtue of norms applying to<br />

related states like belief, as Bach and Hindriks suggest, but rather whether the rule is even a good rule.<br />

2 We’ll use the expressions ‘thing to do’ and ‘what to do’ interchangeably throughout the paper. By X is<br />

what to do, we mean X ought to be done, all things considered. We take no position on whether X’s being<br />

what to do entails its being the morally right thing to do. That may be the case, but nothing we say in this<br />

paper depends on its being so.


Assertion, Knowledge and Action 355<br />

Master Argument (First Attempt)<br />

(1) If act X is what to do for agent S, then S can properly assert that X<br />

is what to do (assuming that this assertion is relevant to the current<br />

conversation).<br />

(2) It is possible that X is what to do for S, even though S is not in a<br />

position to know this.<br />

(3) So, it is possible that S can properly assert that X is what to do even<br />

though she does not know, and is not even in a position to know,<br />

that X is what to do.<br />

In section 1, we’ll motivate premise 1 with a couple of vignettes. In section 2, we’ll<br />

qualify that premise and make it more plausible. In section 3, we’ll motivate premise<br />

2. In section 4, we’ll look at one of the positive arguments for The Knowledge Rule,<br />

the argument from Moore’s paradox, and conclude that it is of no help. In section<br />

5, we’ll look at what could be put in place of The Knowledge Rule, and suggest two<br />

alternatives.<br />

The Evidence Responsiveness Rule Assert that p only if your attitude towards p is<br />

properly responsive to the evidence you have that bears on p.<br />

The Action Rule Assert that p only if acting as if p is true is the thing for you to do.<br />

We’re not going to argue for these rules in detail; that would take a much longer<br />

paper. Nor are we going to decide between them. What we are going to suggest<br />

is that these rules have the virtues that are commonly claimed for The Knowledge<br />

Rule, but lack The Knowledge Rule’s problematic consequences when it comes to<br />

assertions about what to do.<br />

1 Speaking about What to Do<br />

We start by motivating premise 1 of the Master Argument with a couple of examples.<br />

Both cases are direct counterexamples to The Knowledge Rule, but we’re interested in<br />

the first instance in what the cases have in common. After presenting the vignettes,<br />

we offer three distinct arguments to show that, in such cases, it is proper for the<br />

speakers to assert what they do assert, even though they don’t know it to be true.<br />

Going to War<br />

Imagine that a country, Indalia, finds itself in a situation in which the thing for it to<br />

do, given the evidence available to its leaders, is to go to war against an enemy. (Those<br />

pacifists who think it is never right to go to war won’t like this example, but we think<br />

war can at least sometimes be justified.) But it is a close call. Had the evidence been a<br />

bit weaker, had the enemy been a little less murderous, or the risk of excessive civilian<br />

casualties a little higher, it would have been preferable to wait for more evidence, or<br />

use non-military measures to persuade the enemy to change its ways. So, while going<br />

to war is the thing to do, the leaders of Indalia can’t know this. We’ll come back to


Assertion, Knowledge and Action 356<br />

this in section 2, but the crucial point here is that knowledge has a safety constraint,<br />

and any putative knowledge here would violate this constraint.<br />

Our leaders are thus in a delicate position here. The Prime Minister of Indalia<br />

decides to launch the war, and gives a speech in the House of Commons setting out<br />

her reasons. All the things she says in the speech are true, and up to her conclusion<br />

they are all things that she knows. She concludes with (1).<br />

(1) So, the thing to do in the circumstances is to go to war.<br />

Now (1) is also true, and the Prime Minister believes it, but it is not something she<br />

knows. So, the Prime Minister violates The Knowledge Rule when she asserts (1).<br />

But it seems to us that she doesn’t violate any norms in making this assertion. We’ll<br />

have a lot more to say about why this is so in a few paragraphs. But first, here’s a less<br />

dramatic case that is also a counterexample to The Knowledge Rule, one that involves<br />

prudential judgments rather than moral judgments.<br />

Buying Flood Insurance<br />

Raj and Nik are starting a small business. The business is near a river that hasn’t<br />

flooded in recent memory, but around which there isn’t much flood protection. They<br />

could buy flood insurance which would be useful in a flood, naturally, but would be<br />

costly in the much more likely event that there is not a flood. Raj has done the<br />

calculations of the likelihood of a flood, the amount this would damage the business,<br />

the utility loss of not having this damage insured, and the utility loss of paying flood<br />

insurance premiums. He has concluded that buying flood insurance is the thing to<br />

do. As it happens, this was a good conclusion to draw: it does, in fact, maximise<br />

his (and Nik’s) expected utility over time. (It doesn’t maximise their actual utility,<br />

as there actually won’t be a flood over the next twelve months. So, the insurance<br />

premium is an expense they could have avoided. But that doesn’t seem particularly<br />

relevant for prudential evaluation. Prudential buyers of insurance should maximise<br />

expected utility, not actual utility. Or so we must say unless we want to be committed<br />

to the view that everyone who buys an insurance policy and doesn’t make a claim on<br />

it is imprudent.)<br />

But again, it’s a close call. If there had been a little less evidence that a flood<br />

was a realistic possibility, or the opportunity cost of using those dollars on insurance<br />

premiums had been a little higher, or the utility function over different outcomes a<br />

little different, it would have been better to forego flood insurance. That suggests<br />

that safety considerations make it the case that Raj doesn’t know that buying flood<br />

insurance is the thing to do, though in fact it is.<br />

Let’s now assume Raj has done everything he should do to investigate the costs<br />

and benefits of flood insurance. We can imagine a conversation between him and Nik<br />

going as follows.<br />

Nik: Should we get flood insurance?<br />

Raj: I don’t know. Hold on; I’m on the phone.<br />

Nik: Who are you calling?<br />

Raj: The insurance agent. I’m buying flood insurance.


Assertion, Knowledge and Action 357<br />

There is clearly a pragmatic tension in Raj’s actions here. But given The Knowledge<br />

Rule, there’s little else he can do. It would be a serious norm violation to say nothing<br />

in response to Nik’s question. And given that he can’t say “Yes” without violating<br />

The Knowledge Rule, he has to say “I don’t know”. Moreover, since by hypothesis<br />

buying flood insurance is the thing to do in his situation, he can’t not buy the insurance<br />

without doing the wrong thing. So, given The Knowledge Rule, he’s doing the<br />

best he can. But it’s crazy to think that this is the best he can do.<br />

We think that these cases are problems for The Knowledge Rule. In particular,<br />

we think that in each case, there is a non-defective assertion of something that is not<br />

known. It seems to us intuitively clear that those assertions are non-defective, but<br />

for those who don’t share this intuition, we have three independent arguments. The<br />

arguments focus on Going to War, but they generalize easily enough to Buying<br />

Flood Insurance.<br />

Argument One: “That was your first mistake”<br />

Imagine that the Prime Minister has a philosophical advisor. And the advisor’s job<br />

is to inform the Prime Minister whenever she violates a norm, and stay silent otherwise.<br />

If The Knowledge Rule is correct, then the advisor should stay silent as the<br />

Prime Minister orders the invasion, silent as the Prime Minister sets out the reasons<br />

for the invasion, then speak up at the very last line of the speech. That strikes us as<br />

absurd. It’s particularly absurd when you consider that the last line of the speech is<br />

supported by what came earlier in the speech, and the Prime Minister believes it, and<br />

asserts it, because it is well supported by what came earlier in the speech. Since we<br />

think this couldn’t be the right behaviour for the advisor, we conclude that there’s<br />

no norm violation in the Prime Minister asserting (1).<br />

We’ve heard two replies to this kind of argument. According to one sort of reply,<br />

The Knowledge Rule is not meant to be an ‘all-things-considered’ norm. The<br />

defender of The Knowledge Rule can say that the Prime Minister’s assertion is defective<br />

because it violates that rule, but allow that it is nevertheless all-things-considered<br />

proper, because some other norm outweighs The Knowledge Rule on this occasion.<br />

We agree that The Knowledge Rule is not intended to be an all-things-considered<br />

norm. But even keeping clearly in mind the distinction between being defective in<br />

some respect and being defective all-things-considered, it is still deeply unintuitive<br />

to say that the Prime Minister’s assertion is defective in a respect. That is, we don’t<br />

think the philosophical advisor should speak up just at the very end of the Prime<br />

Minister’s speech even if she’s meant to observe all the norm violations (rather than<br />

just the all-things-considered norm violations).<br />

Perhaps the defender of The Knowledge Rule needn’t just appeal to an intuition<br />

here. Another reply we’ve heard starts from the premise that the Prime Minister’s<br />

assertion would be better, in a certain respect, if she knew that it was true. Therefore,<br />

there is a respect in which that assertion is defective, just as The Knowledge<br />

Rule requires. To this second reply, our response is that the premise is true, but the<br />

reasoning is invalid. Saying why requires reflecting a bit on the nature of norms.<br />

There are lots of ways for assertions to be better. It is better, ceteris paribus, for<br />

assertions to be funny rather than unfunny. It is better for assertions to be sensitive


Assertion, Knowledge and Action 358<br />

rather than insensitive. (We mean this both in the Nozickian sense, i.e., an assertion<br />

is sensitive iff it wouldn’t have been made if it weren’t true, and in the Hallmark<br />

greeting card sense.) It is better for speakers to be certain of the truth of their assertions<br />

than for them to be uncertain. But these facts don’t imply that humour,<br />

sensitivity, or certainty are norms of assertion, for it doesn’t follow that assertions<br />

that lack humour (or sensitivity or certainty) are always defective. Similarly, the fact<br />

that it is better to know what you say than not doesn’t imply that asserting what you<br />

don’t know is always defective. In slogan form: Not every absence of virtue is a vice.<br />

We think knowledge is a virtue of assertions. (In fact, we think that pretty much<br />

every norm of assertion that has been proposed in the literature picks out a virtue of<br />

assertion.) What we deny is that the absence of knowledge is (always) a vice. Since<br />

not every absence of virtue is a vice, one can’t argue that the Prime Minister’s assertion<br />

is defective by arguing it could have been better. And that’s why the argument<br />

being considered is invalid.<br />

Argument Two: “Actions speak louder than words”<br />

It’s a bit of folk wisdom that actions speak louder than words. It isn’t crystal clear<br />

just what this wisdom amounts to, but we think one aspect of it is that an agent<br />

incurs more normative commitments by doing X than by talking about X. But if<br />

The Knowledge Rule is right, then this piece of wisdom is in this aspect back-tofront.<br />

According to that rule, an agent incurs a greater normative commitment by<br />

saying that X is what to do than they do by just doing X. If they do X, and X is indeed<br />

what to do, then they’ve satisfied all of their normative commitments. If, by contrast,<br />

they say that X is what to do, then not only must X be what to do, but they must<br />

know this fact as well. This strikes us as completely back-to-front. We conclude that<br />

there is nothing improper about asserting that X is what to do (as the Prime Minister<br />

does), when X is in fact what to do.<br />

Argument Three: “What else could I do?”<br />

Here’s a quite different argument that Going to War is a counterexample to The<br />

Knowledge Rule.<br />

(1) If ending the speech the way she did was a norm violation, there is a better way<br />

for the Prime Minister to end her speech.<br />

(2) There is no better way for the Prime Minister to end the speech without saying<br />

something that she does not know to be true.<br />

(3) So, ending the speech the way she did was not a norm violation.<br />

(4) So, The Knowledge Rule is subject to counterexample.<br />

Premise 1 is a kind of ‘ought-implies-can’ principle, and as such, it isn’t completely<br />

obvious that it is true. But when we’ve presented this argument to various groups,<br />

the focus has always been on premise two. The common complaint has been that the<br />

Prime Minister could have ended the speech in one of the following ways, thereby<br />

complying with The Knowledge Rule.<br />

• I’ve decided that going to war is the thing to do in the circumstances.


Assertion, Knowledge and Action 359<br />

• I believe that going to war is the thing to do in the circumstances.<br />

• It seems to me that going to war is the thing to do in the circumstances.<br />

Our first reply to this suggestion is that we’d fire a speechwriter who recommended<br />

that a Prime Minister end such a speech in such a weaselly way, so this hardly counts<br />

as a criticism of premise 2. Our more serious reply is that even if the Prime Minister<br />

ended the speech this way, she’d still violate The Knowledge Rule. To see why this is<br />

so, we need to pay a little closer attention to what The Knowledge Rule says.<br />

Note that The Knowledge Rule is not a rule about what kind of declarative utterance<br />

you can properly make. An actor playing Hamlet does not violate The Knowledge<br />

Rule if he fails to check, before entering the stage, whether something is indeed<br />

rotten in the state of Denmark. The rule is a rule about what one asserts. And just<br />

as you can assert less than you declaratively utter (e.g., on stage), you can also assert<br />

more than you declaratively utter. 3 For instance, someone who utters The F is G in a<br />

context in which it is common ground that a is the F typically asserts both that the<br />

F is G, and that a is G. Similarly, someone who utters I think that S typically asserts<br />

both asserts that they have a certain thought, and asserts the content of that thought.<br />

We can see this is so by noting that we can properly challenge an utterance of I think<br />

that S by providing reasons that S is false, even if these are not reasons that show<br />

that the speaker does not (or at least did not) have such a thought. In the context of<br />

her speech of the House of Commons, even if the Prime Minister were to end with<br />

one of the options above, she would still assert the same thing she would assert by<br />

uttering (1) in the circumstances, and she’d still be right to make such an assertion.<br />

2 Bases for Action and Assertion<br />

One might worry that premise 1 in our master argument is mistaken, in the following<br />

way. We said that if X is the thing to do for S, then S can say that X is what to do.<br />

But one might worry about cases where S makes a lucky guess about what is to be<br />

done. Above we imagined that Raj had taken all of the factors relevant to buying<br />

flood insurance into account. But imagine a different case, one involving Raj*, Raj’s<br />

twin in a similar possible world. Raj* decides to buy flood insurance because he<br />

consults his Magic 8-Ball. Then, even if buying flood insurance would still maximize<br />

his expected utility, it doesn’t seem right for Raj* to say that buying flood insurance<br />

is what to do.<br />

Here is a defence of premise 1 that seems initially attractive, though not, we think,<br />

ultimately successful. The Magic 8-ball case isn’t a clear counterexample to premise 1,<br />

it might be argued, because it isn’t clear that buying flood insurance for these reasons<br />

is the thing for Raj* to do. On one hand, we do have the concept of doing the right<br />

thing for the wrong reasons, and maybe that is the right way to describe what Raj*<br />

does if he follows the ball’s advice. But it isn’t clearly a correct way to describe Raj*.<br />

It’s not true, after all, that he’s maximising actual utility. (Remember that there will<br />

3 The points we’re about to make are fairly familiar by now, but for more detail, see Cappelen and<br />

Lepore (2005), which played an important role in reminding the philosophy of language community of<br />

their significance.


Assertion, Knowledge and Action 360<br />

be no claims on the policy he buys.) And it isn’t clear how to think about expected<br />

utility maximisation when the entrepreneur in question relies on the old Magic 8-<br />

Ball for decision making. And we certainly want to say that there’s something wrong<br />

about this very decision when made using the Magic 8-Ball. So, perhaps we could say<br />

that buying flood insurance isn’t what to do for Raj* in this variant example, because<br />

he has bad reasons.<br />

But this seems like a tendentious defence of the first premise. Worse still, it is<br />

an unnecessary defence. What we really want to focus on are cases where people do<br />

the right thing for the right reasons. Borrowing a leaf from modern epistemology,<br />

we’ll talk about actions having a basis. As well as there being a thing to do in the<br />

circumstances (or, more plausibly, a range of things to do), there is also a correct basis<br />

for doing that thing (or, more plausibly, a range of correct bases). What we care about<br />

is when S does X on basis B, and doing X on basis B is the thing to do in S’s situation.<br />

Using this notion of a basis for action, we can restate the main argument.<br />

Master Argument (Corrected)<br />

(1) If doing X on basis B is what to do for agent S, then S can properly,<br />

on basis B, assert that X is what to do (assuming this is relevant to<br />

the conversation).<br />

(2) It is possible that doing X on basis B is what to do for S, even though<br />

S is not in a position to know, and certainly not in a position to<br />

know on basis B, that X is what to do.<br />

(3) So, it is possible that S properly can assert that X is what to do, even<br />

though she does not know, and is not even in a position to know,<br />

that X is what to do.<br />

We endorse this version of the master argument. Since its conclusion is the denial<br />

of The Knowledge Rule, we conclude that The Knowledge Rule is mistaken. But we<br />

perhaps haven’t said enough about premise 2 to seal the argument. The next section<br />

addresses that issue.<br />

3 Marginal Wars<br />

The argument for premise 2 is just a simple application of Williamson’s anti-luminosity<br />

reasoning. (The canonical statement of this reasoning is in (Williamson, 2000a,<br />

Ch. 4)).) Williamson essentially argues as follows, for many different values of p.<br />

There are many ways for p to be true, and many ways for it to be false. Some of the<br />

ways in which p can be true are extremely similar to ways in which it can be false. If<br />

one of those ways is the actual way in which p is true, then to know that p we have to<br />

know that situations very similar to the actual situation do not obtain. But in general<br />

we can’t know that. So, some of the ways in which p can be true are not compatible<br />

with our knowing that p is true. In Williamson’s nice phrase, p isn’t luminous, where<br />

a luminous proposition is one that can be known (by a salient agent) whenever it is<br />

true. The argument of this paragraph is called ‘an anti-luminosity argument’, and we<br />

think that many instances of it are sound.


Assertion, Knowledge and Action 361<br />

There is a crucial epistemic premise in the middle of that argument: that we can’t<br />

know something if it is false in similar situations. There are two ways that we could<br />

try to motivate this premise. First, we could try to motivate it with the help of<br />

conceptual considerations about the nature of knowledge. That’s the approach that<br />

Williamson takes. But his approach is controversial. It is criticised by Sainsbury<br />

(1996) and <strong>Weatherson</strong> (2004b) on the grounds that his safety principle goes awry in<br />

some special cases. Sainsbury focuses on mathematical knowledge, <strong>Weatherson</strong> on<br />

introspective knowledge. But the cases in which we’re most interested in this paper<br />

– Indalia going to war, Raj and Nik buying flood insurance – don’t seem to fall into<br />

either of these problem categories. Nevertheless, rather than pursue this line, we’ll<br />

consider a different approach to motivating this premise.<br />

The second motivation for the epistemic premise comes from details of the particular<br />

cases. In the two cases on which we’re focusing, the agents simply lack fine<br />

discriminatory capacities. They can’t tell some possibilities apart from nearby possibilities.<br />

That is, they can’t know whether they’re in one world or in some nearby<br />

world. That’s not because it’s conceptually impossible to know something that fine,<br />

but simply an unfortunate fact about their setup. If they can’t know that they’re not<br />

in a particular nearby world in which ¬p, they can’t know p. Using variants of Going<br />

to War, we’ll describe a few ways this could come about.<br />

The simplest way for this to come about is if war-making is the thing to do given<br />

what we know, but some of the crucial evidence consists of facts that we know, but<br />

don’t know that we know. Imagine that a crucial piece of Indalia’s case for war comes<br />

from information from an Indalian spy working behind enemy lines. As it turns out,<br />

the spy is reliable, so the leaders of Indalia can acquire knowledge from her testimony.<br />

But she could easily enough have been unreliable. She could, for instance,<br />

have been bought off by the enemy’s agents. As it happens, the amount of money<br />

that would have taken was outside the budget the enemy has available for counterintelligence.<br />

But had the spy been a little less loyal, or the enemy a little less frugal<br />

with the counterintelligence budget, she could easily have been supplying misinformation<br />

to Indalia. So, while the spy is a safe knowledge source, the Indalian leaders<br />

don’t know that she is safe. They don’t, for instance, know the size of the enemy’s<br />

counterintelligence budget, or how much it would take to buy off their spy, so for all<br />

they know, she is very much at risk of being bought off.<br />

In this case, if the spy tells the Indalian leaders that p, they come to know that p,<br />

and they can discriminate p worlds from ¬p worlds. But they don’t know that they<br />

know that p, so for all they know, they don’t know p. And for some p that they learn<br />

from the spy, if they don’t know p, then going to war isn’t the thing for them to do<br />

in the circumstances. So, given that they don’t know the spy is reliable, they don’t<br />

know that going to war is the thing for them to do. But the spy really is reliable, so<br />

they do know p, so going to war is indeed the thing for them to do.<br />

Or consider a slightly less fanciful case, involving statistical sampling. Part of the<br />

Prime Minister’s case for starting the war was that the enemy was killing his own<br />

citizens. Presumably she meant that he was killing them in large numbers. (Every<br />

country with capital punishment kills its own citizens, but arguably that isn’t a sufficient<br />

reason to invade.) In practice, our knowledge of the scope of this kind of


Assertion, Knowledge and Action 362<br />

governmental killing comes from statistical sampling. And this sampling has a margin<br />

of error. Now imagine that the Indalian leaders know that a sample has been<br />

taken, and that it shows that the enemy has killed n of his citizens, with a margin of<br />

error of m. So, assuming there really are n killings, they know that the enemy has<br />

killed between n - m and n + m of his citizens. Since knowing that he’s killed n -<br />

m people is sufficient to make going to war the thing to do, the war can be properly<br />

started.<br />

But now let’s think about what the Indalian leaders know that they know in this<br />

case. The world where the enemy has killed n - m people is consistent with their<br />

knowledge. And their margin of error on estimates of how many the enemy has<br />

killed is m. So, if that world is actual, they don’t know the enemy has killed more<br />

than n - 2m of his citizens. And that knowledge might not be enough to make going<br />

to war the thing to do, especially if m is large. (Think about the case where m =<br />

n/2, for instance.) So, there’s a world consistent with their knowledge (the n - m<br />

killings world), in which they don’t know enough about what the enemy is doing to<br />

make going to war the thing to do. In general, if there’s a world consistent with your<br />

knowledge where p is false, you don’t know p. Letting p be Going to war is what to<br />

do, it follows then that they don’t know that going to war is what to do, even though<br />

it actually is the thing to do.<br />

Another way we could have a borderline war is a little more controversial. Imagine<br />

a case where the leaders of Indalia know all the salient descriptive facts about the<br />

war. They know, at least well enough for present purposes, what the costs and benefits<br />

of the war might be. But it is a close call whether the war is the thing to do given<br />

those costs and benefits. Perhaps different plausible moral theories lead to different<br />

conclusions. Or perhaps the leaders know what the true moral theory is, but that<br />

theory offers ambiguous advice. We can imagine a continuum of cases where the<br />

true theory says war is clearly what to do at one end, clearly not what to do at another,<br />

and a lot of murky space between. Unless we are willing to give up on classical<br />

logic, we must think that somewhere there is a boundary between the cases where<br />

it is and isn’t what to do, and it seems in cases near the boundary even a true belief<br />

about what to do will be unsafe. That is, even a true belief will be based on capacities<br />

that can’t reliably discriminate situations where going to war is what to do from cases<br />

where it isn’t.<br />

We’ve found, when discussing this case with others, that some people find this<br />

outcome quite intolerable. They think that there must be some epistemic constraints<br />

on war-making. And we agree. They go on to think that these constraints will be<br />

incompatible with the kind of cases we have in mind that make premise 2 true. And<br />

here we disagree. It’s worth going through the details here, because they tell us quite<br />

a bit about the nature of epistemic constraints on action.<br />

Consider all principles of the form<br />

(KW) Going to war is N1 only if the war-maker knows that going to war is N2.<br />

where N1 and N2 are normative statuses, such as being the thing to do, being right,<br />

being good, being just, being utility increasing, and so on. All such principles look


Assertion, Knowledge and Action 363<br />

like epistemic constraints on war-making, broadly construed. One principle of this<br />

form would be that going to war is right only if the war-maker knows that going to<br />

war is just. That would be an epistemic constraint on war-making, and a plausible<br />

one. Another principle of this form would be that going to war is the thing to do<br />

only if the war-maker knows that going to war increases actual utility. That would<br />

be a very strong epistemic constraint on war-making, one that would rule out pretty<br />

much every actual war, and one that is consistent with the anti-luminosity argument<br />

with which we started this section. So, the anti-luminosity argument is consistent<br />

with there being quite strong epistemic constraints on war-making.<br />

What the anti-luminosity argument is not consistent with is there being any true<br />

principle of the form (KW) where N1 equals N2. In particular, it isn’t consistent<br />

with the principle that going to war is the thing to do only if the war maker knows<br />

that it is the thing to do. But that principle seems quite implausible, because of cases<br />

where going to war is, but only barely, the thing to do. More generally, the following<br />

luminosity of action principle seems wrong for just about every value of X.<br />

(LA) X is the thing for S to do only if S knows that X is the thing for her to do.<br />

Not only is (LA) implausible, things look bad for The Knowledge Rule if it has to rely<br />

on (LA) being true. None of the defenders of The Knowledge Rule has given us an<br />

argument that (LA) is true. One of them has given us all we need to show that (LA)<br />

is false! It doesn’t look like the kind of principle that The Knowledge Rule should<br />

have to depend upon. So, defending The Knowledge Rule here looks hopeless.<br />

Note that given premise 1 of the Master Argument, as corrected, every instance<br />

of (LA) has to be true for The Knowledge Rule to be universally true. Let’s say that<br />

you thought (LA) was true when X is starting a war, but not when X is buying flood<br />

insurance. Then we can use the case of Raj and Nik to show that The Knowledge<br />

Rule fails, since Raj can say that buying flood insurance is what to do in a case where<br />

it is what to do, but he doesn’t know this.<br />

One final observation about the anti-luminosity argument. Given the way Williamson<br />

presents the anti-luminosity argument, it can appear that in all but a few cases,<br />

if p, the salient agent can know that p. After all, the only examples Williamson gives<br />

are cases that are only picked out by something like the Least Number Theorem. So,<br />

one might think that while luminosity principles are false, they are approximately<br />

true. More precisely, one might think that in all but a few weird cases near the borderline,<br />

if p, then a salient agent is in a position to know p. If so, then the failures of<br />

luminosity aren’t of much practical interest, and hence the failures of The Knowledge<br />

Rule we’ve pointed out aren’t of much practical interest.<br />

We think this is all mistaken. Luminosity failures arise because agents have less<br />

than infinite discriminatory capacities. The worse the discriminatory capacities, the<br />

greater the scope for luminosity failures. When agents have very poor discriminatory<br />

capacities, there will be very many luminosity failures. This is especially marked<br />

in decision-making concerning war. The fog of war is thick. There is very much<br />

that we don’t know, and what we do know is based on evidence that is murky and<br />

ephemeral. There is very little empirical information that we know that we know.


Assertion, Knowledge and Action 364<br />

If there are certain actions (such as starting a war) that are proper only if we know<br />

a lot of empirical information, the general case will be that we cannot know that<br />

these actions are correct, even when they are. This suggests that luminosity failures,<br />

where an action is correct but not known to be correct, or a fact is known but not<br />

known to be known, are not philosophical curiosities. In epistemically challenging<br />

environments, like a war zone, they are everyday facts of life.<br />

4 Moore’s Paradox<br />

There is a standard argument for The Knowledge Rule that goes as follows. First, if<br />

the Knowledge Rule did not hold, then certain Moore paradoxical assertions would<br />

be acceptable. In particular, it would be acceptable to assert q, but I don’t know<br />

that q. 4 But second, Moore paradoxical assertions are never acceptable. Hence, The<br />

Knowledge Rule holds. We reject both premises of this argument.<br />

To reject the first premise, it suffices to show that some rule other than The<br />

Knowledge Rule can explain the unacceptability of Moore paradoxical assertions.<br />

Consider, for example, The Undefeated Reason rule.<br />

The Undefeated Reason Rule Assert that p only if you have an undefeated reason<br />

to believe that p.<br />

The Undefeated Reason Rule says that q but I don’t know that q can be asserted only<br />

if the speaker has an undefeated reason to believe it. That means the speaker has<br />

an undefeated reason to believe each conjunct. That means that the speaker has an<br />

undefeated reason to believe that they don’t know q. But in every case where it is unacceptable<br />

to both assert q and assert that you don’t know q, the speaker’s undefeated<br />

reason to believe they don’t know q will be a defeater for her belief that q. If you<br />

have that much evidence that you don’t know q, that will in general defeat whatever<br />

reason you have to believe q.<br />

We don’t claim that The Undefeated Reason Rule is correct. (In fact, we prefer the<br />

rules we’ll discuss in section 5.) We do claim that it provides an alternative explanation<br />

of the unacceptability of instances of q but I don’t know that q. So, we claim that<br />

it undermines the first premise of Williamson’s argument from that unacceptability<br />

to The Knowledge Rule.<br />

We also think that Williamson’s explanation of Moore paradoxicality over-generates.<br />

There is generally something odd about saying q but I don’t know that q. We<br />

suspect that the best explanation for why this is odd will be part of a broader explanation<br />

that also explains, for instance, why saying I promise to do X, but I’m not actually<br />

doing to do X is also defective. Williamson’s explanation isn’t of this general form.<br />

He argues that saying q but I don’t know that q is defective because it is defective in<br />

every context to both assert q and assert that you don’t know that q. But we don’t<br />

think that it is always defective to make both of these assertions. 5 In particular, if a<br />

4 (Williamson, 2000a, §11.3), for instance, shows the strength of this argument.<br />

5 This is why we hedged a little two paragraphs ago about what precisely The Undefeated Reason Rule<br />

explains. We suspect that many in the literature have misidentified the explicandum.


Assertion, Knowledge and Action 365<br />

speaker is asked whether q is true, and whether they know that q, it can be acceptable<br />

to reply affirmatively to the first question, but negatively to the second one. If so,<br />

then the second premise of Williamson’s argument from Moore paradoxicality is also<br />

false.<br />

Imagine that the Indalian Prime Minister is a philosopher in her spare time. After<br />

the big speech to Parliament she goes to her Peninsula Reading Group. It turns out<br />

Michael Walzer and Tim Williamson are there, and have questions about the speech.<br />

TW: Do you agree that knowledge requires safety?<br />

PM: Yes, yes I do.<br />

TW: And do you agree that your belief that going to war is the thing to<br />

do is not safe?<br />

PM: Right again.<br />

TW: So, you don’t know that going to war is the thing to do?<br />

PM: You’re right, I don’t.<br />

MW: But is it the thing to do?<br />

PM: Yes.<br />

The Prime Minister’s answers in this dialogue seem non-defective to us. But if Williamson’s<br />

explanation of why Moore paradoxical utterances are defective is correct,<br />

her answers should seem defective. So, Williamson’s explanation over-generates.<br />

Whether or not it is true that all assertions of sentences of the form q but I don’t<br />

know that q are defective, it isn’t true that there is a defect in any performance that includes<br />

both an assertion of q and an assertion of the speaker’s ignorance as to whether<br />

q. The Prime Minister’s performance in her reading group is one such performance.<br />

So, the explanation of Moore paradoxicality cannot be that any such performance<br />

would violate a norm governing assertion.<br />

To sum up, then, we’ve argued that The Knowledge Rule (a) fails to be the only<br />

explanation of Moore paradoxicality, and (b) misclassifies certain performances that<br />

are a little more complex than simple conjunctive assertions as defective. So, there’s<br />

no good argument from Moore paradoxicality to The Knowledge Rule.<br />

5 Action and Assertion<br />

If we’re right, there’s a striking asymmetry between certain kinds of assertions. In<br />

the war example, early in her speech, the Prime Minister says (2).<br />

(2) The enemy has been murdering his own civilians.<br />

That’s not the kind of thing she could properly say if it could easily have been false<br />

given her evidence. And like many assertions, this is not an assertion whose appropriateness<br />

is guaranteed by its truth. Asserting (2) accuses someone of murder, and<br />

you can’t properly make such accusations without compelling reasons, even if they<br />

happen to be true. On the other hand, we say, the truth of (1) does (at least when it<br />

is accepted on the right basis) suffice to make it properly assertable.<br />

There’s a similar asymmetry in the flood insurance example. In that example, (3)<br />

is true, but neither Raj nor Nik knows it.


Assertion, Knowledge and Action 366<br />

(3) Raj and Nik’s business will not flood this year.<br />

Again, in these circumstances, this isn’t the kind of thing Raj can properly say. Even<br />

though (3) is true, it would be foolhardy for Raj to make such a claim without very<br />

good reasons. By contrast, again, we say that Raj can properly assert that the thing to<br />

do, in their circumstances, is to buy flood insurance, even though he does not know<br />

this.<br />

There are two directions one could go at this point. If we’re right, any proposed<br />

theory of the norms governing assertion must explain the asymmetry. Theories that<br />

cannot explain it, like The Knowledge Rule, or the Certainty Rule proposed by Jason<br />

Stanley (2008), or the Rational Credibility Rule proposed by Igor Douven (2006), are<br />

thereby refuted.<br />

The Certainty Rule Assert only what is certain.<br />

The Rational Credibility Rule Assert only what is rationally credible.<br />

The Certainty Rule fails since the Prime Minister is not certain of (1). And the<br />

Prime Minister can’t be certain of (1), since certainty requires safety just as much as<br />

knowledge does.<br />

It’s a little harder to show our example refutes The Rational Credibility Rule.<br />

Unlike knowledge, a safety constraint is not built into the concept of rational credibility.<br />

(Since rational credibility does not entail truth, in Douven’s theory, it can<br />

hardly entail truth in nearby worlds.) But we think that safety constraints may still<br />

apply to rational credibility in some particular cases. If you aren’t very good at judging<br />

building heights of tall buildings to a finer grain than 10 meters, then merely<br />

looking at a building that is 84 meters tall does not make it rationally credible for<br />

you that the building is more than 80 meters tall. In general, if your evidence does<br />

not give you much reason to think you are not in some particular world where p is<br />

false, and you didn’t have prior reason to rule that world out, then p isn’t rationally<br />

credible. So, when evidence doesn’t discriminate between nearby possibilities, and p<br />

is false in nearby possibilities, p isn’t rationally credible.<br />

And that, we think, is what happens in our two examples. Just as someone looking<br />

at an 84 meter building can’t rationally credit that it is more than 80 meters tall,<br />

unless they are abnormally good at judging heights, agents for whom X is just barely<br />

the thing to do can’t rationally credit that X is the thing to do. By The Rational<br />

Credibility Rule, they can’t say X is the thing to do. But they can say that; that’s<br />

what our examples show. So, The Rational Credibility Rule must be wrong.<br />

But we can imagine someone pushing in the other direction, perhaps with the<br />

help of this abductive argument.<br />

(1) A speaker can only assert things like (2) or (3) if they know them to be true.<br />

(2) The best explanation of premise 1 of this argument is The Knowledge Rule.<br />

(3) So, The Knowledge Rule is correct.


Assertion, Knowledge and Action 367<br />

This isn’t a crazy argument. Indeed, it seems to us that it is implicit in some of the<br />

better arguments for The Knowledge Rule. But we think it fails. And it fails because<br />

there are alternative explanations of the first premise, explanations that don’t make<br />

mistaken predictions about the Prime Minister’s speech. For instance, we might have<br />

some kind of Evidence Responsiveness Rule.<br />

The Evidence Responsiveness Rule Assert that p only if your attitude towards p is<br />

properly responsive to the evidence you have that bears on p.<br />

Given how much can be covered by ‘properly’, this is more of a schema than a rule.<br />

Indeed, it is a schema that has The Knowledge Rule as one of its precisifications. In<br />

Knowledge and Its Limits, Williamson first argues that assertion is “governed by a<br />

non-derivative evidential rule” (249), and then goes on to argue that the proper form<br />

of that rule is The Knowledge Rule. We agree with the first argument, and disagree<br />

with the second one. 6<br />

Note that even a fairly weak version of The Evidence Responsiveness Rule would<br />

explain what is going on with cases like (1) and (2). Starting a war is a serious business.<br />

You can’t properly do it unless your views about the war are evidence responsive in<br />

the right way. You can’t, that is, correctly guess that starting the war is the thing<br />

to do. You can correctly guess that starting the war will be utility maximizing. And<br />

you can correctly guess that starting the war would be what to choose if you reflected<br />

properly on the evidence you have, and the moral significance of the choices in front<br />

of you. But you simply can’t guess that starting the war is what to do, and be right. If<br />

you’re merely guessing that starting a war is thing to do, then you’re wrong to start<br />

that war. So, if (1) is true, and the Prime Minister believes it, her belief simply must<br />

be evidence responsive. Then, by The Evidence Responsiveness Rule, she can assert<br />

it.<br />

For most assertions, however, this isn’t the case. Even if it’s true that it will rain<br />

tomorrow, the Prime Minister’s could believe that without her belief being evidence<br />

responsive. In general, p does not entail that S even believes that p, let alone that this<br />

belief of S’s is evidence responsive. But in cases like (1), this entailment does hold,<br />

and that’s what explains the apparent asymmetry that we started this section with.<br />

The Evidence Responsiveness Rule also handles so called ‘lottery propositions’<br />

nicely. If you know that the objective chance of p being true is c, where c is less than<br />

1, it will seem odd in a lot of contexts to simply assert p. In his arguments for The<br />

Knowledge Rule, Williamson makes a lot of this fact. In particular, he claims that the<br />

best explanation for this is that we can’t know that p on purely probabilistic grounds.<br />

This has proven to be one of the most influential arguments for The Knowledge Rule<br />

in the literature. But some kind of Evidence Responsiveness Rule seems to handle<br />

6 Actually, our agreement with Williamson here is a bit more extensive than the text suggests. Williamson<br />

holds that part of what makes a speech act an assertion as opposed to some other kind of act is<br />

that it is governed by The Knowledge Rule. Although many philosophers agree with Williamson that<br />

The Knowledge Rule is true, this fascinating claim about the metaphysics of speech acts has been largely<br />

ignored. Translating Williamson’s work into the terminology of this paper, we’re inclined to agree that a<br />

speech act is an assertion partly in virtue of being responsive to evidence in the right way. But filling in<br />

the details on this part of the story would take us too far from the main storyline of this paper.


Assertion, Knowledge and Action 368<br />

lottery cases even more smoothly. In particular, an Evidence Responsiveness Rule<br />

that allows for what constitutes ‘proper’ responsiveness to be sensitive to the interests<br />

of the conversational participants will explain some odd features concerning lottery<br />

propositions and assertability.<br />

In the kind of cases that motivate Williamson, we can’t say p where it is objectively<br />

chancy whether p, and the chance of p is less than 1. But there’s one good sense<br />

in which such an assertion would not be properly responsive to the evidence. After<br />

all, in such a case there’s a nearby world, with all the same laws, and with all the same<br />

past fatcs, and in which the agent has all the same evidence, in which p is false. And<br />

the agent knows all this. That doesn’t look like the agent is being properly responsive<br />

to her evidence.<br />

On the other hand, we might suspect that Williamson’s arguments concerning<br />

lottery propositions overstate the data. Consider this old story from David Lewis<br />

(1996b). 7<br />

Pity poor Bill! He squanders all his spare cash on the pokies, the races,<br />

and the lottery. He will be a wage slave all his days . . . he will never be<br />

rich. (Lewis, 1996b, 443 in reprint)<br />

These seem like fine assertions. One explanation of the appropriateness of those assertions<br />

combines The Knowledge Rule with contextualism about assertion. 8 But<br />

contextualism has many weaknesses, as shown in Hawthorne (2004) and Stanley<br />

(2005). A less philosophically loaded explanation of Lewis’s example is that proper<br />

responsiveness comes in degrees, and for purposes of talking about Bill, knowing<br />

that it’s overwhelmingly likely that he’s doomed to wage slavery is evidence enough<br />

to assert that he’ll never be rich. The details of this explanation obviously need to be<br />

filled in, but putting some of the sensitivity to conversational standards, or practical<br />

interests, into the norms of assertion seems to be a simpler explanation of the data<br />

than a contextualist explanation. (It would be a priori quite surprising if the norms<br />

of proper assertion were not context-sensitive, or interests-sensitive. The norms of<br />

appropriateness for most actions are sensitive to context and interests.) So The Evidence<br />

Responsiveness Rule seems more promising here than The Knowledge Rule.<br />

A harder kind of case for The Knowledge Rule concerns what we might call ‘academic<br />

assertions’. This kind of case is discussed in Douven (2006) and in Maitra<br />

(2010). In academic papers, we typically make assertions that we do not know. We<br />

don’t know that most of the things we’ve said here are true. (Before the last sentence<br />

we’re not sure we knew that any of the things we said were true.) But that’s because<br />

knowledge is a bad standard for academic discourse. Debate and discussion would<br />

atrophy if we had to wait until we had knowledge before we could present a view. So,<br />

it seems that assertion can properly outrun knowledge in academic debate.<br />

7 We’ve slightly modified the case. Lewis says we can say that we know Bill will never be rich. That<br />

seems to us to be a much more controversial than what we’ve included here.<br />

8 The combination is slightly trickier to state than would be ideal. The explanation we have in mind is<br />

that S can properly assert p only if S can truly say I know that p, where ‘know’ in this utterance is context<br />

sensitive.


Assertion, Knowledge and Action 369<br />

Again, a context-sensitive version of The Evidence Responsiveness Rule explains<br />

the data well. Although you don’t need to know things to assert them in philosophy<br />

papers, you have to have evidence for them. We couldn’t have just spent this<br />

paper insisting louder and louder that The Knowledge Rule is false. We needed to<br />

provide evidence, and hopefully we’ve provided a lot of it. In some contexts, such<br />

as testifying in court, you probably need more evidence than what we’ve offered to<br />

ground assertions. But in dynamic contexts of inquiry, where atrophy is to be feared<br />

more than temporary mistakes, the standards are lower. Good evidence, even if not<br />

evidence beyond any reasonable doubt, or even if not enough for knowledge, suffices<br />

for assertion. That’s the standard we typically hold academic papers to. Like with<br />

lotteries, we think the prospects of explaining these apparently variable standards in<br />

terms of a norm of assertion that is context-sensitive are greater than the prospects<br />

for explaining them in terms of contextually sensitive knowledge ascriptions.<br />

Here’s a different and somewhat more speculative proposal idea for a rule that<br />

also explains the asymmetry we started this section with. We call it the Action Rule.<br />

The Action Rule Assert that p only if acting as if p is true is the thing for you to do.<br />

We take the notion of acting as if something is true from Stalnaker (1973). Intuitively,<br />

to act as if p is true is to build p into one’s plans, or to take p for granted when<br />

acting. This, note, is not the same as using p as a basis for action. When Raj buys<br />

flood insurance, he acts as if buying flood insurance is the thing to do. But the fact<br />

that buying flood insurance is the thing to do isn’t the basis for his action. (Since he<br />

does not know this, one might suspect it wouldn’t be a good basis.) Instead his basis<br />

is what he knows about the river, and his business, and its vulnerability to flooding.<br />

When an agent is trying to maximise the expected value of some variable (e.g., utility,<br />

profit, etc.), then to act as if p is true is simply to maximise the conditional expected<br />

value of that variable, in particular, to maximise the expected value of that variable<br />

conditional on p. Even when one is not maximising any expected value, we can still<br />

use the same idea. To act as if p is to take certain conditional obligations or permissions<br />

you have – in particular, those obligations or permissions that are conditional<br />

on p – to be actual obligations or permissions.<br />

To see how The Action Rule generates the intended asymmetry, we’ll need a bit<br />

of formalism. Here are the terms that we will use.<br />

• X denotes an action, agent, circumstance triple 〈X Action , X Agent , X Circumstance 〉.<br />

We take such triples to have a truth value. X is true iff X Agent performs X Action<br />

in X Circumstance .<br />

• ThingToDo(X) means that X is the thing to do for X Agent in X Circumstance .<br />

• Act(S,p) means that agent S acts as if p is true.<br />

• Assert(S,p) means that agent S can properly assert that p.<br />

So, The Action Rule is this.<br />

Assert(S,p) → ThingToDo(Act(S,p))


Assertion, Knowledge and Action 370<br />

In our derivations, the following equivalence will be crucial.<br />

Act(X Agent ,ThingToDo(X)) ↔ X<br />

That is, acting as if X is what to do (in your circumstances) is simply to do X (in those<br />

circumstances). And in doing X, you’re acting as if X is what to do (in your circumstances).<br />

We take this equivalence to be quite resilient; in particular, it holds under<br />

operators like ‘ThingToDo’. So, adding that operator to the previous equivalence, we<br />

get another equivalence.<br />

ThingToDo(Act(X Agent ,ThingToDo(X))) ↔ ThingToDo(X)<br />

If we substitute ThingToDo(X) for p in The Action Rule, we get this.<br />

Assert(X Agent ,ThingToDo(X)) → ThingToDo(Act(X Agent ,ThingToDo(X)))<br />

But by the equivalence we derived earlier, that’s equivalent to the following.<br />

Assert(X Agent ,ThingToDo(X)) → ThingToDo(X)<br />

So, we get the nice result that The Action Rule is trivially satisfied for any true claim<br />

about what is to be done. That is, for the special case where p is X is the thing for<br />

you to do, The Action Rule just reduces to something like the Truth Rule. And so we<br />

get a nice explanation of why the Prime Minister and Raj can properly make their<br />

assertions about what to do in their respective circumstances. 9<br />

To explain the other side of the asymmetry with which we began this section,<br />

note that these biconditionals do not hold where p is an arbitrary proposition, and S<br />

an arbitrary agent.<br />

ThingToDo(Act(S,p)) ↔ p<br />

Act(S,ThingToDo(Act(S,p))) ↔ p<br />

To see this, let p be the proposition expressed by (4). To act as if this is true is to, inter<br />

alia, not buy flood insurance. If there won’t be a flood, buying flood insurance is<br />

throwing away money, and when you’re running a business, throwing away money<br />

isn’t the thing to do. In symbols, Act( Raj and Nik,p) is equivalent to Raj and Nik<br />

don’t buy flood insurance. But not buying flood insurance is not the thing to do. The<br />

prudent plan is to buy flood insurance. So, ThingToDo(Act( Raj and Nik,p)) is false,<br />

even though p is true. So, the first biconditional fails. Since Raj and Nik do go on to<br />

buy flood insurance, i.e., since they don’t act as if ThingToDo(Act( Raj and Nik,p)), the<br />

left-hand-side of the second biconditional is also false. But again, the right-hand-side<br />

is true. So, that biconditional is false as well. And without those biconditionals, The<br />

Action Rule doesn’t collapse into Assert(S,p) → p.<br />

9 The derivation here is deliberately simplified in one way. We haven’t included anything about the bases<br />

for action or assertion. We don’t think being sensitive to bases in the formalism would make a material<br />

change, but it would obscure the structure of the argument.


Assertion, Knowledge and Action 371<br />

We have thus far argued that The Action Rule can provide an explanation for the<br />

asymmetry we noted at the beginning of this section. 10 This is not, however, meant<br />

to be anything like a complete defence of that rule. That would require a lot more<br />

than we’ve provided here. But we do think that the Action Rule can explain a lot<br />

of the phenomena that are meant to motivate The Knowledge Rule, as well as some<br />

phenomena The Knowledge Rule struggles with.But we do think The Action Rule<br />

has some virtues. We’ll close with a discussion of how it explains the two kinds of<br />

cases that we argued that The Evidence Responsiveness Rule handles well.<br />

To see this, consider first ‘lottery propositions’. If you know that the objective<br />

chance of p being true is c, where c is less than 1, it will seem odd in a lot of contexts<br />

to simply assert p. In his arguments for The Knowledge Rule, Williamson makes a<br />

lot of this fact. In particular, he claims that the best explanation for this is that we<br />

can’t know that p on purely probabilistic grounds. This has proven to be one of the<br />

most influential arguments for The Knowledge Rule in the literature.<br />

We suggest that The Action Rule can offers an alternative a nice explanation<br />

for why it’s often defective to assert lottery propositions. Note first that inIn a<br />

lot of cases, it isn’t rational for us to act on p when we have only purely probabilistic<br />

evidence for it, especially when acting on p amounts to betting on p at sufficiently<br />

unfavourable odds. This point is something of a staple of the ‘interest-relativeinvariantism’<br />

literature on knowledge. 11 To take a mundane case, imagine that you’re<br />

cleaning up your desk, and you come across some lottery tickets. Most are for lotteries<br />

that have passed, that you know you lost. One ticket, however, is for a future<br />

lottery, which you know you have very little chance of winning. In such a case, to act<br />

as if the ticket for the future lottery would lose would be to throw it out along with<br />

the other tickets. But that would be irrational, and not at all how we’d act in such a<br />

case. That is to say, in such a case, we don’t (and shouldn’t, rationally speaking) act<br />

as if the ticket for the future lottery will lose, even though we take that outcome to<br />

be highly probable.<br />

If acting as if a lottery proposition is true isn’t the thing to do, then The Action<br />

Rule will say that asserting such a proposition defective. Therefore, we think that<br />

The Action Rule can capture why in many cases you can’t in general assert lottery<br />

propositions.<br />

A harder kind of case for The Knowledge Rule concerns what we might call ‘academic<br />

assertions’. This kind of case is discussed in Douven (2006) and in Maitra<br />

10 This explanation makes some interestingly different predictions from the explanation in terms of The<br />

Evidence Responsiveness Rule. Suppose that for relatively trivial decisions, like where to go for a walk on<br />

a nice summer day, one can correctly guess that X is the thing to do. Then the Evidence Responsiveness<br />

Rule would suggest that the truth of claims about where to go for a walk is not sufficient grounds for their<br />

assertability, while the Action Rule would still imply that truth is sufficient grounds for assertability.<br />

We’re not sure that this supposition – that for relatively trivial decisions, one can correctly guess that<br />

X is the thing to do – is coherent, nor what to say about assertability judgments in (imagined) cases where<br />

the supposition holds. So, we’re not sure we can really use this to discriminate between the two proposed<br />

explanations. Nevertheless, it is interesting to note how the explanations come apart. Thanks here to<br />

Susanna Schellenberg.<br />

11 See, for instance, Fantl and McGrath (2002), Hawthorne (2004b), Stanley (2005), and <strong>Weatherson</strong><br />

(2005a).


Assertion, Knowledge and Action 372<br />

(2010). In academic papers, we typically make assertions that we do not know. We<br />

don’t know that most of the things we’ve said here are true. (Before the last sentence<br />

we’re not sure we knew that any of the things we said were true.) But that’s because<br />

knowledge is a bad standard for academic discourse. Debate and discussion would<br />

atrophy if we had to wait until we had knowledge before we could present a view. So,<br />

it seems that assertion can properly outrun knowledge in academic debate.<br />

Academic assertions raised a problem for The Knowledge Rule because proper<br />

assertion in the context of inquiry can outrun knowledge. But note that action in<br />

such a context can also properly outrun knowledge. It would slow down learning<br />

dramatically if people didn’t engage in various projects that really only make sense if<br />

some hypothesis is true. So, academics will study in archives, conduct experiments,<br />

write papers, etc. etc., and do so on the basis of reasons they no more know than we<br />

know the truth of the speculative claims of this paper. And this is all to the good;<br />

the alternative is a vastly inferior alternative to academia as we know it. So, in some<br />

fields, action requires much less than knowledge. Happily, in those fields, assertion<br />

also requires much less than knowledge. Indeed, the shortfalls in the two cases seem<br />

to parallel nicely. And this parallel is neatly captured by The Action Rule.<br />

As we said, none of this is a knockdown case for The Action Rule. Our primary<br />

purpose is to argue against The Knowledge Rule. As long as the Action Rule<br />

is plausible, we have defeated the abductive argument for The Knowledge Rule that<br />

was discussed at the start of this section, and we think we’ve done enough to show<br />

it is plausible. We also hope we’ve made a successful case for moving the study of<br />

assertability away from rules like The Knowledge Rule, and instead have it be more<br />

tightly integrated with our best theories about evidence and action.


No Royal Road to Relativism<br />

Relativism and Monadic Truth is a sustained attack on ‘analytical relativism’, as it has<br />

developed in recent years. The attack focusses on two kinds of arguments. One is<br />

the argument from the behaviour of operators, as developed by David Lewis (1980a)<br />

and David Kaplan (1989b). The other kind of argument takes off from phenomena<br />

concerning speech reports and disagreements. Such arguments play central roles in<br />

arguments by, among others, Andy Egan (2007a), Max Kölbel (2009), Peter Lasersohn<br />

(2005), John MacFarlane (2003b, 2007), Mark Richard (2004) and Tamina Stephenson<br />

(2007b). These arguments also play a role in a paper that I co-authored with Andy<br />

Egan and John Hawthorne (Egan et al., 2005).<br />

As the reader of Relativism and Monadic Truth can tell, John Hawthorne no<br />

longer much likes the arguments of that paper, nor its conclusions. And I think<br />

he’s right to be sceptical of some of the arguments we advanced. The objections that<br />

he and Herman Cappelen raise to arguments for relativism from speech reports and<br />

from disagreement are, I think, telling. But I don’t think those are the best arguments<br />

for relativism. (For what it’s worth, I don’t think they’re even the best arguments in<br />

the paper we co-authored.) The primary purpose of this note will be say a little about<br />

what some of these better arguments are. The core idea will be that although there is<br />

some data that is consistent with non-relativist theories, the best explanation of this<br />

data is that a kind of relativism is true. In short, we should be looking for inductive,<br />

not deductive, arguments for relativism. I’m going to fill in some details of this<br />

argument, and say a little about how it seemed to slip out of the main storyline of<br />

Relativism and Monadic Truth.<br />

In Chapter 2 of Relativism and Monadic Truth, Cappelen and Hawthorne attempt<br />

to develop diagnostics for when an utterance type S has invariant content. They note<br />

that some relativist arguments presuppose a diagnostic based on speech reports. The<br />

idea behind the presupposed diagnostic is that if we can invariably report an utterance<br />

of S by A by saying A said that S, then S is semantically invariant. And they note that<br />

this diagnostic isn’t particularly reliable.<br />

What they aim to replace it with is a diagnostic based on agreement reports. Cappelen<br />

and Hawthorne are more careful on the details than I’ll be, but the rough idea<br />

is easy enough to understand. The diagnostic says that if whenever A and B utter S,<br />

we can report that by saying A and B agree that S, and the basis for our saying this is<br />

that they made those utterances, then S’s content is invariant. The idea behind the<br />

test is that if there isn’t a single proposition that A and B endorse, then it would be<br />

odd to say that they agree.<br />

I don’t think the diagnostic is particularly plausible. The next couple of paragraphs<br />

won’t come as much of a surprise to the authors of Relativism and Monadic<br />

Truth, since the ideas come from a talk Herman Cappelen did at the Arché Summer<br />

School in July 2009. But they are central enough to the story I’m telling that they<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Analysis<br />

71(1), 2011, 133-43.


No Royal Road to Relativism 374<br />

are worth including here. The core problem for this agreement based diagnostic is<br />

that sometimes we can report parties as agreeing even though they don’t agree on the<br />

truth value of any proposition. So while (1) has a disambiguation where it is true<br />

only if something like (2) is true, it also has a disambiguation where it is true as long<br />

as something like (3) is true.<br />

(1) Alec, Pierre and Franz agree that they were lucky to be born where they were<br />

actually born.<br />

(2) Alec, Pierre and Franz each consider each of Alec, Pierre and Franz lucky to<br />

be born where they were actually born.<br />

(3) Alec, Pierre and Franz each consider themselves lucky to be born where they<br />

were actually born.<br />

Given that sentences like (1) could mean something like (3), there is little reason to<br />

think that agreement diagnostics will provide us clear evidence of sameness of content.<br />

Indeed, Cappelen and Hawthorne should hope that this diagnostic doesn’t always<br />

work, because the diagnostic seems to entail relativism about epistemic modals.<br />

Imagine that a detective and a psychic are both investigating a murder. They both<br />

conclude that their evidence entails that Tatort did it, and that their evidence is consistent<br />

with Tatort being dead. They are, however, ignorant of each other’s evidence,<br />

and indeed of the fact that the other is working on the investigation. Still, if each<br />

utters (4), it seems we are in a position to endorse (5).<br />

(4) Tatort must be guilty, and might be dead.<br />

(5) The detective and the psychic agree that Tatort must be guilty, and that he<br />

might be dead.<br />

This will be very hard to explain on a contextualist theory of epistemic modals, if we<br />

accept the agreement diagnostic. That’s because there’s no proposition (other than<br />

the proposition that Tatort is guilty) that they agree about. I think this is some evidence<br />

in favour of relativism, but if the contextualist wanted to argue that we should<br />

understand (4) the same way we understand (1) (on its ‘distributive’ disambiguation),<br />

it would be hard to conclusively show they were wrong.<br />

In any case, it is hard to see why we should expect there to be a diagnostic of the<br />

kind Cappelen and Hawthorne are aiming for. Such diagnostics are the exception, not<br />

the rule, in social sciences. There’s no simple diagnostic for whether a particular state<br />

is democratic or not. (Is modern-day Afghanistan a democracy? What about modernday<br />

Alabama?) Nor is there a simple diagnostic for whether a particular rule is a<br />

law. (Are internal revenue regulations laws?) But political science and jurisprudence<br />

don’t collapse in the absence of such diagnostics. Nor should philosophical semantics<br />

collapse in the absence of a simple test for context-sensitivity.<br />

Indeed, the situation is political science and in jurisprudence is in one sense worse<br />

than it is in semantics. We can state, admittedly in theory-laden terms, what it is for<br />

the content of a sentence type to be context-invariant or context-sensitive. It is much<br />

harder to state, even in theory-laden terms, what it is for a state to be democratic, or<br />

for a rule to be a law. The problem with thinking about the questions I asked in the


No Royal Road to Relativism 375<br />

previous paragraph isn’t that there’s some hidden piece of evidence we haven’t yet<br />

uncovered. It’s that the concepts do not have clear application conditions, and the<br />

hard cases fall between the clear instances and non-instances of the relevant property.<br />

In semantics we have, to a first approximation, a mere epistemic challenge.<br />

Even if the hunt for a diagnostic for context-sensitivity is bound to be futile, as<br />

I think it is, that doesn’t mean it is harmless. I think the structure of Cappelen and<br />

Hawthorne’s inquiry, which starts by looking for a test and then goes on to apply it,<br />

pushes us towards the wrong kind of argument. The effect of this structure is that we<br />

end up looking for deductive arguments for or against relativism, and the absence of<br />

deductive arguments for relativism is taken to be a big problem for the relativist. But<br />

we should have been looking for inductive arguments. The best case for relativism,<br />

I think, will be a kind of inference to the best explanation. For instance, a relativist<br />

might try to clean up this argument.<br />

(1) Our best theory of mental content is that the contents of beliefs and desires do<br />

not satisfy Simplicity. 1<br />

(2) The role of language is to express thoughts, so if the contents of belief and desire<br />

do not satisfy Simplicity, the contents of sentences and utterances probably<br />

don’t either.<br />

(3) Simplicity is false as a theory of linguistic content.<br />

This argument clearly isn’t valid. That’s by design; it’s meant to be an abductive<br />

argument against Simplicity about linguistic content. And of course both premises<br />

are controversial. There’s one argument for premise 1 in Lewis (1979a), and another<br />

in Perry (1979). Both arguments are controversial. Indeed Cappelen and Hawthore<br />

spend some time (pages 50 to 54) responding to the Lewisian arguments, though they<br />

spend less time on Perry’s arguments.<br />

I’m not going to try to advance the debate here over whether premise 1 is true<br />

or not. I suspect the solution will turn on much bigger issues than can be covered<br />

in a note of this length. And that’s because I think the judgment about whether<br />

premise 1 is true will turn on quite global features of our best theory of mental content.<br />

For instance, Daniel Nolan (2006) argues that there are certain desires that<br />

we cannot understand on the modal Lewis offers. That doesn’t entail that Lewis is<br />

wrong about the nature of belief, but it does make Lewis’s theory of belief look less<br />

attractive. From the other direction, many authors working on the Sleeping Beauty<br />

problem, dating back to the problem’s introduction to the philosophical community<br />

in Elga (2000a), have felt that the problem was best approached in Lewis’s Simplicityunfriendly<br />

framework. That doesn’t entail Simplicity is wrong, but it is I think evidence<br />

against it. On the other hand, Robert Stalnaker (2008b) has recently argued<br />

that this is not the best framework for thinking about the Sleeping Beauty problem,<br />

and I’ve argued (<strong>Weatherson</strong>, forthcoming) that Stalnaker’s approach lets us see<br />

1 Simplicity is Cappelen and Hawthorne’s name for the conjunction of theses they want to defend<br />

against the relativist. For our purposes, Simplicity about mental content is the view that the contents<br />

of beliefs and desires are propositions, and these propositions are simply true or simply false, not merely<br />

true or false relative to some or other parameter. Simplicity about linguistic content is the view that these<br />

same propositions, the ones that are simply true or false, are the contents of declarative utterances.


No Royal Road to Relativism 376<br />

things about the Sleeping Beauty puzzle that are hidden on the standard, Lewisian,<br />

approach. So if we’re going to evaluate this kind of argument for relativism, the issues<br />

are going to get far removed from familiar disputes about distributions of words<br />

and phrases. That’s not too surprising. In general, the hard thing about abductive<br />

reasoning in philosophy is that we have to start looking at all sorts of different kinds<br />

of evidence. But that’s no reason to think that the most telling arguments won’t, at<br />

the end of the day, be abductive arguments.<br />

A quite different kind of argument comes from thinking about property ascription<br />

and ignorance. It’s a somewhat frequent occurrence that modern science discovers<br />

that some of our thoughts seem to depend for their truth on more variables than<br />

we realised. So it isn’t true that two accelerating objects simply have the same mass<br />

or different masses; rather, their relative mass might be different relative to different<br />

inertial frames. Or two colour patches might not be simply the same colour or<br />

simply different colours. If the colours are metamers (relative to human vision) then<br />

they will be in a good sense the same colour relative to human vision, and different<br />

colours relative to more discriminating detectors. Such cases raise challenges for the<br />

project of interpreting a language.<br />

Assume that the community uses terms like ‘mass’. Indeed, assume they are sophisticated<br />

enough to distinguish mass from weight, for they know that weight is<br />

relative to a gravitational field, and gravitational fields vary in strength. But they are<br />

not sophisticated enough to know that masses are relative to inertial frames. The<br />

members of this community frequently go around saying things like “Those two objects<br />

have the same mass,” referring to a and b. Call that sentence M . We assume that<br />

the members are in a particular inertial frame, call it F . Let’s assume (just for a few<br />

paragraphs) that the propositions that satisfy Simplicity are structured, and assume<br />

that we can represent the relation has the same mass as by a somewhat unstructured<br />

relation SameM as s. (In other words, ignore whatever internal structure SameM as s<br />

has, since it won’t be relevant to this example.) Then it seems to me that there are<br />

three live options around.<br />

(1) By M, the speakers express the pseudo-proposition SameM as s(a, b), and this<br />

pseudo-proposition is not capable of being true or false, since SameM as s is a<br />

three-place relation (between two objects and an inertial frame) and only two<br />

places are specified.<br />

(2) By M , the speakers express the proposition SameM as s(a, b,F ), and this proposition<br />

is (capable of being) true.<br />

(3) By M, the speakers express the proposition SameM as s(a, b), and this is (capable<br />

of being) true relative to F , although it might be false relative to some other<br />

inertial frame F ′ .<br />

If option 3 is correct, then it seems Simplicity fails. 2 So if there are compelling arguments<br />

against options 1 and 2, and those are all the options, then Simplicity is in<br />

2 I say ‘seems’ since I’m not sure exactly what it takes for there to be a notion of truth simpliciter. The<br />

argument on page 96 against the conjunction of Simplicity, Eternalism and Temporalism suggests that<br />

Cappelen and Hawthorne believe the following principle: If p is true in C 1 , and false in C 2 , and C 1 and<br />

C 2 both exist, then p is not either simply true or simply false. It’s not obvious to me why p couldn’t, in


No Royal Road to Relativism 377<br />

trouble. And it seems the relativist might make progress by pushing back against<br />

both of those options.<br />

The simplest argument against option 1 is that it violates even a very weak form<br />

of the Principle of Charity. Obviously there are very many different kinds of charity<br />

principles. For instance, there are three different versions endorsed in Davidson<br />

(1970), Lewis (1974a) and (Williamson, 2007, Ch. 8). But any kind of Charity will<br />

imply that options 2 or 3 are preferable to option 1, since option 1 will imply that<br />

the subjects don’t even have beliefs about the relative masses of objects, whereas the<br />

other options will imply that their beliefs may well be true, and rational, and even in<br />

some cases amount to knowledge. An alternative argument against option 1 is that<br />

the members of that community would have been right to take it as a Moorean fact<br />

that some things have the same mass. 3 So option 1 doesn’t look overly plausible.<br />

One argument against premise 2 is that it is impossible for the members of the<br />

community, given their powers of individuation, to make singular reference to such<br />

a thing as an inertial frame. If they don’t know what an inertial frame is, then we<br />

might be sceptical of claims that they can refer to it. (Note that the thought here<br />

isn’t merely that some individuals don’t know what inertial frames are; the imagined<br />

case is that even experts don’t know about the kind of things that we would need<br />

to put into the propositions to give them simple truth values.) Another argument<br />

is that competent speakers of the language should be able to identify the number of<br />

argument places in the properties they use.<br />

Neither of the arguments just offered is completely compelling, though I think<br />

both are at least promising avenues for research. But both arguments do look notably<br />

weaker if we drop the assumption that the relevant propositions are structured. In<br />

an unstructured propositions framework, we perhaps don’t need to worry about the<br />

members making singular reference to things like inertial frames. We just need to<br />

have the speakers pick out (in a perhaps imperfect way) the worlds in which their<br />

beliefs are true. And in an unstructured propositions framework it isn’t clear that<br />

being unable to identify the number of arguments places in the properties they use<br />

is any more of a sign of linguistic incompetence than not knowing the individuals to<br />

which they refer. But it is a commonplace of semantic externalism that speakers can<br />

refer without knowing who it is they are referring to.<br />

The arguments in the previous two paragraphs have been sketchy, to say the least.<br />

But if they can be developed into compelling arguments, then it might turn out that<br />

the case against option 2 succeeds iff propositions are structured. In that case the argument<br />

for Simplicity will turn on a very large question about the nature of propositions,<br />

namely whether they are structured or not. Again, the take home lesson is<br />

that debates in this area are not susceptible to easy resolution.<br />

I’ll end with a more narrowly linguistic abductive argument for relativism and<br />

against Simplicity. I think you can find the core ingredients of this argument in Egan<br />

et al. (2005), though it isn’t as well individuated as it might have been. The argument<br />

takes off from what looks like a somewhat misleading claim in Cappelen and<br />

C 1 , be simply true, but I take it Cappelen and Hawthorne are using ‘simply’ in such a way as to exclude<br />

that. So option 3 is inconsistent with Simplicity.<br />

3 Compare the discussion of Moorean facts in (Lewis, 1994a, 489).


No Royal Road to Relativism 378<br />

Hawthorne’s book. The context is a discussion of autocentric and exocentric uses<br />

of predicates. 4 The distinction between autocentric and exocentric uses is important<br />

for thinking about the way various predicates are used, though it isn’t easy to<br />

give a theory-neutral characterisation of it. Assuming contextualism, Cappelen and<br />

Hawthorne note that it is easy to explain the distinction: “a use of a taste predicate is<br />

autocentric iff its truth conditions are given by a completion that indexes the predicate<br />

to the subject” and exocentric iff “its truth conditions are given by a completion<br />

that indexes it to a person or group other than the speaker, which may, however, include<br />

the speaker.” (104) The core idea here is clear enough, I hope, though as they<br />

say it requires a slightly different gloss if we assume relativism. 5<br />

The problem is what they go on to say about epistemic modals. In a footnote<br />

they say,<br />

[I]t is worth noting that there is a similar contrast between autocentric<br />

and exocentric uses of epistemic modals. If I see Sally hiding on a bus<br />

then I might in a suitable context say ‘She is hiding because I might be<br />

on the bus’ even though I know perfectly well that I am not on the bus.<br />

(‘Must’ is harder to use exocentrically, though we shall not undertake to<br />

explain this here.) (104n7)<br />

The parenthetical remark seems mistaken, or at least misleading, and for an important<br />

reason. It is true that it is very hard to use ‘must’ exocentrically in a sentence of<br />

the form a must be F. But that doesn’t mean that it is hard to use ‘must’ exocentically.<br />

In fact it’s very easy. Almost any sentence of the form S believes that a must be F<br />

will have an exocentric use of ‘must’. That’s because almost any use of an epistemic<br />

modal in the scope of a propositional attitude report will be ‘bound’ to the subject of<br />

that report. (I put ‘bound’ in scare quotes because although contextualists will think<br />

of this as literally a case of binding, non-contextualists may think something else is<br />

going on.) This suggests an argument for relativism about epistemic modals, one that<br />

seems to me to be quite a bit stronger than the arguments for relativism discussed in<br />

Relativism and Monadic Truth.<br />

(1) Unembedded uses of epistemic modals are generally autocentric (except in the<br />

context of explanations, like ‘because I might be on the bus’).<br />

(2) Epistemic modals embedded in propositional attitude reports are generally exocentric.<br />

(3) There is a good, simple relativist explanation of these two facts.<br />

(4) There is no good, simple explanation of these facts consistent with Simplicity.<br />

(5) So, relativism is true, and Simplicity is false.<br />

4 The terminology is from Lasersohn (2005).<br />

5 As they also go on to note, things get very complicated in cases where the truth conditions turn on<br />

the nature of an idealised version of the speaker. An example of such a theory is the theory of value in<br />

Lewis (1989b). One of the key points that Cappelen and Hawthorne make, and I think it is a very good<br />

point against a lot of claims for relativist theories concerning predicates of personal taste, is that this kind<br />

of case is very common when it comes to evaluative language.


No Royal Road to Relativism 379<br />

Note that I’m not for a minute suggesting that there is no Simplicity-friendly explanation<br />

of the facts to be had; just that it won’t be a very good explanation. Nor am<br />

I suggesting that the phenomena obtain universally, rather than just in most cases.<br />

But they obtain often enough to need explanation, and the best explanation will be<br />

relativist. And that, I think, is a reason to like relativism.<br />

The simplest relativist explanation of premises 1 and 2 uses the idea, derived from<br />

Lewis (1979a), that contents are λ-abstracts. So the content of a must be F is roughly<br />

λx.(x’s evidence entails that a is F ). A content λx.φ(x) is true relative to a person<br />

iff they are φ, and believed by a person iff they consider themselves to be φ, under a<br />

distinctively first-personal mode of presentation. Then a typical utterance of a must<br />

be F will be autocentric because if the asserter thinks it is true, they must take themselves<br />

to satisfy λx.(x’s evidence entails that a is F ). So assuming they are speaking<br />

truly, the hearer can infer that the speaker’s evidence does indeed entail that a is F .<br />

But a typical utterance of S believes that a must be F will be true just in case S takes<br />

themselves to satisfy λx.(x’s evidence entails that a is F ), and hence will be true as<br />

long as S’s evidence, or at least what S takes to be their evidence, entails that a is F .<br />

There’s no reference there to the speaker’s evidence, so the use of ‘must’ is exocentric.<br />

There are, to be sure, many details of this explanation that could use filling in, but<br />

what is clear is that there is a natural path from the view that contents are λ-abstracts<br />

to the data to be explained. And that explanation is inconsistent with Simplicity.<br />

Is there a good, simple Simplicity-friendly explanation of the data around? I suspect<br />

there is not. There are two obvious places to look for a Simplicity-friendly explanation.<br />

We could look for an explanation that turns on the meaning of ‘must’, or<br />

we could look for an explanation in terms of salience. On closer inspection, neither<br />

avenue is particularly promising.<br />

It does seem likely that there is an available explanation of premise 1 in terms of<br />

the meaning of ‘must’. The contextualist about pronouns has an easy explanation of<br />

why ‘we’ almost always picks out a group that includes the speaker. The explanation<br />

is just that it is part of the meaning of ‘we’ that it is a first-personal plural pronoun,<br />

so it is part of the meaning of ‘we’ that the group it picks out includes the speaker.<br />

We could argue that something similar goes on for ‘must’. So a must be F means,<br />

roughly, that x’s evidence entails that a is F , and it is part of the meaning of ‘must’<br />

that x either is the speaker, or is a group that includes the speaker. The problem with<br />

this explanation is that it won’t extend to premise 2. And that’s because meanings (in<br />

the relevant sense) don’t change when we move into embedded contexts. For example<br />

if Jones says “Smith thinks that we will all get worse grades than she will get”, Jones<br />

isn’t accusing Smith of having the inconsistent belief that she will get lower grades<br />

than what she gets. Rather, the reference of ‘we’ is still a group that includes the<br />

speaker, not the subject of the propositional attitude report. On this model, you’d<br />

expect the truth condition of S believes that a must be F to be that S believes that x’s<br />

evidence entails that a is F , where x is the speaker, or a group containing the speaker.<br />

But that’s typically not at all what it means. So this kind of explanation fails.<br />

The problem for salience based explanations of premises 1 and 2 is that salience is<br />

too fragile an explanatory base to explain the data. Let’s say that in general we think<br />

a must be F means, roughly, that x’s evidence entails that a is F , and x is generally


No Royal Road to Relativism 380<br />

the most salient knower in the context. Then we’d expect that it would be not too<br />

hard to read (6) in such a way that its truth condition is (6a) rather than (6b), and (7)<br />

in such a way that its truth condition is (7a) rather than (7b).<br />

(6) Jones’s evidence must settle who the killer is.<br />

(a) Jones’s evidence entails that Jones’s evidence settles who the killer is.<br />

(b) Our evidence entails that Jones’s evidence settles who the killer is.<br />

(7) Smith believes that Jones’s evidence must settle who the killer is.<br />

(a) Smith believes that Jones’s evidence entails that Jones’s evidence settles<br />

who the killer is.<br />

(b) Smith believes that her evidence entails that Jones’s evidence settles who<br />

the killer is.<br />

After all, (6) and (7) make Jones’s evidence really salient. That evidence settles who<br />

the killer is! But, it seems, that isn’t salient enough to make (6a) or (7a) the preferred<br />

interpretation. That seems to be bad news for a salience-based explanation of the way<br />

we interpret epistemic modals.<br />

Like all abductive arguments, this argument is far from conclusive. One way for<br />

a proponent of Simplicity to respond to it would be to come up with a neater explanation<br />

of premises 1 and 2 in our abductive argument, without giving up Simplicity.<br />

Another way would be to argue that although there is no nice Simplicity-friendly explanation<br />

of the data, the costs of relativism are so high that we should shun the relativist<br />

explanation on independent grounds. I don’t pretend to have ready responses<br />

to either of these moves. All I want to stress is that these abductive arguments are<br />

generally stronger arguments for relativism than the arguments that are, correctly,<br />

dismissed in Relativism and Monadic Truth. Those arguments try to take a quick path<br />

to relativism, claiming that some data about reports, or disagreement, or syntax, entails<br />

relativism. I doubt any such argument works, in part because of the objections<br />

that Cappelen and Hawthorne raise. There is, as my title says, no royal road to relativism.<br />

But I doubt there’s a quick road away from relativism either. If the relativist<br />

can explain with ease patterns that perplex the contextualist, we have good reason to<br />

believe that relativism is in fact true.


Epistemic Modals and Epistemic Modality<br />

<strong>Brian</strong> <strong>Weatherson</strong> and Andy Egan<br />

1 Epistemic Possibility and Other Types of Possibility<br />

There is a lot that we don’t know. That means that there are a lot of possibilities that<br />

are, epistemically speaking, open. For instance, we don’t know whether it rained in<br />

Seattle yesterday. So, for us at least, there is an epistemic possibility where it rained<br />

in Seattle yesterday, and one where it did not. It’s tempting to give a very simple<br />

analysis of epistemic possibility:<br />

• A possibility is an epistemic possibility if we do not know that it does not<br />

obtain.<br />

But this is problematic for a few reasons. One issue, one that we’ll come back to,<br />

concerns the first two words. The analysis appears to quantify over possibilities. But<br />

what are they? As we said, that will become a large issue pretty soon, so let’s set it<br />

aside for now. A more immediate problem is that it isn’t clear what it is to have de<br />

re attitudes towards possibilities, such that we know a particular possibility does or<br />

doesn’t obtain. Let’s try rephrasing our analysis so that it avoids this complication.<br />

• A possibility is an epistemic possibility if for every p such that p is true in that<br />

possibility, we do not know that p is false.<br />

If we identify possibilities with metaphysical possibilities, this seems to rule out too<br />

much. Let p be any contingent claim whose truth value we don’t know. We do<br />

know, since it follows from the meaning of actually, that p iff actually p is true. But<br />

that biconditional isn’t true in any world where p’s truth value differs from its actual<br />

truth value. So the only epistemic possibilities are ones where p’s truth value is the<br />

same as it actually is. But p was arbitrary in this argument, so the only epistemic<br />

possibilities are ones where every proposition has the same truth value as it actually<br />

does. This seems to leave us with too few epistemic possibilities!<br />

A natural solution is to drop the equation of possibilities here with metaphysical<br />

possibilities. We’ve motivated this by using a proposition that is easy to know to<br />

be true, though it isn’t true in many metaphysical possibilities. There are many<br />

problems from the other direction; that is, there are many cases where we want to say<br />

that there is a certain kind of epistemic possibility, even though there is no matching<br />

metaphysical possibility. We’ll go through five such examples.<br />

† Penultimate draft only. Please cite published version if possible. Final version forthcoming in Andy<br />

Egan and <strong>Brian</strong> <strong>Weatherson</strong> (eds.), Epistemic Modality, Oxford University Press. This paper is the introduction<br />

to the volume, so it discusses a lot of the papers in the volume, as well as making some observations<br />

about the state of the literature.


Epistemic Modals and Epistemic Modality 382<br />

First, there are necessary a posteriori claims that arise from the nature of natural<br />

kinds. The standard example here is Water is atomic. That couldn’t be true; necessarily,<br />

anything atomic is not water. But until relatively recently, it was an epistemic<br />

possibility.<br />

Second, there are claims arising from true, and hence metaphysically necessary,<br />

identity and non-identity statements. A simple example here is Hesperus is not Phosphorus.<br />

This could not be true; by necessity, these celestial bodies are identical. But<br />

it was an epistemic possibility.<br />

Third, there are claims about location. It isn’t quite clear what proposition one<br />

expresses by saying It’s five o’clock, but, plausibly, the speaker is saying of a particular<br />

time that that very time is five o’clock. It’s plausible that if that’s true, it’s true as a<br />

matter of necessity. (Could this very time have occurred earlier or later? It doesn’t<br />

seem like it could have.) So a false claim about what time it is will be necessarily false.<br />

But often there will be a lot of epistemic possibilities concerning what time it is.<br />

Temporal location raises further matters beyond the necessary a posteriori. We<br />

want there to be epistemic possibilities in which it is four o’clock, five o’clock and so<br />

on. But it isn’t altogether clear whether claims like that can be true in metaphysical<br />

possibilities. If we identify a metaphysical possibility with a possible world, then it<br />

isn’t clear what would make it the case that it is four o’clock in a possible world.<br />

(What time is it in this possible world?) This might suggest there are different kinds<br />

of facts at a metaphysical possibility as at an epistemic possibility.<br />

Fourth, there are issues about mathematics. Actually, there are two kinds of puzzle<br />

cases here. One concerns propositions that are logical consequences of our mathematical<br />

beliefs, but which we haven’t figured out yet. Twenty years ago, it certainly<br />

seemed to be an epistemic possibility that the equation a n +b n = c n had positive integer<br />

solutions with n > 2. Now we know that there are no such solutions. Moreover,<br />

if mathematics is necessarily true, then there isn’t even a metaphysical possibility in<br />

which there are such solutions. So we shouldn’t think that there was some metaphysical<br />

possibility that twenty years ago we hadn’t ruled out. Rather, we were just<br />

unsure what metaphysical possibilities there are.<br />

Finally, there are issues about logic. (Some views on the nature of logic and mathematics<br />

will deny that our fourth and fifth categories are different.) Getting the metaphysics<br />

of taste right is hard. One option that we think at least can’t be ruled out is<br />

that intuitionist logic is the correct logic of taste talk. That is, when it comes to taste,<br />

we don’t even know that it’s true that everything either is or is not tasty. But that<br />

doesn’t mean we’re committed to the existence of a possibility where it isn’t true that<br />

everything is tasty or not tasty; if such a state isn’t actual, it probably isn’t possible.<br />

The liar paradox is even harder than the metaphysics of taste. Anything should be<br />

on the table, even the dialethist view that the Liar is both true and false. That is, the<br />

Liar might be true and false. In saying that, we certainly don’t mean to commit to<br />

the existence of some possibility where the Liar is true and false. We’re pretty sure<br />

(but not quite certain!) that no such possibility exists.<br />

The last two cases might be dealt with by being more careful about what an epistemic<br />

possibility is. There are quite simple cases in which we want to resist the identification<br />

of epistemic possibilities with what we don’t know to be the case. For


Epistemic Modals and Epistemic Modality 383<br />

discussion of several such cases, see Hacking (1967), Teller (1972) and DeRose (1991).<br />

If we could very easily come to know that p does not obtain, perhaps because p is<br />

clearly ruled out by things we do know, then intuitively it isn’t the case that p is<br />

epistemically possible. If we know that if q then not p, and we know q, then p is<br />

not possible, even if we haven’t put conditional and antecedent together to conclude<br />

that p is false. So we need to put some constraints on the epistemically possible beyond<br />

what we know to be false. Perhaps those constraints will go so far as to rule out<br />

anything inconsistent with what we know. In that case, it wasn’t possible all along<br />

that Fermat’s Last Theorem was false. And, assuming the non-classical approaches to<br />

taste and alethic paradoxes are incorrect, those approaches aren’t even possibly correct.<br />

We’re not endorsing this position, just noting that it is a way to rescue the idea<br />

that all epistemic possibilities are metaphysical possibilities.<br />

The papers in this volume that most directly address these issues are by Frank<br />

Jackson, David Chalmers and Robert Stalnaker. Jackson argues against the view that<br />

accounting for epistemic possibilities requires us to think that there is a kind of possibility,<br />

conceptual possibility, that is broader than metaphysical possibility. He briefly<br />

reviews the reasons some people have had for taking this position, including those<br />

we’ve just discussed, and some of the reasons he rejected it in “From Metaphysics to<br />

Ethics”. But he adds some new arguments as well against this position, what he calls<br />

the ‘two space’ view of possibility. One argument says that if there is a possibility of<br />

any kind where water is not H 2 O, then being water and being H 2 O must be different<br />

properties by Leibniz’s Law. But then we have an implausible necessary connection<br />

between distinct properties. Another argument turns on the difficulty of identifying<br />

the water in these supposed conceptual possibilities that are not metaphysically<br />

possible.<br />

David Chalmers discusses what kind of thing epistemic possibilities, or as he calls<br />

them, ‘scenarios’, might be. He discusses the strengths, weaknesses and intricacies of<br />

two proposals: what he calls the ‘metaphysical’ and ‘epistemic’ constructions. The<br />

metaphysical construction is fairly familiar: it takes epistemic possibilities to be centered<br />

possible worlds. The epistemic construction takes epistemic possibilities to be<br />

maximally possible sentences of a specially constructed language. The metaphysical<br />

construction requires several assumptions before it matches up with the intuitive notion<br />

of epistemic possibility, while the epistemic construction requires a primitive<br />

notion of epistemic possibility. But both constructions seem to illuminate the elusive<br />

notion of an epistemic possibility. Chalmers ends with a discussion of several<br />

applications of his constructions in semantics, in formal epistemology and in moral<br />

psychology.<br />

Another place where one finds an important role for a distinctively epistemic<br />

(or at least doxastic) sort of possibility is in theorizing about indicative conditionals.<br />

In Robert Stalnaker’s contribution, he examines two types of accounts of indicative<br />

conditionals, which differ in where they locate the conditionality. One view analyzes<br />

assertions of indicative conditionals as a special sort of conditional assertion,<br />

and another analyzes them as an ordinary assertion of a special sort of conditional<br />

proposition. Stalnaker argues that the two views are not so different as we might<br />

initially have thought.


Epistemic Modals and Epistemic Modality 384<br />

2 Three Approaches to Epistemic Modals<br />

Even when we settle the issue of what epistemic possibilities are, we are left with<br />

many issues about how to talk about them. Speakers will often say that something<br />

is (epistemically) possible, or that it might be true. (It’s plausible that claims that<br />

p must be true, or that p is probable, are closely related to these, but we’ll stick to<br />

claims about (epistemic) possibility at least for this introduction.) It’s plausible to<br />

think that a proposition isn’t possible or impossible simpliciter, it’s rather that it is<br />

possible or impossible relative to some person, some group, some evidence or some<br />

information. Yet statements of epistemic possibility in plain English do not make<br />

any explicit reference to such a person, group, evidence set or information state. One<br />

of the key issues confronting a semanticist attempting to theorise about epistemic<br />

modals is what to do about this lack of a reference. We’ll look at three quite different<br />

approaches for dealing with this lack: contextualist, relativist and expressivist.<br />

2.1 Contextualism<br />

Consider a particular utterance, call it u, made by speaker s, of the form a might be F,<br />

where the might here is intuitively understood as being epistemic in character. To a<br />

first approximation, the sentence is saying a’s being F is consistent with, or not ruled<br />

out by, a certain body of knowledge. But whose body of knowledge? Not God’s,<br />

presumably, for then a might be F would be true iff a is F is true, and that’s implausible.<br />

The contextualist answer is that the relevant body of knowledge is supplied by<br />

context.<br />

When discussing the ways in which context fills in content, some writers will<br />

start with the pronoun I as an example. And to some extent it’s a useful example.<br />

The sentence I am a fool doesn’t have truth-conditional content outside of a context<br />

of utterance. But any utterance of that sentence does express something truth conditional.<br />

Which truth conditional sentence it expresses is dependent on facts about the<br />

context of its use. In fact, it is dependent on just one fact, namely who utters it. So<br />

when Andy utters I am a fool he expresses the proposition that Andy is a fool. And<br />

when <strong>Brian</strong> utters I am a fool he expresses the proposition that <strong>Brian</strong> is a fool.<br />

So far I is a useful example of a context-sensitive expression. But in many ways<br />

it is an unusual example of context-sensitivity, and focussing too much on it can lead<br />

to an overly simplistic view of how context-sensitive terms work. In particular, I has<br />

three properties that are unusual for a context-sensitive expression.<br />

• Its content in a context is computable from the context by a simple algorithm<br />

- namely the content is the speaker.<br />

• Its content does not depend on any properties of the intended audience of the<br />

utterance.<br />

• It behaves exactly the same way in embedded and unembedded contexts.<br />

Some terms have none of these properties. Consider, for example, we.<br />

There isn’t any obvious algorithm for computing the content of a particular use<br />

of we. The content may depend on the intentions of the speaker. It may depend on<br />

which people have been talked about. In sentences of the form We are F, different


Epistemic Modals and Epistemic Modality 385<br />

values of F might constrain what values can be rationally assigned to we. And when<br />

that is so, the interpretation of we will (usually) be constrained to those groups.<br />

Perhaps most notably, it depends a lot on the audience. If S is talking to H, and<br />

says We should grab some lunch, the content is that S and H should grab some lunch.<br />

And that’s the content because H is the intended audience of the utterance. Intended<br />

audiences can change quickly. If Andy says We will finish the paper this afternoon,<br />

then we will go for a walk, talking to <strong>Brian</strong> when he utters the first conjunct, and Fido<br />

when he utters the second, the content is that Andy and <strong>Brian</strong> will finish the paper<br />

this afternoon, then Andy and Fido will go for a walk.<br />

That we has neither of the first two properties is uncontroversial. What is perhaps<br />

a little more controversial is that it does not have the third either. When we<br />

is in an unembedded context it (usually) behaves like a free (plural) variable. Under<br />

certain embeddings, it can behave like a bound variable. Barbara Partee and Phillipe<br />

Schlenker offer the following examples.<br />

(5.9) John often comes over for Sunday brunch. Whenever someone else comes over<br />

too, we (all) end up playing trios. (Partee, 1989)<br />

(5.10) Each of my colleagues is so difficult that at some point or other we’ve had an<br />

argument. (Schlenker, 2003)<br />

In neither case does we contribute a group consisting of the speaker plus some salient<br />

individuals. Indeed, in neither case does it so much as contribute a group, since it is<br />

(or at least behaves like) a bound variable. There’s nothing in the contextualist story<br />

about we that prevents this.<br />

It’s worthwhile reviewing these facts about we, because on the most plausible<br />

contextualist stories about might, it too has these three properties. The contextualist<br />

theory we have in mind says that the content of u is For all that group X could<br />

know using methods M, a is F . The group X will usually consist of the speaker and<br />

some salient others, perhaps including the intended audience. The salient methods<br />

might include little more than easy deduction from what is currently known, or may<br />

include some wider kinds of investigation. (See DeRose (1991) for arguments that<br />

the relevant methods include more than deduction, and that they are contextually<br />

variable.)<br />

Now it isn’t part of the contextualist theory that there is an easy method for<br />

determining who is in X , or what methods are in M. So in that respect might is like<br />

we. But, just as the group denoted by we typically includes the intended audience of<br />

the utterance, the group X will typically include the intended audience of u. And<br />

the methods M will typically include any method that can be easily carried out. This<br />

can be used to explain some phenomena about disagreement. So if Andy says, to<br />

<strong>Brian</strong>, a might be F, and <strong>Brian</strong> knows that a is not F (or can easily deduce this from<br />

what he knows), <strong>Brian</strong> can disagree with what Andy says. That is, he can disagree<br />

with the proposition that it is consistent with what members of the conversation<br />

know that a is F . And, the contextualist says, that’s just what Andy did say. If <strong>Brian</strong><br />

presents Andy with his grounds for disagreement, Andy might well retract what he<br />

said. Since arguments about disagreeing with utterances like u have been prominent


Epistemic Modals and Epistemic Modality 386<br />

in the literature, it is worth noting that the contextualist theory can explain at least<br />

some facts about disagreement.<br />

Nor is it part of the contextualist theory that might behaves exactly the same<br />

way in embedded and unembedded contexts. Indeed, like we, might can behave like<br />

a bound variable. On the most natural reading of Every pedestrian fears that they<br />

might be being watched, there is no single group X such that every pedestrian fears<br />

that for all X (could easily) know, that pedestrian is being watched. Rather, every<br />

pedestrian fears that for all they themselves know, they are being watched. The naturalness<br />

of this reading is no embarrassment to the contextualist theory, since it is a<br />

common place that terms that usually get their values from context can also, in the<br />

right setting, behave like bound variables.<br />

Indeed, thinking about these parallels between context-sensitive expressions and<br />

epistemic modals seems to provide some support for contextualism. In his contribution<br />

to the volume, Jonathan Schaffer argues that various features of the way epistemic<br />

modals behave in larger sentences support the idea that an evaluator place must<br />

be realised in the syntax. For instance, consider the natural interpretation of “Anytime<br />

you are going for a walk, if it might rain, you should bring an umbrella.” We<br />

interpret that as saying that whenever you go for a walk, you should bring an umbrella<br />

if your evidence at that time is consistent with rain. Schaffer interprets that as<br />

evidence that there is hidden syntactic structure in epistemic modals, and argues that<br />

the contextualist offers the best account of how the hidden structure gets its semantic<br />

values.<br />

So the contextualist has a lot of explanatory resources, and a lot of flexibility in<br />

their theory, which are both clear virtues. But there are some limits to the flexibility.<br />

There are some things that the contextualist, at least as we’re using the term ‘contextualist’<br />

is committed to. In particular, the contextualist is committed to the content<br />

of a particular speech act (or at least of a particular assertion) is absolute, not assessorrelative.<br />

And they’re committed to the truth value of those contents being the same<br />

relative to any assessor. Let’s give those two commitments names.<br />

(C) The semantic content of an assertion is the same relative to any assessors.<br />

(T) The truth value of the semantic content of an assertion is the same relative to<br />

any assessors.<br />

The first of these rules out the possibility that the semantic content of an assertion<br />

differs with respect to different groups. The second rules out the possibility that semantic<br />

contents have assessor relative truth values. Modern relativists have proposed<br />

theories that dispense with these dogmas, and we’ll investigate those in the next section,<br />

after going over some of the motivations for relativism.


Epistemic Modals and Epistemic Modality 387<br />

2.2 Relativism<br />

In many fields, relativism is motivated by instances of “faultless disagreement”, and<br />

epistemic modals are not left out of this trend. Here is the kind of case that we used<br />

in Egan et al. (2005) to motivate relativism.<br />

Consider the following kind of case. Holmes and Watson are using a<br />

primitive bug to listen in on Moriarty’s discussions with his underlings<br />

as he struggles to avoid Holmes’s plan to trap him. Moriarty says to his<br />

assistant<br />

(24) Holmes might have gone to Paris to search for me.<br />

Holmes and Watson are sitting in Baker Street listening to this. Watson,<br />

rather inexplicably, says “That’s right” on hearing Moriarty uttering (24).<br />

Holmes is quite perplexed. Surely Watson knows that he is sitting right<br />

here, in Baker Street, which is definitely not in Paris.<br />

Here we have Watson somewhat surprisingly agreeing with Moriarty. In some sense,<br />

it seems wrong for him to have done so. He should have disagreed. Well, imagine<br />

that he did, by saying “That’s not right”. The quick argument for relativism is that<br />

the contextualist cannot make sense of this. Whatever group’s knowledge Moriarty<br />

intended to be talking about when he spoke, it presumably didn’t include Holmes<br />

and Watson; it just included him and his intended audience, i.e. the underlings. And<br />

it’s true that for all they know, Holmes is in Paris. So the content of Moriarty’s<br />

utterance is true. But it seems that Watson can properly disagree with it (and can’t<br />

properly agree with it). That, we thought, was a problem.<br />

There are three kinds of response to this argument on behalf of the contextualist<br />

that we think look promising. All of these responses are discussed in von Fintel and<br />

Gillies (2008). We might look harder at the denotation of the ‘that’ in Watson’s reply,<br />

we might think again about what the relevant group is, and we might look at other<br />

cases where the contextualist story is more promising, as a way of motivating the first<br />

two responses. Let’s look at these in turn.<br />

Above we said that Watson disagreed with Moriarty by saying “That’s not right”.<br />

But that’s potentially reading too much into the data. What seems correct is that<br />

Watson can say “That’s not right”. But that’s only to disagree with Moriarty if the<br />

‘that’ denotes what Moriarty said. And that might not be true. It’s possible that it<br />

picks out, say, the embedded proposition that Holmes has gone to Paris. And it’s fine<br />

for Watson to disagree with that.<br />

Even if Watson is disagreeing with the semantic content of Moriarty’s utterance,<br />

it might be that he’s doing so properly, because what Moriarty said is false. That<br />

might be the case because it might be that, in virtue of hearing the utterance, Watson<br />

became part of the relevant group X . Typically speaker intentions, particularly<br />

singular speaker intentions, are not the final word in determining the content of a<br />

context-sensitive expression. If <strong>Brian</strong> points over his shoulder, thinking a nice glass<br />

of shiraz is behind him, and says That is tasty, while in fact what he is pointing at is a


Epistemic Modals and Epistemic Modality 388<br />

vile confection of Vegemite infused chardonnay, he’s said something false. The simplest<br />

thing to say about a case like this is that <strong>Brian</strong> intended the denotation of ‘That’<br />

to be the thing he was pointing at, whatever it is. Similarly, Moriarty might have<br />

intended the relevant group X to be whoever heard the utterance at that time, even if<br />

he didn’t know Watson was in that group. (Or it might be that, whatever Moriarty’s<br />

intentions, the semantic rules and conventions for ‘might’ in English determine that<br />

the relevant group includes everybody who heard the utterance at the time.)<br />

This second response would seem somewhat ad hoc were it not for a class of<br />

examples von Fintel and Gillies describe concerning assessors from radically different<br />

contexts. Typically the anti-contextualist commentary on cases like these suggest that<br />

any hearer who knows that a is not F can disagree with u. But that doesn’t seem to<br />

be in general true.<br />

Or consider the case of Detective Parker. He has been going over some<br />

old transcripts from Al Capone’s court case in the 1920s–Capone is being<br />

asked about where some money is in relation to a particular safe:<br />

(20) a. Capone: The loot might be in the safe.<br />

b. Parker: ??Al was wrong/What Al said is false. The safe was<br />

cracked by Geraldo in the 80s and there was nothing inside.<br />

(2008, 86)<br />

The knowledge of at least some hearers, such as Detective Parker, does not seem<br />

to count for assessing the correctness of Capone’s speech. A contextualist might<br />

suggest that’s because contemporaneous hearers are in the relevant group, and later<br />

reviewers are not.<br />

So there are definitely some contextualism-friendly lines of response available to<br />

the argument for relativism from disagreement. But interestingly, some of these contextualist<br />

responses do not work as well as a response to a similar argument from<br />

agreement. Imagine that Andy, after doing some reading on the publicly available evidence,<br />

correctly concludes that it doesn’t rule out Prince Albert Victor. He doesn’t<br />

think this is very likely, but thinks it is possible. Andy hears someone on TV talking<br />

about the Ripper who says “Prince Albert Victor might have been Jack the Ripper”,<br />

and Andy says “That’s true”. Intuitively Andy is right to agree with the TV presenter,<br />

but this is a little hard to explain on the contextualist theory.<br />

Note that here we can’t say that Andy is agreeing because he is agreeing with<br />

the embedded proposition, namely that Prince Albert Victor was the Ripper. That’s<br />

because he doesn’t agree with that; he thinks it is an open but unlikely possibility.<br />

Nor does it particularly matter that Andy, as one of the people watching the<br />

TV show, is part of the relevant group X . All that would show is that if Andy<br />

knew Prince Albert Victor wasn’t the Ripper, the presenter’s assertion is false. But<br />

unless Andy is the group X , the fact that Andy’s knowledge, or even what is available<br />

to Andy, does not rule out the Prince does not mean Andy should agree with the<br />

statement. For all Andy knows, someone else watching, perhaps even someone else<br />

the presenter intends to include in her audience, has evidence exculpating the Prince.


Epistemic Modals and Epistemic Modality 389<br />

If that’s right, then he does not know that the proposition the contextualist says the<br />

speaker asserted is true. But yet he seems justified in agreeing with the presenter. This<br />

seems like a real problem for contextualism.<br />

A quite different objection to contextualism comes from metasemantic considerations.<br />

The most casual reflection on the intuitive content of utterances like u suggests<br />

there is staggeringly little rhyme or reason to which group X or method M might be<br />

relevant. The argument here isn’t that the contextualist’s semantic proposal is mistaken<br />

in some way. Rather, the argument is that the accompanying metasemantic<br />

theory, i.e. the theory of how semantic values get fixed, is intolerably complicated.<br />

Slightly more formally, we can argue as follows.<br />

(1) If contextualism is true, the metasemantic theory of how a particular use of<br />

‘might’ gets its semantic value is hideously complicated.<br />

(2) Metasemantic theories about how context-sensitive terms get their values on<br />

particular occasions are never hideously complicated.<br />

(3) So, contextualism is false.<br />

The problem with this argument, as Michael Glanzberg (2007) has argued, is that<br />

premise 2 seems to be false. There are examples of uncontroversially context-sensitive<br />

terms, like ‘that’, for which the accompanying metasemantic theory is, by any standard,<br />

hideously complicated. So the prospects of getting to relativism from metasemantic<br />

complexity are not, we think, promising.<br />

But there is a different metasemantic motivation for relativism that we think is a<br />

little more promising. Compare the difference between (1) and (2).<br />

(1) Those guys are in trouble, but they don’t know that they are.<br />

(2) ??Those guys are in trouble, but they might not be.<br />

Something has gone wrong in (2). This suggests that (2) can’t be used to express<br />

(1). That is, there’s no good interpretation of (2) where those guys are the group X .<br />

This is a little surprising, since we’ve made the guys pretty salient. Cases like this<br />

have motivated what we called the Speaker Inclusion Constraint (hereafter SIC) in<br />

“Epistemic Modals in Context”. That is, in unembedded uses of ‘might’ the group X<br />

always includes the speaker. Now the explanation of the problem with (2) is that for<br />

the speaker to assert the first clause, she must know that the guys are in trouble, but<br />

if that’s the case, and she’s in group X, then the second clause is false.<br />

Now a generalisation like this doesn’t look like it should be grounded in the meaning<br />

(in some sense of ‘meaning’) of ‘might’. For comparison, it seems to be part of the<br />

meaning of ‘we’ that it is a first-person plural pronoun. It isn’t just a metasemantic<br />

generalisation that the speaker is always one of the group denoted by ‘we’. By analogy,<br />

it is part of the meaning of ‘might’ that the speaker is always part of the group<br />

X .<br />

Further, when the meaning of a context-sensitive expression constrains its value,<br />

those constraints still hold when the term is used as a bound variable. For instance,<br />

it is part of the meaning of ‘she’ that it denotes a female individual. If Smith is male,<br />

then the semantic content of She is happy can’t be that Smith is happy. Similarly,


Epistemic Modals and Epistemic Modality 390<br />

when ‘she’ is behaving like a bound variable, the only values it can take are female<br />

individuals. So we can’t use Every student fears she will fail the test to quantify over<br />

some students some of whom are male. And there’s no interpretation of Every class<br />

hopes we will win where it means that every class hopes that that class will win. Even<br />

when under a quantifier and an attitude ascribing verb, ‘we’ must still pick out a<br />

group that includes the speaker. The natural generalisation is that constraints on<br />

context supplied by meaning do not get overridden by other parts of the sentence.<br />

The problem for contextualists about ‘might’ is that it doesn’t behave as you’d<br />

expect given these generalisations. In particular, the SIC doesn’t hold when ‘might’<br />

is in certain embeddings. So there is a reading of Every student fears they might have<br />

failed where it means that every student fears that, for all they know, they failed. The<br />

knowledge of the speaker isn’t relevant here. Indeed, even if the speaker knows that<br />

many students did not fail, this sentence can be properly uttered. This suggests the<br />

following argument.<br />

(1) If contextualism is true, then the explanation of the SIC is that it is part of the<br />

meaning of ‘might’ that the relevant group X includes the speaker.<br />

(2) If it is part of the meaning of ‘might’ that the relevant group X includes the<br />

speaker, then this must be true for all uses of ‘might’, included embedded uses.<br />

(3) When ‘might’ is used inside the scope of an attitude ascription, the relevant<br />

group need not include the speaker.<br />

(4) So, contextualism is not true.<br />

Premise 1 would be false if the metasemantics was allowed to be systematic enough<br />

to explain why the SIC holds even though it is not part of the meaning. Premise<br />

2 would be false if we allowed ‘might’ to have a systematically different meaning<br />

inside and outside the scope of attitude ascriptions. And premise 3 would be false<br />

if any attitude ascriptions that are made are, contrary to intuition, tacitly about the<br />

speaker’s knowledge. Since none of these seems particularly plausible, there does<br />

seem to be a problem for contextualism here.<br />

In their contribution to this volume, Kai von Fintel and Thony Gillies reject one<br />

of the presuppositions of the argument we’ve just presented. Classical contextualism,<br />

what they call ‘the canon’, says that context picks out a particular group, and<br />

an utterance of ‘It might be that p’ is true iff that group’s information is consistent<br />

with p. That’s what we’ve taken as the stalking horse in this section, and von Fintel<br />

and Gillies are certainly right that it is the canonical version of contextualism. Von<br />

Fintel and Gillies agree that the broad outline of this contextualist story is correct.<br />

But they deny that context picks out a determinate group, or a determinate body<br />

of information. Rather, uttering an epistemic modal will ‘put into play’ a number<br />

of propositions of the form ‘For all group G knows, p’. This ambiguity, or perhaps<br />

better indeterminacy, is crucial they argue to the pragmatic role that epistemic<br />

modals play. And once we are sensitive to it, they claim, we see that contextualism<br />

has more explanatory resources than we’d previously assumed, and so the motivation<br />

for relativism fades away.<br />

In summary, there are four motivations for relativism that have been floated in<br />

the literature. These are:


Epistemic Modals and Epistemic Modality 391<br />

• Intuitions about disagreement;<br />

• Intuitions about agreement;<br />

• Arguments from metasemantic complexity; and<br />

• Arguments from semantic change in attitude ascriptions,<br />

As noted, the third argument doesn’t seem very compelling, and it is a fairly open<br />

question whether the first works. But the second and fourth do look like good<br />

enough arguments to motivate alternatives.<br />

2.3 Two Kinds of Relativism<br />

We said above that contextualism is characterised by two theses, repeated here for<br />

convenience.<br />

(C) The semantic content of an assertion is the same relative to any assessors.<br />

(T) The truth value of the semantic content of an assertion is the same relative to<br />

any assessors.<br />

So there are two ways to not be a relativist, deny (C) and deny (T). One might<br />

deny both, but we’ll leave that option out of our survey.<br />

What we call content relativism denies (C). The picture is that contextualists were<br />

right to posit a variable X in the structure of an epistemic modal claim. But the<br />

contextualists were wrong to think that X gets its value from the context of utterance.<br />

Rather, the value of X is fixed in part by the context of assessment. In the simplest<br />

(plausible) theory, X is the speaker and the assessor. So if Smith asserts that Jones<br />

might be happy, the content of that assertion, relative to Andy, is that for all Smith<br />

and Andy know, Jones is happy, while relative to <strong>Brian</strong> its content is that for all Smith<br />

and <strong>Brian</strong> know, Jones is happy.<br />

The primary motivation for content relativism is that it keeps quite a bit of the<br />

contextualist picture, while allowing enough flexibility to explain the phenomena<br />

that troubled contextualism. So for the content relativist, contents are exactly the<br />

same kinds of propositions as the contextualist thinks they are. So we don’t need<br />

to tell a new kind of story about what it is for a content to be true, to be accepted,<br />

etc. Further, because we keep the variable X , we can explain the ‘bound variable’<br />

readings of epistemic modals discussed in the first section.<br />

A worry about content relativism is that the ‘metasemantic’ argument against<br />

contextualism might equally well tell against it. The worry there was that the constraints<br />

on X seemed to depend, in an unhappy way, on where in the sentence it<br />

appeared. The content relativist has a move available here. She can say that as a rule,<br />

whenever there’s a variable like X attached to a term, and that term is in an attitude<br />

ascription, then the variable is bound to the subject of the ascription. This might be<br />

an interesting generalisation. For instance, if she is a content relativist about both<br />

epistemic modals and predicates of personal taste, she has a single explanation for<br />

why both types of terms behave differently inside and outside attitude ascriptions.<br />

There are two interesting ‘near cousins’ of content relativism. One is a kind of<br />

content pluralism. We might hold (a) that an assertion’s content is not relative to an


Epistemic Modals and Epistemic Modality 392<br />

assessor, but (b) some assertions have many contents. So if s says a might be F, and<br />

this is assessed by many hearers, s asserts For all s and h know, a is F , for each h<br />

who hears and assesses the speech. Now when a hearer h 1 does this, she’ll probably<br />

focus on one particular content of s’s assertion, namely that For all s and h 1 know, a<br />

is F . But the content pluralist accepts (while the content relativist denies) that even<br />

relative to h 1 , s’s assertion also had the content For all s and h 2 know, a is F , where h 2<br />

is a distinct assessor.<br />

Another near cousin is the view, defended in this volume by Kent Bach, that the<br />

semantic content of an epistemic modal is typically not a complete proposition. In<br />

the case just described, it might be that the semantic content of what s says is For<br />

all ____ knows, a is F , and that’s not a proposition. Now a given hearer, h, might<br />

take s to have communicated to them that For all s and h know, a is F , but that’s<br />

not because that’s the semantic content of what s says. It’s not the absolute content<br />

(a la contextualism), the content relative to h (a la content relativism) or one of the<br />

contents (a la content pluralism).<br />

It’s a very big question how we should discriminate between these theories. Some<br />

readers may even worry that there is no substantive differences between the theories,<br />

they are in some sense saying the same thing in different words. One big task for<br />

future research is to clearly state the competing theories in the vicinity of here, and<br />

find arguments that discriminate between them.<br />

A quite different kind of relativism denies (T). This view says that the content<br />

itself of an assertion can be true for some assessors, and false for others. Such a view<br />

is not unknown in recent philosophy. In the 1970s and 1980s (and to a lesser extent<br />

in subsequent years) there was a debate between temporalists and eternalists about<br />

propositions. The temporalists thought that a tensed proposition, i.e. the content<br />

of a tensed assertion, could be true at one time and false at another. The eternalists<br />

denied this, either taking truth to be invariant across times, or in some cases denying<br />

that it even made sense to talk about truth being relative to something, e.g. a time.<br />

Contemporary forms of truth relativism generalise the temporalist picture. The<br />

temporalists thought that propositions are true or false relative to a world-time pair.<br />

Modern relativists think that propositions are true or false relative to a world-assessor<br />

pair, or what loosely following Quine (1969) we might call a centered world. (Quine<br />

used this to pick out any world-place-time triple, but since most times and places<br />

don’t have assessors at them, world-assessor pairs, or even world-assessor-time triples,<br />

are more restricted.) For example, as a first pass at a truth-relativism about predicates<br />

of personal taste, one might propose that the proposition expressed by a typical utterance<br />

of ‘beer is tasty’ will be true at any centered world where the person at the<br />

center of the world likes the taste of beer.<br />

The truth relativist has an easy explanation of the data that motivated the rejection<br />

of contextualism. Recall two puzzles for the contextualist about terms like<br />

‘tasty’: that it is so easy to agree with claims about what’s tasty, and that reports of<br />

the form X thinks that beer is tasty are always about X ’s attitude towards beer, not<br />

about X ’s beliefs about how the speaker finds beer.<br />

On the first puzzle, note that if to agree with an assertion is to agree with its<br />

propositional content, and that content is true at the center of your world iff you


Epistemic Modals and Epistemic Modality 393<br />

find beer tasty, then to agree with an assertion that beer is tasty, you don’t have to<br />

launch an inquiry into the sincerity of the speaker, you just have to check whether<br />

you like beer. If you’re in a world full in insincere speakers, and abundant beer, that’s<br />

relatively easy.<br />

On the second puzzle, if propositional attitude ascriptions report the subject’s<br />

attitude towards a proposition, and if a proposition is a set of centered worlds, then<br />

the subject’s attitude towards ‘Beer is tasty’ should be given by their attitude towards<br />

whether that proposition is true in their centered world. That is, it should be given<br />

by their attitude towards beer. And that’s just what we find.<br />

The extension of all this to epistemic modals is more or less straightforward. The<br />

simplest truth relativist theory says that an utterance of the form a might be F is true<br />

iff, for all the assessor at the center of the world knows, a is F . As Richard Dietz<br />

(2008) has pointed out, this won’t do as it stands. If the speaker knows a is not F ,<br />

then their utterance seems like it should be false relative to everyone. (Conversely,<br />

a speaker who knows a is F speaks truly, relative to any assessor, when they say<br />

a must be F.) If we’re convinced of this, the solution is a mild complication of the<br />

theory. The utterance is both somewhat context-sensitive, and somewhat relative.<br />

So S’s utterance of a might be F is true at a centered world iff for all S plus the person<br />

at the center of the world know, a is F . We might want to add more complications<br />

(is it knowledge that matters or available information, for example?) but that’s one<br />

candidate truth relativist theory.<br />

There are three worries we might have about truth relativism. One is a very big<br />

picture worry that the very notion of truth being relative is misguided. This is a<br />

theme of Herman Cappelen and John Hawthorne’s Relativism and Monadic Truth<br />

. Another is that it overgenerates ‘explanations’. We can’t explain cases like the<br />

Capone/Parker example. And a third is that, by making propositions so different<br />

from what we thought they were, we’ll have to redo a lot of philosophy of language<br />

that presupposed propositions have the same truth value for everyone. In particular,<br />

we’ll have to rethink what an assertion is. (That challenge is addressed – in different<br />

ways – in recent work by John MacFarlane and by Andy Egan.)<br />

The strongest defence of relativism in this volume comes from John MacFarlane.<br />

His work on tense (MacFarlane, 2003b), and on knowledge attributions (MacFarlane,<br />

2005a), and on the broader philosophical status of relativism and other rivals to<br />

classical contextualism (MacFarlane, 2005b, 2009), have been immensely influential<br />

in the contemporary debates. Here he develops a relativistic semantics for epistemic<br />

modals, along the lines of the proposals he has offered elsewhere for tense and knowledge<br />

attributions. He argues that many phenomena, several of which we’ve discussed<br />

in this introduction, raise trouble for contextualism and promote relativism. These<br />

phenomena include third-party assessments, retraction and disagreement. He argues<br />

that only the relativist can explain the troublemaking phenomena.<br />

2.4 Expressivism<br />

So far we’ve looked at two of our three major approaches to epistemic modals. The<br />

contextualist says that which proposition is asserted by an epistemic modal depends


Epistemic Modals and Epistemic Modality 394<br />

crucially on the context of utterance. The relativist says that the contextualist is ignoring<br />

the importance of the context of assessment. The content relativist says that<br />

they are ignoring the way in which the context of assessment partially determines<br />

what is said. The truth relativist says that they are ignoring the way in which propositions<br />

uttered have different truth values at different contexts of assessment.<br />

The expressivist thinks that there is a common assumption behind all of these<br />

theories, and it is a mistaken assumption. The assumption is that when we’re in the<br />

business of putting forward epistemic modals, we’re in the business of asserting things<br />

that have truth values. The expressivist rejects that assumption. They say that when<br />

we say a might be F, we’re not asserting that we are uncertain about whether a is F ,<br />

we’re expressing that uncertainty directly. The contextualists and relativists think<br />

that in making these utterances, we’re expressing a second-order belief, i.e. a belief<br />

about our own knowledge, or lack thereof. The expressivists think we’re expressing<br />

a much simpler mental state: uncertainty.<br />

One way to motivate expressivism is to start with the anti-contextualist arguments,<br />

and then argue that relativism is not an acceptable way out. So we might,<br />

for instance, start with the argument from agreement. The expressivist notes that<br />

there are many ways to agree with a statement. If Smith says ‘Let’s have Chinese for<br />

dinner’, and Jones agrees, there need not be any proposition that Smith asserted that<br />

Jones is agreeing to. We’re happy to call all sorts of meetings of minds agreements.<br />

So the agreement phenomena that the contextualist can’t explain, the expressivist can<br />

explain. When Smith says ‘Brown might be a spy’, and Jones agrees, there isn’t necessarily<br />

any proposition they both accept. Rather, their agreement consists in having<br />

a common mental state, namely uncertainty about whether Brown is a spy.<br />

The expressivist may then run out any number of arguments against relativism.<br />

For instance, they might argue (against content relativism) that it is a requirement of<br />

a speech act being an assertion that it have a determinate content. And they might<br />

argue, perhaps motivated by theoretical considerations about the role of assertions<br />

in conversation, that contents which vary in truth value among hearers couldn’t be<br />

contents of assertions. If true, that would rule out truth relativism. We’re moved,<br />

perhaps by elimination as much as anything, to expressivism.<br />

There are more direct arguments for expressivism as well. Isaac Levi (1996, 55)<br />

motivated a view on which epistemic modals don’t have truth values by thinking<br />

about learning. Imagine someone previously thought that Brown might be a spy,<br />

perhaps on quite good grounds, then they learn that he is not a spy. If that’s all<br />

they learned, then it seems odd to say that there’s something that they previously<br />

knew, that now they don’t know. It seems learning shouldn’t destroy knowledge.<br />

That’s what happens in standard models for belief revision (which were one of Levi’s<br />

primary concerns) and it is independently plausible. But if epistemic modals express<br />

propositions, and those are true or false, then there is a proposition that the person<br />

did know and now don’t know, namely that Brown might be a spy.<br />

There are clearly a few possible responses to this argument. For one thing, we<br />

could make the epistemic modal claims explicitly tensed. Both before and after the<br />

learning experience, the subject knew that Brown might, at t 1 , have been a spy, but<br />

didn’t know that Brown might, at t 2 , have been a spy. (Indeed, they learned that that


Epistemic Modals and Epistemic Modality 395<br />

was false.) Or, and this is more in keeping with the spirit of this introduction, we<br />

might spell out the epistemic modal claim. Before and after the learning experience,<br />

the subject knew that it was consistent with everything the subject knew prior to the<br />

learning experience that Brown was a spy. So there’s no information lost.<br />

The problem with this move is that it seems to make epistemic modals overly<br />

complex. Intuitively, it is possible for a child to grasp a modal, and for the most<br />

natural interpretation of that modal to be epistemic, without the child having the<br />

capacity to form second order thoughts. (This point is one that Seth Yalcin uses in<br />

his argument for a kind of expressivism in this volume.) This question seems like it<br />

would be good to test empirically, though we don’t know of any existing evidence<br />

that settles the question. Introspectively, it does seem that one can think that the<br />

cat might be in the garden without thinking about one’s own epistemic or doxastic<br />

states as such. Those kinds of introspections might tell in favour of an approach<br />

which identifies epistemic modality with a distinct kind of relation to content, rather<br />

than a distinct kind of content.<br />

Following important work by Allan Gibbard (1990), there is a natural way to formalise<br />

an expressivist theory of epistemic modality. Identify a ‘context’ with a set of<br />

propositions. Sentences, whether epistemic modals or simple sentences, are satisfied<br />

or unsatisfied relative to world-context pairs, where a world and a context make a pair<br />

iff every proposition in the context is true at that world. Then an epistemic modal,<br />

say Brown might be a spy, is satisfied by such a pair iff Brown is a spy is consistent<br />

with everything in the context. A simple sentence, like White is a spy is satisfied by<br />

such a pair iff White is a spy is true at the world. The pairing becomes useful when<br />

considering, say, conjunctions. A conjunction is satisfied iff both conjuncts are satisfied.<br />

So White is a spy and Brown might be is satisfied by a world-context pair iff<br />

White is a spy at the world, and Brown’s being a spy is consistent with the context.<br />

So far this looks a lot like relativism. A world-context pair is just like a centered<br />

world, with the context being what’s known by the person at the center of the world.<br />

If we apply the formalism to real-life cases, perhaps taking the contexts to be genuine<br />

contexts in the sense of Stalnaker (1978), the two formalisms might look very close<br />

indeed.<br />

But there is, or at least we hope there is, a substantive philosophical difference between<br />

them. The expressivist has a restricted sense of what it is to make an assertion,<br />

and of what it is for an expression to be an expression of a truth. The expressivist<br />

most insistently does not identify satisfaction with truth. The only sentences that are<br />

true or false are sentences that are satisfied by a world context pair 〈w, c 1 〉 iff they are<br />

satisfied by every other pair starting with the same world. The expression of such a<br />

sentence, and perhaps only of such a sentence, constitutes an assertion. Otherwise it<br />

constitutes some other speech act.<br />

And this is no mere difference in how to use the words ‘truth’, ‘assertion’ and so<br />

on. Nor is it even just a difference about truth and assertion and so on. It hopefully<br />

makes a difference to what predictions we make about the way epistemic modals<br />

embed, especially how they embed in propositional attitude ascriptions. We used<br />

that fact to argue against expressivism in “Epistemic Modals in Context”, since we<br />

thought there were in some cases more examples of successful embedding of epistemic


Epistemic Modals and Epistemic Modality 396<br />

modals, especially in conditionals, than the expressivist would predict. On the other<br />

hand Seth Yalcin uses facts about embedding to argue, in his paper in this volume, in<br />

favour of expressivism. He argues that on a non-expressivist view, we should be able<br />

to suppose that p is true but might not be true, and that can’t be supposed.<br />

This argument is part of the argument by elimination that Yalcin against what he<br />

calls ‘descriptivism’ about epistemic modals in this contribution to the volume. He<br />

uses ‘descriptivism’ to pick out a broad category of theories about epistemic modals<br />

that includes both contextualism and relativism. He argues against all descriptivist<br />

views, and in favour of what he calls ‘expressivism’. He says that when someone<br />

utters an epistemic modal, they do not describe their own knowledge (or the knowledge<br />

of someone else), rather they express their own mental state. Some of Yalcin’s<br />

arguments for expressivism are related to arguments against contextualism; in particular<br />

he thinks like we do that there isn’t a viable form of contextualism. But he<br />

also thinks that there are problems for relativism, such as the difficulty in supposing<br />

Moore paradoxical propositions. He also notes that it is a puzzle for descriptivists<br />

to make sense of belief ascriptions involving epistemic modals. On a descriptivist<br />

model, a sentence like ‘X believes that it might be that p’ reports the existence of a<br />

second-order belief state. But Yalcin notes there are reasons to doubt that is right. He<br />

develops in detail an expressivist model that avoids what he takes to be shortcomings<br />

of descriptivist approaches.<br />

The two papers we haven’t discussed so far, those by Eric Swanson and Stephen<br />

Yablo, are both related to this expressivist family of theories, though their positive<br />

proposals head off in distinctive directions.<br />

Eric Swanson’s contribution locates epistemic modals within a broader category,<br />

which he calls “the language of subjective uncertainty”. He also emphasizes the diversity<br />

of epistemic modal locutions, and draws attention to the risks involved in focusing<br />

too closely on just a few examples. In the literature so far, ‘might’ and ‘must’ have<br />

tended to get the lion’s share of the attention, while other sorts of epistemic modality<br />

– including the more explicitly quantitative sorts (‘four to one against that’, ‘there’s<br />

a 55% chance that’, etc.) – have gone mostly unnoticed. Swanson argues that attending<br />

to other instances of the language of subjective uncertainty serves to undermine<br />

many of the standard proposals about epistemic ‘might’ and ‘must’, and motivates a<br />

probabilistic semantics.<br />

Somewhat relatedly, Stephen Yablo develops a theory about epistemic modals<br />

where their primary function is not to state facts about the world, but to update<br />

the conversational score. Theories of this kind are quite familiar from the dynamic<br />

semantics tradition, but Yablo notes that the existing dynamic theories of epistemic<br />

modals are quite implausible. One of the challenges a dynamic approach to epistemic<br />

modals faces is to say how we should update a context (or a belief state) with It might<br />

be that p when the context previously was incompatible with p. Yablo adopts some<br />

suggestions from David Lewis’s “A Problem about Permission” (Lewis, 1979e) to try<br />

and solve this puzzle.


Questioning Contextualism<br />

There are currently a dizzying variety of theories on the market holding that whether<br />

an utterance of the form A knows that p is true depends on pragmatic or contextual<br />

factors. Even if we allow that pragmatics matters, there are three questions to be<br />

answered. First, whose interests matter? Here there are three options: the interests<br />

of A matter, the interests of the person making the knowledge ascription matter,<br />

or the interests of the person evaluating the ascription matter. Second, which kind<br />

of pragmatic factors matter? Broadly speaking, the debate here is about whether<br />

practical interests (the stakes involved) or intellectual interests (which propositions<br />

are being considered) are most important. Third, how do pragmatic factors matter?<br />

Here there is not even a consensus about what the options are.<br />

This paper is about the first question. I’m going to present some data from the<br />

behaviour of questions about who knows what that show it is not the interests of the<br />

person making the knowledge ascription that matter. This means the view normally<br />

known as contextualism about knowledge-ascriptions is false. Since that term is a little<br />

contested, and for some suggests merely the view that someone’s context matters, I’ll<br />

introduce three different terms for the three answers to the first question.<br />

Consider a token utterance by B of A knows that p. This utterance is being evaluated<br />

by C. A semantic pragmatist about knowledge ascriptions says that whether C<br />

can correctly evaluate that utterance as true or false depends on some salient party’s<br />

context or interests. 1 We can produce a quick taxonomy of semantic pragmatist positions<br />

by looking at which of the three parties is the salient one.<br />

Type A pragmatism A’s context or interests matter.<br />

Type B pragmatism B’s context or interests matter.<br />

Type C pragmatism C’s context or interests matter.<br />

Type A pragmatism is defended by Hawthorne (2004b) under the name ‘subject-sensitive<br />

invariantism’ and Stanley (2005) under the name ‘interest-relative invariantism’.<br />

The theory commonly known as contextualism, defended by Cohen (1986), DeRose<br />

(1995) and Lewis (1996b), is a form of Type B pragmatism. On their theory, the semantic<br />

value of ‘knows’ depends on the context in which it is uttered, i.e. B’s context.<br />

(And the salient feature of B’s context is, broadly speaking, B’s interests.) Recently<br />

MacFarlane (2009) has outlined a very different variant of Type B pragmatism that<br />

we will discuss below. Type C pragmatism is the radical view that the token utterance<br />

does not have a context-invariant truth value, so one and the same utterance can<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Aspects of<br />

Knowing, edited by Stephen Hetherington, Elsevier, 2006, pp. 133-147. Thanks to David Chalmers, Keith<br />

DeRose, Tamar Szabó Gendler, Stephen Hetherington, John MacFarlane, Ishani Maitra and Matthew<br />

Weiner for helpful discussions about earlier drafts.<br />

1 Note that the focus here is on the truth or falsity of the utterance, and not on the truth or falsity of the<br />

proposition the utterance expresses. I’m indebted to John MacFarlane for stressing to me the importance<br />

of this distinction.


Questioning Contextualism 398<br />

be properly evaluated as true by one evaluator and false by another. This position is<br />

outlined, and defended, by MacFarlane (2005a).<br />

The purpose of this paper is to argue against Type B pragmatism. I’ll show that<br />

there is a striking disanalogy between the behaviour of ‘knows’ in questions and the<br />

behaviour of terms for which a Type B-type theory is true. The best explanation for<br />

this disanalogy is that Type B theories, contextualism included, are false. 2<br />

I won’t be addressing the second or third questions here, but I’ll assume (basically<br />

for ease of exposition) that the right answer to the second has more to do with practical<br />

than intellectual interests. So a sceptical possibility becomes relevant not because<br />

anyone is actively considering it, but because it makes a difference to someone’s actions.<br />

Everything I say here should be restatable given any answer to that question,<br />

so little beyond exposition turns on this.<br />

1 Basic Indexicals in Questions<br />

As well as considering these three pragmatic theories about ‘know’, we can consider<br />

similar theories about other words. We’ll start with words that are universally considered<br />

to be indexicals. Consider the following three theories about ‘here’.<br />

Type A pragmatism A token of ‘here’ denotes the location of the subject of the<br />

sentence in which it appears.<br />

Type B pragmatism A token of ‘here’ denotes the location of the speaker.<br />

Type C pragmatism A token of ‘here’ denotes the location of the person evaluating<br />

the sentence.<br />

So when C evaluates B’s utterance A is here, the Type A pragmatist says that it gets<br />

evaluated as true iff A is where A is, the Type B pragmatist says that it gets evaluated<br />

as true iff A is where B is, and the Type C pragmatist says that it gets evaluated as true<br />

iff A is where C is. Obviously Type B pragmatism is true in general about ‘here’, as<br />

it is for all obvious indexicals. 3 What we’re interested in for now, however, is not its<br />

truth but the kind of evidence that can be adduced for it. One very simple way of<br />

separating out these three theories involves questions, as in the following example.<br />

2 My own view is that if any pragmatic theory is true, it is Type A pragmatism. In <strong>Weatherson</strong> (2005a)<br />

I defend the view that whether an agent is sufficiently confident in p to count as believing that p depends<br />

on features of her context. In contexts where it is very important for practical deliberation whether p<br />

is true, a degree of confidence that might ordinarily count as belief that p might no longer so count.<br />

This is an odd kind of doxastic externalism, namely the view that whether a state amounts to a belief is<br />

environment-dependent. Now to know that p also requires having a certain level of confidence that p<br />

is true, and it is arguable (though I haven’t argued it) that this level is also dependent on the would be<br />

knowers environment. It is also arguable that the best theory of epistemic defeaters will contain pragmatic<br />

features.<br />

Though I describe Type C pragmatism as radical, I’ve been convinced by John MacFarlane’s work that<br />

it is a perfectly coherent doctrine. In Egan et al. (2005) we defend a version of Type C pragmatism for<br />

sentences of the form A might be F, where might is understood epistemically.<br />

3 Whether the ‘in general’ is strictly necessary turns on tricky considerations about non-standard usages,<br />

such as answering machines. I’m generally favourable to the view that if any qualification is needed it is a<br />

very small one. See <strong>Weatherson</strong> (2002) for more discussion of this point.


Questioning Contextualism 399<br />

Watson is at a party in Liverpool, and he is worried that Moriarty is there as well.<br />

He calls Holmes, who is tracking Moriarty’s movements by satellite, and asks (1).<br />

(1) Is Moriarty here?<br />

This question has three properties that are distinctive of questions involving indexicals.<br />

SPEAKER How Holmes should answer the question depends on the speaker’s (i.e.<br />

Watson’s) environment, not his own and not Moriarty’s. It would be wrong to<br />

say “Yes” because Moriarty is where he is, or “No” because Moriarty is not in<br />

the lab with Holmes following the satellite movement.<br />

CLARIFICATION It is permissible in certain contexts to ask the speaker for more<br />

information about their context before answering. If Holmes’s is unsure of<br />

Watson’s location, he can reply with “Where are you?” rather than answering<br />

directly. 4<br />

DIFFERENT ANSWERS If two different people, who are in different locations,<br />

ask Holmes (1), he can answer differently. Assume that Moriarty is at the party,<br />

so Holmes says “Yes” in reply to Watson. Lestrade then calls from Scotland<br />

Yard, because he is worried that Moriarty has broken in and asks Holmes (1).<br />

Holmes says “No”, and this is consistent with his previous answer.<br />

Questions involving indexicals have a fourth property that we’ll return to from time<br />

to time in what follows.<br />

NEGATIVE AGREEMENT It is coherent to answer the question by saying “No”,<br />

then basically repeating the question, inverting the verb-subject order so you<br />

have an indicative sentence. So Holmes could consistently (if falsely in this<br />

story) say, “No, Moriarty is here.”<br />

The truth of Type B Pragmatism about ‘here’ explains, indeed predicts, that (1) has<br />

these properties. If a correct answer is a true answer, then Type B Pragmatism directly<br />

entails that (1) has SPEAKER. That assumption (that correctness is truth) plus<br />

the fact that you can speak to someone without knowing their location implies that<br />

(1) has CLARIFICATION, and adding the fact that different people can be in different<br />

locations implies that (1) has DIFFERENT ANSWERS. Whether Holmes can<br />

say Moriarty is here depends whether he can truthfully answer his own question Is Moriarty<br />

here? so NEGATIVE AGREEMENT is related to DIFFERENT ANSWERS.<br />

For these reasons it isn’t surprising that other terms that are agreed on all sides to be<br />

indexicals generate questions that have all four properties. For instance, because ‘me’<br />

is an indexical, (2) has all four properties.<br />

4 It is important here that we restrict attention to clarificatory questions about the context, and in<br />

particular to features of the context that (allegedly) affect the truth value of utterances involving the indexical.<br />

Speakers can always ask for clarification of all sorts of features of a question, but a question<br />

only has CLARIFICATION if it is appropriate to ask for clarifying information about the nature of the<br />

questioner’s context.


Questioning Contextualism 400<br />

(2) Does Moriarty want to kill me?<br />

This suggests a hypothesis, that all terms for which Type B pragmatism is true generate<br />

questions with those four properties. This hypothesis seems to be false; some<br />

terms for which Type B pragmatism is true do not generate questions that have NEG-<br />

ATIVE AGREEMENT. But a weaker claim, that all terms for which Type B pragmatism<br />

is true generate questions with the first three properties, does seem to be true. 5<br />

Or so I will argue in the next section.<br />

2 Other Type B terms in Questions<br />

Various philosophers have argued that a version of Type B pragmatism is true for each<br />

of the following four terms: ‘tall’, ‘empty’, ‘ready’ and ‘everyone’. Various defenders<br />

of Type B pragmatism about ‘knows’ have argued that ‘knows’ is analogous to terms<br />

on this list. In this section I’ll argue that questions involving those four terms have<br />

the first three properties of questions listed above, though they don’t seem to have<br />

NEGATIVE AGREEMENT. In the next section I’ll argue that questions involving<br />

‘know’ do not have those three properties. These facts combine to form an argument<br />

against the hypothesis that Type B pragmatism is true of ‘knows’. (The argument<br />

here does not turn on supposing that Type B pragmatism is true of these four terms.<br />

All I want to argue for is the claim that any term for which Type B pragmatism<br />

is true generates questions with the first three properties. It is possible that other<br />

terms also generate these kinds of questions, but this possibility doesn’t threaten the<br />

argument.) The examples involving these four terms will be a little complicated, but<br />

the intuitions about them are clear.<br />

Moriarty has hired a new lackey, a twelve year old jockey. She is tall for a twelve<br />

year old girl, and tall for a jockey, but not tall for a person. Moriarty is most interested<br />

in her abilities as a jockey, so he worries that she’s tall. Holmes is also most<br />

interested in her qua jockey. Watson has noted that Moriarty never hires people who<br />

are tall for adults (he thinks this is because Moriarty likes lording over his lackeys)<br />

and is wondering whether the new hire fits this property. He asks Holmes (3).<br />

(3) Is Moriarty’s new lackey tall?<br />

Holmes should say “No”. What matters is whether she is tall by the standards Watson<br />

cares about, not whether she is tall by the standards Holmes cares about (i.e. jockeys)<br />

or the standards Moriarty cares about (i.e. also jockeys). So (3) has SPEAKER.<br />

Lestrade wants information from Holmes about the lackey so his men can pick her<br />

up. They have this conversation.<br />

5 The failure of NEGATIVE AGREEMENT for questions involving quantifiers, comparative adjectives<br />

and similar terms is interesting is interesting, and is something that a full semantic theory should account<br />

for. My feeling is that the best explanation will draw on the resources of a theory like that defended by<br />

Stanley (2000, 2002) but it would take us far away from our primary purpose to explore this here. It is certainly<br />

an argument in favour of the ‘semantic minimalism’ defended by Cappelen and Lepore (2005) that<br />

all and only the terms in their ‘basic set’ (apart from tense markers), i.e. the obvious indexicals, generate<br />

questions with NEGATIVE AGREEMENT. I think the fact that many terms generate questions with the<br />

SPEAKER, CLARIFICATION and DIFFERENT ANSWERS property tells more strongly against their<br />

view.


Questioning Contextualism 401<br />

Lestrade: Is Moriarty’s new lackey tall? My men are looking for her.<br />

Holmes: Where are they looking?<br />

Lestrade: At her school.<br />

Holmes: Yes, she looks like she could be fourteen or fifteen.<br />

Holmes quite properly asks for a clarification of the question so as to work out what<br />

standards for tallness are in play, so (3) has CLARIFICATION. And he properly<br />

gives different answers to Watson and Lestrade, so it has DIFFERENT ANSWERS.<br />

But note that ‘tall’ doesn’t have NEGATIVE AGREEMENT. Holmes can’t answer<br />

(3) with “No, she’s tall.” I won’t repeat this observation for the other terms discussed<br />

in this section, but the point generalises.<br />

Next we’ll consider ‘empty’. Holmes and Watson are stalking Moriarty at a party.<br />

Watson is mixing poisons, and Holmes is trying to slip the poison into Moriarty’s<br />

glass. Unfortunately Moriarty has just about finished his drink, and might be about<br />

to abandon it. Watson is absent-mindedly trying to concoct the next poison, but he<br />

seems to have run out of mixing dishes.<br />

Watson: Is Moriarty’s glass empty?<br />

Holmes: What do you want it for?<br />

Watson: I need something dry to mix this poison in.<br />

Holmes: No, it’s got a small bit of ice left in it.<br />

(Lestrade arrives, and sees Holmes holding a vial.)<br />

Lestrade: Why haven’t you moved in? Is Moriarty’s glass empty?<br />

Holmes: Yes. He should get another soon.<br />

Holmes behaves entirely appropriately here, and his three responses show that the<br />

question Is Moriarty’s glass empty? has the CLARIFICATION, SPEAKER and DIF-<br />

FERENT ANSWERS properties respectively. Note in particular that the glass is<br />

empty by the standards that matter to Moriarty and Holmes (i.e. it’s got not much<br />

more than ice left in it) doesn’t matter to how Holmes should answer until someone<br />

with the same interests, Lestrade, asks about the glass.<br />

Third, we’ll look at ‘ready’. Moriarty is planning four things: to rob the Bank<br />

of England, to invade Baker St and kill Holmes, to invade Scotland Yard to free his<br />

friends, and to leave for a meeting of criminals where they will plan for the three<br />

missions. Moriarty cares most about the first plan, Holmes about the second, and<br />

Lestrade about the third, but right now Watson cares most about the fourth because<br />

it’s his job to track Moriarty to the meeting. Holmes is tracking Moriarty through<br />

an installed spycam.<br />

Watson: Is Moriarty ready?<br />

Holmes: Yes. You should go now.<br />

(Watson departs and Lestrade arrives)


Questioning Contextualism 402<br />

Lestrade: Is Moriarty ready?<br />

Holmes: Who’s that? Oh, hello Inspector. No, he still has to plan the<br />

attack out.<br />

Again, Holmes’s answers show that the question Is Moriarty ready? has the CLARI-<br />

FICATION, SPEAKER and DIFFERENT ANSWERS properties. Note that in this<br />

case the issue of whether Moriarty is ready for the thing he cares most about, and the<br />

issue of whether he is ready for the thing Holmes cares most about, are not relevant<br />

to Holmes’s answers. It is the interests of the different speakers that matter, which<br />

suggests that if one of the three types of pragmatism is true about ‘ready’, it is Type<br />

B pragmatism.<br />

Finally we’ll look at ‘everyone’. Lewis (1996b) suggests that ‘know’ is directly<br />

analogous to ‘every’, so the two words should behave the same way in questions.<br />

Moriarty’s gang just robbed a department store while the royal family, along with<br />

many police, were there. Holmes is most interested in how this affected the royals,<br />

Watson in how the public reacted, Lestrade in how his police reacted, and Moriarty<br />

merely in his men and his gold. The public, and the royals, were terrified by the raid<br />

on the store, but the police reacted bravely.<br />

Watson: Did Moriarty’s men terrify everyone?<br />

Holmes: Her majesty and her party were quite shocked. Oh, you mean<br />

everyone. Yes, the masses there were completely stunned.<br />

(Lestrade enters.)<br />

Lestrade: I just heard about the raid. How did they get through security?<br />

Did Moriarty’s men terrify everyone?<br />

Holmes: No, your men did their job, but they were outnumbered.<br />

Again, Holmes’s answers show that the question Did Moriarty’s men terrify everyone?<br />

has the CLARIFICATION, SPEAKER and DIFFERENT ANSWERS properties.<br />

And again, what the quantifier domain would be if Holmes were to use the word ‘everyone’,<br />

namely all the royal family, is irrelevant to how he should answer a question<br />

involving ‘everyone’. That’s the distinctive feature of expressions for which Type B<br />

pragmatism is true, and it suggests that in this respect at least ‘everyone’ behaves as if<br />

Type B pragmatism is true of it.<br />

3 Questions about Knowledge<br />

We have two reasons for thinking that if Type B pragmatism is true about ‘knows’,<br />

then questions about knowledge should have all the SPEAKER, CLARIFICATION<br />

and DIFFERENT ANSWERS properties. First, the assumption that correct answers<br />

are true answers plus trivial facts about the environment (namely that environments<br />

are not always fully known and differ between speakers) implies that the questions<br />

have these properties. Second, many words that are either uncontroversial or controversial<br />

instances where Type B pragmatism is true generate questions with these


Questioning Contextualism 403<br />

properties. So if ‘knows’ is meant to be analogous to these controversial examples,<br />

questions about knowledge should have these properties. I’ll argue in this section<br />

that knowledge questions do not have these properties. Again we’ll work through a<br />

long example to show this.<br />

Last week Watson discovered where Moriarty was storing a large amount of gold,<br />

and retrieved it. Moriarty is now coming to Baker St to try to get the gold back, and<br />

Holmes is planning a trap for him. Moriarty has made educated guesses that it was<br />

Watson (rather than Holmes) who retrieved the gold, and that Holmes is planning a<br />

trap at Baker St Station. But he doesn’t have a lot of evidence for either proposition.<br />

Holmes has been spying on Moriarty, so he knows that Moriarty is in just this position.<br />

Neither Moriarty nor Holmes care much about who it was who retrieved the<br />

gold from Moriarty’s vault, but this is very important to Watson, who plans to write<br />

a book about it. On the other hand, that Holmes is planning a trap at Baker St Station<br />

is very important to both Holmes and Moriarty, but surprisingly unimportant<br />

to Watson. He would prefer that he was the hero of the week for recovering the gold,<br />

not Holmes for capturing Moriarty. They have this conversation.<br />

Watson: Does Moriarty know that you’ve got a trap set up at Baker St<br />

Station?<br />

Holmes: No, he’s just guessing. If I set up a diversion I’m sure I can get<br />

him to change his mind.<br />

Watson: Does he know it was me who recovered the gold?<br />

Holmes: Yes, dear Watson, he figured that out.<br />

These answers sound to me like the answers Holmes should give. Because the<br />

trap is practically important to both him and Moriarty, it seems he should say no<br />

to the first question unless Moriarty has very strong evidence. 6 But because it is<br />

unimportant to Holmes and Moriarty just what Watson did, the fact that Moriarty<br />

has a true belief that’s based on the right evidence that Watson recovered the gold is<br />

sufficient for Holmes to answer the second question “Yes”. This shows that questions<br />

involving ‘knows’ do not have the SPEAKER property.<br />

Some might dispute the intuitions involved here, but note that on a simple Type B<br />

pragmatist theory, one that sets the context just by the speaker’s interests (as opposed<br />

to the speaker and her interlocutors) Holmes should assent to (4) and (5).<br />

(4) Moriarty does not know that I’ve got a trap set up for him at Baker St Station,<br />

he’s just guessing.<br />

(5) Moriarty does know that Watson recovered the gold.<br />

6 The Type A pragmatist says that it is the importance of the trap to Moriarty that makes this a highstakes<br />

question, and the Type C pragmatist (at least as I understand that view) says that it is the importance<br />

of the trap to Holmes that matters. Since Moriarty and Holmes agree about what is important here, the<br />

Type A and Type C pragmatists can agree with each other and disagree with the Type B pragmatist.


Questioning Contextualism 404<br />

And it is very intuitive that if he should assent to (4) and (5), then he should answer<br />

“No” to Watson’s first question, and “Yes” to the second.<br />

Some adherents of a more sophisticated Type B pragmatist theory will not say<br />

that Holmes should assent to these two, because they think he should take Watson’s<br />

interests on board in his use of ‘knows’. This is the ‘single scoreboard view’ of Keith<br />

DeRose (2004). 7 But we can, with a small addition of complexity rearrange the case<br />

so as to avoid this complication. Imagine that Watson is not talking to Holmes, but<br />

to Lestrade, and that Lestrade (surprisingly) shares Watson’s interests. Unbeknownst<br />

to the two of them, Holmes is listening in to their conversation via a bug. When<br />

listening in to conversations, Holmes has the habit of answering any questions that<br />

are asked, even if they obviously aren’t addressed to him. (I do the same thing when<br />

watching sports broadcasts, though the questions are often mind-numbingly bland.<br />

Listening to the intuitive answers one gives is a good guide to the content of the<br />

question.) So the conversation now goes as follows.<br />

Watson: Does Moriarty know that Holmes has got a trap set up at Baker<br />

St Station?<br />

Holmes (eavesdropping): No, he’s just guessing. If I set up a diversion<br />

I’m sure I can get him to change his mind.<br />

Lestrade: I’m not sure. My Moriarty spies aren’t doing that well.<br />

Watson: Does he know it was me who recovered the gold?<br />

Holmes (eavesdropping): Yes, dear Watson, he figured that out.<br />

It is important here that Holmes is talking to himself, even though he is using Watson’s<br />

questions to guide what he says. So it will be a very extended sense of context if<br />

somehow Watson’s interests guide Holmes’s context. Some speakers may be moved<br />

by empathy to align their interests with the people they are speaking about, but<br />

Holmes is not such a speaker. So Type B pragmatic theories should say that Holmes<br />

should endorse (4) and (5) in this context, and intuitively if this is true he should<br />

speak in the above interaction just as I have recorded. But this is just to say that<br />

questions about knowledge do not have SPEAKER. 8<br />

One might worry that we’re changing the rules, since we did not use eavesdropping<br />

situations above in arguing that if Type B pragmatism is true of ‘knows’, then<br />

questions about knowledge should have SPEAKER. But it is a simple exercise to<br />

check that even if Holmes is eavesdropping, his answers to questions including ‘tall’,<br />

‘empty’, ‘ready’ and ‘everyone’ should still have the three properties. So this change<br />

of scenery does not tilt the deck against the Type B pragmatist.<br />

Cases like this one also suggest that questions involving ‘knows’ also lack the<br />

CLARIFICATION and DIFFERENT ANSWERS property. It would be odd of<br />

Holmes to reply to one of Watson’s questions with “How much does it matter to<br />

7 The failure of NEGATIVE AGREEMENT for knowledge questions suggests that if any contextualist<br />

theory is true, it had better be a single scoreboard view.<br />

8 As noted above, the Type A and Type C pragmatists agree with this intuition, though for very differ-<br />

ent reasons.


Questioning Contextualism 405<br />

you?”, as it would be in any case where a questioner who knows that p asks whether<br />

S also knows that p. Intuitions about cases where the questioner does not know<br />

that p are a little trickier, because then the respondent can only answer ‘yes’ if she<br />

has sufficient evidence to assure the speaker that p, and it is quite plausible that the<br />

amount of evidence needed to assure someone that p varies with the interests of the<br />

person being assured. 9 But if Type B pragmatism were true, we’d expect to find cases<br />

of CLARIFICATION where it is common ground that the speaker knows p, and yet<br />

a request for standards is in order, and no such cases seem to exist. Similarly, it’s hard<br />

to imagine circumstances where Holmes would offer a different answer if Lestrade<br />

rather than Watson asked the questions. And more generally, if two questioners who<br />

each know that p ask whether S knows that p, it is hard to see how it could be apt<br />

to answer them differently. 10 But if Type B pragmatism about ‘knows’ were true,<br />

knowledge questions would have these three properties, so it follows that Type B<br />

pragmatism about ‘knows’ is false.<br />

4 Objections and Replies<br />

Objection: The argument that Type B pragmatism implies that questions involving<br />

‘knows’ should have the three properties assumes that correct answers are true answers.<br />

But there are good Gricean reasons to think that there are other standards for<br />

correctness.<br />

Reply: It is true that one of the arguments for thinking that Type B pragmatism has<br />

this implication uses this assumption. But the other argument, the argument from<br />

analogy with indexicals and other terms for which Type B pragmatism is true, does<br />

not. Whatever one thinks of the theory, there is data here showing that ‘knows’<br />

does not behave in questions like other context-sensitive terms for which Type B<br />

pragmatism is the most plausible view.<br />

Moreover, Type B pragmatists have to be very careful wielding this objection. If<br />

there is a substantial gap between correct answers to knowledge questions and true<br />

answers, then it is likely that there is a substantial gap between correct knowledge<br />

ascriptions and true knowledge ascriptions. As a general (though not universal) rule,<br />

truth and correctness are more tightly connected for questions than for simple statements.<br />

For example, some utterances do not generate all the scalar implicatures as<br />

answers to questions that they typically generate when asserted unprompted. 11 But<br />

9With the right supplementary assumptions, I believe this claim can be argued to be a consequence of<br />

any of our three types of pragmatism.<br />

10The above remarks about cases where the speakers don’t know that p also apply. I think that without<br />

a very careful story about how the norms governing assertion relate to the interests of the speaker and the<br />

audience, such cases will not tell us much about the semantics of ‘knows’. Best then to stick with cases<br />

where it is common ground that everyone, except the subject, knows that p.<br />

11Here are a couple of illustrations of this point. Yao Ming is seven feet six inches tall. It would be odd<br />

(at best) to use (6) to describe him, but it is clearly improper to answer (7) with ‘No’.<br />

(6) Yao Ming is over six feet tall.<br />

(7) Is Yao Ming over six feet tall?<br />

For a second case, consider an example of Hart’s that Grice (1989) uses to motivate his distinction between<br />

semantics and pragmatics. A motorist drives slowly down the street, pausing at every driveway to see if


Questioning Contextualism 406<br />

the primary ‘ordinary language’ argument for Type B pragmatism about knowledge<br />

ascriptions assumes that correct knowledge ascriptions are, by and large, true.<br />

We can put this in more theoretical terms. If we are to use these considerations<br />

to generate a rebutting defeater for Type B pragmatism, we would need an argument<br />

that Holmes’s answers are correct iff they are true. And while that step of the argument<br />

is plausible, it is not beyond contention. But that isn’t the only use of the<br />

examples. We can also use them to undercut the argument from ordinary language<br />

to Type B pragmatism. We have a wide range of cases where ordinary usage is as if<br />

Type B pragmatism is false. And these cases are not peripheral or obscure features of<br />

ordinary usage. Answering questions is one of the most common things we do with<br />

language. So ordinary usage doesn’t provide an all things considered reason to believe<br />

that Type B pragmatism is true. If ordinary usage is (or at least was) the best reason<br />

to believe Type B pragmatism, it follows that there is no good reason to believe Type<br />

B pragmatism.<br />

Objection: Sometimes when we ask Does S know that p? all we want to know is<br />

whether S has the information that p. In this mood, questions of justification are<br />

not relevant. But Type B pragmatism is a theory about the interaction between the<br />

subject’s justification and knowledge ascriptions. So these questions are irrelevant to<br />

evaluating indexicalism. 12<br />

Reply: It is plausible that there is this use of knowledge questions. It seems to me that<br />

this is a usage that needs to be explained, and isn’t easily explained on current theories<br />

of knowledge. 13 But I’ll leave discussion of that use of knowledge questions for<br />

another day. For now I’ll just note that even if this can be an explanation of why we<br />

sometimes assent to knowledge questions, it can’t be an explanation of why Holmes<br />

denies that Moriarty knows about the planned trap at Baker St Station. Holmes agrees<br />

anyone or anything is rushing out. It seems extremely odd to use (8) to describe him.<br />

(8) He drove carefully down the street.<br />

Is this oddity due to (8) being false, or it being otherwise infelicitous? Part of the argument that (8) is true,<br />

but infelicitous, is that intuitively it is correct to say ‘Yes’ in response to (9), and incorrect to say ‘No’.<br />

(9) Did he drive carefully down the street?<br />

The implicatures associated with (8) are largely absent from affirmative answers to (9), though such an<br />

answer presumably has the same truth-conditional content as (8).<br />

12 A similar view is defended by Alvin Goldman (2002).<br />

13 One explanation, due to Stephen Hetherington (2001), is that all that is ever required for a true ascription<br />

of knowledge that p to S is that p is true and S believe it. This leaves open the question, as discussed<br />

below, of why we sometimes deny that true believers are knowers, but we can bring out familiar Gricean<br />

explanations about why we might usefully deny something that is strictly speaking true, or we could<br />

follow Hetherington and offer an explanation in terms of the gradability of knowledge claims. Another<br />

possible explanation is that ‘know’ univocally means something like what most epistemologists say that<br />

it means, but there is a good pragmatic story about why speakers sometimes attribute knowledge to those<br />

with merely true belief. (I don’t know how such an explanation would go, especially if Type B pragmatism<br />

is not true.) Finally, we might follow Goldman and say the English word ‘know’ is ambiguous between<br />

a weak and strong reading, and the strong reading has been what has concerned epistemologists, and the<br />

weak reading just requires true belief. Such an ambiguity would be a small concession to Type B pragmatism<br />

(since it is the interests of the speaker that resolve ambiguities) but we could still sensibly ask whether<br />

Type B pragmatism is true of the strong reading.


Questioning Contextualism 407<br />

that Moriarty believes there is a trap planned, but insists that because Moriarty is ‘just<br />

guessing’ that this belief does not amount to knowledge. What really needs explaining<br />

is the difference between Holmes’s two answers, and this other use of knowledge<br />

questions doesn’t seem sufficient to generate that explanation.<br />

Objection: There is no explanation offered here for the data, and we shouldn’t give<br />

up an explanatory theory without an explanation of why it fails.<br />

Reply: The best way to explain the data is to look at some differences in our attitudes<br />

towards questions containing ‘knows’ and questions containing other terms for<br />

which Type B pragmatism is plausibly true. I’ll just go over the difference between<br />

‘knows’ and ‘ready’, the point I’m making easily generalises to the other cases.<br />

Different people may be concerned with different bits of preparation, so they<br />

may (speaker) mean different things by X is ready. But neither will regard the other<br />

as making a mistake when they focus on a particular bit of preparing to talk about<br />

by saying X is ready. And this is true even if they think that the person should be<br />

thinking about something else that X is preparing.<br />

Knowledge cases are not like that. Different people may have different standards<br />

for knowledge, so perhaps they may (speaker) mean different things by A knows that<br />

p, because they will communicate that A has met their preferred standards for knowledge.<br />

But in these cases, each will regard the other as making a mistake. Standards<br />

for knowledge aren’t the kind of thing we say people can differ on without making<br />

a mistake, in the way we do (within reason) say that different people can have different<br />

immediate goals (e.g. about what to have for dinner) without making a mistake.<br />

That explains why we don’t just adopt our questioners standards for knowledge when<br />

answering their knowledge questions.<br />

Objection: In section 2 it was argued that questions involving ‘knows’ should have<br />

the three properties because questions involving other terms for which Type B pragmatism<br />

is true have the properties. But this argument by analogy may be flawed. All<br />

those terms are ‘indexical’ in John MacFarlane’s sense (MacFarlane, 2009). That is,<br />

the content of any utterance of them varies with the context. But not all Type B pragmatist<br />

theories are indexical, and this argument does not tell against a non-indexical<br />

Type B pragmatism.<br />

Reply: The view under consideration says that all utterances of ‘S knows that p’ express<br />

the same proposition, namely that S knows that p. But the view is Type B<br />

because it says that proposition can be true relative to some contexts and false relative<br />

to other contexts, just as temporalists about propositions say that a proposition<br />

can be true at some times and false at other times, and the utterance is true iff the<br />

proposition is true in the context of the utterance. This is a Type B view, and I agree<br />

that the argument by analogy in section 2 is powerless against it.<br />

But that argument was not the only argument that Type B views are committed<br />

to questions involving ‘knows’ having the three properties. There was also an argument,<br />

at the end of section 1, from purely theoretical considerations. It turns out this


Questioning Contextualism 408<br />

argument is also complicated to apply here, but it does tell against this type of Type<br />

B pragmatist as well.<br />

Here are two hypotheses about how one should answer a question Does NP VP?<br />

where NP is a noun phrase and VP a verb phrase. Consider the proposition expressed<br />

by the sentence NP VP in the speaker’s context. First hypothesis: the correct answer<br />

is ‘Yes’ iff that proposition is true in the speaker’s context. Second hypothesis: the<br />

correct answer is ‘Yes’ iff that proposition is true in the respondent’s context. If<br />

propositions do not change their truth value between contexts, these two hypotheses<br />

are equivalent, but on this version of Type B pragmatism that is not so, so we have<br />

to decide between them. The best way to do this is by thinking about cases where<br />

it is common ground that propositions change their truth value between contexts,<br />

namely cases where the contexts are possible worlds. Let’s consider the scenario I<br />

briefly discussed above of the television watcher answering whatever question gets<br />

asked. Above I put the questioners in the same possible world as me, but we can<br />

imagine they are fictional characters in a different possible world. Imagine a show<br />

where it has just been revealed to the audience, but not all the characters, that the<br />

Prime Minister is a space alien. The following happens.<br />

Character (on screen): Is the Prime Minister a space alien?<br />

Me (in real world): Yes!<br />

Since the proposition that the Prime Minister is a space alien is true in their world,<br />

but not in our world, the propriety of this response tells in favour of the first hypothesis<br />

above. Now this is not a conclusive argument, because it might be that<br />

there is some distinctive feature of fiction that is causing this answer, even though the<br />

second hypothesis is in general correct. But it seems we have reason to think that<br />

if this kind of Type B pragmatism were true, respondents would use the speaker’s<br />

context to work out what kind of answer to give. And as we saw above, this is not<br />

what respondents actually do, they either use their own context or (more likely) the<br />

subject’s.<br />

Objection: At most this shows that Type B pragmatism about ‘knows’ is not actually<br />

true of English. But there might still be good epistemological reasons to adopt it as a<br />

philosophically motivated revision, even if the data from questions shows it isn’t true.<br />

So even if hermeneutic Type B pragmatism is false, revolutionary Type B pragmatism<br />

might be well motivated.<br />

Reply: It’s true that the philosophically most interesting concepts may not map exactly<br />

on to the meanings of words in natural language. (Though I think we should<br />

be careful before abandoning the concepts that have proven useful enough to get simple<br />

representation in the language.) And it’s true that there are reasons for having<br />

epistemological concepts that are sensitive to pragmatic factors. But what is hard to<br />

see is what interest we could have in having epistemological terms whose application<br />

is sensitive to the interests of the person using them. Hawthorne (2004b) provides<br />

many reasons for thinking that terms whose applications are sensitive to the interests


Questioning Contextualism 409<br />

of the person to whom they are being applied are philosophically and epistemologically<br />

valuable. Such terms provide ways of expressing unified judgements about the<br />

person’s intellectual and practical reasoning. On the other hand, there is little to<br />

be gained by adopting Type B pragmatism. If there needs to be a revision around<br />

here, and my guess is that there does not, it should be towards Type A, not Type B,<br />

pragmatism.


Part IV<br />

Vagueness


Many Many Problems<br />

Abstract<br />

Recently four different papers have suggested that the supervaluational<br />

solution to the Problem of the Many is flawed. Stephen Schiffer (1998,<br />

2000a,b) has argued that the theory cannot account for reports of speech<br />

involving vague singular terms. Vann McGee and <strong>Brian</strong> McLaughlin<br />

(2000) say that theory cannot, yet, account for vague singular beliefs.<br />

Neil McKinnon (2002) has argued that we cannot provide a plausible theory<br />

of when precisifications are acceptable, which the supervaluational<br />

theory needs. And Roy Sorensen (2000) argues that supervaluationism<br />

is inconsistent with a directly referential theory of names. McGee and<br />

McLaughlin see the problem they raise as a cause for further research,<br />

but the other authors all take the problems they raise to provide sufficient<br />

reasons to jettison supervaluationism. I will argue that none of<br />

these problems provide such a reason, though the arguments are valuable<br />

critiques. In many cases, we must make some adjustments to the<br />

supervaluational theory to meet the posed challenges. The goal of this<br />

paper is to make those adjustments, and meet the challenges.<br />

1 Schiffer’s Problem<br />

Stephen Schiffer suggests the following argument refutes supervaluationism. The<br />

central point is that, allegedly, the supervaluational theory of vague singular terms<br />

says false things about singular terms in speech reports.<br />

Pointing in a certain direction, Alice says to Bob, ‘There is where Harold<br />

and I first danced the rumba.’ Later that day, while pointing in the same<br />

direction, Bob says to Carla, ‘There is where Alice said she and Harold<br />

first danced the rumba.’ Now consider the following argument:<br />

(1) Bob’s utterance was true.<br />

(2) If the supervaluational semantics were correct, Bob’s utterance<br />

wouldn’t be true.<br />

(3) ∴ The supervaluational semantics isn’t correct. (Schiffer, 2000a,<br />

321)<br />

Assuming Bob did point in pretty much the same direction as Alice, it seems implausible<br />

to deny (1). The argument is valid. So the issue is whether (2) is correct.<br />

Schiffer has a quick argument for (2), which I will paraphrase here. On supervaluational<br />

semantics, a sentence is true iff each of its acceptable precisifications is true.<br />

In this case, this means that if Bob’s utterance is true then it must be true however<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philo-<br />

sophical Quarterly 53 (2003): 481-501.


Many Many Problems 412<br />

we precisify ‘there’. Each precisification of ‘there’ will be a (precise) place, and since<br />

‘there’ is rather vague, many of these precisifications will be acceptable. For Bob’s<br />

utterance to be true, then, Alice must have said of every one of those places that it<br />

was the place where Harold and her first danced the rumba. But Alice couldn’t have<br />

said all those things, so (2) is true.<br />

Schiffer suggests that one way out of this problem would be to accept the existence<br />

of a vague object: the place where Harold and Alice first danced the rumba. I<br />

will note in section four several reasons for thinking the cost of this move is excessive.<br />

Fortunately, there is a cheaper way home.<br />

Schiffer underestimates the scope of supervaluationism. On Schiffer’s vision of<br />

the theory, a precisification assigns a precise content to a word, and hence to a sentence,<br />

then the world determines whether that content is satisfied, and hence whether<br />

the sentence is true on that precisification. This is hardly an unorthodox view of<br />

how supervaluationism works, it seems for instance to be exactly the view defended<br />

in Keefe (2000), but it is neither the only way, nor the best way, forward. We could<br />

say, rather, that a precisification assigns content to every linguistic token in the world,<br />

and the truth conditions of every one of these tokens is then determined relative to<br />

that global assignment of content. So if a precisification P assigns a place x to Bob’s<br />

word ‘there’, Bob’s utterance is true according to that precisification iff P also assigns<br />

x to Alice’s utterance of ‘there’. That is, Bob’s utterance is true according to P iff the<br />

precisification of his words by P just is what Alice said according to P. 1<br />

It is a dramatic widening of the scope of precisifications to claim that they assign<br />

content to every linguistic token in the world, rather than just words in the<br />

sentence under consideration, but it can be justified. 2 Consider how we would react<br />

if later in the day, pointing in the crucial direction, Alice said, ‘Harold and I never<br />

danced the rumba there.’ We would think that Alice had contradicted herself – that<br />

between her two statements she must have said something false. A standard supervaluationist<br />

account, where sentences are precisified one at a time, cannot deliver<br />

this result. On such a view, it might be that each of Alice’s utterances are true on<br />

some precisifications, so they are both neither true nor false. On my theory, each<br />

precisification applies to both of Alice’s utterances (as well as every other utterance<br />

ever made) and since on each precisification one or other of the utterances is false,<br />

it turns out supertrue that Alice said something false, as desired. The current view<br />

allows for penumbral connections between sentences, as well as penumbral connections<br />

within sentences. Just as someone who says, “That is red and orange” says<br />

something false, my theory decrees that someone who says, “That is red. That is<br />

orange,” while pointing at the same thing says something false, even if the object is in<br />

the vague area ‘between’ red and orange.<br />

It is crucial for this response to work that on every precisification, Alice and<br />

Bob’s demonstratives are co–referential. It does not seem like a particular expansion<br />

of supervaluational theory to posit this as a penumbral connection between the two<br />

1 Following Schiffer, we ignore the vagueness in ‘is where Harold and I first danced the rumba.’ This<br />

phrase is vague, but its vagueness raises no extra issues of philosophical importance.<br />

2 Thanks to John Hawthorne for the following argument.


Many Many Problems 413<br />

words. At least, it seems plausible enough to do this if Alice and Bob really are pointing<br />

in a similar direction. If their demonstrations are only roughly co-directional,<br />

then on some precisifications they may well pick out different objects. This will definitely<br />

happen if some admissible precisification of Alice’s ‘there’ is not an admissible<br />

precisification of Bob’s ‘there’. In such a case, the theory here predicts that Bob’s<br />

utterance will be indeterminate in truth value. But if Alice and Bob only vaguely<br />

pointed in the same direction this is the correct prediction.<br />

2 Natural Properties<br />

Schiffer’s problem seems to have been solved with a minimum of fuss, but there is<br />

still a little work to do. Above I posited a penumbral connection between Alice’s<br />

and Bob’s words without explaining how such a connection could arise. This connection<br />

can be explained by some general considerations about content, considerations<br />

closely tied to the view of vagueness as semantic indecision that provides the<br />

best motivation for supervaluationism. As a few writers have pointed out (Quine,<br />

1960; Putnam, 1980; Kripke, 1982), there is not enough in our dispositions to use<br />

words to fix a precise content all terms in our lexicon. This does not immediately imply<br />

a thorough-going content scepticism because, as a few writers have also pointed<br />

out (Putnam, 1973; Kripke, 1980; Lewis, 1983c, 1984a), meanings ain’t (entirely) in<br />

the head. Sometimes our words refer to a particular property or object rather than<br />

another not because our dispositions make this so, but because of some particular<br />

feature of that property or object. David Lewis calls this extra feature ‘naturalness’:<br />

some properties and objects are more natural than others, and when our verbal dispositions<br />

do not discriminate between different possible contents, naturalness steps<br />

in to finish the job and the more natural property or object gets to be the content.<br />

Well, that’s what happens when things go well. Vagueness happens when things<br />

don’t go well. Sometimes our verbal dispositions are indiscriminate between several<br />

different contents, and no one of these is more natural than all the rest. In these<br />

cases there will be many unnatural contents not eliminated by our dispositions that<br />

naturalness does manage to eliminate, but there will be still be many contents left<br />

uneliminated. Consider, for example, all the possible properties we might denote by<br />

‘tall woman’. As far as our usage dispositions go, it might denote any one of the following<br />

properties: woman taller than 1680mm, woman taller than 1681mm, woman<br />

taller than 1680.719mm, etc. And it does not seem that any of these properties are<br />

more natural than any other. Hence there is no precise fact about what the phrase<br />

denotes. Hence it is vague. In sum, our dispositions are never enough to settle the<br />

content of a term. In some cases, such as ‘water’, ‘rabbit’, ‘plus’, ‘brain’ and ‘vat’, nature<br />

is kind enough to, more or less, finish the job. In others it is not, and vagueness<br />

is the result.<br />

(The above reasoning has a surprising consequence. Perhaps our verbal dispositions<br />

are consistent with the predicate Tall X denoting the property of being in the<br />

top quartile of Xs by height. Unlike each of the properties mentioned in the text, this<br />

is a more natural property than many of its competitors. So if this kind of approach<br />

to vagueness is right, there might not be quite as much vagueness as we expected.)


Many Many Problems 414<br />

If this is how vagueness is created, then there is a natural way to understand how<br />

precisifications remove vagueness. Vagueness arises because more natural than is a<br />

partial order on putative contents, and hence there might be no most natural content<br />

consistent with our verbal dispositions. If this relation only defined a strict ordering,<br />

so whatever the candidate meanings were, one of them would be most natural,<br />

vagueness might be defeated. Well, that isn’t true in reality, but it is true on each precisification.<br />

Every precisification is a completion of the ‘naturalness’ partial order.<br />

That is, each precisification P defines a strict order, more natural-P than, on possible<br />

contents of terms such that o 1 is more natural-P than o 2 if (but not only if) o 1 is more<br />

natural than o 2 . The particular contents of terms according to P is then defined by<br />

using the more natural-P than relation where the more natural than relation is used<br />

in the real theory of content.<br />

This conjecture meshes nicely with my theory of the role of precisifications.<br />

First, it explains why precisifications apply to the whole of language. Since a precisification<br />

does not just remedy a defect in a particular word, but a defect in the<br />

content generation mechanism, precisifications are most naturally applied not just<br />

to a single word, but to every contentful entity. Secondly, it explains why we have<br />

the particular penumbral connections we actually have. Recall that it was left a little<br />

unexplained above why Alice’s and Bob’s use of ‘there’ denoted the same precise<br />

place. On the current conjecture, Alice’s term refers to a particular place x according<br />

to P because x is more natural–P than all the other places to which Alice might<br />

have referred. If this is so, then x will be more natural–P than all the other places<br />

to which Bob might have referred, so it will also be the referent according to P of<br />

Bob’s there. Hence according to every precisification, Bob’s utterance will be true,<br />

as Schiffer required.<br />

We can also explain some other unexplained penumbral connections by appeal<br />

to naturalness. Consider the sentence David Chalmers is conscious. Unless this is supertrue,<br />

supervaluationism is in trouble. It is vague just which object is denoted by<br />

David Chalmers. On every precisification, there are other objects that massively overlap<br />

David Chalmers. Indeed, these very objects are denoted by ‘David Chalmers’ on<br />

other precisifications. These objects are not conscious, since if one did there would<br />

be two conscious objects where, intuitively, there is just one. But each of these rogue<br />

objects must be in the extension of ‘conscious’ on the precisifications where it is the<br />

denotation of ‘David Chalmers’. So ‘conscious’ must be vague in slightly unexpected<br />

ways, and there must be a penumbral connection between it and ‘David Chalmers’:<br />

on every precisification, whatever object is denoted by that name is in the extension<br />

of ‘conscious’, while no other potential denotata of ‘David Chalmers’ is in the extension.<br />

How is this penumbral connection to be explained? Not by appeal to the<br />

meanings of the terms! Even if ‘David Chalmers’ has descriptive content, it is highly<br />

implausible that this includes being conscious. (After all, unless medicine improves<br />

a bit in a thousand years Chalmers will not be conscious.) Rather, this penumbral<br />

connection is explained by the fact that the very same thing, naturalness, is used in<br />

resolving the vagueness in the terms ‘conscious’ and ‘David Chalmers’. If the precisification<br />

makes one particular possible precisification of ‘David Chalmers’, say d 1 ,<br />

more natural than another, d 2 , then it will make properties satisfied by d 1� more


Many Many Problems 415<br />

natural than those satisfied by d 2 , so every precisification will make the denotation<br />

of ‘David Chalmers’ fall into the extension of ‘conscious’.<br />

We can say the same thing about Alice’s original statement: That is where Harold<br />

and I first danced the rumba. Since one can’t first dance the rumba with Harold in two<br />

different places, it seems Alice’s statement can’t be true relative to more than one<br />

precisification of ‘That’. But really the phrase after ‘is’ is also vague, and there is a<br />

penumbral connection (via naturalness) between it and the demonstrative. Hence we<br />

can say Alice’s statement is supertrue without appealing to any mysterious penumbral<br />

connections.<br />

3 McGee and McLaughlin’s Challenge<br />

Vann McGee and <strong>Brian</strong> McLaughlin (2000) raise a challenge for supervlauational approaches<br />

to the Problem of the Many that uses belief reports in much the way that<br />

Schiffer’s problem uses speech reports. They fear that without further development,<br />

the supervaluational theory cannot distinguish between the de re and de dicto readings<br />

of (4).<br />

(4) Ralph believes that there is a snow-capped mountain within sight of the equator.<br />

They claim, correctly, that (4) should have both a de dicto reading and a de re reading,<br />

where in the latter case it is a belief about Kilimanjaro. The problem with the latter<br />

case is unclear how Ralph’s belief can be about Kilimanjaro itself. To press the point,<br />

they consider an atom at or around the base of Kilimanjaro, called Sparky, and define<br />

“Kilimanjaro(+) to be the body of land constituted . . . by the atoms that make up<br />

Kilimanjaro together with Sparky [and] Kilimanjaro(-) [to] be the body of land constituted<br />

. . . by the atoms that make up Kilimanjaro other than Sparky.” (129) The<br />

problem with taking (4) to be true on a de re reading is that “there isn’t anything,<br />

either in his mental state or in his neural state or in his causal relations with his environment<br />

that would make one of Kilimanjaro(+) and Kilimanjaro(-), rather than<br />

the other, the thing that Ralph’s belief is about.” (146) So if the truth of (4) on a de<br />

re reading requires that Ralph believes a singular, or object-dependent, proposition,<br />

about one of Kilimanjaro(+) and Kilimanjaro(-), then (4) cannot be true. Even worse,<br />

if the truth of (4) requires that Ralph both that Ralph believes a singular proposition<br />

about Kilimanjaro(+), that it is a snow-capped mountain within sight of the equator,<br />

and the same proposition about Kilimanjaro(-), then given some knowledge about<br />

mountains on Ralph’s part, (4) cannot be true, because that would require Ralph to<br />

mistakenly believe there are two mountains located roughly where Kilimanjaro is<br />

located.<br />

We should not be so easily dissuaded. It is hard to identify exactly which features<br />

of Ralph’s “mental state or neural state or causal relations with his environment”<br />

that make it the case that he believes that two plus two equals four, but does not<br />

believe that two quus two equals four. (I assume Ralph is no philosopher, so lacks<br />

the concept QUUS.) I doubt, for example, that the concept PLUS has some causal


Many Many Problems 416<br />

influence over Ralph that the concept QUUS lacks. But Ralph does have the belief<br />

involving PLUS, and not the belief involving QUUS. He has this belief not merely in<br />

virtue of his mental or neural states, or his causal interactions with his environment,<br />

but in virtue of the fact that PLUS is a more natural concept than QUUS, and hence<br />

is more eligible to be a constituent of his belief.<br />

So if Kilimanjaro(+) is more natural than Kilimanjaro(-), it will be a constituent<br />

of Ralph’s belief, despite the fact that there is no other reason to say his belief is about<br />

one rather than the other. Now, in reality Kilimanjaro(+) is no more natural than<br />

Kilimanjaro(-). But according to any precisification, one of them will be more natural<br />

than the other, for precisifications determine content by determining relative naturalness.<br />

Hence if Ralph has a belief with the right structure, in particular a belief with<br />

a place for an object (roughly, Kilimanjaro) and the property being within sight of the<br />

equator, then on every precisification he has a singular belief that a Kilimanjaro-like<br />

mountain is within sight of the equator. And notice that since naturalness determines<br />

both mental content and verbal content, on every precisification the constituent of<br />

that belief will be the referent of ‘Kilimanjaro’. So even on a de re reading, (4) will be<br />

true.<br />

Schiffer’s problem showed that we should not take precisifications to be defined<br />

merely over single sentences. McGee and McLaughlin’s problem shows that we<br />

should take precisifications to set the content not just of sentences, but of mental<br />

states as well. Precisifications do not just assign precise content to every contentful<br />

linguistic token, but to every contentful entity in the world, including beliefs. This<br />

makes the issue of penumbral connections that we discussed in section two rather<br />

pressing. We already noted the need to establish penumbral connections between<br />

separate uses of demonstratives. Now we must establish penumbral connections between<br />

words and beliefs. The idea that precisifications determine content by determining<br />

relative naturalness establishes these connections.<br />

To sum up, McGee and McLaughlin raise three related problems concerning de re<br />

belief. Two of these concern belief reports. First, how can we distinguish between de<br />

re and de dicto reports? If I am right, we can distinguish between these just the way<br />

Russell suggested, by specifying the scope of the quantifiers. McGee and McLaughlin<br />

suspect this will not work because in general we cannot argue from (5) to (6), given<br />

the vagueness of ‘Kilimanjaro’.<br />

(5) Kilimanjaro is such that Ralph believes it to be within sight of the equator.<br />

(6) There is a mountain such that Ralph believes it to be within sight of the equator.<br />

Whether or not we want to accept a semantics in which we must restrict existential<br />

generalisation in this way as a general rule, we can give an independent argument<br />

that (6) is true whenever (4) is true on a de re reading (i.e. whenever (5) is true). The<br />

argument is just that on every precisification, the subject of Ralph’s salient singular<br />

belief is a mountain, so (6) is true on every precisification. This argument assumes<br />

that there is a penumbral connection between the subject of this belief, as we might


Many Many Problems 417<br />

say the referent of ‘Kilimanjaro’ in his language of thought 3 , and the word ‘mountain’.<br />

But since we have already established that there is such a connection between<br />

‘Kilimanjaro’ in his language of thought and ‘Kilimanjaro’ in public language, and<br />

there is obviously a connection between ‘Kilimanjaro’ in public language and the<br />

word ‘mountain’, as ‘Kilimanjaro is a mountain’ is supertrue, this assumption is safe.<br />

So the second puzzle McGee and McLaughlin raise, how it can be that the relevant<br />

de re reports can be true, has also been addressed.<br />

There is a third puzzle McGee and McLaughlin raise that the reader might think I<br />

have not addressed. How can it be that Ralph can actually have a de re belief concerning<br />

Kilimanjaro? I have so far concentrated on belief reports, not merely on beliefs,<br />

and my theory has relied crucially on correlations between the vagueness in these<br />

reports and the vagueness in the underlying belief. It might be thought that I have<br />

excluded the most interesting case, the one where Ralph has a particular belief with<br />

Kilimanjaro itself as a constituent. While I will end up denying Ralph can have such<br />

a belief, I doubt this a problematic feature of my view. The theory outlined here denies<br />

that Ralph has object–dependent beliefs, but not that he has de re beliefs. I deny<br />

that Ralph has a belief that has Kilimanjaro(+) as a constituent, but it is hard to see<br />

how Ralph could have such a belief, since it very hard to see how he could have had<br />

a belief that has Kilimanjaro(+) rather than Kilimanjaro(-) as its subject. (This was<br />

McGee and McLaughlin’s fundamental point.) If we think that having a de re belief<br />

implies having a belief whose content is an object–dependent proposition, then we<br />

must deny that there are de re beliefs about Kilimanjaro. Since there is no object that<br />

is determinately a constituent of the proposition Ralph believes, it is a little hard to<br />

maintain that he believes an object–dependent proposition. 4 But this is not the only<br />

way to make sense of de re beliefs.<br />

Robin Jeshion has argued that whether a belief is de re depends essentially on its<br />

role in cognition. “What distinguishes de re thought is its structural or organisational<br />

role in thought” (Jeshion, 2002, 67) 5 I won’t rehearse Jeshion’s arguments here, just<br />

their more interesting conclusions. We can have de re beliefs about an object iff we<br />

have a certain kind of mental file folder for the object. This folder need not be generated<br />

by acquaintance with the object, so acquiantanceless de re belief is possible.<br />

Indeed, the folder could have been created defectively, so there is no object that the<br />

information in the folder is about. 6 In this case, the contents of the folder are subjectless<br />

de re beliefs. Jeshion doesn’t discuss this, but presumably the folder must not<br />

have been created purely to be the repository for information about the bearer of a<br />

certain property, whoever or whatever that is. We have to rule out this option if we<br />

follow Szabó (2000) in thinking the folder metaphor plays a crucial role in explaining<br />

3I do not mean here to commit myself to anything like the language of thought hypothesis. This is just<br />

being used as a convenient shorthand.<br />

4This is hard, but not perhaps impossible. One might say that on every precisification, Ralph believes<br />

a proposition that has a mountain as a constituent, and hence as an essential part.<br />

5I don’t know if Jeshion would accept the corollary that if belief is too unstructured to allow for the<br />

possibility of such organisational roles, then there is no de re belief, but I do.<br />

6Which is not just to say that there is no object that has all the properties in the folder. This is neither<br />

necessary nor sufficient for the folder to be about the object, as Kripke’s discussion of ‘famous deeds’<br />

descriptivism should make clear.


Many Many Problems 418<br />

our talk and thought involving descriptions. Provided the folder was created with the<br />

intent that it record information about some object, rather than merely information<br />

about whatever object has a particular property, its contents are de re beliefs. (To allow<br />

for distinct folders ‘about’ non-existent objects, we must allow that it is possible<br />

that such folders do have their reference fixed by their contents, but as long as this<br />

was not the intent in creation these folders can suffice for de re belief. This point<br />

lets us distinguish between my folder for Vulcan and my folder for The planet causing<br />

the perturbations of Mercury. Both are individuated by the fact that they contain the<br />

proposition This causes the perturbations of Mercury. It is this feature of the folder<br />

that fixes their reference, or in this case their non-reference. Only in the latter case,<br />

however, was this the intent in creating the folder, so its contents are de dicto beliefs,<br />

while the contents of the former are de re beliefs.)<br />

Now we have the resources to show how Ralph can have de re beliefs concerning<br />

Kilimanjaro. When Ralph hears about it, or sees it, he opens a file folder for Kilimanjaro.<br />

This is not intended to merely be a folder for the mountain he just heard about,<br />

or saw. It is intended to be a folder for that. (Imagine here that I am demonstrating<br />

the mountain in question.) The Kripkenstein point about referential indeterminacy<br />

applies to folders as much as to words. This point is closely related to Kripke’s insistence<br />

that his indeterminacy argument does not rely on behaviourism. So if Ralph’s<br />

folder is to have a reference, it must be fixed in part by the naturalness of various<br />

putative referents. But that is consistent with Ralph’s folder containing de re beliefs,<br />

since unless Ralph is a certain odd kind of philosopher, he will not have in his folder<br />

that Kilimanjaro is peculiarly eligible to be a referent. So the referent of the folder is<br />

not fixed by its contents (as the referent for a folder about The mountain over there,<br />

whatever it is, would be, or how the referent for a folder about The natural object over<br />

there, whatever it is, would be), and the contents of this folder are still de re beliefs<br />

Ralph has about Kilimanjaro. This was a bit roundabout, but we have seen that the<br />

Problem of the Many threatens neither the possibility that Ralph is the subject of<br />

true de re belief ascriptions, nor that he actually has de re beliefs.<br />

4 Vague Objects<br />

“I think the principle that to be is to be determinate is a priori, and hence<br />

that it is a priori that there is no de re vagueness”. (Jackson, 2001, 657-658)<br />

So do I. I also think there are a few arguments for this claim, though some of them<br />

may seem question-begging to the determined defender of indeterminate objects.<br />

Most of these arguments I will just mention, since I assume the reader has little desire<br />

to see them detailed again. One argument is just that it is obvious that there is no<br />

de re vagueness. Such ‘arguments’ are not worthless. The best argument that there<br />

are no true contradictions is of just this form, as Priest (1998) shows. And it’s a good<br />

argument! Secondly, Russell’s point that most arguments for de re vagueness involve<br />

confusing what is represented with its representation still seems fair (Russell, 1923).<br />

Thirdly, even though the literature on this is a rather large, it still looks like the


Many Many Problems 419<br />

Evans-Salmon argument against vague identities works, at least under the interpretation<br />

David Lewis gives it, and this makes it hard to see how there could be vague objects<br />

(Evans, 1978; Salmon, 1981; Lewis, 1988d). Fourthly, Mark Heller (1996) argues<br />

that we have to allow that referential terms are semantically vague. He says we have<br />

to do so to explain context dependence but there are a few other explanatory projects<br />

that would do just as well. Since semantic conceptions of vagueness can explain all<br />

the data that are commonly taken to support ontological vagueness, it seems theoretically<br />

unparsimonious to postulate ontological vagueness too. That’s probably<br />

enough, but let me add one more argument to the mix. Accepting that Kilimanjaro<br />

is be a vague material object distinct from both Kilimanjaro(+) and Kilimanjaro(-) has<br />

either metaphysical or logical costs. To prove this, I derive some rather unpleasant<br />

metaphysical conclusions from the assumption that Kilimanjaro is vague. The proofs<br />

will use some contentious principles of classical logic, but rejecting those, and hence<br />

rejecting classical logic, would be a substantial logical cost. The most contentious<br />

such principle used will be an instance of excluded middle: Sparky is or is not a part<br />

of Kilimanjaro. I also assume that if for all x other than Sparky that x is a part of y<br />

iff it is a part of z, then if Sparky is part of both y and z, or part of neither y nor z,<br />

then y and z coincide. If someone can contrive a mereological theory that rejects this<br />

principle, it will be immune to these arguments.<br />

It is very plausible that material objects are individuated by the materials from<br />

which they are composed, so any coincident material objects are identical. Properly<br />

understood, that is a good account of what it is to be material. The problem is<br />

getting a proper understanding. Sider (1996) interprets it as saying that no two nonidentical<br />

material objects coincide right now. His project ends up running aground<br />

over concerns about sentences involving counting, but his project, of finding a strong<br />

interpretation of the principle is intuitively compelling. David (Lewis, 1986b, Ch.<br />

4) defends a slight weaker version: no two non-identical material objects coincide<br />

at all times. Call this the strong composition principle (scp). The scp is (classically)<br />

inconsistent with the hypothesis that Kilimanjaro is vague. If Sparky is part of Kilimanjaro,<br />

then Kilimanjaro and Kilimanjaro(+) always coincide. If Sparky is not part<br />

of Kilimanjaro then Kilimanjaro and Kilimanjaro(-) always coincide. Either way, two<br />

non-identical objects always coincide, which the scp does not allow.<br />

Some think the scp is refuted by Gibbard’s example of Lumpl and Goliath (Gibbard,<br />

1975). The most natural response to Gibbard’s example is to weaken our individuation<br />

principle again, this time to: no two non-identical material objects coincide<br />

in all worlds at all times. Call this the weak compositional principle (wcp). Since there<br />

are worlds in which Goliath is composed of bronze, but Lumpl is still a lump of clay<br />

in those worlds, Lumpl and Goliath do not refute the wcp. Some may think that<br />

even the wcp is too strong 7 , but most would agree that if vague objects violated the<br />

wcp, that would be a reason to believe they don’t exist.<br />

Given a plausible metaphysical principle, which I call Crossover, vague objects<br />

will violate the wcp. As shown above, Kilimanjaro actually (always) coincides with<br />

7 Kit Fine (1994) does exactly this.


Many Many Problems 420<br />

Kilimanjaro(+) or Kilimanjaro(-), but is not identical with either. Crossover is the<br />

following principle:<br />

Crossover For any actual material objects x and y there is an object z that coincides<br />

with x in the actual world and y in all other worlds.<br />

Given that arbitrary fusions exist, Crossover is entailed by, but does not entail, the<br />

doctrine of arbitrary modal parts: that for any object o and world w, if o exists in<br />

w then o has a part that only exists in w. But Crossover does not have the most<br />

surprising consequence of the doctrine of arbitrary modal parts: that for any object<br />

o there is an object that has essentially all the properties o actually has.<br />

Let K1 be the object that coincides with Kilimanjaro in this world and Kilimanjaro(+)<br />

in all other worlds. Let K2 be the object that coincides with Kilimanjaro in<br />

this world and Kilimanjaro(-) in all other worlds. If Sparky is part of Kilimanjaro<br />

then K1 and Kilimanjaro(+) coincide in all worlds, but they are not identical, since<br />

it is determinate that Sparky is actually part of Kilimanjaro(+) and not determinate<br />

that it is part of K1. If Sparky is not part of Kilimanjaro then K2 and Kilimanjaro(-)<br />

coincide in all worlds, but they are not identical, since it is determinate that Sparky<br />

is not actually part of Kilimanjaro(-) and not determinate that it is not part of K2. Either<br />

way, we have a violation of the wcp. So the following three claims are (classically)<br />

inconsistent.<br />

(a) Crossover.<br />

(b) The wcp.<br />

(c) Kilimanjaro is a vague object that indeterminately has Sparky as a part.<br />

I think the first two are highly plausible, so accepting (c) is costly. I already noted the<br />

plausibility of the wcp, so the focus should be on Crossover. On Lewis’s account of<br />

modality, it is clearly true, as is the stronger doctrine of arbitrary modal parts. On<br />

a fictionalist theory of modality based on Lewis’s account, it is still true, or at least<br />

true in the fiction that we must adopt to make sense of modal talk. So the principle<br />

is not without merits. And dialectically, opposing Crossover will be problematic for<br />

the believer in vague objects. Either an object’s modal profile is determined by its<br />

categorical properties or it isn’t. If it is, then the wcp will entail the scp, so by the<br />

above reasoning vague objects will be inconsistent with the wcp. If it is not, then it<br />

is hard to see why an object could not have a completely arbitrary modal profile, say<br />

the profile of some other ordinary material object. But that means Crossover is true,<br />

and again we cannot have both the wcp and vague objects. Probably the best way<br />

out for the believer in vague objects will be to short-circuit this reasoning by abandoning<br />

classical logic, presumably by declining to endorse the version of excluded<br />

middle with which I started. But that is undoubtedly a costly move, particularly for<br />

a supervaluationist.


Many Many Problems 421<br />

5 McKinnon on Coins and Precisifications<br />

Most of our discussions of the Problem of the Many relate to the vagueness in a single<br />

singular term, and a single ordinary object. As McKinnon reminds us, however, there<br />

is not just one mountain in the world, there are many of them, and supervaluationists<br />

are obliged to say plausible things about statements that are about many mountains.<br />

Or, to focus on McKinnon’s example, we must not only have a plausible theory of<br />

coins, but of coin exhibitions. These do raise distinctive problems. Imagine we have<br />

an exhibition with, as we would ordinarily say, 2547 coins, each numbered in the catalogue.<br />

So to each number n there correspond millions of coin-like entities, coin*s<br />

in Sider’s helpful phrase (Sider, 2001b), and each precisification assigns a coin* to a<br />

number. In general, Sider holds that something is an F* iff it has all the properties<br />

necessary and sufficient for being an F except the property of not massively overlapping<br />

another F. There are some interesting questions about how independent these<br />

assignments can be. If one precisification assigns coin* c 1 to n 1 , and another assigns<br />

coin* c 2 to n 2 (distinct from n 1 ) then is there a guaranteed to be a precisification that<br />

assigns both c 1 to n 1 and c 2 to n 2 ? In other words, may the precisifications of each numeral<br />

(construed as a coin denotation) be independent of each other? The following<br />

example suggests not. Say C j is the set of coin*s that are possible precisifications of<br />

j. This set may be vague because of higher–order vagueness, but set those difficulties<br />

aside. If every member of C 1728 has a duplicate in C 1729 , then presumably only precisifications<br />

that assigned duplicates to ‘1728’ and ‘1729’ would be admissible. If the<br />

exhibition has two Edward I pennies on display to show the obverse and reverse, and<br />

miraculously these coins are duplicates, such a situation will arise.<br />

This case is fanciful, so we don’t know whether in reality the precisifications of<br />

the numerals are independent. We probably can’t answer this question, but this is no<br />

major concern. McKinnon has found a question which the supervaluationist should<br />

feel a need to answer, but to which neither answer seems appropriate. Say that a<br />

precisification is principled iff there is some not-too-disjunctive property F such that<br />

for each numeral n, the precisification assigns to n the F-est coin* in C n . If F does<br />

not come in degrees, then the precisification assigns to n the F in C n . McKinnon’s<br />

question to the supervaluationist is: Are all precisifications principled? He aims to<br />

show either answering ‘yes’ or ‘no’ gets the supervaluationist in trouble. ‘Yes’ leads<br />

to there being too few precisifications; ‘No’ leads to there being too many. Let us<br />

look at these in order.<br />

I have little to say for now on the first horn of this dilemma. McKinnon’s survey<br />

of principled precisifications only considers cases where F is intrinsic, and I postpone<br />

for now investigation of extrinsic principles. Nevertheless, he does show that if F<br />

must be intrinsic, then there are not enough principled precisifications to generate all<br />

the indeterminacy our coin exhibit intuitively displays. The other horn is trickier.<br />

A precisification must not only assign a plausible coin* to each numeral, it must<br />

do so in such a way that respects penumbral connections. McKinnon thinks that<br />

unprincipled, or arbitrary precisifications, will violate (NAD) and (NAS).


Many Many Problems 422<br />

Non-Arbitrary Differences (NAD) For any coin and non-coin, there is a principled<br />

difference between them which forms the basis for one being a coin and<br />

the other being a non-coin.<br />

Non-Arbitrary Similarities (NAS) For any pair of coins, there is a principled similarity<br />

between them which forms the basis for their both being coins.<br />

McKinnon holds these are true, so they should be true on all precisifications, but<br />

they are not true on unprincipled precisifications, so unprincipled precisifications<br />

are unacceptable. The motivation for (NAD) and (NAS) is clear. When we list the<br />

fundamental properties of the universe, we will not include being a coin. Coinness<br />

doesn’t go that deep. So if some things are coins, they must be so in virtue of their<br />

other properties. From this (NAD) and (NAS) follow.<br />

The last step looks dubious. Consider any coin, for definiteness say the referent<br />

of ‘1728’, and a coin* that massively overlaps it. The coin* is not a coin, so (a) one<br />

of these is a coin and the other is not, and (b) the minute differences between them<br />

cannot form the basis for a distinction between coins and non-coins. Hence (NAD)<br />

and (NAS) fail. At best, it seems, we can justify the following claims. If something<br />

is a coin* and something else is not, then there is a principled difference between<br />

them that makes one of them a coin* and the other not. Something is a coin iff it<br />

is a coin* that does not excessively overlap a coin. If this is the best we can do at<br />

defining ‘coin’, then the prospects for a reductive physicalism about coins might look<br />

a little dim, though this is no threat to a physicalism about coins that stays neutral<br />

on the question of reduction. (I trust no reader is an anti-physicalist about coins, but<br />

it is worth noting how vexing questions of reduction can get even when questions of<br />

physicalism are settled.)<br />

So I think this example refutes (NAD) and (NAS). Do I beg some questions here?<br />

Well, my counterexample turns crucially on the existence of kinds of objects, massively<br />

overlapping coin*s, that some people reject, and indeed that some find the<br />

most objectionable aspect of the supervaluationist solution. But this gets the burden<br />

of proof the wrong way around. I was not trying to refute (NAD) and (NAS). I just<br />

aimed to parry an argument based on those principles. I am allowed to appeal to<br />

aspects of my theory in doing so without begging questions. I do not want to rest<br />

too much weight on this point, however, for issues to do with who bears the burden<br />

of proof are rarely easily resolved, so let us move on.<br />

My main response to McKinnon’s dilemma is another dilemma. If the principled<br />

similarities and differences in (NAD) and (NAS) must be intrinsic properties, then<br />

those principles are false, because there is no principled intrinsic difference between a<br />

coin and a token, or a coin and a medal. If the principled similarities and differences<br />

in (NAD) and (NAS) may be extrinsic properties, then those principles may be true,<br />

but then the argument that there are not enough principled precisifications fail, since<br />

now we must consider precisifications based on extrinsic principles. Let’s look at the<br />

two halves of that dilemma in detail, in order.


Many Many Problems 423<br />

A subway token is not a coin. Nor is a medal. 8 But in their intrinsic respects, subway<br />

tokens often resemble certain coins more than some coins resemble other coins.<br />

Imagine we had a Boston subway token (which looks a bit like an American penny,<br />

but larger), an American penny, a British 20p piece (which is roughly heptagonal) and<br />

an early Australian holey dollar (which has a hole in it). There is no non-disjunctive<br />

classification of these by intrinsic properties that includes the penny, the 20p piece<br />

and the holey dollar in one group, and the subway token in the other. Any group<br />

that includes the penny and the other coins will include the token as well. So if we<br />

restrict attention to intrinsic similarities and differences, (NAD) and (NAS) are false.<br />

There is a difference between these coins and the subway token. The coins were<br />

produced with the intent of being legal tender, the token was not. Perhaps we can find<br />

a difference between coins and non-coins based on the intent of their creator. 9 This<br />

might make (NAD) and (NAS) true. But note that given the theory of precisifications<br />

developed in section 3, on every precisification, one and only one of the precisifications<br />

of ‘1728’ will be the subject of an intention on the part of its manufacturer.<br />

Just which of the objects is the subject of this intent will vary from precisification<br />

to precisification, but there is only one on every precisification. So we can say that<br />

on every precisification, the coin is the one where the intent of its creator was that<br />

it be used in a certain way. Indeed, on any precisification we may have antecedently<br />

thought to have existed, we can show that precisification to be principled by taking<br />

F to be the property being created with intent of being used in a coin-like way. 10 So<br />

now we can say that restricting attention to the principled precisifications does not<br />

unduly delimit the class of precisifications.<br />

Let’s sum up. To argue against the possibility of unprincipled precisifications,<br />

McKinnon needed to justify (NAD) and (NAS). But these are only true when we<br />

allow ‘principled differences’ to include differences in creatorial intent. And if we do<br />

that we can see that every prima facie admissible precisification is principled, so we<br />

can give an affirmative answer to McKinnon’s question.<br />

It might be objected that this move relies heavily on the fact that for many artefacts<br />

creative intent is constitutive of being the kind of thing that it is. But a Problem<br />

of the Many does not arise only for artefacts, so my solution does not generalise.<br />

This is little reason for concern since McKinnon’s problem does not generalise either.<br />

(NAD) and (NAS) are clearly false when we substitute ‘mountain’ for ‘coin’. Consider<br />

a fairly typical case where it is indeterminate whether we have one mountain or<br />

8 Some people I have asked think tokens are coins, but no one thinks medals are coins, so if you (mistakenly)<br />

think tokens are coins, imagine all my subsequent arguments are phrased using medals rather than<br />

tokens.<br />

9 Note that I say little here about what the intent of the creator must be. I don’t think that the intent<br />

must always be to create legal tender. A ceremonial coin that is created, for example, to be tossed before<br />

the start of a sporting match is still a coin, although it is not intended to be tender. But intent still matters.<br />

If someone had made a duplicate of that ceremonial coin with the intent of awarding it as a medal to the<br />

victorious captain, it would be a medal and not a coin.<br />

10 Because of the problems raised in the previous footnote, I will not try and say just what this intention<br />

amounts to. There are complications when (a) the creator is a corporate entity rather than an individual<br />

and (b) the coins are mass–produced rather than produced individually. But since the story is essentially<br />

the same, I leave the gruesome details out here.


Many Many Problems 424<br />

two. 11 In this case it might be not clear whether, for example, we have one mountain<br />

with a southern and a northern peak, or two mountains, one of them a little north<br />

of the other. Whether there is one mountain here or two, clearly the two peaks exist,<br />

and their fusion exists too. The real question is which of these three things is a<br />

mountain. However this question is resolved, a substitution instance of (NAD) with<br />

the two objects being the southern peak and the fusion of the two peaks will be false.<br />

So in this case a relatively unprincipled precisification will be acceptable. The point<br />

here is that mountain*s that are not mountains exist (either the peaks or their fusion<br />

will do as examples), and that suffices to refute McKinnon’s alleged penumbral<br />

connections and allow, in this case, a negative answer to his question.<br />

6 Sorensen on Direct Reference<br />

According to orthodoxy, we can use descriptions to determine the reference of names<br />

without those descriptions becoming part of the meaning of the name. This, apparently,<br />

is what happened when Leverrier introduced ‘Neptune’ to name, not merely<br />

describe, the planet causing certain perturbations, and when someone introduced<br />

‘Jack the Ripper’ to name, not merely describe, the person performing certain murders.<br />

So let us introduce ‘Acme’ as the name for the first tributary of the river<br />

Enigma. As Sorensen suggests, this can create certain problems.<br />

When [explorers] first travel up the river Enigma they finally reach the<br />

first pair of river branches. They name one branch ‘Sumo’ and the other<br />

‘Wilt’. Sumo is shorter but more voluminous than Wilt. This makes<br />

Sumo and Wilt borderline cases of ‘tributary’ . . . ‘Acme’ definitely refers<br />

to something, even though it is vague whether it refers to Sumo and<br />

vague whether it refers to Wilt. (Sorensen, 2000, 180)<br />

If ‘Acme’, ‘Sumo’ and ‘Wilt’ are all vague names related in this way, Sorensen thinks<br />

the supervaluationist has a problem. The sentences ‘Acme is Sumo’ and ‘Acme is<br />

Wilt’ both express propositions of the form 〈x = y〉. For exactly one of them, x is y.<br />

Since the proposition contains just the objects x and y (and the identity relation) but<br />

not their route into the proposition, there is no vagueness in the proposition. Hence<br />

there is no way to precisify either proposition. So a supervaluationist cannot explain<br />

how these propositions are vague.<br />

This is no problem for supervaluationism, since supervaluationism says that sentences,<br />

not propositions, are vague. Indeed, most supervaluationists would say that no<br />

proposition is ever vague. Thinking they are vague is just another instance of the<br />

fallacy Russell identified: attributing properties of the representation to the entity, in<br />

this case a proposition, represented.<br />

But maybe there is a problem in the area. One natural way of spelling out the idea<br />

that names directly refer to objects is to say that the meaning of a name is its referent.<br />

And one quite plausible principle about precisifications is that precisifications must<br />

11 This case is rather important in the history of the problem, because its discussion in Quine (1960) is<br />

one of the earliest presentations in print of anything like the problem of the many.


Many Many Problems 425<br />

not change the meaning of a term, they may merely provide a meaning where none<br />

exists. Now the supervaluationist has a problem. For it is true that one of Sorensen’s<br />

identity sentences is true in virtue of its meaning, since its meaning determines that<br />

it expresses a proposition of the form 〈x = x〉. But each sentence is false on some<br />

precisifications, so some precisifications change the meaning of the terms involved.<br />

The best way to respond to this objection is simply to bite the bullet. We can<br />

accept that some precisifications alter meanings provided we can provide some other<br />

criteria for acceptability of precisifications. I offered one such proposal in section 2.<br />

An acceptable precisification takes the partial order more natural than, turns it into<br />

a complete order without changing any of the relations that already exist, and uses<br />

this new relation to generate meanings. If we proceed in this way it is possible, for<br />

all we have hitherto said, that on every precisification the proposition expressed by<br />

‘Acme is Sumo’ will be of the form 〈x = y〉, so just the named object, rather than the<br />

method of naming, gets into the proposition. The central point is that since precisifications<br />

apply to the processes that turn semantic intentions into meanings, rather<br />

than to sentences with meanings, there is no guarantee they will preserve meanings.<br />

But if we like directly referential theories of names we should think this perfectly natural.<br />

If names are directly referential then Sorensen’s argument that there are vague<br />

sentences that are true in virtue of their meaning works. But this is consistent with<br />

supervaluationism.<br />

One challenge remains. If precisifications change meanings, why should we care<br />

about them, or about what is true on all of them? This is not a new challenge; it is a<br />

central plank in Jerry Fodor and Ernest Lepore’s (1996) attack on supervaluationism.<br />

A simple response is just to say that we should care about precisifications because this<br />

method delivers the right results in all core cases, and an intuitively plausible set of<br />

results in contentious cases. This kind of instrumentalism about the foundations of a<br />

theory is not always satisfying. 12 But if that’s the biggest problem supervaluationists<br />

have, they should be able to sleep a lot easier than the rest of us.<br />

7 Conclusions and Confessions<br />

I have spent a fair bit of time arguing that supervaluationism is not vulnerable to a<br />

few challenges based on the Problem of the Many. Despite doing all this, I don’t<br />

believe supervaluationism to be quite true. So why spend this time? Because the true<br />

theory of vagueness will be a classical semantic theory, and everything I say about<br />

supervaluationism above applies mutatis mutandis to all classical semantic theories.<br />

I focussed on supervaluationism because it is more familiar and more popular, but I<br />

need not have.<br />

What is a classical semantic theory? That’s easy - it’s a theory that is both classical<br />

and semantic. What is a classical theory? It is one that incorporates vagueness while<br />

preserving classical logic. How much of classical logic must we preserve? That’s a<br />

12 The largest debate in the history of philosophy of economics concerned whether we could, or should,<br />

be instrumentalists about the ideally rational agents at the core of mainstream microeconomic theory. See<br />

Friedman (1953) for the classic statement of the instrumentalist position, and Hausman (1992) for the most<br />

amusing and enlightening of the countably many responses.


Many Many Problems 426<br />

hard question, though it is relevant to determining whether supervaluationism is (as<br />

it is often advertised) a classical theory. Williamson (1994) notes that supervaluationism<br />

does not preserve classical inference rules, and Hyde (1997) notes that it does<br />

not preserve some classically valid multiple–conclusion sequents. Keefe (2000) argues<br />

that neither of these constitutes an important deviation from classical logic. I’m inclined<br />

to disagree with Keefe on both points. Following Read (2000), I take it that the<br />

best response to the anti-classical arguments in Dummett (1991)takes the essential features<br />

of classical logic to be its inferential rules as formulated in a multiple–conclusion<br />

logic. But we need not adjudicate this dispute here. Why should we want a classical<br />

theory? The usual arguments for it are based on epistemic conservatism, and I think<br />

these arguments are fairly compelling. I also think that no non–classical theory will<br />

be able to provide a plausible account of quantification. 13<br />

What is a semantic theory? It is one that makes vagueness a semantic phenomenon.<br />

It is not necessarily one that makes vagueness a linguistic phenomenon. That<br />

would be absurd in any case, since clearly some non–linguistic entities, maps, beliefs<br />

and pictures for example, are vague. But the more general idea that vagueness is a<br />

property only of representations is quite attractive. It links up well with the theory<br />

of content Lewis outlines in “Languages and Language” - all Languages (in his technical<br />

sense) are precise, vagueness in natural language is a result of indecision about<br />

which Language we are speaking.<br />

Trenton Merricks (2001) argues against this picture, claiming that all semantic<br />

vagueness (he says ‘linguistic’, but ignore that) must arise because of metaphysical<br />

or epistemic vagueness. He claims that if (17) is vague, then so is (18), and (18)’s<br />

vagueness must be either metaphysical or semantic.<br />

(17) Harry is bald.<br />

(18) ‘Bald’ describes Harry.<br />

One might question the inference from (17)’s vagueness to (18) - on some supervaluational<br />

theories if (17) is vague then (18) is false. But I will let that pass, for there<br />

is a simpler problem in the argument. Merricks claims that if (18) is vague, then<br />

it is vague whether ‘Bald’ has the property describing Harry, and this is a kind of<br />

metaphysical vagueness. It is hard to see how this follows. If there is metaphysical<br />

vagueness, there is presumably some object o and some property F such that it is<br />

vague whether the object has the property. Presumably the object here is the word<br />

‘bald’ and the property is describing Harry. But words alone do not have properties<br />

like describing Harry. At best, words in languages do so. So maybe the object can be<br />

the ordered pair 〈‘Bald’, l 〉, where l is a language. But which one? Not one of Lewis’s<br />

Languages, for then it is determinate whether has the property describing<br />

Harry. So maybe a natural language, perhaps English! But it is doubly unclear that<br />

English is an object. First, it is unclear whether we should reify natural languages<br />

to such a degree that we accept that ‘English’ refers to anything at all. Secondly, if<br />

we say ‘English’ does refer, why not say that it refers to one of Lewis’s Languages,<br />

13 See the last section of <strong>Weatherson</strong> (2005c) for a detailed defence of this claim.


Many Many Problems 427<br />

thought it is vague which one? That way we can say that the sentence ‘Bald’ in English<br />

describes Harry is vague without there being any object that vaguely instantiates<br />

a property. Now on a supervaluational theory this approach may have the unwanted<br />

consequence that “English is a precise language” is true, since it is true on all precisifications.<br />

It does not seem that this problem for the supervaluationist generalises to<br />

be a problem for all semantic theories of vagueness, so Merricks has raised no general<br />

problem for semantic theories of vagueness. (The problem for the supervaluationist<br />

here is not new. For some discussion see Lewis’s response, in “Many, but Almost<br />

One” to the objection, there attributed to Kripke, that the supervaluationist account<br />

makes it true that all words are precise.)<br />

If we have a classical semantic theory that provides a concept of determinateness,<br />

then we can define acceptable precisifications as maximal consistent extensions of<br />

the set of determinate truths. Given that, it follows pretty quickly that determinate<br />

truth implies truth on all precisifications. And this is sufficient for the major objections<br />

canvassed above to get a foothold, and hence be worthy of response, though as<br />

we have seen none of them will ultimately succeed. Still, our theory may differ from<br />

supervaluationism in many ways. For one thing, it might explain determinateness<br />

in ways quite different from those in supervaluationism. For example, the theory<br />

in Field (2000) is a classical semantic theory 14 , but it clearly goes beyond supervaluational<br />

theory because it has an interesting, if ultimately flawed, explanation of determinateness<br />

in terms of Shafer functions. Other classical semantic theories may differ<br />

from supervaluationism by providing distinctive theories of higher order vagueness.<br />

The most promising research programs in vagueness are within the classical semantic<br />

framework. Like all research programs, these programs need a defensive component,<br />

to fend off potential refutations and crisis. This avoids unwanted crises in<br />

the program, and as we have seen here we can learn a bit from seeing how to defend<br />

against certain attacks. There will undoubtedly be more challenges in the time ahead,<br />

but for now the moves in this paper brings the defensive side of the program up to<br />

date.<br />

14 At least, it strikes me as a classical semantic theory. Ryan Wasserman has tried to convince me that<br />

properly understood, it is really an epistemic theory. Space prevents a thorough account of why I think<br />

Field’s theory is flawed. Briefly, I think the point in Leeds (2000) that Field’s concept of a numerical<br />

degree of belief needs substantially more explanation than Field gives it can be developed into a conclusive<br />

refutation.


Vagueness as Indeterminacy<br />

Abstract<br />

Traditionally, we thought vague predicates were predicates with borderline<br />

cases. In recent years traditional wisdom has come under attack from<br />

several leading theorists. They are motivated by a common idea, that<br />

terms with borderline cases, but sharp boundaries around the borderline<br />

cases, are not vague. I argue for a return to tradition. Part of the argument<br />

is that the alternatives that have been proposed are themselves<br />

subject to intuitive counterexample. And part of the argument is that we<br />

need a theory of what vagueness is that applies to non-predicates. The<br />

traditional picture can be smoothly generalised to non-predicates if we<br />

identify vagueness generally with indeterminacy. Modern rivals to tradition<br />

do not admit of such smooth generalisation.<br />

Recently there has been a flurry of proposals on how to ‘define’ vagueness. These<br />

proposals are not meant to amount to theories of vagueness as, for instance, epistemic<br />

or supervaluational theories of vagueness are. That is, they are not meant to provide<br />

solutions to the raft of puzzles and paradoxes traditionally associated with vagueness.<br />

Rather, they are meant to give us a sense of which terms in the language are vague,<br />

and to use Matti Eklund’s phrase, in what their vagueness consists. Doing this might<br />

be a prelude to a successful theory of vagueness, or it might just be an interesting<br />

classificatory question in its own right.<br />

When this activity started, most notably with Patrick Greenough’s “A Minimal<br />

Theory of Vagueness”, I suspected that it would be a hopeless project. Imagine, I<br />

thought, trying to give a definition of what causation is that didn’t amount to a theory<br />

of causation. That project seems hopeless, and I didn’t think the prospects for<br />

a definition of vagueness were much better. I now think I was wrong, and we can<br />

learn a lot from thinking about which terms are vague, independent of our theory of<br />

vagueness. (As we’ll get to below, Greenough’s theory isn’t marketed as a definition<br />

of vagueness, but rather a ‘minimal theory’ to which all parties can agree. But it has<br />

been taken, e.g. by Eklund and Nicholas Smith, to be providing a rival to genuine<br />

definitions of vagueness, and I’ll follow Eklund and Smith in this respect.)<br />

The point of this exercise is not to give an analysis of how the man on the<br />

Clapham omnibus uses ‘vague’ and its cognates. As is widely recognised, ‘vague’<br />

is often used in ordinary language as a predicate that applies to claims like The Grand<br />

Canyon is between 2 and 2 trillion years old, i.e. claims that are consistent with a wide<br />

range of possible worlds. That’s not the sense of ‘vague’ which philosophers use, nor<br />

the sense we are trying to define. But nor should we think we are just trying to analyse<br />

philosophical use of ‘vague’. The philosophers’ usage may be our starting point,<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Richard<br />

Dietz and Sebastiano Moruzzi (eds.), Cuts and Clouds, OUP, pp. 77-90. Thanks to David Chalmers,<br />

Matti Eklund, Delia Graff Fara, Patrick Greenough, Mark Johnston, Daniel Nolan, Nicholas J. J. Smith,<br />

Andrew Strauss and audiences at ANU, Princeton and Arché.


Vagueness as Indeterminacy 429<br />

but if we find philosophers have traditionally being ignoring theoretically important<br />

commonalities, or blurring theoretically important distinction, our best definition<br />

may well amount to a revision of philosophical usage.<br />

The game, I think, is one of setting goals for what a theory of vagueness should<br />

do. It is a legitimate objection to a theory of vagueness that it isn’t comprehensive,<br />

that it doesn’t cover the field. If supervaluationism was only a theory of how vague<br />

words that started with consonants behaved, for example, that would be a problem<br />

for supervaluationism. But to press objections of this form we must have an antecedent<br />

answer to the question of which words are in the field, and hence should be<br />

covered. That’s the good question which these definitions of vagueness address. Because<br />

I take this to be the important issue, I’m going to start this paper with a bunch<br />

of examples of apparently vague, and apparently non-vague, terms. We’ll then look<br />

at which theories do the best job at systematising intuitions about these cases. I’ll<br />

then argue that the best way to systematise our intuitions about these cases while respecting<br />

theoretically important commonalities and distinctions is to take vagueness<br />

to be indeterminacy, while staying silent for now on whether the indeterminacy is<br />

semantic or epistemic. In doing so I’m returning to a traditional view of vagueness,<br />

one that is discussed in such classic works as Kit Fine’s statement of supervaluationism<br />

(Fine, 1975b). So I make no claim to originality in my conclusions here, though<br />

I hope at least some of the arguments are original.<br />

1 Examples<br />

I’m going to introduce five classes of examples, which will serve as our data in what<br />

follows. I’ll give a fairly tendentious description of each class to orient us before<br />

starting. Our five classes are (a) words that are indeterminate but not vague, (b)<br />

vague words that are not predicates, especially predicate modifiers, (c) vague predicates<br />

whose conditions of application are contentious, (d) vague predicates whose<br />

application depends on discrete states of the world, and (e) vague predicates that do<br />

not determine boundaries.<br />

1.1 Indeterminacy without Vagueness<br />

Many philosophers, if asked, would say that vague words are those that have borderline<br />

cases. As noted above, Fine (1975b) takes exactly this view. My preferred<br />

view, that vagueness is indeterminacy, is a simple generalisation of this view to nonpredicates.<br />

But it is a commonplace of the literature on definitions of vagueness that<br />

this won’t do because of examples of indeterminacy without vagueness. Two examples<br />

are commonly used. One of these is Sainsbury’s example child* (Sainsbury,<br />

1991). By definition, the extension of child* is the set of persons under sixteen years<br />

old, and its anti-extension is the set of persons eighteen years old or older. Sixteen and<br />

seventeen year olds are borderline cases. The intuition is that even though child* has<br />

borderline cases it is not vague, because there are sharp boundaries to its borderline.<br />

A similar case arrives with mass as it is used by a Newtonian physicist. (I’m<br />

grateful to Delia Graff Fara for pointing out the connection here.) As Field (1973)<br />

showed, mass is indeterminate between two meanings, rest mass and proper mass.


Vagueness as Indeterminacy 430<br />

But it is intuitively not vague, because it is determinate that it means either rest mass<br />

or proper mass. These cases are well discussed in the existing literature, and I won’t<br />

say much more about them here, save to note that one of the examples that is usually<br />

taken to be very problematic for the vagueness as indeterminacy view, child*, is not<br />

synonymous with any term in any natural language. This is not a reason that it could<br />

not serve as a counterexample, because a definition should cover all terms actual and<br />

possible.<br />

1.2 Non-Predicate Vagueness<br />

Not only predicates are vague. There is an extensive literature on vague singular<br />

terms. Arguably many determiners are vague. And, as I’ll stress here, many predicate<br />

modifiers are vague.<br />

We can make an intuitive distinction between vague and precise predicate modifiers.<br />

Compare the following two (obviously artificial) predicate modifiers. (I owe<br />

these examples to David Chalmers.) Where F is a predicate such that Fa is true iff for<br />

some variable v, v(a) > x and v has a natural zero value (e.g. like height and unlike<br />

utility) then we can define doubly F and bigly F. It is true that a is doubly F iff v(a)<br />

> 2x and a is bigly F iff v(a)/x is big. Now there’s a good sense in which doubly is<br />

a precise modifier, for the modification it makes to its attached predicate can be precisely<br />

defined, while bigly is a vague modifier. That’s the sense in which I mean some<br />

modifiers are vague and others are precise. Note that even though doubly is precise<br />

it can be a constituent of a vague predicate, such as doubly tall. That makes sense;<br />

just as a vague sentence need only contain one vague word, so need a vague complex<br />

predicate need only contain one vague word.<br />

Now we might well ask whether natural language modifiers like very are vague<br />

or precise. I’m sad to say that I really don’t have an answer to that question, but I<br />

think it’s an excellent question. To get a sense of how hard it is, note one awkward<br />

feature of very – it is most comfortable attaching to words that are themselves vague.<br />

For instance (1a) is a sentence of English while (1b) is not.<br />

(1) (a) Jack is very old.<br />

(b) *Jack is very forty-seven years old.<br />

I don’t know whether this is a universal feature of very. My best guess is that it is<br />

though in conversation some people have proposed interesting putative counterexamples.<br />

(I’m grateful here to Daniel Nolan.) But to avoid that complication, I’ll introduce<br />

a new word very*. This modifier is defined such that if F is vague then very*<br />

F means the same thing as very F, and if F is not vague then very* F is meaningless,<br />

like very forty-seven years old. It’s an excellent question whether very* is vague, and I<br />

think it’s a requirement on a definition of vagueness that it allow this question to be<br />

asked. As we’ll see, this is sadly not true of most proposed definitions of vagueness<br />

on the market.


Vagueness as Indeterminacy 431<br />

1.3 Philosophically Interesting Vague Terms<br />

It’s morally obligatory that someone with my standard of living donate 1% of their<br />

income to charity. It’s not morally obligatory that someone with my standard of<br />

living donate 100% of their income to charity. What is the largest x such that it’s<br />

morally obligatory that someone with my standard of living donate x% of their income<br />

to charity? (As a moralistic cheapskate I’d rather like to know.) Arguably this<br />

is vague. But perhaps only arguably. On some divine command theories it is precise,<br />

because there’s a fact about what God wants me to do, however hard this is to figure<br />

out. (It’s even a knowable fact, since God knows it.) But on more standard secular<br />

moral theories this may indeed be vague.<br />

There are two lessons to draw from this case. First, if two philosophers can debate<br />

what the correct theory of morality is while one thinks it is vague and the other<br />

thinks is precise, as I think could happen in a dispute between a divine command<br />

theorist and a virtue ethicist, then knowing that a vague term is vague is not required<br />

for understanding the term. (I assume here the divine command theorist is not so<br />

confused that she’s not really talking about goodness.)<br />

Second, it is important to remember that for some vague terms competent users<br />

of the term need not know in virtue of what they apply. Much of the literature<br />

on vagueness focuses on words like tall, thin and bald where all competent users<br />

know which kinds of underlying facts are relevant to their application. But not all<br />

vague terms are like that, as good illustrates. And this phenomena extends beyond the<br />

normative, at least narrowly conceived. If you believe Tom Wolfe (2000) then among<br />

the youth of America going out with is vague and many do not know exactly in virtue<br />

of what it applies. It’s a familiar point in philosophy of mind that competent users<br />

can disagree about what kinds of features a thing must have to satisfy is thinking. And<br />

we can multiply instances of this by considering any area of philosophy we like.<br />

1.4 Discrete Vague Terms<br />

An academic with one child has few children for an academic. An academic with five<br />

children does not have few children for an academic. (I’ll omit the comparison class<br />

‘for an academic’ from now on.) Where is the borderline between those with few<br />

children and those not with few children? (I don’t ask out of personal interest this<br />

time.) This question, like the question of how much giving is morally obligatory,<br />

feels vague. But note that we cannot generate a compelling Sorites paradox using has<br />

few children. Let’s see how badly this Sorites argument fails.<br />

(2) (a) An academic with one child has few children.<br />

(b) If an academic with one child has few children, then an academic with<br />

two children has few children.<br />

(c) If an academic with two children has few children, then an academic with<br />

three children has few children.<br />

(d) If an academic with three children has few children, then an academic<br />

with four children has few children.<br />

(e) If an academic with four children has few children, then an academic with<br />

five children has few children.


Vagueness as Indeterminacy 432<br />

(f) So an academic with five children has few children.<br />

Arguably premise e is plausible because as a material conditional it can be seen to be<br />

true via the falsity of the antecedent. And at a pinch I can see d as compellingly true<br />

for the same reason. But neither b nor c strike me as at all compelling. If someone<br />

presents this argument as a Sorites paradox, I simply deny that the paradox-mongerer<br />

can know these premises to be true, or that I have a reason to believe they are true.<br />

To be sure, I don’t know which premise is false. (If you think you know b to be false<br />

replace academics in the example with a more fertile professional group.) But just<br />

because I don’t know where the argument fails doesn’t mean it presents any kind of<br />

paradox. When I have no reason to accept two, maybe three, of the premises, the<br />

argument falls well short of being paradoxical.<br />

A small note on terminology. Contemporary scientific theories imply that many<br />

familiar vague predicates apply in virtue of facts about the world that are, at some<br />

level, discrete. What I’m interested in under this heading are predicates where the<br />

differences between salient adjacent cases are easily observable, such as the difference<br />

between having two and three children.<br />

1.5 Vagueness without Boundaries<br />

The letter of Patrick Greenough’s proposal (to be discussed in section three below)<br />

suggests that every vague term has only vague boundaries. This is not true. The<br />

predicate in one’s early thirties has a sharp boundary at the lower end and a precise<br />

boundary at the upper end. But it isn’t too hard to amend his theory to allow for<br />

such cases, by saying (in effect) that a vague term is a term with at least one vague<br />

boundary. Nicholas Smith makes basically that move in his paper. But such a move<br />

won’t work, because some vague predicates don’t have boundaries. Indeed, some<br />

predicates can be vague even though they are satisfied by every object in the domain.<br />

The examples here are a little more complicated than in the rest of the paper, but I<br />

think they are important enough to warrant the complexity.<br />

For the next several paragraphs the domain will be adult Australian women, and<br />

when I use tall I’ll mean tall for an adult Australian woman. I don’t know enough<br />

facts to know where the boundaries are for tall in this context, but I’ll stipulate that a<br />

woman shorter than 170cm is determinately not tall, and a woman taller than 180cm<br />

is determinately tall. I claim here neither that I know where these boundaries are nor<br />

that I could know where they are. But I assume there are boundaries. I’m making<br />

these stipulations because it is easier to follow the examples if I use 170 and 180 rather<br />

than variables like y and z. It will become obvious that the particular numbers won’t<br />

matter, as long as there’s separation between them. It also doesn’t matter whether we<br />

use a semantic or epistemic account of determinacy here. It will matter that we use<br />

classical logic at various points (e.g. in assuming there are boundaries), but I think<br />

that’s perfectly reasonable in this context. (Here I follow the arguments in section 2<br />

of Greenough’s paper.)<br />

Consider the class of predicates defined by the following schema.<br />

tall x = df tall or shorter than x cm


Vagueness as Indeterminacy 433<br />

For x < 170, tall x has all the same borderline cases as tall, and is presumably vague in<br />

anyone’s book. For x > 180, tall x determinately applies to everyone in the domain,<br />

and for now we’ll say that makes it not vague. (Though note it need not determinately<br />

determinately apply to everyone in the domain, and we’ll see below that might<br />

be a reason to group it with the vague predicates.)<br />

When x is between 170 and 180, tall x has some very odd properties. The borderline<br />

cases are those women whose height is between x and 180cm. When x is close<br />

to 180, this might be a very small border. While we’re assuming classical logic, we<br />

can assume that there is a value y such that women taller than y cm are tall and those<br />

shorter than y cm are not tall. We need not here assume the value of y is either epistemically<br />

or semantically determinate. Consider a value of x, say 179, such that x ><br />

y. (Again it’s not a necessary assumption that 179 > y, but it makes the example<br />

easier to understand if I use a particular number.) Now tall 179 has some interesting<br />

properties. It has borderline cases, those women between 179 and 180cm tall. But it<br />

is satisfied by every woman, since every woman is either tall or shorter than 179cm.<br />

I think the existence of the borderline cases is sufficient to make tall 179 vague. Note<br />

that these cases are quite different to child*, because at the upper boundary there is<br />

no sharp jump from borderline cases to clear cases – the two blur together in just the<br />

way borderline cases and clear cases of tall blur together, so whatever reasons we had<br />

to worry about child* being vague are not applicable here. Still the ‘borderline cases’<br />

are mislabelled here for there is no border they fall on. Every woman satisfies the<br />

predicate. So no definition of vagueness in terms of having a vague boundary, indeed<br />

of having a boundary at all, can work.<br />

One might object here that a definition of vagueness is only meant to apply to<br />

words not phrases. But just as we can worry about a possible word child*, we can<br />

worry about a possible atomic word gish that means the same thing as tall 179 , so that<br />

move won’t help here.<br />

We now have enough data on the table. In the next section I argue that treating<br />

vagueness as being indeterminacy provides a satisfactory treatment of the data. In the<br />

third section I argue that none of the live alternatives is so satisfactory. So I conclude,<br />

somewhat tentatively, that we should define vagueness as indeterminacy.<br />

2 Vagueness as Indeterminacy<br />

Back when I was a supervaluationist, I thought that what it was for a term to be vague<br />

was for it to refer to different things on different precisifications. That won’t do as a<br />

theory-neutral definition, for it presupposes supervaluationism, which is not only a<br />

theory but a false theory. But we can capture the essential idea in slightly less loaded<br />

language.<br />

I will have to make three possibly controversial assumptions. First, I assume a<br />

broadly Montagovian perspective, on which we can talk about the referent of an arbitrary<br />

term. (See Montague (1970, 1973) for more details.) That referent might be<br />

an object, or a truth value, or a function from objects to truth values, or a more<br />

complicated function built out of these. Second, I assume we can sensibly use an expanded<br />

Lagadonian language where objects can be names for themselves, truth-values


Vagueness as Indeterminacy 434<br />

can be names for themselves, functions from objects to truth-values can be names for<br />

themselves, and so on. (See Lewis 1986 for more on Lagadonian languages.) Third, I<br />

assume there is no metaphysical vagueness, so each of these Lagadonian names is not<br />

vague.<br />

Those assumptions let us make a first pass at a definition of vagueness, as follows.<br />

A term t is vague iff there is some object, truth-value or function l which can serve<br />

as its own name such that the following sentence is neither determinately true nor<br />

determinately false.<br />

(3) t denotes l.<br />

That delivers the intuitively correct account in four of the five cases we discuss above,<br />

all except the cases like child*. I’ll say much more about that case below. But it<br />

is in one respect slightly too liberal, and we need to make a small adjustment or<br />

two to fix this. Consider a predicate F that is defined over a vague domain, but<br />

which is determinately satisfied by every object in the domain. Intuitively it is a<br />

partial function, which maps every member of its domain to true. And assume for<br />

sake of argument that it is determinate that it maps every member of the domain to<br />

true. (Say, for example, it means is self-identical when applied to a member of the<br />

domain.) Such a predicate is not, I think, vague. But since it is indeterminate which<br />

partial function it denotes, the above theory suggests it is vague. We need to make<br />

a small adjustment. To state the corrected theory, we will stipulate that every term<br />

denotes a function. What were previously thought of as terms denoting constants<br />

will be treated as terms denoting constant functions. So instead of a name like Scott<br />

Soames denoting Scott Soames, we’ll take it to denote the function that takes anything<br />

whatsoever as input, and returning Scott Soames as output. Given that, our second<br />

take at a definition of vagueness is as follows.<br />

t is vague iff ∃x, y 1 , y 2 such that y 1 �= y 2 and it is indeterminate whether<br />

∃l such that t denotes l and l(x) = y 1 , and it is indeterminate whether ∃l<br />

such that t denotes l and l(x) = y 2 .<br />

To get a sense of the definition, it helps to translate it back into supervaluational talk,<br />

and look at the special case where t is a predicate. Then the definition comes to the<br />

claim that there is some object that is in the extension of t on one precisification, and<br />

in the anti-extension of t on another, which seems like what was intended.<br />

Arguably even that is not enough of a correction. (I’m indebted in the following<br />

three paragraphs to Mark Johnston.) Frequently there are debates in semantics over<br />

the appropriate type of various terms. 1 For instance, a straightforward account would<br />

say that in She ran yesterday, yesterday modifies the intransitive verb run, so it denotes<br />

a function of type 〈〈e,t〉, 〈e,t〉〉. But on a Davidsonian semantics, yesterday denotes a<br />

1 In what follows I’ll refer to functions of type 〈X, Y〉. These are functions from things of type X to<br />

things of type Y, where the basic types are entities, represented by e, and truth values, represented by t.<br />

So a function of type 〈e, t〉 is a function from objects to truth values, or, equivalently, the characteristic<br />

function of a set. A function of type 〈〈e, t〉, 〈e, t〉〉 is a function from characteristic functions of sets to<br />

characteristic functions of sets. This is plausibly the semantic value of a predicate modifier like very.


Vagueness as Indeterminacy 435<br />

property of the running event being discussed, so its type is simply 〈e,t〉. Now it is<br />

at least a philosophical possibility that there should be no fact of the matter which of<br />

these theories is correct.<br />

There are two things we might say about such a possibility. On the one hand, it<br />

doesn’t at all seem right that a word should count as vague because it is indeterminate<br />

what its type should be. That suggests the above definition needs modification. On<br />

the other hand, the above definition doesn’t imply t is vague whenever there are<br />

two distinct functions that could be the denotation of t; it must also be the case<br />

that these functions have overlapping domains. The most natural cases of syntactic<br />

indeterminacy are cases where the two possible denotations are functions of quite<br />

different types. That suggests the above definition needs no modification.<br />

I think the case for modification is a little stronger. That’s partially because the<br />

possibility of type-shifting suggests there’s a possibility, perhaps a distant one but<br />

a possibility, that the second suggestion could fail. And partially because even if<br />

there are no uncontroversial cases of syntactic indeterminacy that will mistakenly be<br />

treated as cases of vagueness by this theory, the mere possibility of classifying a case of<br />

syntactic indeterminacy as a case of vagueness should be enough to warrant concern.<br />

And there is a way to modify the definition that does not look like it will lead to<br />

mistakenly ruling out any cases of vagueness that should be ruled in, as follows.<br />

t is vague iff ∃x, y 1 , y 2 such that y 1 �= y 2 and y 1 is of the same type as y 2 ,<br />

and it is indeterminate whether ∃l such that t denotes l and l(x) = y 1 , and<br />

it is indeterminate whether ∃l such that t denotes l and l(x) = y 2 .<br />

That implies that if yesterday is indeterminate merely because it is indeterminate what<br />

type of function it denotes, it won’t count as vague, and that’s all to the good. So this<br />

is our final definition of vagueness.<br />

Still there’s a problem with child*. Many people have thought that it should not<br />

be considered vague for one reason or another. Sometimes this is just asserted as a raw<br />

intuition, as in Smith and Eklund. There’s no arguing with an intuition, so I won’t<br />

try arguing with it. Rather I’ll just repeat a point I made at the start. We aren’t here<br />

in the business simply of summarising ordinary or philosophical intuitions. Rather<br />

we are looking for a definition that captures all the cases that fall into the most theoretically<br />

important categories. And intuitions about theoretical importance are less<br />

impressive than demonstrations of theoretical importance.<br />

Patrick Greenough (2003) suggests that the problem with terms like child* is that<br />

they aren’t vague, but rather that they are simply undefined for the alleged borderline<br />

cases. If that’s true, and perhaps for some of the examples people had in mind in this<br />

area it is, then our definition agrees that they are not vague. For a term that carves a<br />

precise division out of part of the domain, and then stays silent, is precise not vague<br />

on my account.<br />

Greenough also suggests that the problem with child* is that it is not higher-order<br />

vague. 2 But as he says this can hardly be the entirety of the problem. For it does not<br />

2 When I say a term is higher-order vague, I mean that it is subject to higher-order vagueness, not that it<br />

is vague whether the term is vague.


Vagueness as Indeterminacy 436<br />

seem to be definitional that the vague terms are also higher-order vague. True, there<br />

is a theoretically important category of terms that are vague and higher-order vague.<br />

But it is not a category that we cannot represent. A term t is in this category just in<br />

case t is vague, and definitely t is vague, and definitely definitely t is vague, and so on.<br />

So we can capture that category, even if we don’t call only members of that category<br />

the vague terms. And this doesn’t seem to diminish the theoretical importance of the<br />

category of terms I called vague.<br />

It might be thought that what is wrong with child* is that it cannot be used to<br />

generate a Sorites argument. If you think that’s what is centrally important to vague<br />

terms, then there’s a theoretical reason to separate child* from the genuinely vague.<br />

But we should have seen enough by now to show that that can’t be right. It’s hard<br />

to know what it is for a predicate modifier to be Sorites-susceptible, and our last<br />

two example predicates, has few children and tall 179 cannot be used to set up Sorites<br />

arguments. So that child* does not generate a Sorites paradox is no reason to classify<br />

it outside the vague.<br />

So I take it there is no compelling reason to classify child* and similar terms as<br />

precise rather than vague. Admittedly there is an intuition that they are not vague,<br />

and perhaps that should be respected. But if the cost of respecting that intuition is<br />

that we misclassify several other terms, we should reject the intuition. That’s what<br />

I’ll argue in the next section.<br />

3 Rival Definitions<br />

I just mentioned the idea that a vague predicate could be defined as one that is susceptible<br />

to a Sorites argument. This account is sometimes attributed to Delia Graff Fara<br />

(2000), but it seems quite a widespread view. For instance, Terence Horgan (1995)<br />

says that it is distinctive of vague predicates that they can be used to generate inconsistency<br />

because the Sorites premises attaching to them are true. As I mentioned, such<br />

views are vulnerable to a wide variety of counterexamples. Many of these counterexamples<br />

also apply to rival definitions of vagueness.<br />

Matti Eklund (2005) develops a similar kind of definition. He starts with Crispin<br />

Wright’s (1975) famous definition of what it is for a predicate F to be tolerant.<br />

Whereas large enough differences in F’s parameter of application sometimes<br />

matter to the justice with which it is applied, some small enough<br />

difference never thus matters.<br />

Eklund’s position then is that F is vague iff it is part of semantic competence with<br />

respect to F to be disposed to accept that F is tolerant. Eklund agrees that it is inconsistent<br />

to assert that F is indeed tolerant. But as he has argued extensively elsewhere,<br />

the falsity of the tolerance principle is compatible with it being part of competence<br />

that one is disposed to accept it. (A view in the same family is put forward in Sorensen<br />

(2001).) I have no wish to dispute this part of Eklund’s theory. Indeed that meaning<br />

principles can be false, even inconsistent, it seems to have been a fairly fruitful idea<br />

in a variety of areas of Eklund’s philosophy. But I don’t think it helps with vague<br />

terms.


Vagueness as Indeterminacy 437<br />

Three of the problems with this have already been given. It is not clear what a<br />

parameter of application for a non-predicate like very even is, so it isn’t clear what it<br />

means to say that very is tolerant. It surely is not required of competent users of few<br />

children that they are disposed to accept the premises in our earlier Sorites argument.<br />

And for some vague predicates, like tall 179 , the tolerance principle is not plausible to<br />

a competent speaker because it is not plausible that a “large enough” difference in the<br />

parameter of application (presumably height) matters. These problems all seem to<br />

carry over from the problems associated with Sorites based definitions.<br />

I suspect, though I’m less certain here, that the philosophically interesting cases<br />

also pose a problem for Eklund’s view. When we look at philosophically interesting<br />

cases, like being good, there are two distinct ways to read Eklund’s claim that competent<br />

speakers are disposed to accept the tolerance principle. These are the wide<br />

scope and the narrow scope reading. To see the ambiguity, let’s write out Eklund’s<br />

principle in full.<br />

Competent speakers are disposed to accept that whereas large enough<br />

differences in F’s parameter of application sometimes matter to the justice<br />

with which it is applied, some small enough difference never thus<br />

matters.<br />

Here’s the wide scope reading of this.<br />

F’s parameter of application is such that whereas competent speakers are<br />

disposed to accept that large enough differences in it sometimes matter to<br />

the justice with which F is applied, some small enough difference never<br />

thus matters.<br />

And here is the narrow scope reading, with a phrase added for emphasis.<br />

Competent speakers are disposed to accept that whereas large enough differences<br />

in F’s parameter of application, whatever it is, sometimes matter<br />

to the justice with which it is applied, some small enough difference<br />

never thus matters.<br />

To see the difference between the two cases, assume for the sake of argument that a<br />

competent speaker thinks that to be good is to do actions whose consequences have<br />

a high enough utility, whereas in reality to be good is to obey enough of God’s commands.<br />

In each case being good is vague, because we are using satisficing versions<br />

of consequentialism and divine command theory. So the parameter of application<br />

for being good is the number of God’s commands you obey. The competent speaker<br />

will not accept the wide scope version of tolerance with respect to being good, because<br />

they don’t think that large differences with respect to how many of God’s commands<br />

you obey matter to the justice with which being good is applied. Such cases can be<br />

multiplied endlessly to show that the wide scope version of Eklund’s principle cannot<br />

generally be true, because it makes it the case that competent speakers have correct


Vagueness as Indeterminacy 438<br />

views on contentious philosophical matters the resolution of which goes beyond semantic<br />

competence. For these reasons Eklund has said (personal communication)<br />

that he intends the narrow scope version.<br />

But the narrow scope version also faces some difficulties. The most direct problem<br />

is that one can be a competent user of a term like food or dangerous or beautiful<br />

without having any thoughts about parameters of application. I suspect I was a competent<br />

user of these terms before I even had the concept of a parameter of application.<br />

Even bracketing this concern, there is a worry that competence requires knowing of a<br />

term whether it is vague or not. But this seems to be a mistake. It is not a requirement<br />

of competence with moral terms like good that one know whether they are maximising<br />

or satisficing terms. Tom Wolfe and the students he observed while writing I<br />

Am Charlotte Simmons seemed to disagree about whether going out with is vague,<br />

but they were both competent users, they simply disagreed on something like a normative<br />

question. (See Wolfe (2000) for more on his take on matters.) And it seems<br />

that two users of language could disagree over whether is thinking is vague without<br />

disagreeing over whether either is a competent semanticist. They may well disagree<br />

over whether either is a competent philosopher of mind, but such disagreements are<br />

neither here nor there with respect to our present purposes. So I don’t think that<br />

either disambiguation of Eklund’s principle can properly account for vagueness in<br />

philosophically interesting terms.<br />

Nicholas Smith argues for a definition of vagueness that uses some heavier duty<br />

assumptions about the foundations of semantics. In particular, he sets out the following<br />

definition,<br />

Closeness If a and b are very similar in F-relevant respects, then ‘Fa’ and ‘Fb’ are<br />

very similar in respect of truth.<br />

and goes on to say that vague predicates are those that satisfy non-vacuously satisfy<br />

Closeness over some part of their domain. For this to work there must be, as Smith<br />

acknowledges, both degrees of truth and something like a distance metric defined on<br />

them. (These are separate assumptions; in the theory of <strong>Weatherson</strong> 2005 the first is<br />

true but not the second.) I won’t question those assumptions, but rather focus on the<br />

problems the definition has even granting the assumptions.<br />

As with the two definitions considered so far, it is hard to see how this could<br />

possibly be generalised to cover vagueness in non-predicates. It’s true (given our assumptions)<br />

that if a and b are similar in very tall-relevant respects, then ‘a is very tall’<br />

and ‘b is very tall’ will be similar in respect of truth. But that doesn’t show very is<br />

vague, for the same condition is satisfied when we replace very with the precise modifier<br />

doubly. This isn’t an argument that Smith’s definition couldn’t be extended to<br />

cover modifiers, but a claim that it is hard to see how this will work.<br />

The definition also has trouble with tall 179 for this satisfies Closeness vacuously.<br />

Though to be fair given the logical assumptions Smith makes, it is possible that no<br />

predicate with the properties I’ve associated with tall 179 can be defined.<br />

More seriously, there is a problem with predicates like has few children. It just isn’t<br />

true that “An academic with two children has few children” is close in truth value to


Vagueness as Indeterminacy 439<br />

“An academic with one child has few children”. In general Smith’s theory has trouble<br />

with, i.e. rules out by definition, vague terms where the underlying ‘relevant respects’<br />

are highly discrete. Note that the problem here extends to some predicates where the<br />

underlying facts are continuous. Consider the predicate is very late for the meeting. At<br />

least where I come from, a person who is roughly ten minutes late is a borderline case<br />

of this predicate. But which side of ten minutes late they are matters. (In what follows<br />

I make some wild guesses about how numerical degrees of truth, which aren’t part of<br />

my preferred theory, should operate. But I think the guesses are defensible given the<br />

empirical data.) If Alice is nine and three-quarters minutes late, and Bob is ten and<br />

a quarter minutes late, then the degree of truth of “Alice is very late” will be much<br />

smaller than the degree of truth of “Bob is very late”. The later you are the truer “you<br />

are very late” gets, but crossing conventionally salient barriers like the ten minutes<br />

barrier matter much more to the degree of truth than crossing other barriers like<br />

the nine minutes thirty-three seconds barrier. Smith (in conversation) has suggested<br />

that he’s prepared to accept that is very late for the meeting is only partially vague<br />

if the truth-values ‘jump’ at the ten minute mark as I’m suggesting. But this seems<br />

improper, for this is as clear a case of a vague predicate as we have. Still, it’s worth<br />

remembering as always that every definition has its costs, and this may be a cost one<br />

chooses to live with. Personally I think it is excessive.<br />

Patrick Greenough did not put forward his theory as a definition of vagueness,<br />

but rather as a minimal theory to which all partisans could agree. Like Eklund, Greenough<br />

plays off Crispin Wright’s idea of tolerance. Roughly, a vague predicate is one<br />

that is epistemically tolerant – it’s one where you can’t know that a small difference<br />

makes a difference. Here’s a less rough statement of it, though note this is heavily<br />

paraphrased.<br />

Let τ be a variable that ranges over truth states (e.g. true, determinately true, not<br />

determinately determinately not determinately true, etc.) v a function from objects<br />

to real numbers such that whether x is F depends only on the value of v(x) (i.e. v is<br />

F’s parameter of application) and c a suitably small number. Then F is vague iff the<br />

following claim non-vacuously holds.<br />

∀τ∀α∀β∀a∀b, if |v(α) - v(β)| < c and a names α and b names β and it<br />

is knowable that Fa is τ then it is not knowable that Fb is not τ.<br />

Less formally, we can’t know where any boundary at any order of definiteness for F<br />

lies. (It isn’t clear in Greenough’s presentation exactly what the non-vacuous condition<br />

comes to. He only explicitly says that for the special case where τ is is true there<br />

must be an a and a b such it is knowable that Fa is τ and Fb is not τ, but maybe that<br />

should be extended to all τ.) Because of cases like in one’s early thirties this cannot do<br />

as a general definition, but it is easy enough to repair it by restricting the quantifier<br />

attaching to a and b to a range over which F has only vague boundaries. Doing this<br />

amounts to weakening Greenough’s claim from the view that vague terms have only<br />

vague boundaries to the view that they have some vague boundaries, which seems<br />

plausible. But still there are problems.


Vagueness as Indeterminacy 440<br />

Most obviously, tall 179 does not non-vacuously satisfy the tolerance requirement.<br />

And like all the tolerance-based theories it is far from clear how it should be extended<br />

to vagueness in non-predicates. On the other hand, Greenough’s theory might well<br />

handle the discrete cases like has few children. I say might rather than does because it<br />

is rather hard to work out how the higher-orders of vagueness work for such terms.<br />

I’ll simply note that there are some plausible enough epistemic models on which has<br />

few children satisfies his requirement.<br />

There is a problem which is distinctive to Greenough’s view of his theory as<br />

a minimal theory. As Smith notes, Greenough makes it a requirement that vague<br />

boundaries are unknown. But this is controverted in some mainstream theories,<br />

for example the version of supervaluationism in Dorr (2003). Since Dorr’s theory<br />

should not be ruled out by a minimal theory or a definition, this is a weakness in<br />

Greenough’s theory.<br />

The more philosophically interesting problems concern, appropriately enough,<br />

the philosophically interesting terms. Greenough has a proof that his definition is<br />

equivalent to a definition in terms of borderline cases. The proof has several assumptions,<br />

one of which being that we know what the parameter of application of a vague<br />

term is. More precisely, he assumes that we know everyone older than an old person<br />

is old, which is unproblematic, but he also assumes that the proof generalises to all<br />

vague cases, and this amounts to the assumption that we know parameters of application.<br />

As we’ve seen, this isn’t true of philosophically interesting vague terms. This<br />

leaves open the possibility that Greenough’s theory, unlike Smith’s and Eklund’s<br />

theories, overgenerates. The following is probably not a live possibility in any interesting<br />

sense, but it isn’t I think the kind of thing a definition (or minimal theory)<br />

should rule out by definition.<br />

It is possible that a kind of mysterianism about ethics is true, and we cannot know<br />

whether good is vague or precise. For a concrete example, let’s assume it is knowable<br />

that some kind of divine command theory is true, but it is unknowable whether to<br />

be good one must obey all of God’s commands or merely enough of them, where<br />

it is vague what counts as enough of them. In fact morality requires obeying all<br />

God’s commands, but this is not knowable – for all we know the satisficing version<br />

is the true moral theory. If this is the case then good will be epistemically tolerant,<br />

for we cannot know that a small difference in how many of God’s commands you<br />

obey makes a difference to whether you are good, or determinately good etc. But in<br />

fact good is precise, for it precisely means obeying all of God’s commands. Earlier I<br />

objected to Eklund’s theory because semantic competence does not require knowing<br />

parameters of application, especially as such. This is the converse objection – I claim<br />

that a term’s being precise does not imply that we know, or even could know, that it<br />

applies in virtue of a precise condition. All that matters is that it does apply in virtue<br />

of a precise condition.<br />

It’s a constant danger in philosophy that one infer from the falsity of all extant<br />

rivals that one’s preferred theory is correct. I certainly don’t want to argue that<br />

because Eklund’s, Smith’s and Greenough’s definitions are incorrect that the traditionalist<br />

definition I have offered must be right. But we can make that conclusion<br />

more plausible by noting how widely the arguments levelled here generalise. The


Vagueness as Indeterminacy 441<br />

philosophically interesting cases seem to tell against any definition of vagueness in<br />

terms of semantic competence, for they show that competent users can have exactly<br />

the same attitude towards vague terms as they have towards precise terms. And our<br />

moral example suggests that any definition in terms of epistemic properties will be<br />

in trouble for it might not be knowable whether a particular term is vague or precise.<br />

Finally, the cases of vague predicate modifiers raise difficulties for any attempt<br />

to define the vagueness of a term in terms of properties of sentences in which it is<br />

used rather than mentioned. For it seems that as long as very* attaches only to vague<br />

predicates, then whether very* is vague or precise will make no salient differences to<br />

the sentences in which it appears. So we have to look at sentences in which the allegedly<br />

vague term is mentioned. And while I don’t have a definitive argument here,<br />

I think looking at the range of cases we want to cover, and in particular at the range<br />

of cases where tolerance-type principles fail to be non-vacuously satisfied, our best<br />

option for completing these sentences is to look whether the term has a determinate<br />

or indeterminate denotation. We can then pass the questions of what determinacy<br />

consists in, and in particular the question of whether it is an epistemic or semantic<br />

feature, to the theorist of vagueness.


True, Truer, Truest<br />

What the world needs now is another theory of vagueness. Not because the old theories<br />

are useless. Quite the contrary, the old theories provide many of the materials<br />

we need to construct the truest theory of vagueness ever seen. The theory shall be<br />

similar in motivation to supervaluationism, but more akin to many-valued theories<br />

in conceptualisation. What I take from the many-valued theories is the idea that some<br />

sentences can be truer than others. But I say very different things to the ordering over<br />

sentences this relation generates. I say it is not a linear ordering, so it cannot be represented<br />

by the real numbers. I also argue that since there is higher-order vagueness, any<br />

mapping between sentences and mathematical objects is bound to be inappropriate.<br />

This is no cause for regret; we can say all we want to say by using the comparative<br />

truer than without mapping it onto some mathematical objects. From supervaluationism<br />

I take the idea that we can keep classical logic without keeping the familiar<br />

bivalent semantics for classical logic. But my preservation of classical logic is more<br />

comprehensive than is normally permitted by supervaluationism, for I preserve classical<br />

inference rules as well as classical sequents. And I do this without relying on the<br />

concept of acceptable precisifications as an unexplained explainer.<br />

The world does not need another guide to varieties of theories of vagueness, especially<br />

since Timothy Williamson (1994) and Rosanna Keefe (2000) have already<br />

provided quite good guides. I assume throughout familiarity with popular theories<br />

of vagueness.<br />

1 Truer<br />

The core of my theory is that some sentences involving vague terms are truer than<br />

others. I won’t give an analysis of truer, instead I will argue that we already tacitly<br />

understand this relation. The main argument for this will turn on a consideration of<br />

two ‘many-valued’ theories of vagueness, one of which will play a central role (as the<br />

primary villain) in what follows.<br />

The most familiar many-valued theory, call it M, says there are continuum many<br />

truth values, and they can be felicitously represented by the interval [0, 1]. The four<br />

main logical connectives: and, or, if and not are truth-functional. The functions are:<br />

V (A∧ B) = mi n(V (A),V (B))<br />

V (A∨ B) = max(V (A),V (B))<br />

V (A → B) = max(1,1 − V (A) + V (B))<br />

V (¬A) = 1 − V (A)<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philosophical<br />

Studies 123 (2005): 47-70. Thanks to audiences at the 2002 APA Central, Edinburgh and especially<br />

the 2003 BSPC for helpful comments. From the latter I’m especially grateful to Jonathan Bennett, Alex<br />

Byrne, Cian Dorr, Andy Egan, Elizabeth Harman, Robin Jeshion, Mike Nelson, Jonathan Schaffer, Ted<br />

Sider and Gabriel Uzquiano. I’m also grateful to two very helpful referee reports.


True, Truer, Truest 443<br />

where V is the valuation function on sentences, mi n(x, y) is the smaller of x and y<br />

and max(x, y) is the larger of x and y.<br />

Adopting these rules for the connectives commits us to adopting the logic Ł C .<br />

M is the theory that this semantic model, under its most natural interpretation, is<br />

appropriate for vague natural languages. (We’ll discuss less natural interpretations<br />

presently.)<br />

M tells a particularly nice story about the Sorites. A premise like If she’s rich,<br />

someone with just a little less money is also rich will have a very high truth value.<br />

If we make the difference in money between the two subjects small enough, this<br />

conditional will have a truth value arbitrarily close to 1.<br />

M also tells a nice story about borderline cases and determinateness. An object<br />

a is a borderline case of being an F just in case the sentence a is F has a truth value<br />

between 0 and 1 exclusive. Similarly, a is a determinate F just in case the truth value of<br />

a is F is 1. (It is worthwhile comparing how simple this analysis of determinateness is<br />

to the difficulties supervaluationists have in providing an analysis of determinateness.<br />

On this topic, see Williamson (1995), McGee and McLaughlin (1998) and Williamson<br />

(2004).)<br />

But M tells a particularly implausible story about contradictions. Here is how<br />

Timothy Williamson (1994) makes this problem vivid.<br />

More disturbing is that the law of non-contradiction fails . . . . ¬( p ∧ ¬ p)<br />

always has the same degree of truth as p ∨ ¬ p, and thus is perfectly true<br />

only when p is either perfectly true or perfectly false. When p is halftrue,<br />

so are both p ∧ ¬ p and ¬( p ∧ ¬ p). (Williamson, 1994, 118)<br />

At some point [in waking up] ‘He is awake’ is supposed to be half-true,<br />

so ‘He is not awake’ will be half-true too. Then ‘He is awake and he is<br />

not awake’ will count as half-true. How can an explicit contradiction be<br />

true to any degree other than 0? (Williamson, 1994, 136)<br />

There is a way to keep the semantic engine behind M while avoiding this consequence.<br />

(The following few paragraphs are indebted pretty heavily to the criticisms<br />

of Strawson’s theory of descriptions in Dummett (1959))<br />

Consider an interpretation of the above semantics on which there are only two<br />

truth values: True and False. Any sentence that gets truth value 1 is true, all the<br />

others are false. The numbers in [0, 1) represent different ways of being false. (As<br />

Tolstoy might have put it, all true sentences are alike, but every false sentence is false<br />

in its own unique way.) Which way a sentence is false can affect the truth value of<br />

compounds containing that sentence. In particular, if A and B are false, then the<br />

truth values of Not A and If A then B will depend on the ways A and B take their<br />

truth values. If V (A) = 0 and V (B) = 0.3, then Not A and If A then B will be true,<br />

but if V (A) becomes 0.6, and remember this is just another way of being false, both<br />

Not A and If A then B will be false.<br />

The new theory we get, one I’ll call M D , is similar to M in some respects. For<br />

example, it agrees about what the axioms should be for a logic for natural language.


True, Truer, Truest 444<br />

But it has several philosophical differences. In particular, it has none of the three<br />

characteristics of M we noted above.<br />

It cannot tell as plausible story as M does about the Sorites. If any sentence with<br />

truth value below 1 is false, then many of the premises in a Sorites argument are false.<br />

This is terrible – it was bad enough to be told that one of the premises were false, but<br />

now we find many thousands of them are false. I doubt that being told they are false<br />

in a distinctive way will improve our estimation of the theory. Similarly, it is hard<br />

to see just how the new theory has anything interesting to say about the concept of a<br />

borderline case.<br />

On the other hand, according to M D , contradictions are always false. To be sure,<br />

a contradiction might be false in some obscure new way, but it is still false. Recall<br />

Williamson’s objection that an explicit contradiction should be true to degree 0 and<br />

nothing more. This objection only works if being true to degree 0.5 is meant to be<br />

semantically significant. If being ‘true to degree 0.5’ is just another way of being<br />

false, then there is presumably nothing wrong with contradictions are true to degree<br />

0.5. This is not to say Williamson’s objection is no good, since he intended it as an<br />

objection to M, but just to say that re-interpreting the semantic significance of the<br />

numbers in M makes a philosophical difference.<br />

Despite M D ’s preferable treatment of contradictions, I think M is overall a better<br />

theory because it has a much better account of borderline cases. But for now I want<br />

to stress a simpler point: M and M D are different theories of vagueness, and that we<br />

grasp the difference between these theories. One crucial difference between the two<br />

theories is that in M, but not M D , S 1 is truer than S 2 if V (S 1 ) is greater than V (S 2 ).<br />

In M D , if S 1 is truer than S 2 , V (S 1 ) must be one and V (S 2 ) less than one. And that<br />

is the only difference between the two theories. So if we understand this difference,<br />

we must grasp this concept truer than. Indeed, it is in virtue of grasping this concept<br />

that we understand why saying each of the Sorites conditionals is almost true is a<br />

prima facie plausible response to the Sorites, and why having a theory that implies<br />

contradictions are truer than many other sentences is a rather embarrassing thing.<br />

I have implicitly defined truer by noting its theoretical role. As David Lewis<br />

(1972) showed, terms can be implicitly defined by their theoretical role. There is<br />

one unfortunate twist here in that truer is defined by its role in a false theory, but that<br />

does not block the implicit definition story. We know what phlogiston and ether mean<br />

because of their role in some false theories. The meaning of truer can be extracted in<br />

the same way from the false theory M .<br />

2 Further Reflections on Truer<br />

As noted, I won’t give a reductive analysis of truer. The hopes for doing that are no<br />

better than the hopes of giving a reductive analysis of true. But I will show that we<br />

pre-theoretically understand the concept.<br />

My primary argument for this has already been given. Intuitively we do understand<br />

the difference between M and its M D , and this is only explicable by our<br />

understanding truer. Hence we understand truer.


True, Truer, Truest 445<br />

Second, it’s noteworthy that truer is morphologically complex. If we understand<br />

true, and understand the modifier -er, then we know enough in principle to know<br />

how they combine. Not every predicate can be turned into a comparative. But most<br />

can, and our default assumption should be that true is like the majority.<br />

I have heard two arguments against that assumption. First, it could be argued<br />

that most comparatives in English generate linear orderings, but truer generates a<br />

non-linear ordering. I reject the premise of this argument. Cuter, Smarter, Smellier,<br />

and Tougher all generate non-linear orderings over their respective domains, and they<br />

seem fairly indicative of large classes. Second, it could be argued that it’s crucial to<br />

understanding comparatives that we understand the interaction of the underlying<br />

adjectives with comparison classes. Robin Jeshion and Mike Nelson made this objection<br />

in their comments on my paper at BSPC 2003. Again, the premise is not<br />

obviously true. We can talk about some objects being straighter or rounder despite<br />

the fact that it’s hard to understand round for an office building or straight for a line<br />

drive. (Jonathan Bennett made this point in discussion at BSPC.) Straight and round<br />

either don’t have or don’t need comparison classes, but they form comparatives. So<br />

true, which also does not take comparison classes, could also form a comparative.<br />

Finally, if understanding the inferential role of a logical operator helps know its<br />

meaning, then it is notable that truer has a very clear inferential role. It is the same as<br />

a strict material implication �(q ⊃ p) defined using a necessity operator whose logic<br />

is KT. Since many operators have just this logic, this doesn’t individuate truer, but it<br />

helps with inferential role semantics aficionados.<br />

I claim that the concept truer, and the associated concept as true as, are the only<br />

theoretical tools we need to provide a complete theory of vagueness. It is simplest<br />

to state the important features of my theory by contrasting it with M. I keep the<br />

following good features of M .<br />

G1 There are intermediate sentences, i.e. sentences that are truer than some sentences<br />

and less true than others. For definiteness, I will say S is intermediate iff<br />

S is truer than 0=1 and less true than 0=0.<br />

G2 a is a borderline F iff a is F is intermediate, and a is determinately F iff a is F<br />

and a is not a borderline F .<br />

I won’t repeat the arguments here, but I take G1 to be a large advantage of theories<br />

like M over epistemicist theories. (See Burgess (2001), Sider (2001b) and <strong>Weatherson</strong><br />

(2003a) for more detailed arguments to this effect.) And as noted G2 is a much simpler<br />

analysis of determinacy and borderline than supervaluationists have been able to<br />

offer.<br />

I drop the following bad features of M .<br />

B1 Some contradictions are intermediate sentences.<br />

On my theory all contradictions are determinately false, and determinately<br />

determinately false, and so on. The argument for this has been given above.


True, Truer, Truest 446<br />

B2 Some classical tautologies are intermediate sentences.<br />

On my theory all classical tautologies are determinately true, and determinately<br />

determinately true, and so on. We will note three arguments for this<br />

being an improvement in the next section.<br />

B3 Some classical inference rules are inadmissible.<br />

On my theory all classical inference rules are admissible. As Williamson (1994)<br />

showed, the most prominent version of supervaluationism is like M in ruling<br />

some classical rules to be inadmissible, and this is clearly a cost of those theories.<br />

B4 Sentences of the form S is intermediate are never intermediate<br />

I will argue below this is a consequence of M, and it means it is impossible to<br />

provide a plausible theory of higher-order vagueness within M. In my theory<br />

we can say that there is higher-order vagueness by treating truer as an iterable<br />

operator, so we can say that S is intermediate is intermediate. If S is a is F, that’s<br />

equivalent to saying that a is a borderline case of a borderline case of an F .<br />

Essentially we get out theory of higher-order vagueness by simply iterating our<br />

theory of first-order vagueness, which is what Williamson does in his justly<br />

celebrated treatment of higher-order vagueness. Note it’s not just M that has<br />

troubles with higher-order vagueness. See Williamson (1994) and <strong>Weatherson</strong><br />

(2003b) for the difficulties supervaluationists have with higher-order vagueness.<br />

The treatment of higher-order vagueness here is a substantial advantage of my<br />

theory over supervaluationism.<br />

B5 Truer is a linear relation.<br />

On my theory it need not be the case that S 1 is truer than S 2 , or S 2 is truer than<br />

S 1 , or they are as true as each other. In the last section I will argue that this is a<br />

substantial advantage of my theory. I claim that truer generates a Boolean lattice<br />

on possible sentences of the language. (For a familiar example of a Boolean<br />

lattice, think of the subsets of � ordered by the subset relation.)<br />

I also provide a very different, and much more general, treatment of the Sorites than<br />

is available within M . The biggest technical difference between my theory and M<br />

concerns the relationship between the semantics and the logic. In M the logic falls out<br />

from the truth-tables. Since I do not have the concept of an intermediate truth value<br />

in my theory, I could not provide anything like a truth-table. Instead I posit several<br />

constraints on the interaction of truer with familiar connectives, posit an analysis<br />

of validity in terms of truer, and note that those two posits imply that all and only<br />

classically admissible inference rules are admissible.<br />

3 Constraints on Truer and Classical Logic<br />

The following ten constraints on truer seem intuitively compelling. I’ve listed here<br />

both the philosophically important informal claim, and the formal interpretation of<br />

that claim. (I use A � T B as shorthand for A is at least as true as B. Note all the<br />

quantifiers over sentences here are possibilist quantifiers, we quantify over all possible<br />

sentences in the language.)


True, Truer, Truest 447<br />

(A1) � T is a weak ordering (i.e. reflexive and transitive)<br />

If A � T B and B � T C then A � T C<br />

A � T A<br />

(A2) ∧ is a greatest lower bound with respect to � T<br />

A∧ B � T C iff A � T C and B � T C<br />

C � T A∧B iff for all S such that A � T S and B � T S it is also the case that<br />

C � T S<br />

(A3) ∨ is a least upper bound with respect to � T<br />

A ∨ B � T C iff for all S such that S � T A and S � T B, it is also the case that<br />

S � T C<br />

C � T A∨B iff C � T A and B � T C<br />

(A4) ¬ is ordering inverting with respect to � T<br />

A � T B iff ¬B � T ¬A<br />

(A5) Double negation is redundant<br />

¬¬A = T A<br />

(A6) There is an absolutely false sentence S F and an absolutely true sentence S T<br />

There are sentences S F and S T such that S F = T ¬S T and ¬S F = T S T and for all<br />

S: S T � T S � T S F<br />

(A7) Contradictions are absolutely false<br />

A∧ ¬A = T S F<br />

(A8) ∀ is a greatest lower bound with respect to � T<br />

A � T ∀x(φx) iff for all S such that for all o, if n is a name of o then φn � T S,<br />

it is the case that A � T S<br />

∀x(φx) � T A iff for all o, if n is a name of o then φn � T A<br />

(A9) ∃ is a least upper bound with respect to � T<br />

A � T ∃x(φx) iff for all o, if n is a name of o then A � T φn∃x(φx) � T A iff for<br />

all S such that for all o, if n is a name of o then S � T φn, S � T A<br />

(A10) A material implication with respect to � T can be defined.<br />

There is an operative → such that<br />

(1) B → A � T S T iff A � T B<br />

(2) (A∧ B)→ C = T A →(B → C )<br />

Apart from (A10) these are fairly straightforward. We can’t argue for (A10) by saying<br />

English if. . . then is a material implication, because that leads directly to the paradoxes<br />

of material implication. Assuming that ¬A∨ B is a material implication is equivalent<br />

to assuming (inter alia) that A∨¬A is perfectly true. I believe this, but since it is denied<br />

by many I want that to be a conclusion, not a premise. So the argument for (A10)<br />

must be a little indirect. In particular, we will appeal to the behaviour of quantifiers.<br />

We can formally represent All Fs are Gs in two ways: using restricted or unrestricted<br />

quantifiers. In the first case the formal representation will look like:<br />

∀x(Fx ? Gx)<br />

with some connective in place of ‘?’. But it seems clear that whatever connective goes<br />

in there must be a material implication. In the second case, the formal representation<br />

will look like:


True, Truer, Truest 448<br />

[∀x: Fx] Gx<br />

In that case, we can define a connective ∇ that satisfies the definition of a material<br />

implication:<br />

A∇B = df [∀x: A∧ x=x] (B ∧ x=x)<br />

This is equivalent to the odd (but intelligible) sentence Everything such that A is such<br />

that B. Again, considerations about what should be logical truths involving quantifiers<br />

suggests that ∇ must be a material implication. So either way there should be a<br />

material implication present in the language, as (A10) says.<br />

Given (A1) to (A10) it follows that this material implication is equivalent to ¬A∨<br />

B, and hence A∨¬A is a logical truth. This is a surprising conclusion, since intuitively<br />

vagueness poses problems for excluded middle, but I think it is more plausible that<br />

vague instances of excluded middle are problematic for pragmatic reasons than that<br />

any of (A1) to (A10) are false.<br />

What is interesting about these ten constraints is that they suffice for classical<br />

logic, with just one more supposition. I assume that an argument is valid iff it is<br />

impossible for the premises taken collectively to be truer than the conclusion, i.e. iff<br />

it is impossible for the conjunction of the premises to be truer than the conclusion.<br />

Given that, we get:<br />

∀A 1 , . . . ,A n ,B: A 1 , . . . , A n ⊢ T B iff, according to classical logic, A 1 ,<br />

. . . , A n ⊢ B<br />

(I use Γ ⊢ T A to mean that in all models for � T that satisfy the constraints, here (A1)<br />

to (A10), the conclusion is at least as true as the greatest lower bound of the premises.)<br />

I won’t prove this result, but the idea is that (A1) to (A10) imply that � T defines a<br />

Boolean lattice over equivalence classes of sentences with respect to = T . And all<br />

Boolean lattices are models for classical logic, from which our result follows. Indeed,<br />

Boolean lattices are models for classical logic in the strong sense that classical inference<br />

rules, such as conditional proof and reductio, are admissible in logics defined on<br />

them, so we also get the admissibility of classical inference rules in this theory. (Note<br />

that this result only holds in the right-to-left direction for languages that contain the<br />

� T operator. Once this operator is added, some arguments that are not classically<br />

valid, such as B � T A, A ⊢ T B, will valid. But the addition of this operator is conservative:<br />

if we look at the � T -free fragment of such languages, the above result still<br />

holds in both directions.)<br />

There are three reasons for wanting to keep classical logic in a theory of vagueness.<br />

First, as Williamson has stressed, classical logic is at the heart of many successful<br />

research programs. Second, non-classical theories of vagueness tend to abandon<br />

too much of classical logic. For instance, M abandons the very plausible schema<br />

(A ∧ A → B)→ B. The third reason is the one given here - these ten independently<br />

plausible constraints on truer entail that the logic for a language containing truer<br />

should be classical. These three arguments add up to a powerful case that non-classical<br />

theories like M are mistaken, and we should prefer a theory that preserved classical<br />

logic.


True, Truer, Truest 449<br />

4 Semantics and Proof Theory<br />

In this section I will describe a semantics and proof theory for a language containing<br />

truer as an iterable operator. This is important for the theory of higher-order vagueness.<br />

I say that a is a borderline borderline F just in case the sentence a is a borderline<br />

F is intermediate, where ‘borderline’ is analysed as in section 2. It might not be obvious<br />

that it is consistent with (A1) to (A10) that any sentence a is a borderline F could<br />

be consistent. One virtue of the model theory outlined here is that it shows this is<br />

consistent.<br />

For comparison, note that M as it stands has no way of dealing with higher-order<br />

vagueness, i.e. with borderline cases of borderline cases of F -ness. If every sentence<br />

a is a borderline F either does or does not receive an integer truth value, then this<br />

intuitive possibility is ruled out. We cannot solve the problem simply by iterating M.<br />

(This is a point stressed by (Williamson, 1994, Ch. 4).) We cannot say that it is true<br />

to degree 0.5 than (2) is true to degree 1, and true to degree 0.5 that it is true to degree<br />

0.8. For then it is only true to degree 0.5 that (2) has some truth value or other. And<br />

the use of truth-tables to generate a logic presupposes that every sentence has some<br />

truth values or other. If this is not determinately true, M is not a complete theory.<br />

So the model theory will show that our theory is substantially better than M in this<br />

respect.<br />

Consider the following (minor) variant on KT. Vary the syntax so �A is only<br />

well-formed if A is of the form B → C . Call the resulting logic KT R , with the R<br />

indicating the syntactic restriction. The restriction makes very little difference. Since<br />

A is equivalent to (A → A) → A, even if �A is not well-formed in KT R , the KTequivalent<br />

sentence �((A → A) → A) will be well-formed. The Kripke models for<br />

KT R are quite natural. �(B → C ) is true at a point iff all accessible points at which<br />

B is true are points at which C is true. (There is no restriction on the accessibility<br />

relation other than reflexivity.)<br />

Since KT R is so similar to KT, we can derive most of its formal properties by<br />

looking at the derivations of similar properties for KT. (The next few paragraphs<br />

owe a lot to (Goldblatt, 1992, Chs.1-3).) Let’s start with an axiomatic proof theory.<br />

The axioms for KT R are:<br />

• All classical tautologies<br />

• All well-formed instances of K: �(A → B) → (�A → �B)<br />

• All well-formed instances of T: �A → A<br />

The rules for KT R are<br />

Modus Ponens If A → B is a theorem and A is a theorem, then B is a theorem<br />

Restricted Necessitation If A is a theorem and �A is well-formed, then �A is a<br />

theorem.<br />

Given these, we can now define a maximal consistent set for KT R . It is a set S of<br />

sentences with the following three properties:


True, Truer, Truest 450<br />

• All theorems of KT R are in S.<br />

• For all A, either A is in S or ¬A is in S.<br />

• S is closed under modus ponens.<br />

The existence of Kripke models for KT R show that some maximal consistent sets<br />

exist: the set of truths at any point will be a maximal consistent set. The canonical<br />

model for KT R is 〈W , R,V 〉 where<br />

• W is the set of maximal consistent sets for KT R<br />

• R is a subset of W × W such that w 1 Rw 2 iff for all A such that �A ∈ w 1 , A ∈ w 2<br />

• V is the valuation such that V (A) = {w: A ∈ w}<br />

Since all instances of T are theorems, it can be easily shown that R is reflexive, and<br />

hence that this is a frame for KT R and hence that KT R is canonically complete.<br />

We can translate all sentences of KT R into a language that contains � T but not �.<br />

Just replace �(B → A) with A � T B wherever � occurs including inside sentences.<br />

(We appeal here and here alone to the restriction in KT R .) Translating the axioms for<br />

KT R , we get the following axioms for the logic of � T .<br />

• All classical tautologies<br />

• All instances of: (B � T A) → ((A � T (A → A)) → (B � T (B → B)))<br />

• All instances of: (B � T A) → (A → B)<br />

The rules are<br />

Modus ponens If A → B is a theorem and A is a theorem, then B is a theorem.<br />

Determination If A → B is a theorem, then B � T A is a theorem.<br />

We can simplify somewhat by replacing the second axiom schema with<br />

• All instances of: A � T B → (B � T C → A � T C )<br />

A Kripke model for this logic is just a Kripke model for KT, except we say B � T A<br />

is true at a point iff B is true at all accessible points at which A is true. This leads<br />

to a semantic definition of validity. An argument is valid iff it preserves truth at any<br />

point in all such models.<br />

Maximal consistent sets with respect to � T and a canonical model for � T can be<br />

easily constructed by parallel with the maximal consistent sets and canonical models<br />

for KT R . These constructions show that if A is a theorem of the logic for � T , then<br />

it is true at all points in all models. More generally, they can be used to show that<br />

this logic is canonically complete, though the details of the proof are omitted. The<br />

maximal consistent sets for � T , i.e. the points in the canonical model, just are the<br />

results of applying the translation rule �(B → A) ⇒ A � T B to the (sentences in the)<br />

maximal consistent sets for KT R .<br />

That’s important because the points in the canonical model for � T are useful for<br />

understanding the relationship between truer and true, and for understanding what


True, Truer, Truest 451<br />

languages are. The set of true sentences in English is one of the points in the canonical<br />

model for � T . For semantic purposes, languages just are points in this canonical<br />

model. It is indeterminate just which such point English is, but it is one of them.<br />

For many purposes it is useful to think of the theory based on truer as a variant<br />

on M. But considering the canonical model for � T highlights the similarities with<br />

supervaluationism rather than the similarities with M, for the points in the canonical<br />

model look a lot like precisifications. It is, however, worth noting the many<br />

differences between my theory and supervaluationism. I identify languages with a<br />

single point rather than with a set of points, which leads to the smoother treatment<br />

of higher-order vagueness on my account. Also, I don’t start with a set of acceptable<br />

points/precisifications. The canonical model contains all the points that are formally<br />

consistent, and I identify particular languages, like English, by vaguely saying that<br />

the point that represents English is (roughly) there. (Imagine my vaguely pointing<br />

at some part of the model when saying this.) The most important difference is that<br />

I take the points, with the truer than relation already defined, to be primitive, and<br />

the accessibility/acceptability relation to be defined in terms of them. This reflects<br />

the fact that I take the truer relation to be primitive, and determinacy to be defined<br />

in terms of it, whereas typically supervaluationists do things the other way around.<br />

None of these differences are huge, but they all favour my theory over supervaluationism.<br />

To return to the point about higher order vagueness, note that all of the following<br />

sentences are consistent in KT, and hence their ‘equivalents’ using > T are also<br />

consistent.<br />

(1) ¬� p ∧ ¬�¬ p<br />

0=0 > T p > T 0=1<br />

(2) ¬�(¬� p ∧ ¬�¬ p) ∧¬�¬(¬� p ∧ ¬�¬ p)<br />

0=0 > T (0=0 > T p > T 0=1) > T 0=1<br />

(3) ¬�(¬�(¬� p ∧ ¬�¬ p)∧¬�¬(¬� p ∧ ¬�¬ p))<br />

∧¬�¬(¬�(¬� p ∧ ¬�¬ p)∧¬�¬(¬� p ∧ ¬�¬ p))<br />

0=0 > T (0=0 > T (0=0 > T p > T 0=1) > T 0=1) > T 0=1<br />

And obviously this pattern can be extended indefinitely. In general, any claim of the<br />

form that a is an n-th order borderline case of an F is consistent in this theory, as can<br />

be seen by comparison with KT.<br />

To close this section, I will note that we can also provide a fairly straightforward<br />

natural deduction system for the logic of � T . There are two philosophical benefits<br />

to doing this. First, it proves my earlier claim that I can keep all inference rules of<br />

classical logic. Second, it helps justify (A1) to (A10). Most rules correspond directly<br />

to one of the constraints. For that reason I’ve set all the rules, even though you’ve<br />

probably seen most of them before.<br />

(∧ In) Γ ⊢ A, ∆ ⊢ B ⇒ Γ ∪ ∆ ⊢ A∧ B<br />

(∧ Out-left) Γ ⊢ A∧ B ⇒ Γ ⊢ A<br />

(∧ Out-right) Γ ⊢ A∧ B ⇒ Γ ⊢ B


True, Truer, Truest 452<br />

(∨ In-left) Γ ⊢ B ⇒ Γ ⊢ A∨ B<br />

(∨ In-right) Γ ⊢ A ⇒ Γ ⊢ A∨ B<br />

(∨ Out) Γ∪{A} ⊢ C , ∆∪{B} ⊢ C , Λ ⊢ A∨ B ⇒ Γ ∪ ∆ ∪ Λ ⊢ C<br />

(→ In) Γ∪{A} ⊢ B ⇒ Γ ⊢ A → B<br />

(→ Out) Γ ⊢ A → B, ∆ ⊢ A ⇒ Γ ∪ ∆ ⊢ B<br />

(¬ In) Γ∪{A} ⊢ B ∧ ¬B ⇒ Γ ⊢ ¬A<br />

(¬ Out) Γ ⊢ ¬¬A ⇒ Γ ⊢ A<br />

(� T In) Γ ⊢ A ⇒ {B � T C : B ∈ Γ} ⊢ A � T C<br />

(� T Convert) Γ ⊢ A � T B ⇒ Γ ⊢ (B → A)� T C<br />

(� T Out) Γ ⊢ A � T B ⇒ Γ ⊢ B → A<br />

(Thanks to Gabriel Uzquiano for several probing questions that led to this section<br />

being written.)<br />

5 Sexy Sorites<br />

A good theory of vagueness should tell us two things about the Sorites. The easy<br />

part is to say what is wrong with Sorites arguments: not all premises are perfectly<br />

true. The hard part is to say why the premises looked plausible to start with. The M<br />

theorist has the beginnings of a story, though not the end of a story. The beginning<br />

is that all the premises in a typical Sorites argument are nearly true, and they look<br />

plausible because we confuse near truth for truth. Can I say the same thing, since<br />

my theory is like M ? No, for two reasons. First, since my theory explicitly gets rid<br />

of numerical representations of intermediate truth values, I don’t have any way to<br />

analyse almost true. Second, since I say that one of the Sorites premises is false, I’d be<br />

committed to the odd view that some false sentence is almost perfectly true. Thanks<br />

to Cian Dorr for pointing out this consequence.<br />

The story the M theorist tells does not generalise. The problem is that not all<br />

Sorites arguments involve conditionals. A typical Sorites situation involves a chain<br />

from a definite F to a definite not-F . Let ′ denote the successor relation in this sequence,<br />

so if F is is tall and a is 178cm tall, then a ′ will be 177.99cm tall, assuming<br />

the sequence progresses 0.1mm at a time. According to M, every premise like (SI) is<br />

almost true.<br />

(SI) If a is tall, then a ′ is tall.<br />

But we could have built a Sorites argument with premises like (SA).<br />

(SA) It is not the case that a is tall and a ′ is not tall.<br />

And premises of this form are not, in general, almost true. Indeed, some will have<br />

a truth value not much about 0.5. So M has no explanation for why premises like<br />

(SA) look persuasive. This is quite bad, because (SA) is more plausible than (SI) as<br />

I’ll now show. Consider the following thought experiment. You are trying to get<br />

a group of (typically non-responsive) undergraduates to appreciate the force of the<br />

Sorites paradox. If they don’t feel the force of (SI), how do you persuade them? My


True, Truer, Truest 453<br />

first instinct is to appeal to something like (SA). If that doesn’t work, I appeal to theoretical<br />

considerations about how our use of tall couldn’t possibly pick a boundary<br />

between a and a ′ . I think I find (SI) plausible because I find (SA) plausible, and I would<br />

try to get the students to feel likewise. There’s an asymmetry here. I wouldn’t defend<br />

(SA) by appealing to (SI), and I don’t find (SA) plausible because it follows from (SI).<br />

(This is not to endorse universally quantified versions of either (SA) or (SI). They are<br />

like Axiom V - claims that remain intuitively plausible even when we know they are<br />

false.)<br />

Sadly, many theories have little to say about why (SA) seems true. The official<br />

epistemicist story is that speakers only accept sentences that are determinately, i.e.<br />

knowably, true. But some instances of (SA) are actually false, and many many more<br />

are not knowably true. The supervaluationist story about (SA) is no better.<br />

Here’s a surprising fact about the Sorites that puts an unexpected constraint on<br />

explanations of why (SA) is plausible. In the history of debates about it, I don’t think<br />

anyone has put forward a Sorites argument where the major premises are like (SO).<br />

(SO) Either a is not tall, or a ′ is tall.<br />

(This point is also noticed in Braun and Sider (2007).) There’s a good reason for this:<br />

(SO) is not intuitively true, unless perhaps one sees it as a roundabout way of saying<br />

(SA). In this respect it conflicts quite sharply with (SA), which is intuitively true.<br />

But hardly any theory of vagueness (certainly not M or supervaluationism or epistemicism)<br />

provide grounds for distinguishing (SA) from (SO), since most theories of<br />

vagueness endorse DeMorgan’s laws. Further, none of the many and varied recent<br />

solutions to the Sorites that do not rely on varying the underlying logic (e.g. Fara<br />

(2000); Sorensen (2001); Eklund (2002)) seem to do any better at distinguishing (SA)<br />

from (SO). As far as I can tell none of these theories could, given their current conceptual<br />

resources, tell a story about why (SA) is intuitively plausible that does not<br />

falsely predict (SO) is intuitively plausible. That is, none of these theories could solve<br />

the Sorites paradox with their current resources.<br />

There is, however, a simple theory that does predict that (SA) will look plausible<br />

while (SO) will not. Kit Fine (1975b) noted that if we assume that speakers systematically<br />

confuse p for Determinately p, even when p occurs as a constituent of larger<br />

sentences rather than as a standalone sentence, then we can explain why speakers may<br />

accept vague instances of the law of non-contradiction, but not vague instances of the<br />

law of excluded middle. (That speakers do have these differing reactions to the two<br />

laws has been noted in a few places, most prominently Burgess and Humberstone<br />

(1987) and Tappenden (1993).) It’s actually rather remarkable how many true predictions<br />

one can make using Fine’s hypothesis. It correctly predicts that (5) should<br />

sound acceptable.<br />

(5) It is not the case that I am tall, but nor is it the case that I am not tall.


True, Truer, Truest 454<br />

Now (5) is a contradiction, so both the fact that it sounds acceptable if I am a borderline<br />

case of vagueness, and the fact that some theory predicts this, are quite remarkable.<br />

This is about as good as it gets in terms of evidence for a philosophical<br />

claim.<br />

(We might wonder just why Fine’s hypothesis is true. One idea is that there really<br />

isn’t any difference in truth value between p and Determinately p. This leads<br />

to the absurd position that some contradictions, like (5), are literally true. I prefer<br />

the following two-part explanation. The first part is that when one utters a simple<br />

subject-predicate sentence, one implicates that the subject determinately satisfies the<br />

predicate. This is a much stronger implicature than conversational implicature, since<br />

it is not cancellable. And it does not seem to be a conventional implicature. Rather, it<br />

falls into the category of nonconventional nonconversational implicatures Grice suggests<br />

exists on pg. 41 of his 1989. The second part is that some implicatures, including<br />

determinacy implicatures, are computed locally and the results of the computations<br />

passed up to whatever system computes the intuitive content of the whole sentence.<br />

This implies that constituents of sentences can have implicatures. This theme has<br />

been studied quite a bit recently; see Levinson (2000) for a survey of the linguistic<br />

data and Sedivy et al. (1999) for some empirical evidence supporting up this claim.<br />

Just which, if any, implicatures are computed locally is a major research question, but<br />

there is some evidence that Fine’s hypothesis is the consequence of a relatively deep<br />

fact about linguistic processing. This isn’t essential to the current project - really all<br />

that matters is that Fine’s hypothesis is true - but it does suggest some interesting<br />

further lines of research and connections to ongoing research projects.)<br />

If Fine’s hypothesis is true, then we have a simple explanation for the attractiveness<br />

of (SA). Speakers regularly confuse (SA) for (6), which is true, while they confuse<br />

(SO) for (7), which is false.<br />

(6) It is not the case that a is determinately tall and a ′ is determinately not tall.<br />

(7) Either a is determinately not tall, or a ′ is determinately tall.<br />

This explanation cannot directly explain why speakers find (SI) attractive. My explanation<br />

for this, however, has already been given. The intuitive force behind (SI)<br />

comes from the fact that it follows, or at least appears to follow, from (SA), which<br />

looks practically undeniable.<br />

So Fine’s hypothesis gives us an explanation of what’s going on in Sorites arguments<br />

that is available in principle to a wide variety of theorists. Fine proposed it<br />

in part to defend a supervaluationist theory, and Keefe (2000) adopts it for a similar<br />

purpose. Patrick Greenough (2003) has recently adopted a similar looking proposal<br />

to provide an epistemicist explanation of similar data. (Nothing in the explanation<br />

of the attractiveness of Sorites premises turns on any analysis of determinacy, so the<br />

story can be told by epistemicists and supervaluationists alike.) And the story can<br />

be added to the theory of truer sketched here. It might be regretted that we don’t<br />

have a distinctive story about the Sorites in terms of truer. But the hypothesis that<br />

some sentences are truer than others is basically a semantic hypothesis, and if the reason<br />

Sorites premises look attractive is anything like the reason (5) looks prima facie


True, Truer, Truest 455<br />

attractive, then that attractiveness should receive a pragmatic explanation. What is<br />

really important is that there be some story about the Sorites we can tell.<br />

6 Linearity Intuitions<br />

The assumption that truer is a non-linear relation is the basis for most of the distinctive<br />

features of my theory, so it should be defended. There are two reasons to believe<br />

it.<br />

One is that we can’t simultaneously accept all of the following five principles.<br />

• Truer is a linear relation.<br />

• (A2), that conjunction is a greatest lower bound.<br />

• (A4), that negation is order inverting.<br />

• (A7), that contradictions are determinately false.<br />

• There are indeterminate sentences.<br />

I think by far the least plausible of these is the first, so it must go.<br />

Linearity (or at least determinate linearity) also makes it difficult to tell a plausible<br />

story about higher order vagueness. Linearity is the claim that for any two<br />

sentences A and B, the following disjunction holds. Either A > T B, or B > T A, or<br />

A = T B. If truer is determinately linear, that disjunction is determinately true. And<br />

if truer is linear, and if that disjunction is determinately true, then one of its disjuncts<br />

must be determinately true, for linearity rules out the possibility of a determinately<br />

true disjunction with no determinately true disjunct. Now take a special case of that<br />

disjunction, where B is 0=0. In that case we can rule out A > T B. So the only options<br />

are B > T A or A = T B. We have concluded that given linearity, one of these disjuncts<br />

must be determinately true. That is, A is either determinately intermediate or determinately<br />

determinate. But intuitively neither of these need be true, for A might be in<br />

the ‘penumbra’ between the determinately intermediate and the determinately determinate.<br />

This argument is only a problem if we assume determinate linearity, but it’s<br />

hard to see the theoretical motivation for believing in linearity but not determinate<br />

linearity.<br />

Still, it is very easy to believe in linearity. Even for comparatives that are clearly<br />

non-linear, like more intelligent than, there is a strong temptation to treat them as<br />

linear. (Numerical measurements of intelligence are obviously inappropriate given<br />

that more intelligent than is non-linear, but there’s a large industry involved in producing<br />

such measurements.) And this temptation leads to some prima facie plausible<br />

objections to my theory. (All of these objections arose in the discussion of the paper<br />

at BSPC.)


True, Truer, Truest 456<br />

True and Truer (due to Cian Dorr)<br />

Here’s an odd consequence of my theory plus the plausible assumption that If S then<br />

S is true is axiomatic. We can’t infer from A is true and B is false that A is truer than<br />

B. But this looks like a reasonably plausible inference.<br />

If we added this as inference rule, we would rule out all intermediate sentences.<br />

To prove this assume, for reductio, that A is intermediate. Since we keep classical<br />

logic, we know A ∨ ¬A is true. If A, then A is true, and hence ¬A is false. Then the<br />

new this inference rule implies A � T ¬A, hence A∧¬A � T ¬A, since ¬A � T ¬A, and<br />

hence 0=1� T ¬A, since 0=1 � T A∧ ¬A, and � T is transitive. So A is determinately<br />

true, not intermediate. A converse proof shows that if ¬A, then A is determinately<br />

false, not intermediate. So by (∨-Out) it follows that A is not intermediate, but since<br />

A was arbitrary, there are no intermediate truths. So this rule is unacceptable, despite<br />

its plausibility.<br />

Comparing Negative and Positive (due to Jonathan Schaffer)<br />

Let a be a regular borderline case of genius, somewhere near the middle of the penumbra.<br />

Let b be someone who is not a determinate case of genius, but is very close. Let<br />

A be a is a genius and B be b is a genius. It seems plausible that A � T ¬B, since a<br />

is right around the middle of the borderline cases of genius, but b is only a smidgen<br />

short of clear genius. But since b is closer to being a genius than a, we definitely have<br />

B � T A. By transitivity, it follows that B � T ¬B, hence B is determinately true (by<br />

the reasoning of the last paragraph). Since ¬B is not determinately false, it follows<br />

that B ∧ ¬B is not determinately false, contradicting (A7).<br />

Since I accept (A7) I must reject the initial assumption that A � T ¬B. But it’s<br />

worth noting that this case is quite general. Similar reasoning could be used to show<br />

that for any indeterminate propositions of the form x is a genius and y is not a genius,<br />

the first is not truer than the second. This seems odd, since intuitively these could<br />

both be indeterminate while the first is very nearly true and the second very nearly<br />

false.<br />

Comparing Different Predicates (due to Elizabeth Harman)<br />

One intuitive way to understand the behaviour of truer is that A is truer than B iff A<br />

is true on every admissible precisification on which B is true and the converse does<br />

not hold. This can’t be an analysis of truer, since it assumes we can independently define<br />

what is an admissible precisification, and this seems impossible. But it’s a useful<br />

heuristic. And reflecting on it brings up a surprising consequence of my theory. If<br />

we assume that precisifications of predicates from different subject areas (e.g. hexagonal<br />

and honest) are independent, it follows that subject-predicate sentences involving<br />

those predicates and indeterminate instances of them are incomparable with respect<br />

to truth. But this seems implausible. If France is a borderline case of being hexagonal<br />

that is close to the lower bound, and George Washington is a borderline case of being<br />

honest who is close to the upper bound, then we should think George Washington is<br />

honest is truer than France is hexagonal.


True, Truer, Truest 457<br />

All three of these objections seem to me to turn on an underlying intuition that truer<br />

should be a linear relation. If we are given this, then the inference principle Dorr<br />

suggests looks unimpeachable, and the comparisons Schaffer and Harman suggested<br />

look right. But once we drop the idea that truer is linear, I think the plausibility of<br />

these claims falls away. So the arguments against linearity are ipso facto arguments<br />

that we should simply drop the intuitions Dorr, Schaffer and Harman are relying<br />

upon.<br />

To conclude, it’s worth noting that a very similar inferential rule to the rule Dorr<br />

suggests is admissible. From the fact that A is determinately true, and B is determinately<br />

false, it follows that A is truer than B. If we assume, as seems reasonable, that<br />

we’re only in a position to say that A is true when it is determinately true, then whenever<br />

we’re in a position to say A is true and B is false, it will be true that A is truer than<br />

B. This line of defence is obviously similar to the explanation I gave in the previous<br />

section of why Sorites premises look plausible, and to the argument Rosanna Keefe<br />

gives that the failure of classical inference rules is no difficulty for supervaluationism<br />

because it admits very similar inference rules (Keefe, 2000).


Vagueness and Pragmatics<br />

If Louis is a penumbral case of baldness, then many competent speakers will not be<br />

disposed to assent to any of (1) through (3), though they will assent to (4).<br />

(1) Louis is bald.<br />

(2) Louis is not bald.<br />

(3) Louis is bald or Louis is not bald.<br />

(4) It is not the case that Louis is bald and that he is not bald.<br />

A good theory of vagueness should explain these differing reactions. Most theorists<br />

have something like explanations of our reactions to (1) and (2). Some are built to<br />

explain our reactions to (3) – theories that advocate reforming classical logic to accommodate<br />

data concerning vagueness are paradigm cases of this. Some are built<br />

to explain our reactions to (4) – theories that stress penumbral connections, like supervaluationism<br />

and epistemicism are paradigm cases of this. What is trickier is to<br />

provide an explanation of our reactions to both (3) and (4). Here I will outline a<br />

pragmatic explanation of the data – (3) and (4) are both true, but we have reasons to<br />

not assert (3) that do not apply to (4).<br />

The core idea will be a development of one outlined by Kit Fine (1975b, 140)<br />

and Rosanna Keefe (2000, 164). Fine and Keefe are both supervaluationists, but the<br />

theory they present looks like it could work independently of the supervaluational<br />

framework. Louis’s being bald is not a sufficient condition for you to properly assert<br />

he is bald - rather he must be determinately bald. Or, perhaps more perspicuously, it<br />

must be determinately true that he is bald. For Fine and Keefe, being determinately<br />

true means being supertrue, but may mean something different to you if, perchance,<br />

you disavow supervaluationism. However we understand determinacy, we should<br />

agree that a simple sentence like (1) is assertable only if it is supertrue. Assuming<br />

other factors are equal, the audience is interested in the state of Louis’s hair, you<br />

have adequate epistemic access to that state, and so on, (1)’s being determinately true<br />

will also be a sufficient condition for its proper assertion. We will assume from now<br />

on that those conditions are met. We will write Determinately S as �S throughout<br />

in what follows, noting where necessary what assumptions we are making about its<br />

logic.<br />

Now comes the crucial step. If that was all you were told, you would think that<br />

for a disjunction A or B could be properly asserted iff it were determinately true, just<br />

like all other sentences. But, Fine and Keefe suggest, perhaps we take the condition<br />

in which it can be properly asserted to be different to this. We think (rightly or<br />

wrongly) that it can only be properly asserted if �A ∨ �B is true, and not merely if<br />

�(A∨ B) is true. The bulk of this paper consists of a development of this idea, and a<br />

defence of that development.<br />

According to the theory presented thus far, there is a fairly mechanical procedure<br />

for connecting simple sentences with their assertion conditions. The suggestion,<br />

† In progress. Some parts of this have been incorporated into True, Truer, Truest.


Vagueness and Pragmatics 459<br />

then, is that we work out the assertion conditions for compound sentences by applying<br />

that procedure not to whole sentences, but to their parts. This is how we get the<br />

assertion condition for ‘A or B’ being �A∨�B. To which parts should we apply this<br />

procedure? Well, perhaps there are no hard and fast rules about this. Perhaps context<br />

determines whether the procedure should be applied to a whole sentence or to its<br />

sentential parts. You might expect that this will mean context determines whether<br />

sentences like (3) and (4) can be properly asserted. This is right in the case of (3). If<br />

its assertion condition is (3a) then it can be asserted, if it is (3b) then it cannot. ((3a) is<br />

the case where the procedure is applied to the whole sentence, (3b) where it is applied<br />

to the parts.)<br />

(3) (a) � (Louis is bald or Louis is not bald)<br />

(b) � (Louis is bald) or � (Louis is not bald)<br />

However, this speculation would be wrong about (4). No matter how or where we<br />

apply the mechanical procedure, the assertion condition for (4) that is generated is<br />

true, as (4a) to (4c) illustrate.<br />

(4) (a) � (It is not the case that Louis is bald and that he is not bald)<br />

(b) It is not the case that � (Louis is bald and that he is not bald)<br />

(c) It is not the case that � (Louis is bald) and that � (he is not bald)<br />

Apply our little procedure to (4) any way you like, and provided you’ve started with<br />

a broadly classical theory like supervaluationism or epistemicism, you will predict<br />

that (4) can be properly asserted. So the theory sketched by Fine and Keefe looks like<br />

it has a chance of capturing some rather interesting data.<br />

The two core aims of this paper are to show that Fine and Keefe’s theory, as<br />

amended and extended, can (a) explain all the data about our reactions to compound<br />

sentences involving vague clauses like Louis is bald and (b) this theory can be grounded<br />

in an independently plausible theory concerning implicatures of compound sentences.<br />

Along the way we will say a lot about conditionals whose antecedents typically carry<br />

Gricean implicatures – these will be an important data source. Reflecting on these<br />

conditionals will help explain some odd data concerning vagueness, but it will also<br />

provide an interesting perspective on some problems concerning conditionals, such<br />

as Vann McGee’s apparent counterexamples to modus ponens. We will also consider<br />

whether this explanation of (3) and (4) can be adopted by non-supervaluationists.<br />

The answer here will be that theorists who adopt non-classical logic almost certainly<br />

cannot adopt this solution, while theorists who retain classical logic but provide nonsemantic<br />

theories of vagueness (such as, notably, epistemicists) probably cannot adopt<br />

the solution, though the evidence here is more equivocal. First, though, we shall survey<br />

the possible responses to the data about (1) through (4).<br />

1 Famous Answers<br />

Faced with such the challenges posed by these responses, theorists of vagueness seem<br />

to have five (or maybe six) options open to them.


Vagueness and Pragmatics 460<br />

Option One – Deny the Data<br />

The simplest thing to do philosophically would be to deny the data; deny, that is,<br />

that there really are a substantial number of speakers who are willing to assent to (4),<br />

but not to (1), (2) or (3). Maybe after a substantial empirical investigation, this will<br />

turn out to be the right thing to say. But I doubt it is true. A poor reason for this is<br />

introspection. 1 A better reason is that it seems to be a fairly widespread assumption<br />

among experts in the field that the data is roughly as I have presented it. Some authors<br />

have explicitly asserted that this is the data (e.g. Burgess and Humberstone (1987) and<br />

Tappenden (1993)), and others have implicitly conceded the same thing. Theorists<br />

who reject the law of non-contradiction typically feel they have some explaining to<br />

do 2 , while some of those who accept the law of excluded middle similarly feel an<br />

explanation is needed 3 , and the reason such theorists feel this way, I imagine, is that<br />

they note that we intuitively do assent to (4) but not (3), so they have to explain their<br />

divergence from ordinary practice. I will from now on assume that speakers do have<br />

these intuitions, though of course this is an empirical assumption, and much of the<br />

argument in what follows would lose some force if there was serious evidence against<br />

this assumption.<br />

Option Two – Deny that the Data is Relevant<br />

There is an obvious reason we might think that the data about assent to, and dissent<br />

from, various sentences is relevant to the theorist of vagueness. Such a theorist is in<br />

a position similar in broad respects to Quine’s radical translator (Quine, 1960, 26-<br />

35), though with two salient differences. First, she is trying to ‘translate’ her own<br />

language 4 . Secondly, she is not taking for granted that the logic and semantics of the<br />

language under investigation are classical. Still, the similarity is close enough that<br />

we should take native dispositions concerning assent to various sentences in various<br />

situations to be important data. But, it might be objected, we do not take untutored<br />

dispositions to be particularly important here. What really matters to our project<br />

are the reflective dispositions of speakers, and, it might be argued, speakers will not<br />

keep the dispositions described above at the end of the process of coming to reflective<br />

equilibrium. This is a more serious option than the first, and it requires a more subtle<br />

response. In a nutshell, the response I will give is that the theory I develop in this<br />

1Though as (Jackson, 1998, 37) notes, when philosophers say, ‘Intuitively, p’, where p might be a<br />

proposition to the effect that such-and-such an example is or is not a case of knowledge, or causation, or<br />

justice, or whatever, the only evidence they usually have that p really is intuitive is their own intuitions,<br />

and perhaps those of a few colleagues or students. And my intuitions about whether speakers in general<br />

are disposed to assent to certain sentences is a better guide to the facts than my intuitions about causation,<br />

knowledge or justice, because in the former case, but arguably not the latter, my intuitions are partially<br />

constitutive of the facts, since I am one of the speakers in question.<br />

2See, for instance, (Machina, 1976, 183-5), (Tye, 1994, 194) and (Parsons, 2000, 71) for acknowledgements<br />

of this and attempts at explanation.<br />

3See, for instance, (Keefe, 2000, 164), who proposes an similar, though less wide-ranging, explanation<br />

to the one I will provide below.<br />

4This might not be a dramatic difference if one thinks that children learn their native language by a<br />

process similar to that which the radical translator learns the foreign language, though such an assumption<br />

seems to be rather implausible these days (Laurence and Margolis, 2001).


Vagueness and Pragmatics 461<br />

paper not only predicts but justifies speakers assenting to (4) but not (1), (2) or (3),<br />

and hence these dispositions can be kept in equilibrium. How good a response this is<br />

cannot be assessed without seeing my theory, so I will say no more about this until<br />

the theory is presented. 5<br />

Option Three – Radical Semantic Change<br />

One could hold that the reason that the only one of the numbered sentences to which<br />

speakers assent is (4) is because (4) is the only one of them that is true. A sufficient<br />

motivation for holding such a view would be believing that (a) speakers are competent<br />

judges of the truth value of sentences such as (1) through (4) and (b) they assent to such<br />

sentences iff they are true. Whether that is the motivation or not, this option is taken<br />

completely by Burgess and Humberstone (1987), and is adopted in part by many<br />

other theorists. Defenders of ‘many-valued’ logics 6 accept that (1), (2) and (3) are<br />

not completely true, though neither are they completely false. Supervaluationists 7<br />

accept that (1) and (2) are not true, and (4) is true, though (3) is also true, so there<br />

must be some alternative explanation for why speakers decline to assent to it. As<br />

several writers have pointed out, most notably Field (1986) and (McGee, 1991, Ch.<br />

4), the philosophical justification for such a move is somewhat dubious. In most<br />

fields of study, if there is some clash between our theories, the data, and classical<br />

logic, then what generally goes is our theory, unless we have good reason to impugn<br />

the data. It is very unlikely that the best move will be to dismiss classical logic, unless<br />

there are no other moves available. So the success of this option depends on the nonviability<br />

of other options, and I demonstrate below that a rival option is viable. (Also<br />

there are serious internal difficulties with taking the data at face value, as Burgess and<br />

Humberstone themselves show.)<br />

Option Four – Moderate Semantic Change<br />

Russell (1923) did not discuss (4), but agreed that (1), (2) and (3) might all fail to be<br />

true. This was not because he had a radically non-classical logic. Rather, it was because<br />

he thought that logic only applied to logically perfect languages, and natural<br />

languages are not logically perfect because they contain vague terms. While this position<br />

is not vulnerable to exactly the same methodological objection as option three,<br />

5 As a few authors have stressed, for example Sanford (1976), Tye (1990) and Tappenden (1993), there is<br />

also an argument that we should not assent to (3), based on the intuition that if a disjunction is true then<br />

there must be an answer to the question, “Well, which of its disjuncts is true then?” While this argument<br />

can be directly challenged, and has been by Dummett (1975), it nevertheless provides some reason for<br />

thinking that the intuition that (3) cannot be properly asserted will survive into equilibrium.<br />

6 Such as Machina (1976), Tye (1994) and Parsons (2000). The scare quotes here are because we do not<br />

learn a lot about the nature of a logic from knowing how many distinct values there are in a particular<br />

semantic model of it. Even classical logic has many-valued models, but that does not make classical logic a<br />

many-valued logic in the salient sense. The logics in question here have recently been called ‘fuzzy logics’<br />

by Bonini et al. (1999) and Priest (2001). This seems to be a mistake – fuzzy logic is a specific research<br />

program based on work by Zadeh (especially his 1965) that differs in some important ways from the<br />

theories Machina, Tye and Parsons defend, particularly in its treatment of higher order vagueness.<br />

7 Such as Fine (1975b) and Keefe (2000). The supervaluational position defended by McGee and<br />

McLaughlin (1995) is harder to classify here because they recognise two concepts of truth, and only on<br />

one of them are (1) and (2) both not true.


Vagueness and Pragmatics 462<br />

it does seem unhappy for two reasons. First, it is a seriously incomplete theory unless<br />

it tells us what kinds of reasoning we are allowed to use in natural language. Since<br />

instances of the law of excluded middle are not true, any argument that law as a conclusion<br />

must be flawed in some way, but Russell does not provide a systematic way to<br />

locate such errors, and no one developing his theory has done so either. Secondly, the<br />

theory must provide a way to provide truth conditions to sentences such that (3) is<br />

not true, and either (4) is true or we have an explanation of why speakers are disposed<br />

to assent to it even though it is not true. The first option here seems to lead to the<br />

difficulties that Burgess and Humberstone face, and the second is a theory schema in<br />

need of completion. In neither case does it seem this Russellian option is preferable<br />

to the position I will presently describe.<br />

Option Five – Radical Pragmatics<br />

We know for many sentences that whether speakers are disposed to assert them, or<br />

even assent to them, depends on many factors beyond the mere truth conditions for<br />

the sentence. In each of the following cases 8 , speakers may only assent to the sentence<br />

marked (a) if the condition marked (b) is not satisfied.<br />

(5) (a) That looks like a knife and fork.<br />

(b) That is not a knife and fork.<br />

(6) (a) He drove carefully down the street.<br />

(b) He used reasonable (as opposed to unreasonable) care in driving down the<br />

street.<br />

(7) (a) Her action was voluntary.<br />

(b) Her action was blameworthy.<br />

(8) (a) Katie had several drinks and drove home.<br />

(b) Katie had several alcoholic drinks and shortly afterwards drove home.<br />

In all cases, the condition mentioned in (b) is no part of the truth condition of the<br />

sentence (a) 9 , though it may be required for ordinary speakers to be willing to assent<br />

to that sentence. The pragmatic interpretation of (1) through (4) is that just as in<br />

(5) through (8), there is some difference between the situations 10 in which speakers<br />

would willingly assent to the sentences, and the situations in which they are true.<br />

In what follows I will defend this option. We can, without assuming much at all<br />

about what the true theory of vagueness looks like, develop a pragmatic theory that<br />

predicts (and, for that matter, justifies) speakers assertoric practices concerning sentences<br />

like (1) through (4), and concerning a few more interesting cases that will be<br />

discussed below. While the existence of such a theory does not entail that various<br />

theories of vagueness based on non-classical logic are mistaken, indeed the pragmatic<br />

theory I sketch will, when combined with such theories generate true and interesting<br />

predictions, just as it would when combined with more conservative theories of<br />

8 All borrowed, more or less literally, from Grice (1989).<br />

9 Contra the suggestions of Wittgenstein (1953); Hart (1961); Ryle (1949) in the cases of (5), (6) and (7).<br />

10 In using this term I do not mean to endorse all of the details of the views of Barwise and Perry (1983);<br />

I merely use it as the least loaded term available in the circumstances. If one so desires, one can understand<br />

‘situations’ to be centred possible worlds in everything that follows.


Vagueness and Pragmatics 463<br />

vagueness, it does undercut the support for theories based on non-classical logics at a<br />

crucial point.<br />

There is, perhaps, a sixth option available, which is to mix and match between the<br />

above accounts. Just how reputable this option is depends on just how systematic the<br />

mixing and matching is. One might claim that some of the dispositions under consideration<br />

will not be preserved in equilibrium, others can be explained pragmatically,<br />

and others are good guides to the semantics. If this is done unsystematically, then it is<br />

obviously philosophically dubious. Later in the paper I will suggest that some recent<br />

arguments against various theories of vagueness commit just this sin. But for now we<br />

will focus on the version of option 5 outlined in the introduction.<br />

2 Truth, Assertion and Compound Sentences<br />

Consider again sentence (8), which we will focus on for a while.<br />

(8) Katie had several drinks and drove home.<br />

The truth conditions for this sentence should be clear enough, though perhaps a little<br />

vague at the fringes. 11 The sentence is true iff each conjunct is true. That is, (8) is<br />

true iff it is true that Katie had several drinks (in the time pragmatically specified as<br />

being under consideration) and drove home (again in that specified time). Assuming<br />

that the context specifies that the time under consideration is last night, then (8) is<br />

true iff Katie had several drinks last night and Katie drove home last night.<br />

The sentence cannot be properly asserted, and speakers would not normally be<br />

disposed to either assert it or assent to it, unless Katie drove home shortly after having<br />

the said several drinks, and that the drinks in question were alcoholic. The reason<br />

it cannot be properly asserted unless this condition is satisfied is that hearers will normally<br />

conclude from the existence of the utterance that the conditions are satisfied,<br />

and hence the speaker would mislead the hearers if they were not. There is one other<br />

condition that must be satisfied before speakers will happily assert (8). They must<br />

know (or at least take themselves to know) that all the conditions mentioned above<br />

are true. So for (8) to be assertable, a certain fact about the world must be true, Katie<br />

must have had several alcoholic drinks and shortly afterwards drove home, and a certain<br />

fact about the speaker must be true, she must have a justified belief that the fact<br />

about the world is true. In general (though perhaps not always) we will be able to<br />

make such a division into facts about the world that can be reasonably assumed to<br />

be communicated by an utterance and hence must be true before a sentence can be<br />

properly asserted, and facts about the speaker (usually that they have a justified belief<br />

that those facts about the world hold) that must also be true before that speaker can<br />

properly assert the sentence. No doubt there will be practical difficulties in any case<br />

11The vagueness will not be directly relevant here. For most (but not all) of the points I want to make<br />

below, we could instead use (8 ′ )<br />

(8 ′ )Katie drank a bottle of scotch and drove home.<br />

It is convenient to have a sentence that does not say that the drinks were alcoholic, so we will stay with<br />

(8) for now. I am grateful to a conversation Peter Smith for clearing up some of the possible confusions<br />

here.


Vagueness and Pragmatics 464<br />

in making this division, and in some cases there may even be conceptual difficulties<br />

in carrying out this task. (We will come back to this point below.) Recognising this<br />

difficulty, we will for now carry on as if the division can be made. For a sentence<br />

S, say the semantic content of S is the set of situations in which S is true, the objective<br />

pragmatic content of S is the set of situations such that the conditions about the<br />

world necessary for S to be asserted are satisfied and the subjective pragmatic content<br />

of S for x is the set of situations in which S justifiably believes that objective pragmatic<br />

content of S is satisfied. The idea is that x should be happy, on reflection, to<br />

assent to S in just those situations in the subjective pragmatic content of S. We will<br />

write T (S) for the semantic content of S, O(S) for its objective pragmatic content,<br />

and A(S, x) for its subjective pragmatic content for x. The semantic content of the<br />

sentence on an occasion is what Grice said was said (in his favoured sense) by uttering<br />

the sentence, while the objective pragmatic content is what he said is implicated<br />

(Grice, 1989, 118). 12 The subjective pragmatic content corresponds rather closely to<br />

Quine’s affirmative stimulus meaning (Quine, 1960, 33).<br />

It is commonly 13 assumed that semantic content must be compositional. This<br />

assumption may or may not be true, but there is some evidence that objective pragmatic<br />

content is compositional. (Indeed, this is an important reason for recognising<br />

objective pragmatic content as well as subjective pragmatic content.) Consider the<br />

indicative conditional (9).<br />

(9) If Katie had several drinks and drove home, then she broke the law.<br />

It seems that O((9)) includes all the situations we might find ourselves in these days.<br />

Given that there are laws against driving while intoxicated, and that the antecedent<br />

implies that Katie drove intoxicated, we are happy to assent to that conditional. (Perhaps<br />

things would be different if we somehow knew that Katie was immune to motor<br />

laws, but let us set that aside.) But even though intuitively O((9)) includes all situations<br />

we might hope to find ourselves in, there is a good argument that T ((9)) does<br />

not include some salient situations in everyday life. Consider the world in which last<br />

night Katie drove home sober, then had several drinks, and broke no laws for the<br />

evening. Then (9) is a conditional with a true antecedent and a false consequent. And<br />

that indicative conditionals with true antecedents and false consequents are false is<br />

the closest thing there is to a point of consensus in theories about conditionals 14 . So<br />

T ((9)) does not include some situations that are included in O((9)). Further, we can<br />

work out what O((9)) is without knowing what T ((9)) is; even if doubt about various<br />

semantic theories concerning indicative conditionals means that we are unsure<br />

of the truth conditions for (9), we know that O((9)) includes all the situations we are<br />

12 (Stanley and Szabó, 2000, 230) use ‘communicated’ here, which is probably more perspicuous in virtue<br />

of being less technical. Note that when I say that the semantic content is an unstructured entity, a set of<br />

situations, I do not rule out the possibility that the sentence has that content in virtue of expressing a<br />

structured proposition.<br />

13 Though not universally; see Schiffer (1987) and McGee (1991).<br />

14 Of course not quite everyone joins the consensus, most prominently McGee (1985). Still, the principle<br />

is called the Uncontested Principle by Jackson (1987), so my claim that it is a consensus is not exactly<br />

idiosyncratic.


Vagueness and Pragmatics 465<br />

likely to encounter. This explains why we can assert (9) while knowing next to nothing<br />

about Katie; given reasonable background assumptions, its objective pragmatic<br />

content is more or less trivial.<br />

These considerations decisively refute one possible theory of how we calculate<br />

objective pragmatic content, a theory that Grice seems to take to be true. 15 On this<br />

hypothesis, we calculate the objective pragmatic of a sentence by first consulting our<br />

linguistic knowledge to determine its semantic content, then employing our mastery<br />

of the Gricean maxims to work out what conversational implicatures it might have,<br />

and the objective pragmatic content is the set of situations in the semantic content<br />

where those implicatures are all true. This can’t be right in detail, for it predicts O((9))<br />

will be a subset of T ((9)), which we have seen is not true. And it can’t even be right<br />

in broad outline, because it predicts we should be unsure of the objective pragmatic<br />

content of a sentence until we know its semantic content. This seems clearly untrue<br />

in the case of (9); we can know its objective pragmatic content while remaining quite<br />

unsure of its semantic content. 16<br />

Those considerations suggest that the objective pragmatic content of compound<br />

sentences is tied less closely to the semantic content of that sentence, and more closely<br />

to the objective pragmatic content of its constituent sentences. In light of this suggestion,<br />

the following hypothesis, called the Compositionality of Objective Pragmatic<br />

content, or COP, thesis, might be plausible.<br />

COP The objective pragmatic content of a compound sentence is a function of the<br />

objective pragmatic contents of its constituents, with the function given by the<br />

operator or connective used to form the compound.<br />

When stated in full generality like that COP is a bit obscure, but it becomes clear<br />

with a few examples. COP entails that O(If A then B) is If O(A) then O(B). In the<br />

case of (9), this is exactly the right answer. Further, it entails that O(A or B) will<br />

be O(A) or O(B) and O(Not A) will be Not O(A). Again, consideration of sentences<br />

where (8) is embedded in various sentences suggest that COP is on the right track.<br />

One might worry that this one example can hardly support a theory as wide<br />

ranging as COP. This of course is true; a large part of the argument for COP is that<br />

it generates a theory that makes such surprising true predictions when applied to<br />

vagueness. (See, in particular, the discussion of complex contradictions in section 4<br />

below.) But it is worth noting that the points about (8) and (9) above do not just<br />

turn on features to do with the ordering implication in conjunctions. Kent (Bach,<br />

1994, 134) notes that one can often use (10) to sooth someone who is in a relatively<br />

mild state of disrepair. (Imagine a mother saying this to an injured child, or a doctor<br />

reporting good test results to a patient.)<br />

15 That last claim is contentious. Grice only needs the premise that things are as if the theory I am about<br />

to describe is true, and even if that theory is false, it is possible that the relevant things are as if the theory<br />

is true.<br />

16 (Stanley and Szabó, 2000, 231) say that “Cases where the speaker knows the proposition communicated<br />

without the proposition expressed . . . are highly exceptional.” The above considerations seem to<br />

suggest either that indicative conditionals, or at least indicative conditionals whose antecedents typically<br />

carry Gricean implicatures, are exceptional cases or that Stanley and Szabó’s claim is mistaken.


Vagueness and Pragmatics 466<br />

(10) You’re not going to die.<br />

Normally, when this is uttered, the speaker knows, or should know, that it is literally<br />

false. Of course the person addressed is going to die some time, so the semantic content<br />

of (10) is false. This reasoning is not conclusive; the sentence might be elliptical<br />

and it might be clearly true once the ellipsis is completed. But the more plausible<br />

position seems to be that the sentence is false. The speaker communicates that the<br />

hearer is not going to die soon, or from the particular illness they are suffering, and<br />

that is the objective pragmatic content of the sentence. And we can note that this content<br />

seems to be carried over into conditionals. Jack is working on a major project,<br />

and his manager Jill is concerned he is taking too much time off with illness. While<br />

he is at home with one minor illness, Jill emails him the following directive.<br />

(11) If you’re not going to die, then you should be in at work.<br />

Heartless, perhaps, but the intended message is clear. If Jack is not going to die from<br />

this particular illness, or at any rate in the near future, he should be at work. Jack<br />

could hardly say that this conditional directive (threat?) did not apply to him, because<br />

as a mortal he is sure to die. COP is not entailed by two examples any more than it is<br />

by one, but it is worthwhile noting that we have not had to rely on particular features<br />

of (8) or (10) to support COP.<br />

If something like COP is correct, then it is important to distinguish between<br />

objective and subjective pragmatic content. 17 Note that if we replaced objective with<br />

subjective pragmatic content in COP, generating a thesis we may call CSP, we get a<br />

clearly false thesis. It is not the case that we are only happy to assert A or B when<br />

we are happy to assert A or we are happy to assert B. We may assert Either X will<br />

be the next Prime Minister or Y will be the next Prime Minister, for suitable X and Y,<br />

when we don’t know who the next Prime Minister will be, but are very confident<br />

that it will be X or Y. Indeed, we might only assert it if we don’t know who the next<br />

Prime Minister will be; if we did know this we would assert it rather than just the<br />

disjunction. So CSP is false, but this does not show that COP is false.<br />

3 Application to Vagueness<br />

On most theories of vagueness, if F is a vague predicate, then we can distinguish between<br />

a being F , and a being determinately F . 18 And, again on most theories, the<br />

conditions under which it is proper to say that a is F , or to assent to the claim that<br />

a is F , are those where a is determinately F . On the epistemic theory of vagueness,<br />

we can only say that a is F if it known that a is F , and, according to that theory,<br />

that means that a is determinately F . (What makes the epistemic theory of vagueness<br />

epistemic is that it interprets ‘determinately’ as an epistemic operator.) On the<br />

17 I am indebted here to conversations with Tim Maudlin and <strong>Brian</strong> McLaughlin.<br />

18 Various writers use ‘clearly’ or ‘definitely’ where I use ‘determinately’. Nothing of any importance<br />

turns on this.


Vagueness and Pragmatics 467<br />

supervaluational theory, we can only properly say that a is F if a is F on all 19 precisifications,<br />

and that is what it is for a to be determinately F on that theory. On degree<br />

of truth theories, we can assert that a is F iff a is F to a very high degree, and that is<br />

what it is for a to be determinately F on that theory. In short, the pragmatic content<br />

of ‘a is F ’ is that a is determinately F , or, for short, �F a. Note that on the epistemic<br />

theory, this is the subjective pragmatic content, while on the supervaluational<br />

and degree of truth theories it is the objective pragmatic content. This reflects the<br />

differences between the ways the theories understand determinate truth. This point<br />

acquires some importance soon, so we will return below to what epistemicists might<br />

take the objective pragmatic content of ‘a is F ’ to be. For now we will focus on those<br />

theories where �F a is the objective pragmatic content of ‘a is F ’. On those theories,<br />

COP predicts that objective pragmatic content of (3) and (4) will be (12) and (13)<br />

(3) Louis is bald or Louis is not bald.<br />

(12) � (Louis is bald) or � (Louis is not bald).<br />

(4) It is not the case that Louis is bald and that he is not bald.<br />

(13) It is not the case that � (Louis is bald) and that � (Louis is not bald).<br />

Note that on almost any theory of vagueness one cares to consider, if Louis is a<br />

penumbral case of baldness, then (12) will be false and (13) true. (12) is false because<br />

Louis’s penumbral status makes both � (Louis is bald) and � (Louis is not bald) come<br />

out false. Hence if COP is true, speakers should decline to assent to (3), but should<br />

assent to (4). Since they do, this is good news for COP.<br />

We have not said how COP should apply to quantified sentences. These are a little<br />

harder to incorporate into the theory than sentences that are formed by familiar<br />

propositional connectives because the logical form of these sentences is less transparent.<br />

I will assume 20 that quantified noun phrases are restricted quantifiers. Hence the<br />

logical form of (14) will be (15).<br />

(14) Q Fs are Gs<br />

(15) [Qx: Fx] Gx<br />

Although COP as it stands is silent on quantified sentences, we can naturally generalise<br />

it. And the natural thing to say, given COP, is that if the logical form of (14) is<br />

(15) then its objective pragmatic content is (16).<br />

(16) (Qx) [� Fx : � Gx]<br />

This lets us explain one of the consequences of supervaluationism that is, if anything,<br />

more surprising that its endorsement of the law of excluded middle. The following<br />

presentation of the puzzle is due to Jamie Tappenden. Let P(n) abbreviate ‘A<br />

man with exactly n cents is poor’. “Since the supervaluation deems the conditional<br />

19 Or perhaps most. Little will turn on this here, but the ability of supervaluational theories to handle<br />

various paradoxes (such as the Sorites and the problem of the many) might depend on just how this<br />

principle is framed. For salient discussion on this point see Lewis (1993b).<br />

20 Following Barwise and Cooper (1981), Higginbotham and May (1981) and, most directly, Neale (1990)


Vagueness and Pragmatics 468<br />

premise of the sorites paradox false, it deems true the calim that there is an n such<br />

that (P(n) ∧ ¬P(n + 1)). Alas there is no such number.” (Tappenden, 1993, 564)<br />

Let us abbreviate further, and say that n is the poor borderline iff it is such that<br />

(P(n) ∧ ¬P(n + 1)). Then the dubious claim is (17), written symbolically as (18), and<br />

Tappenden’s alternative judgement is (19), or in symbols (20).<br />

(17) Some number is the poor borderline.<br />

(18) [∃xN(x)] (PB(x))<br />

(19) No number is the poor borderline.<br />

(20) ¬([∃xN(x)] (PB(x)))<br />

From what we said above, it follows that the objective pragmatic contents of (17) and<br />

(19) are (18a) and (20a). 21<br />

(18a) [∃x�N(x)] (�PB(x))<br />

(20a) �¬([∃x�N(x)] (�PB(x)))<br />

Since (18a) is false, and (20a) true, we have an explanation not only of why one cannot<br />

properly assert (17), but of why one can assert its negation, (19).<br />

So our hypothesis, derived from COP, can explain a few puzzling pieces of data.<br />

And if COP is generally correct, then it does so it a fairly systematic way. Tappenden<br />

suggests that we endorse (4) but not (3) because “Noncontradiction in these cases is<br />

a “no overlap” condition” while “excluded middle functions as a “sharp boundaries”<br />

condition.” (565) No doubt this is true, but it would be nice to have a systematic<br />

explanation of why this is so, and COP promises to give us one.<br />

We notes above that the subjective pragmatic content of compound sentences is<br />

not built from the subjective pragmatic content of its components in the way that<br />

objective pragmatic content is built up. This means that if the definitely operator<br />

in the pragmatic content is part of the subjective pragmatic content, we cannot give<br />

the above explanation for the unattractiveness of (3). This is not a problem for a<br />

supervaluational theory, but it is a problem for the epistemic theory. When Louis<br />

is a penumbral case of baldness, the reason we are happy to neither assert that he is<br />

bald nor assert that he is not is that we do not know which. There is a close analogy,<br />

according to this theory, between our unwillingness to make these assertions and our<br />

unwillingness to assert either that Louis was born in France or that he was not when<br />

we do not know which. 22 But it seems there is an important disanalogy between<br />

these two cases; namely, in the case of ignorance we are happy to say that the relevant<br />

instance of excluded middle is true, whereas our assessment of (3) is, as Tappenden<br />

puts it, “range from mixed to strongly negative” (1993, 565).<br />

21 Nothing turns on it here, but whether the box at the front of (20a) should be there depends on just<br />

what the logical form of No Fs are Gs is. I have assumed, without any good reason, that it is ¬[∃x: Fx](Gx),<br />

but other interpretations are possible. Nothing turns on it because however we write (20), (20a) will be<br />

true.<br />

22 I assume there is no vagueness concerning where Louis was born, though of course there is a faint<br />

possibility that this would be vague. The analogy is both drawn and alleged to be supported by data in<br />

Bonini et al. (1999).


Vagueness and Pragmatics 469<br />

There is a way out of this difficulty for the epistemicist, though it does require<br />

relaxing the analogy between cases of vagueness and traditional cases of ignorance.<br />

We said above that the subjective pragmatic content of a sentence (on an occasion)<br />

was that the speaker knew the objective pragmatic content was satisfied. In effect,<br />

we took objective pragmatic content as primary, and subjective pragmatic content is<br />

derived from it. We could have done things the other way around. Loosely following<br />

Quine’s lead, we will take subjective pragmatic content as primary, and say that the<br />

objective pragmatic content is what is common to subjective pragmatic content all<br />

(or perhaps most) occasions the sentence is used. If part of the subjective pragmatic<br />

content on every occasion of utterance of a simple sentence is that the speaker knows<br />

the sentence is true, then it will follow that part of the objective pragmatic content<br />

is that the sentence is knowable. So we can interpret � in each of the statements of<br />

objective pragmatic content as a knowability operator, and we get all the right results:<br />

(10) and (18a) turn out to be false, (11) and (20a) turn out to be true.<br />

If true this account would explain the date, but its plausibility seems open to question.<br />

The theory says that objective pragmatic content of a simple sentence is what<br />

is common to all the different subjective pragmatic contents, and then says that the<br />

objective pragmatic contents of complex sentences are composed out of the objective<br />

pragmatic contents of the components, and says this fact explains our reactions to<br />

(3) and (4). But that fact (if it is a fact) stands in need of explanation just as much as<br />

our reactions to (3) and (4) do. If objective pragmatic content is something as artificial<br />

as this, it seems mysterious why it should so neatly compose. If, on the other<br />

hand, objective pragmatic content is something that speakers understand in virtue<br />

of understanding the sentence (and perhaps even something they understand before<br />

understanding the semantic content), then we have an explanation for why it is compositional:<br />

it has to be if speakers’ understanding of the language is to be productive.<br />

So while epistemicists can adopt something like COP as an explanation for our reactions<br />

to (3) and (4), their adopting of it must rest on premises that seem surprising,<br />

and possibly in need of explanation.<br />

4 Modifying COP<br />

If our only data was that people are hesitant about instances of the law of excluded<br />

middle, like (3), but are not resistant to asserting simple instances of the law of noncontradiction,<br />

like (4), then COP would do an excellent job in explaining the data<br />

that we have. But this does not seem to be all the data we have. In two ways our<br />

willingness to assert sentences goes beyond what is suggested above. In this section I<br />

will set out that data, and then suggest a natural revision of COP that explains this<br />

data, as well as the data presented above.<br />

Above we have stressed what Jamie Tappenden calls the truth-functional intuition.<br />

This intuition, directly or indirectly, causes us to resist disjunctions when we<br />

know that for each disjunct there is no fact of the matter as to whether it is true. As<br />

Tappenden notes, though, our intuitions towards vague sentences are also guided at<br />

times by a penumbral intuition. When under the sway of this intuition, we tend to<br />

judge disjunctions as true if all the possibilities other than the disjuncts in question


Vagueness and Pragmatics 470<br />

have been ruled out. To take a classic example from Fine (1975b), if a shade of colour<br />

is around the border between red and orange, then in the right frame of mind we<br />

might be prepared to say That is red or orange. To be sure, as Tye (1990) and Tappenden<br />

(1993) note, this intuition can waver in the face of sustained argument, such as<br />

the forceful suggestion that if a disjunction is true there should be a fact about which<br />

disjunct is true. And as Machina (1976) makes clear, some philosophers do not feel<br />

this intuition at all.<br />

I think the penumbral intuition is a real, widespread phenomena, and it is incumbent<br />

on a theory of vagueness to explain it. My main reason for regarding it this<br />

way, despite the comments of Tye, Tappenden and Machina, is the prevalence of the<br />

penumbral intuition in some of the social sciences. To take just one prominent case,<br />

in most macroeconomics textbooks there will be a warning that the traditional division<br />

of goods into investment goods and consumption goods is fraught with vagueness,<br />

cars are often mentioned as being a penumbral case, but this is explicitly taken<br />

to be consistent with the assumption that all goods are investment goods or consumption<br />

goods. For example, (Keynes, 1936, 59-63) explicitly mentions that the line<br />

between investment goods and consumption goods is vague, but then proceeds to run<br />

a technical argument that clearly has as a premise that all goods are investment goods<br />

or consumption goods 23 . And the common distinction between goods and services is<br />

attended by similar vagueness, I guess takeaway food is these days the clearest penumbral<br />

case, but economists are frequently willing to divide all sales into purchases of<br />

goods and purchases of services, simply because that exhausts the possibilities.<br />

The epistemic merits of building a scientific discipline on vague terms while assuming<br />

the logic appropriate to that discipline is classical could be debated, but that<br />

is not our topic. The fact remains that various social scientists are prepared to talk<br />

in a certain way, accepting disjunctions even when they know they could not, in<br />

principle, have reason to accept either disjunct. This is an important piece of data<br />

that theorists of vagueness must explain. The data is not of a different kind to what<br />

had been previously considered, but it does reinforce the claim that the penumbral<br />

intuition must be accommodated.<br />

I said above that we are normally prepared to accept simple instances of the law of<br />

non-contradiction, like (4). A corollary of that is that speakers normally reject simple<br />

contradictions, even when each conjunct is a penumbral case. (21) is an obvious<br />

example.<br />

(21) ?Louis is bald and Louis is not bald.<br />

23 In an earlier draft of The General Theory (Keynes, 1934) this premise is more explicit: “[F]inished<br />

goods fall into two classes according as the effective demand for them depends predominantly on expectations<br />

of consumers’ demand [i.e. they are consumption goods] or partly on expectations of consumers’<br />

demand and partly on another factor conveniently summed up as the rate of interest [i.e. they are investment<br />

goods].” (428) Two pages later Keynes explicitly acknowledges that any division of real-world goods<br />

into categories such as these will be more than a little ‘arbitrary’, but claims that anything that is true on<br />

any arbitrary way of drawing the line is after all, true. It is unfortunate, but surely insignificant, that this<br />

premise was not left explicit in the final draft. Keynes’s position, of acknowledging the distinction to be<br />

vague but reasoning as if a line had been drawn is repeated in many economics texts.


Vagueness and Pragmatics 471<br />

(Williamson, 1994, 136) notes that intuitions can be a little misleading here, because<br />

of the use of (more or less) idiomatic expressions like ‘He is and he isn’t’ to describe<br />

borderline cases. To get a feel for this, imagine the following conversation. “Is Louis<br />

bald?” “Well, . . . he is and he isn’t.” One might push intuitions like this to get<br />

someone to feel (21) is properly assertable, and hence (4) is not. As Williamson perceptively<br />

notes, this tendency can be overcome merely by using different names, or<br />

generally different referring devices, to pick out the subject in each conjunct. This is<br />

fairly good evidence that we are dealing with an idiomatic usage here. So whatever<br />

one thinks about (21), (22) should seem definitely odd.<br />

(22) ??Louis is bald, and the King of France is not, and Louis is the King of France.<br />

To the extent that one can make sense of (22) at all, it is by assuming that bald picks<br />

out an intensional property, while the identity clause only implies an extensional<br />

equivalence between Louis and the king. This is almost certainly a false assumption,<br />

but it seems the most charitable assumption around if one is interpreting (22). One<br />

certainly does not hear utterances of (22) as in any way conveying that Louis is a<br />

borderline case of baldness, as one might hear the idiomatic, “He is and he isn’t.”<br />

Everything that has been said so far relates to what I have called simple contradictions.<br />

These are sentences such as ‘a 1 is F and a 2 is not F ’, for suitable predicates F ,<br />

and expressions a 1 and a 2 which are known by all parties to the conversation to corefer.<br />

This last condition can be satisfied on a particular occasions by making the two<br />

names the same, or by saying that they are co-referring, as in (22). And, following<br />

Williamson, I have argued that the only such sentences that are accepted by speakers<br />

are idiomatic. However, when we stop dealing with simple contradictions, we find<br />

the data becomes somewhat more problematic. I think (23) is a legitimate, if slightly<br />

long-winded, way to communicate that Louis is a penumbral case of baldness.<br />

(23) It is not the case that Louis is bald, but nor is it the case that he is not bald.<br />

Assuming the last not in (23) can be viewed as a sentential connective, then (23) is a<br />

contradiction. 24 Yet it seems like a perfectly accurate thing to say about Louis. Any<br />

infelicity associated with it is due to its length, not its apparent falsehood. Unlike<br />

(21), or phrases like “He is and he isn’t”, (23) does not behave like an idiom. Replace<br />

the pronoun with a term known to refer to Louis, such as “His Majesty”, and the<br />

message conveyed by (23) stays the same.<br />

(23a) It is not the case that Louis is bald, but nor is it the case that His Majesty is not<br />

bald.<br />

So some contradictions, such as (23) can be properly asserted on account of vagueness,<br />

and this is not due to their being idiomatic. In the terminology of the previous<br />

section, the objective pragmatic content of (23) is not the null set, it is the set of situations<br />

where Louis is a penumbral case of baldness. This striking piece of data needs to<br />

24 Even if it cannot be so viewed, (23) might still count as being a contradiction under a more liberal<br />

definition of what constitutes a contradiction.


Vagueness and Pragmatics 472<br />

be explained. And it should be immediately clear that the explanation cannot be that<br />

(23) is true. After all, (23) is a contradiction, and contradictions have a tendency to<br />

be false. So the explanation for it must be pragmatic. As it stands, COP cannot provide<br />

that explanation. But a small alteration to COP can do so, and when combined<br />

with the right kind of theory of vagueness, can explain why we are happy to accept<br />

disjunctions without a determinately true disjunct, at least while in the social science<br />

classroom. I will first state the theory, called POP (loosely for Pragmatic determination<br />

of Objective Pragmatic Content), then explain how it applies. I will first state<br />

the special case of POP for sentences that have no differences between their semantic<br />

and objective pragmatic contents other than those caused by vagueness. This special<br />

theory will be called POP V .<br />

POP V Let S be a sentence that has no differences between its semantic and objective<br />

pragmatic contents other than those caused by vagueness. Then there is a<br />

sentence S ′ generated by adding � operators to S so that every term in it apart<br />

from sentential connectives is inside the scope of a � operator, and O(S) =<br />

T (S ′ ). Which such sentence S ′ satisfies this condition on an occasion where S<br />

is used is determined by pragmatic features of utterance and occasion.<br />

So, for example, if S is ‘a is F or b is G’, then POP V says that the objective pragmatic<br />

content of S is either �F a ∨ � Gb or � (Fa ∨ Gb). More generally, any sentence you<br />

can generate from S by adding boxes so that every part of S, except the sentential<br />

connectives, is inside the scope of a box, could be the objective pragmatic content of<br />

S. Letting l be a name for Louis, and B refer to the property of baldness, then POP V<br />

says that the objective pragmatic content of (3) is (24a) or (24b), and that the objective<br />

pragmatic content of (4) is one of (25a) through (25c).<br />

(3) Louis is bald or Louis is not bald.<br />

(24) (a) �B l ∨ �¬B l<br />

(b) �(B l ∨ ¬B l )<br />

(4) It is not the case that Louis is bald and Louis is not bald.<br />

(25) (a) ¬(�B l ∧ �¬B l )<br />

(b) ¬�(B l ∧ ¬B l )<br />

(c) �¬(B l ∧ ¬B l )<br />

In each case, it says there should be some indeterminacy in precisely how the objective<br />

pragmatic content is generated. But, as we said in the introduction, a crucial<br />

difference arises between the two cases if we assume a broadly supervaluational (or<br />

epistemic) interpretation of �. In (3), POP V predicts that depending on the context,<br />

the objective pragmatic content will either be (24a), which is false, or (24b), which is<br />

true. So it predicts that whether (3) is assertable will depend on the broader features<br />

of the context that select which of these will be the objective pragmatic content. In<br />

(4), POP V predicts that whatever method is pragmatically selected to generate the<br />

objective pragmatic content, it will be true. So (4) should be always assertable. In<br />

each case, we seem to get a rather pleasing correlation between predictions and data.


Vagueness and Pragmatics 473<br />

The assumption here that � would get a supervaluational (or epistemic) interpretation<br />

is crucial. Interpret �A as meaning A has a high truth value, in a typical degree<br />

of truth theory, and we do not get the conclusion that (3) should sound trivial in<br />

some contexts, since (24b) is not guaranteed to be true, and nor do we get the conclusion<br />

that (4) should always sound trivial, since (25c) is no longer guaranteed to be<br />

true. In fact, if Louis is anything like a penumbral case of baldness, (25c) is guaranteed<br />

to be false on those theories. So if proponents of that theory want to explain why<br />

(4) sounds trivial, they must appeal to something other than POP V . Supervaluationists,<br />

however, can explain the data just via POP V . So can epistemicists, provided they<br />

can discharge the burden outlined earlier of explaining how a subjective feature like<br />

knowability can make its way into objective pragmatic content. From now on we<br />

will assume, with the supervaluationists and epistemicists, that all classical tautologies<br />

are true, and all classical anti-tautologies are false.<br />

Most surprisingly, POP V is consistent with the hypothesis that complex contradictions,<br />

like (23) should be properly assertable, and hence have a non-degenerate<br />

objective pragmatic content. POP V predicts that the objective pragmatic content of<br />

(23) is one of (26a) through (26d).<br />

(23) It is not the case that Louis is bald, but nor is it the case that he is not bald.<br />

(26) (a) ¬�B l ∧ ¬�¬B l<br />

(b) �¬B l ∧ ¬�¬B l<br />

(c) ¬�B l ∧ �¬¬B l<br />

(d) �(¬B l ∧ ¬¬B l )<br />

(26b) through (26d) are all false, since they are all inconsistent, or entail inconsistencies<br />

by obvious steps. But (26a) is not inconsistent, indeed it is true. So POP V<br />

explains why a contradiction, like (23) can be used to convey a true message, despite<br />

not being idiomatic. This is a rather surprising piece of data to have, and it is even<br />

more surprising to have a neat explanation of it. For what it is worth, and depending<br />

on your preferred philosophy of science it might be worth a lot, when I was<br />

developing this theory, the explanation of (23) appeared as a prediction, not a retrodiction.<br />

I had no idea that there could be non-idiomatic contradictions which, because<br />

of vagueness, could be used to convey true messages, until I realised POP V predicted<br />

the existence of such sentences, and then I realised (23) was such a sentence. If one<br />

tends to value surprising and true predictions higher than any retrodictions, this little<br />

bit of autobiography has epistemic importance 25 .<br />

So POP V can explain some data that COP cannot. Still, we have insufficient<br />

reason to switch to POP V unless it can be embedded in a theory of the same level<br />

of generality as COP. To that end, I set out the general version of POP. For any<br />

compound sentence, there are going to be several ways to partition the sentence into<br />

sub-sentences that one can ‘treat as simple’. To treat a sub-sentence as simple, in<br />

the sense intended here, is to not take into account the fact that the sentence has<br />

25 It does seem rather surprising to think that autobiographical facts like this could effect the plausibility<br />

of a theory; so surprising in fact that one might view it as a problem for the prediction/retrodiction<br />

distinction. But that is a matter for a paper far removed from this one.


Vagueness and Pragmatics 474<br />

sentences as parts when evaluating its objective pragmatic content. For a sentence<br />

that one treats as simple, the objective pragmatic content is generated from the form<br />

and semantic content by the application of broadly Gricean maxims, including, if this<br />

is not included already, a maxim that one should not assert sentences unless they are<br />

determinately true. Sentences that have no sentences as proper parts in their logical<br />

form have to be treated as simple. 26 For other sentences, though, there are often<br />

choices as to which sentences can be treated as simple for purposes of generating<br />

objective pragmatic content. For example, if S is ‘If S 1 and S 2 then S 3 ’, then one can<br />

treat S 1 , S 2 and S 3 as simple sentences, or ‘S 1 and S 2 ’ and S 3 , or, possibly, S itself. (The<br />

‘possibly’ here is because it is unclear whether we can treat a conditional as simple,<br />

just because we can only treat as simple sentences that we (at least tacitly) know the<br />

truth conditions of, and we are so in the dark about how conditionals work that it is<br />

not clear this is possible.) Then POP says:<br />

POP The objective pragmatic content of a compound sentence is a function of the<br />

objective pragmatic contents of its sub-sentences that are treated as simple, with<br />

the function given by the operator or connective used to form the compound.<br />

The choice of which sub-sentences are treated as simple is determined by the<br />

syntactic features of the sentence and the context. The objective pragmatic<br />

content of sentences treated as simple is determined by a direct application of<br />

broadly Gricean rules.<br />

The reference here to syntactic features is because there seem to be some quite general<br />

relations between surface-level syntax and which sentences are treated as simple in the<br />

default evaluation of a sentence. In general, the more variables that are relevant to the<br />

objective pragmatic content of two sentences and not determined by the surface syntax,<br />

the less likely those sentences are to be treated as simple. Conversely, the fewer<br />

‘hidden variables’ shared by two sentences, the more likely they are to be treated as<br />

simple. We can see this illustrated by an example we have already examined. For ease<br />

of exposition, we will use a slight variant on the example discussed above.<br />

(27) If Katie had several drinks and she drove home, then she broke the law.<br />

In a normal utterance of (27), we evaluate each clause as being about a specific time.<br />

For example, even if there was a time ten years ago when Katie had several alcoholic<br />

drinks and drove home shortly afterwards, we would not normally consider the antecedent<br />

of the conditional satisfied unless she repeated this behaviour more recently.<br />

(There is an obvious exception to this if the events of ten years ago are for some reason<br />

under consideration.) The sense of ‘satisfaction’ here is meant to be rather pragmatic;<br />

it is the sense in which the antecedent of (27) is not satisfied if Katie drank water all<br />

night after driving home sober. So in the pragmatic content of the antecedent of (27)<br />

26 When we are just investigating the objective pragmatic content of conditionals and disjunctions this<br />

condition is relatively easy to interpret. Some complications arise because POP, like COP, is intended to<br />

apply to quantified sentences as well, and indeed mimic COP’s explanation of why (17) is not assertable<br />

though (19) is. The intent here is that the Gx in [∃x: Fx]Gx should count as a sentence for purposes as<br />

POP.


Vagueness and Pragmatics 475<br />

there is a suppressed reference to a time period. We can make this explicit, as in (28),<br />

without noticeably changing the pragmatic content.<br />

(28) If, last night, Katie had several drinks and she drove home, then she broke the<br />

law.<br />

In (28) we added the suppressed time reference to the conjunction that is its antecedent.<br />

This addition did not, it seems, change the pragmatic content. If, however,<br />

we add the suppressed time reference to both conjuncts, we do get a change in<br />

pragmatic content.<br />

(29) ?If Katie had several drinks last night and she drove home last night, then she<br />

broke the law<br />

To my ear, at least, (29) does not seem as trivial as (28), and certainly not as trivial as<br />

(27) or (9). POP has an explanation for this. Once the two conjuncts in effect stop<br />

sharing an unuttered constituent, there is less temptation to treat the conjunction as a<br />

simple sentence. If we treat the two conjuncts as simple, then the objective pragmatic<br />

content of the conjunction is just the conjunction of the objective pragmatic content<br />

of the conjuncts. And that means that the objective pragmatic content does not imply<br />

that the driving happened after the drinking, since this is not implied by the objective<br />

pragmatic content of the conjuncts, taken separately or together.<br />

This hypothesis about which sentences are treated as simple has implications for<br />

vagueness. It predicts that the objective pragmatic content of (23) will be (23a), and<br />

hence that (23) is not just possibly assertable, but actually assertable. Returning to<br />

Fine’s example of the colour patch that is somewhere between being perfectly red<br />

and perfectly orange, it predicts that (30) should sound better than (31).<br />

(30) The patch is red or orange.<br />

(31) The patch is red or the patch is orange.<br />

I think this prediction is correct, though my intuitions here are perhaps getting unreliable.<br />

The more important point is that there seems to be some evidence for POP outside<br />

of vagueness. Since POP also does an excellent job at explaining several puzzling<br />

features concerning vagueness, this suggests it is a rather well-supported theory.


Vagueness and Pragmatics 476<br />

5 Consequences for Vagueness (and Conditionals)<br />

To conclude, I will mention four points that follow from the discussion so far. First,<br />

I will note that everyone needs a theory something like POP if they are to explain<br />

the data, so the fact that supervaluationists (and their fellow travellers) need POP<br />

to explain some data is no cost to them. Secondly, I will note that the plausibility<br />

of POP reduces the plausibility of some arguments against various theories of<br />

vagueness. Thirdly, I will outline what POP tells us about Sorites arguments. And<br />

fourthly, I will return to the point of whether epistemicists can appeal to POP to<br />

explain the data under consideration here. I will suggest that when we look closely at<br />

Vann McGee’s ‘counterexample to modus ponens’ we find some evidence to suggest<br />

that they can. The evidence is hardly conclusive, and POP still sits more comfortably<br />

with semantic than with epistemic theories of vagueness, but it may provide<br />

some solace to epistemicists.<br />

5.1 The Need to POP<br />

POP explains all the data we have, and makes surprising true predictions. But there<br />

might be other theories more deserving of our assent. In particular, it might be possible<br />

to provide a semantic explanation of the data we have seen so far, and if so, it<br />

might be preferable to accept such an explanation. It is not entirely clear whether<br />

such a semantic explanation of the data would be preferable if it were possible, because<br />

it has become fairly standard practice to prefer pragmatic explanations of data<br />

to semantic explanations. Grice, for instance, never conclusively proved that the semantic<br />

theories of Austin, Strawson and Wittgenstein that he attacked in the first<br />

of the William James lectures were false, just that they were unnecessary given his<br />

pragmatic theories. And most of us took that to be sufficient to make the case, presumably<br />

because there is a preference for pragmatic explanations in the kind of cases<br />

Grice considered. I will not stress this point here, because it seems all the semantic<br />

rivals to POP are provably inferior. I will only consider two rivals, because it shall<br />

quickly become clear that all such rivals face a fairly pressing problem.<br />

The first rival theory to consider says that vagueness shows that the correct logic<br />

for natural language is intuitionist. Given that the original task we set ourselves was<br />

to explain why an instance of the law of excluded middle, (3), was not acceptable,<br />

while the matching instance of the law of non-contradiction, (4), was acceptable, intuitionism<br />

may seem to be a natural refuge. After all, intuitionism famously rejects<br />

excluded middle while accepting non-contradiction. It might be objected immediately<br />

that vagueness gives us no reason to drop the principle of double negation elimination.<br />

But it is not clear that this is so. After all, it is possible to read (23) as a denial<br />

of the conditional If it is not the case that Louis is not bald, then he is bald, and as we<br />

argued above, it is possible to read (23) in such a way that it is acceptable.<br />

The main problem facing intuitionists is that there seem to be acceptable vague<br />

sentences that are intuitionistically unacceptable. One simple example is (23). Even<br />

though intuitionists reject double negation elimination, and excluded middle, it is<br />

inconsistent to deny instances of either principle. But, as noted, it seems that (23)<br />

might be a denial of one or other of these principles. More tellingly, if we are to let


Vagueness and Pragmatics 477<br />

surface structure be our guide as to semantic content, then we will say that (23) is a<br />

contradiction. And since all contradictions are false in intuitionist logic, intuitionists<br />

will counsel its rejection. Of course, we need not (and perhaps should not) say that<br />

surface structure shall be our infallible guide. So we might say that (23) is intuitionistically<br />

acceptable because it expresses something other than a contradiction, or that<br />

it is acceptable despite being a contradiction because there are pragmatic rules stating<br />

when false sentences are acceptable. Some such approach is clearly possible: if the<br />

intuitionist just buys POP then she can explain the data just as well as anyone else.<br />

But now intuitionism has ceased to be a rival to POP. Intuitionism plus POP makes<br />

for an intriguing theory of vagueness, but not one we need consider if we want to<br />

know whether POP is in the best explanation of the data.<br />

The other semantic theory to consider was explicitly designed to handle the fact<br />

that speakers accept sentences like (4) but not (3). Burgess and Humberstone (1987)<br />

suggest a semantics for vague sentences that draws partly on the supervaluational<br />

semantics suggested by Fine (1975b), and partly on the tradition established by the<br />

degree theorists. A model, in their theory is a set of points, a reflexive, transitive, antisymmetric<br />

relation on those points ≥, and a valuation function that, within certain<br />

constraints, assigns to each atomic proposition two exclusive (but not necessarily<br />

exhaustive) sets of points. Intuitively, the two sets are the extension and the antiextension<br />

of the atomic proposition. So an atomic proposition p is true at a point x<br />

iff x is in the extension of p, false at x iff x is in the anti-extension of x, and undefined<br />

otherwise. Say that a point x is complete iff for any atomic proposition q, x is in the<br />

extension or anti-extension of q. There are two constraints on the valuation function.<br />

Persistence For all points x, y and atomic propositions p, if y ≥ x then if x is in the<br />

extension of p then so is y and if x is in the anti-extension of p, so is y.<br />

Two-Sided Resolution For all points x and atomic propositions p, if p is undefined<br />

at x then there are points y, z such that y, z > x and p is true at y and false at z.<br />

So far this is all familiar from Fine’s work. And the truth conditions for negation and<br />

conjunction should also seem familiar.<br />

¬A is true at x iff A is false at x.<br />

¬A is false at x iff A is true at x.<br />

A∧ B is true at x iff A is true at x and B is true at x.<br />

A∧ B is false at x iff (∀y ≥ x)(∃z ≥ y)(A is false at z or B is false at z).<br />

The difference appears in how Burgess and Humberstone (hereafter, BH) handle of<br />

disjunction. They offer the following rules:<br />

A∨ B is true at x iff A is true at x or B is true at x.<br />

A∨ B is false at x iff A is false at x and B is false at x.


Vagueness and Pragmatics 478<br />

As BH note, the truth conditions for disjunctions are more like those offered by<br />

degree theorists than those offered by supervaluationists like Fine. Assuming that<br />

English corresponds to a point that is in neither the extension nor the anti-extension<br />

of Louis is bald, then we get the nice result that (4) is true, while (3) is neither true<br />

nor false. Thus BH propose their theory as a way of solving the puzzle posed by (3)<br />

and (4).<br />

BH’s theory has many interesting aspects, and the rough sketch I have given here<br />

glosses over some important subtleties in their presentation. There are, however,<br />

three important objections to using their theory as a solution to the puzzle posed by<br />

(3) and (4). (And since these objections are not removed by a more careful formulation<br />

of their theory, the rough sketch I have made here should suffice.) The first<br />

objection, which BH note, is that the logic contains some failures of congruentiality.<br />

The second is that it seems hard to explain on their theory why sentences like (3)<br />

are sometimes assertable. The last reason is that they cannot explain the apparent<br />

acceptability of (23).<br />

Let A and B be any formulae such that A ⊢ B and B ⊢ A are provable in a given<br />

logic. Let D be a formula that has B as a subformula, and let C be a formula which can<br />

be generated by replacing some occurrences of B with A. The logic is congruential<br />

iff, whenever C and D are chosen in this way, C ⊢ D is provable. In BH’s logic, a<br />

sequent is valid iff it is truth-preserving at all points in all models, and BH provide<br />

natural deduction rules that are sound and complete with respect to this definition of<br />

validity, so we can safely call valid sequents provable. Most many-valued logics will<br />

have failures of congruentiality. The provability of A ⊢ B and B ⊢ A just shows that<br />

these two formulae are true in the same circumstances. This need not imply that the<br />

formulae are false in the same circumstances, so, for example, ¬A ⊢ ¬B might not<br />

be valid. So, as BH note, the fact that there are failures of congruentiality in their<br />

system should come as no surprise. One such failure is that we can prove p ∧ ¬ p ⊣⊢<br />

¬( p ∨¬ p), since neither formula is ever true. However, we cannot prove ¬( p ∧¬ p) ⊢<br />

¬¬( p ∨ ¬ p), since the LHS is a logical truth, but the RHS is equivalent to LEM.<br />

That there are some failures of congruentiality like this in a system that allows for<br />

truth-value gaps is not surprising. What is surprising is just how widespread these<br />

failures are. Note that A ∧ A is false at x unless there is some point y(y ≥ x) where<br />

A is true. This means that it is possible for A ∧ A to be false when A is undefined.<br />

When A is the formula ¬( p ∨ ¬ p), and p is undefined, then A will be undefined, but<br />

A∧A will be false. Conversely, ¬(A∧A) will be true, even though ¬A is undefined, so<br />

¬(A∧ A) ⊢ ¬A will not be valid. Since we have, in general, A ⊣⊢ A∧ A this is another<br />

failure of congruentiality. As BH notes, this particular failure of congruentiality does<br />

look like a cost of the system.<br />

We noted above that there are occasions, in particular when we are trying to<br />

apply formal methods in social sciences, when it seems worthwhile to abstract away<br />

from vagueness. In these cases, it seems appropriate to adopt conventions that make<br />

sentences like (3) assertable. There are differences of opinion on this question, but it<br />

seems like the adoption of such conventions has been a benefit to the development<br />

of the social sciences in the last hundred or so years. It is hard to see how this could<br />

be justified if BH have provided the correct semantics for vague languages. These


Vagueness and Pragmatics 479<br />

conventions apparently license assertions that are simply not true. So I take it to be a<br />

cost of BH’s theory that it cannot explain why we accept sentences like (3) in formal<br />

contexts.<br />

Finally, BH’s theory is completely unable to explain why we are prepared to<br />

accept sentences like (23). That sentence has the form ¬A ∧ ¬¬A, and sentences of<br />

that form are always false on BH’s theory. Moreover, just the kind of intuitions that<br />

would lead us to accept (4) but reject (3), that is, just the intuitions that BH’s theory<br />

rests upon, lead to acceptance of (23). So it looks like BH’s theory has not provided<br />

a systematic theory of all the intuitions it attempts to systematise. It is undoubtedly<br />

important to explain our intuitions concerning (3) and (4), and BH’s theory is the<br />

best systematic attempt to do this in the literature. But, given these three difficulties<br />

it faces, the explanation of those intuitions in terms of POP seems more promising.<br />

5.2 POP and other arguments<br />

The attraction of using many-valued logics in a theory of vagueness is that they provide<br />

an easy explanation of the unacceptability of (3). The big problem with such<br />

theories is that they provide no obvious explanation of the acceptability of (4). This,<br />

I think, is the motivation underlying Williamson’s argument in the following passage.<br />

The sentences He is awake and He is asleep are vague. According to the<br />

degree theorist, as the former falls in degree of truth, the latter rises. At<br />

some point they have the same degree of truth, an intermediate one . . .<br />

the conjunction He is awake and he is asleep also has that intermediate<br />

degree of truth. But how can that be? Waking and sleep by definition<br />

exclude each other. He is awake and he is asleep has no chance at all of being<br />

true. . . Since the conjunction in question is clearly incorrect it should<br />

not have an intermediate degree of truth. . . How can an explicit contradiction<br />

be true to any degree other than 0? (1994, 136)<br />

It is, I suppose, noteworthy that Williamson did not make the following argument:<br />

At some point the disjunction He is awake or he is asleep also has that<br />

intermediate degree of truth. But how can that be? Waking and sleep<br />

by definition exhaust the possibilities. He is awake or he is asleep has no<br />

chance at all of being false. . . Since the disjunction in question is clearly<br />

correct it should not have an intermediate degree of truth. . . How can<br />

an explicit instance of the law of excluded middle be true to any degree<br />

other than 1?<br />

This alternative argument has, it seems, no persuasive force whatsoever. So whatever<br />

difficulty is being brought out by Williamson’s argument must turn on the differences<br />

between his argument and the alternative argument. Hence the argument cannot<br />

just be, for example, that only sentences that are (completely) true in some possible<br />

situations and (completely) false in others can have intermediate truth values. If<br />

that principle were right, our alternative argument here would also go through. Nor<br />

can Williamson’s argument just be that we intuit that contradictions have degree of


Vagueness and Pragmatics 480<br />

truth zero, so contradictions have degree of truth zero. As epistemicists must accept,<br />

some sentences that intuitively have degree of truth less than one really do have<br />

degree of truth one. (Instances of the law of excluded middle seem to be good candidates.)<br />

For a similar reason, the argument cannot just be that we intuit that instances<br />

of the law of non-contradiction are true. Once we have accepted that theories can<br />

force us to revise intuitions about which sentences are true, we are in no position to<br />

insist that a particular theory must respect a particular intuition about the truth of<br />

various sentences.<br />

The best form of Williamson’s argument, I think, appeals to assertability. The<br />

argument cannot just be that He is awake and he is asleep are not assertable. Degree<br />

theorists agree that sentence is not assertable. Remember that their criteria of assertability<br />

is that a sentence is assertable iff it has a high degree of truth. Since He is awake<br />

and he is asleep has at most degree of truth 0.5, and 0.5 is not high, that sentence is<br />

definitely not assertable. If, however, we look at the negation of that sentence we do<br />

get an interesting disagreement. Intuitively, It is not the case that he is awake and he<br />

is asleep can be asserted, even though its degree of truth is merely 0.5. This, I think,<br />

is the strongest interpretation of Williamson’s argument. What is wrong with He is<br />

awake and he is asleep, the reason it seems wrong to give it even a moderate degree of<br />

truth, is that its negation can be confidently asserted. If the degree-theorist cannot<br />

explain this, then her theory is in tatters. It is no good to say here that intuitions<br />

about assertability are not philosophically important; the whole motivation for the<br />

many-valued account is that it captures certain important intuitions about simple<br />

cases, in particular about instances of the law of excluded middle. So this is a serious<br />

challenge. But, as we have seen, it is a challenge that the degree-theorist can meet if<br />

she accepts POP. With POP, we can explain how it is that It is not the case that he is<br />

awake and he is asleep can be assertable even though it does not have a high degree of<br />

truth, because its pragmatic content is true.<br />

Of course, similar arguments against epistemicism and supervaluationism, arguments<br />

that turn on the fact that sentences like (3) cannot be asserted, will also fail if<br />

epistemicists and supervaluationists can explain the unassertability of (3) using POP.<br />

Much of the motivation for degree theories, tracing back to early work by Goguen<br />

(1969) and Zadeh (1975), centred around the fact that we will not assert any of the<br />

sentences: Louis is bald; Louis is not bald; Louis is bald or not bald. 27 It might be argued<br />

that if (3) were true then competent speakers would not be hesitant to assert it,<br />

and certainly would not deny it. Hence theories that suggest it is true, such as epistemicism<br />

and supervaluationism, are mistaken. This reasoning is not without merit<br />

(it seems, for example, to be part of the motivation for BH’s theory). However, at<br />

best it poses a challenge to explain why it might not be assertable, and since POP<br />

provides that explanation, the challenge is met.<br />

5.3 Sorites<br />

A good explanation of the Sorites should not just explain why Sorites arguments are<br />

unsound (as they surely are), they should explain why they seem sound. POP splits<br />

27 David Sanford (1976) stresses the related point that we will not assert (and indeed deny) sentences of<br />

the form There is an n that a person is tall iff they are over n nm in height.


Vagueness and Pragmatics 481<br />

this task into three parts, and quickly disposes of two of them. (We will have to<br />

leave the fourth until the next section.) For simplicity, we will again use Tappenden’s<br />

notation of writing P(n) for ‘A person with exactly n cents is poor’. 28 Now consider<br />

the following three Sorites arguments.<br />

(32) Prem1. P(1)<br />

Prem2. It is not the case that P(1) and not P(2)<br />

. . .<br />

Prem100,000,000. It is not the case that P(99,999,999) and not P(100,000,000)<br />

C. P(100,000,000)<br />

(33) Prem1. P(1)<br />

Prem2. If P(1) then P(2)<br />

. . .<br />

Prem100,000,000. If P(99,999,999) then P(100,000,000)<br />

C. P(100,000,000)<br />

(34) Prem1. P(1)<br />

Prem2. Either it is not the case that P(1) or P(2)<br />

. . .<br />

Prem100,000,000. Either it is not the case that P(99,999,999) or P(100,000,000)<br />

C. P(100,000,000)<br />

The arguments are, I claim, arranged in decreasing order of intuitive plausibility of<br />

the premises. It feels absurd to deny, or even to decline to assent to, any of the<br />

premises in (32). One might resist some of the premises in (33), and in borderline<br />

cases the premises in (34) have very little persuasive force. 29 Two conclusions immediately<br />

follow. We do not need to explain the intuitive force of (34), since it has no<br />

intuitive force. And our explanation of the intuitive force of (32) and (33) should not<br />

work equally well as an ‘explanation’ of the intuitive force of (34). This last condition<br />

is something of a worry for a few theories of vagueness, since without POP it<br />

is hard to see how, say, an epistemicist or a many-valued theorist could account for<br />

the difference between (32) and (34), their premises being trivially equivalent within<br />

their respective theories. 30 POP is rather helpful here. According to POP, the objective<br />

semantic content of every premise in (32) is satisfied, but the objective pragmatic<br />

content of the premises in (34) concerning borderline cases are not satisfied. The<br />

problematic case is (33). Since, as I’ve said a few times, we don’t really understand<br />

conditionals all that well, it is not easy to say much in detail about the objective pragmatic<br />

content of the premises in (33). The best we can say, and it isn’t totally compelling,<br />

is that speakers accept the premises in (33) because they accept the premises in<br />

28 And we will assume, somewhat rashly, that poverty supervenes on net wealth. It is not altogether clear<br />

that, say, a person with $1 billion in real assets, and $1 billion in liquid assets, and $2 billion in long-term<br />

debts is in any sense poor. Following philosophical tradition, however, we will assume that they are.<br />

29 I am grateful here to conversations with Ted Sider.<br />

30 It is actually rather hard to see just what the epistemicist explanation of the Sorites is – there is nothing<br />

in Williamson (1994) saying just exactly why the epistemicist thinks we see these premises as being true.<br />

The pragmatic theories of vagueness, defended by Raffman (1994) and Soames (1999) seem to have a<br />

chance of drawing a distinction between (32) and (34), which is a positive feature of those theories. But<br />

those theories seem to have more pressing problems, as Robertson (2000) outlines.


Vagueness and Pragmatics 482<br />

(32) and think (perhaps mistakenly) that they entail the premises in (32). For a quick<br />

introspective test of this, think of how you would react to someone who said they<br />

didn’t see any reason to accept the premises in (33). Would you respond by pointing<br />

out to them that there can hardly be an n such that P(n) but not P(n + 1), so if P(n)<br />

then P(n + 1)? If so, then you too think the main evidence for the premises in (33)<br />

is the premises in (32). This might be the correct explanation, but perhaps there is<br />

something else to say. To say it we need, however, to take one final detour through<br />

conditionals.<br />

6 Modus Ponens, Epistemicism and Some Conclusions<br />

Consider again this example, one of Vann McGee’s counterexamples to modus ponens.<br />

(McGee, 1985, 463; the numbering is not in the original)<br />

I see what looks like a large fish writhing in a fisherman’s net a ways off.<br />

I believe<br />

(35) If that creature is a fish, then if it has lungs, it’s a lungfish.<br />

That, after all, is what one means by “lungfish”. Yet, even though I believe<br />

the antecedent of this conditional, I do not conclude<br />

(36) If that creature has lungs, it’s a lungfish.<br />

Lungfishes are rare, oddly shaped, and, to my knowledge, appear only in<br />

fresh water. It is more likely that, even though it does not look like one,<br />

the animal in the net is a porpoise.<br />

What exactly should one make of this? We should all agree that (35) does in some<br />

sense follow from the meanings of the words it contains. I think that the sense is that<br />

its objective pragmatic content is satisfied, not that it is, say, true. After all, its antecedent<br />

is, we may suppose, true and its consequent apparently false, and I am somewhat<br />

more certain than conditionals with true antecedents and false consequents are<br />

false than that any semantic judgement I (or anyone else) would make about such a<br />

case is correct. But this flat-footed defence of the falsity of (35) does not explain its<br />

pragmatic acceptability. Presumably it is acceptable because the consequent follows,<br />

in some pragmatically salient sense, from the antecedent. But what is that sense?<br />

One might be tempted to appeal to exportation here, but that is just a red herring.<br />

It is one thing to note that speakers typically assent to If A and B, then C just when<br />

they assent to If A, then if B then C. It is another thing altogether to explain why these<br />

patterns of assent sway together. If one knew that, one would know why (35) seems<br />

acceptable. If one does not, then one does not know why (35) is acceptable. So we<br />

must look deeper.<br />

If POP is correct, then the explanation for the acceptability of (35), like that of (9),<br />

is that the objective pragmatic content of the consequent follows from the objective


Vagueness and Pragmatics 483<br />

pragmatic content of the antecedent. So if we get clear on what the objective pragmatic<br />

content of That is a fish, we will might solve the puzzle, and learned something<br />

rather interesting about the pragmatic content of simple sentences.<br />

One possibility can be quickly disposed of, that the objective pragmatic content<br />

of That is a fish is just that that is, indeed, a fish. If that were true we would expect<br />

the objective pragmatic content of (36) to follow from the fishiness of the object. As<br />

we have seen, it seems as if it does not. Even if (36) is true, and that might be the<br />

right thing to say about the example, we cannot assert it, and this does not seem to be<br />

merely because of ignorance.<br />

Another possibility might be that the objective pragmatic content of That is a<br />

fish is that it is a metaphysically rigid fact that it is a fish. By ‘metaphysically rigid’<br />

here I do not mean that it is true in all metaphysically possible worlds, just that it is<br />

true in all metaphysically nearby worlds. This might solve the problem. If there is<br />

some metaphysical stability to the fishiness of the object, then we might think that<br />

if it has lungs then it is indeed a lungfish, because it could not fail to be a fish, in<br />

a suitably strong sense of could. Whatever the merits of that suggestion, it seems<br />

implausible when applied to sentences outside conditionals. I can properly say The<br />

Cubs were unlucky last century without suggesting that their luck is anything other<br />

than a coincidence, so the pragmatic content of that sentence cannot be that there is<br />

anything metaphysically stable about its truth.<br />

If metaphysical stability does not do the trick here, does epistemic stability do any<br />

better? Not if we mean by ‘epistemic stability’ that the speaker knows the sentence<br />

to be true. We noted above that assuming the objective pragmatic content of S is<br />

The speaker knows that S causes some difficulties for handling ddisjunctions. Perhaps,<br />

though, it might be suggested that the type of pragmatic content that gets embedded<br />

in conditionals is different to the type of content that gets embedded in disjunctions.<br />

This is ad hoc, but not altogether implausible. Still, it does not handle all the data<br />

about conditionals. As Richmond Thomason noted 31 , we can properly say sentences<br />

like If the President is a spy, then we are all being successfully deceived, where the consequent<br />

most assuredly does not follow from our knowing the antecedent.<br />

Maybe a more subtle form of epistemic stability will work. Say that the objective<br />

pragmatic content of (a simple sentence) S is It is humanly possible to know that<br />

S. And assume that the right analysis of indicative conditionals is, broadly speaking,<br />

epistemic. 32 So, roughly, If A then B is true if the epistemically nearest worlds where<br />

A is true are worlds where B is true. Now we have an explanation for why (35) seems<br />

acceptable. If it is knowable that the thing is a fish, and this has just been made salient,<br />

then we might suppose that the epistemically nearest worlds are all ones where it is a<br />

fish. 33 Whether this is a plausible story at the end of the day will depend on how it<br />

31Cited in van Fraassen (1980)<br />

32As, for instance, the analyses of indicative conditionals in Stalnaker (1975); Davis (1979) and <strong>Weatherson</strong><br />

(2001a) are.<br />

33If one is wedded to a particular analyses of the indicative conditional, this move might seem a little<br />

quick. But if you think, as I do, that we do not know very much at all about the similarity metric relevant<br />

for assessing indicative conditionals, other than that it is sensitive to epistemic considerations and to<br />

salience, then we can take the little pattern of reasoning in the text to be a discovery about the nature of<br />

that metric.


Vagueness and Pragmatics 484<br />

links up with other facts about the behaviour of indicative conditionals. But for now<br />

we should just note that there is some chance that this explanation will go through.<br />

And, given the failures of other explanations to account for the plausibility of (35),<br />

this provides at least some evidence that part of the objective pragmatic content of S<br />

is that it is knowable that S. And that, in turn, provides some evidence that epistemicists<br />

about vagueness might be able to appeal to POP to explain why (4) is acceptable<br />

but (3) is not.<br />

Given this, we may have one other explanation of the reason we are inclined to<br />

accept the premises in a Sorites argument where the main premises are conditionals,<br />

as in (33). If the objective pragmatic content of the antecedent is that the antecedent<br />

is knowable, then that certainly entails the truth of the consequent. So might that<br />

explain the acceptability of the conditional? This is possible, but it seems like an inferior<br />

explanation to the one provided at the end of 5.3. The problem is that one would<br />

expect that the sentence would be acceptable if the objective pragmatic content of the<br />

antecedent implied the objective pragmatic content of the consequent, which it does<br />

not. And, of course, which it could not, unless somehow that objective pragmatic<br />

content was somehow immune to Sorites arguments. Which, obviously, it is not.<br />

So while getting clear on how indicative conditionals work might tell us something<br />

important about which aspects of content are compositional, it does not seem likely<br />

to explain why the main premises in arguments like (33) seem acceptable.<br />

Let us take stock. We started with some puzzling data about reactions to sentences<br />

that look like logical truths, namely, that vagueness seems to threaten the law<br />

of excluded middle but not, in the first instance, the law of non-contradiction. And<br />

we proposed a theory, COP, that could explain this data, whether or not one accepted<br />

that all theorems of classical logic really are logical truths. Already, though,<br />

there was some difficulty about whether this explanation was consistent with an epistemic<br />

account of vagueness. We then noted some further data, concerning speakers’<br />

willingness to accept the law of excluded middle in some circumstances, but not accept<br />

some complex instances of the law of non-contradiction in others, that could<br />

not be explained by COP. So we posited a modification of COP, POP, that could<br />

explain even this new data. Unlike COP, POP did require us to assume that all truths<br />

of classical logic really are logical truths, but like COP, it was far from clear that POP<br />

was compatible with an epistemic account of vagueness. At this stage it looked as if<br />

we had something like an argument for a broadly supervaluational account of vagueness,<br />

or at least for some account of vagueness that preserves much of classical logic<br />

while accepting that vagueness is semantic indeterminacy. The considerations of this<br />

last section suggest we should our temper our enthusiasm for that argument somewhat,<br />

because they show that epistemicism might be compatible with POP. If both<br />

epistemicism and supervaluationism are compatible with POP, and each of them,<br />

when combined with POP, can explain all of our intuitions about the logic of sentences<br />

containing vague terms, then we must find some other means than intuitive<br />

plausibility to separate those two theories.


Part V<br />

Probability


From Classical to Intuitionistic Probability<br />

Abstract<br />

We generalize the Kolmogorov axioms for probability calculus to obtain<br />

conditions defining, for any given logic, a class of probability functions<br />

relative to that logic, coinciding with the standard probability functions<br />

in the special case of classical logic but allowing consideration of other<br />

classes of “essentially Kolmogorovian” probability functions relative to<br />

other logics. We take a broad view of the Bayesian approach as dictating<br />

inter alia that from the perspective of a given logic, rational degrees of belief<br />

are those representable by probability functions from the class appropriate<br />

to that logic. Classical Bayesianism, which fixes the logic as classical<br />

logic, is only one version of this general approach. Another, which we<br />

call Intuitionistic Bayesianism, selects intuitionistic logic as the preferred<br />

logic and the associated class of probability functions as the right class<br />

of candidate representions of epistemic states (rational allocations of degrees<br />

of belief). Various objections to classical Bayesianism are, we argue,<br />

best met by passing to intuitionistic Bayesianism – in which the probability<br />

functions are taken relative to intuitionistic logic – rather than by<br />

adopting a radically non-Kolmogorovian, e.g. non-additive, conception<br />

of (or substitute for) probability functions, in spite of the popularity of<br />

the latter response amongst those who have raised these objections. The<br />

interest of intuitionistic Bayesianism is further enhanced by the availability<br />

of a Dutch Book argument justifying the selection of intuitionistic<br />

probability functions as guides to rational betting behaviour when due<br />

consideration is paid to the fact that bets are settled only when/if the<br />

outcome betted on becomes known.<br />

1 Introduction<br />

It is a standard claim of modern Bayesian epistemology that reasonable epistemic<br />

states should be representable by probability functions. There have been a number<br />

of authors who have opposed this claim. For example, it has been claimed that epistemic<br />

states should be representable by Zadeh’s fuzzy sets, Dempster and Shafer’s<br />

evidence functions, Shackle’s potential surprise functions, Cohen’s inductive probabilities<br />

or Schmeidler’s non-additive probabilities. 1 A major motivation of these<br />

theorists has been that in cases where we have little or no evidence for or against p,<br />

it should be reasonable to have low degrees of belief in each of p and ¬ p, something<br />

apparently incompatible with the Bayesian approach. There are two broad types of<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Notre<br />

Dame Journal of Formal Logic 44 (2003): 111Ð123. Thanks to Alan Hájek, Graham Oppy and, especially,<br />

Lloyd Humberstone for comments and suggestions on various drafts of this paper.<br />

1 For more details, see Zadeh (1978), Dempster (1967), Shafer (1976), Shackle (1949), Cohen (1977),<br />

Schmeidler (1989).


From Classical to Intuitionistic Probability 487<br />

response to this situation, the second of which shows the incompatibility just mentioned<br />

is more apparent than real. The first of these – much in evidence in the work<br />

of the writers just cited – is to replace or radically reconstrue the notion of probability<br />

taken by that approach to represent degrees of belief. The second – to be defended<br />

here – seeks to maintain the core of standard probability theory but to generalize the<br />

notion of a probability function to accommodate variation in the background logic<br />

of the account; this allows us to respond to such issues as the low degree of belief<br />

in a proposition and its negation by simply weakening the background logic from<br />

classical to intuitionistic logic. Thus if Bayesianism is construed as in our opening<br />

sentence, one way to respond to the objections of the heterodox writers listed above<br />

is to trade in classical Bayesianism for intuitionistic Bayesianism. Since for many theorists<br />

at least the motivation for their opposition to Bayesianism is grounded in either<br />

verificationism or anti-realism, a move to a intuitionistic theory of probability seems<br />

appropriate. Indeed, as Harman (1983) notes, the standard analysis of degrees of belief<br />

as dispositions to bet leads naturally to a intuitionistic theory of probability. We give<br />

a Dutch Book argument in defence of constructive Bayesianism in Section 4 below.<br />

The appropriate generalization of the notion of a probability function makes explicit<br />

allowance for a sensitivity to the background logic. The latter we identify with<br />

a consequence relation, such as, in particular, the consequence relation CL associated<br />

with classical logic or the consequence relation IL associated with intuitionistic<br />

logic. To keep things general, we assume only that the languages under discussion<br />

have two binary connectives: ∨ and ∧. No assumptions are made about how a consequence<br />

relation on such a language treats compounds formed using these connectives,<br />

though of course in the cases in which we are especially interested, CL and IL , such<br />

compounds have the expected logical properties. We take the language of these two<br />

consequences relations to be the same, assuming in particular that negation (¬) is<br />

present for both. Finally, if A belongs to the language of a consequence relation ,<br />

then we say that A is a -thesis of A and that A is a -antithesis if for all B in that<br />

language A B. (Thus the -theses and antitheses represent the logical truths and<br />

logical falsehoods as seen from the perspective of .) We are now in a position to give<br />

the key definition.<br />

If is a consequence relation, then a function Pr mapping the language of to<br />

the real interval [0,1] is a -probability function if and only if the following conditions<br />

are satisfied:<br />

P0) Pr(A) = 0 if A is a -antithesis.<br />

(P1) Pr(A) = 1 if A is a -thesis<br />

(P2) If A B then Pr(A) ≤ Pr(B)<br />

(P3) Pr(A) + Pr(B) = Pr(A ∨ B) + Pr(A ∧ B)<br />

If is CL , then we call a -probability function a classical probability function; if<br />

is IL we call a -probability function an intuitionistic probability function. The<br />

position described above as constructive Bayesianism would replace classical probability<br />

functions by intuitionistic probability functions as candidate representations<br />

of reasonable epistemic states. Note that classical probability functions in this sense


From Classical to Intuitionistic Probability 488<br />

are exactly those obeying the standard probability calculus axioms. In paricular, the<br />

familiar negation axiom dictating that Pr(¬A) = 1 – Pr(A) emerges as a by-product of<br />

the interaction between the general (i.e., logic-independent) condition (P3) and, via<br />

(P0) and (P1), the logic-specific facts that A ∧ ¬A is a CL -antithesis and A ∨ ¬A is a<br />

CL -thesis for any A.<br />

Although it is these two kinds – intuitionistic and classical – of probability functions<br />

we shall be dealing with specifically in what follows, we emphasize the generality<br />

of the above definition of a -probability function, and invite the reader to<br />

consider what effect further varying the choice of has on the behaviour of such<br />

functions. Our attention will be on the comparative merits of CL and IL in this<br />

regard. (It may have occurred to the reader in connection with (P3) above that we<br />

might naturally have considered a generalized version of (P3) for ‘countable additivity’.<br />

Whether such a condition ought be adopted will turn on some rather difficult<br />

questions concerning the use of infinities in constructive reasoning; let us leave it as<br />

a question for further research. We have stated (P3) in its finitary form so as not to<br />

require that intuitionistic probability functions satisfy the more contentious general<br />

condition.)<br />

In the following section we shall review some of the motivations for intuitionistic<br />

Bayesianism. The arguments are rather piecemeal; they are designed to show<br />

that given the philosophical commitments various writers in the field have expressed<br />

they would be better off taking this route, i.e., focussing on the class of intuitionistic<br />

probability functions, than – as many of them have suggested –abandoning Bayesianism<br />

in our broad sense. In particular, we shall urge that moves in the latter direction<br />

which involve abandoning (what we shall call) the Principle of Addition are seriously<br />

undermotivated.<br />

One aspect of the Bayesian perspective which we have not considered concerns<br />

the dynamics rather than the statics of epistemic states: in particular the idea that<br />

changes in such states are governed for rational agents by the principle of conditionalizing<br />

on new information. This requires that we have a dyadic functor available<br />

for expressing conditional probabilities. Accordingly, where Pr is for some consequence<br />

relation a -probability function, we favour the standard account and take<br />

the associated conditional -probability function Pr( · , · ) to be given by Pr(A,B)<br />

= Pr(A ∧ B)/Pr(B) when Pr(B) �= 0, with Pr(A,B) undefined when Pr(B) = 0. The<br />

intention, of course, is that Pr(A,B) represents the conditional probability of A given<br />

B. We defer further consideration of conditional probability until the Appendix.<br />

2 Motivating Intuitionistic Bayesianism<br />

There are four main reasons for grounding preferring intuitionistic over classical<br />

probability functions as representing the range of reasonable epistemic states. These<br />

are: (1) a commitment to verificationism, (2) a commitment to anti-realism, (3) preservation<br />

of the principle of Addition, and (4) avoidance of direct arguments for the orthodox<br />

approach. Now some of these will be viewed by some people as bad reasons<br />

for adopting the given position, a reaction with which it is not hard to sympathise.<br />

In particular, the verificationist and anti-realist elements of the theory might well


From Classical to Intuitionistic Probability 489<br />

be viewed as negatives. These arguments are principally directed at showing that by<br />

their own lights, various opponents of classical Bayesianism would do better to adopt<br />

the intuitionistic Bayesian position than some still more heterodox non-Bayesian account.<br />

2.1 A standard objection to classical Bayesianism is that it has no way of representing<br />

complete uncertainty. Because of the failures of Laplace’s principle of indifference,<br />

it can’t be said that uncertainty about p is best represented by assigning<br />

credence 1/2 to p. Heterodox approaches usually allow the assignment of credence 0<br />

to both p and ¬ p when an agent has no evidence at all as to whether or not p is true.<br />

Because these approaches generally require an agent to assign credence 1 to classical<br />

tautologies, including p ∨ ¬ p, these theories must give up the following Principle of<br />

Addition.<br />

Addition For incompatible A, B: Bel(A ∨ B) = Bel(A) + Bel(B).<br />

“Bel(A)” is here used to mean the degree of belief the agent has in A, and “incompatible”<br />

to apply to A and B in which for some favoured consequence relation , the<br />

conjunction of A with B is a -antithesis. Such conditions as Addition are of course<br />

taken not as descriptive theories about all agents, since irrational agents would serve<br />

as counterexamples. Rather, they are proposed coherence constraints on all rational<br />

agents.<br />

The Principle of Addition is stated in terms of degrees of belief, or credences.<br />

Where no ambiguity results we also use the same term to refer to the corresponding<br />

principle applied to -probability functions, with incompatibility understood in<br />

terms of (as just explained). Now in some writings (particularly Shafer’s) the reason<br />

suggested for giving up Addition is openly verificationist. Shafer says that when<br />

an agent has no evidence for p, they should assign degree of belief 0 to p. Degrees of<br />

belief, under this approach, must be proportional to evidence. 2 In recent philosophical<br />

literature, this kind of verificationism is often accompanied by an insistence that<br />

validity of arguments be judged by the lights of IL rather than CL .<br />

A similar line of thought is to be found in Harman (1983). He notes that when<br />

we don’t distinguish between the truth conditions for a sentence and its assertibility<br />

conditions, the appropriate logic is intuitionistic. And when we’re considering gambles,<br />

something like this is correct. When betting on p we don’t, in general, care if p<br />

is true as opposed to whether it will be discovered that p is true. A p-bet, where p<br />

asserts the occurrence of some event for instance, becomes a winning bet, not when<br />

that event occurs, but when p becomes assertible. So perhaps not just verificationists<br />

like Shafer, but all those who analyse degrees of belief as propensity to bet should<br />

adopt constructivist approaches to probability.<br />

To see the point Harman is making, consider this example. We are invited to<br />

quote for p-bets and ¬ p-bets, where p is O. J. Simpson murdered his wife. If we are to<br />

take the Californian legal system literally, the probability of that given the evidence<br />

is strictly between one-half and one. To avoid one objection, these bets don’t just<br />

2 This assumption was shared by many of the participants in the symposium on probability in legal<br />

reasoning, reported in the Boston University Law Review 66 (1986).


From Classical to Intuitionistic Probability 490<br />

pay $1 if the bettor guesses correctly. Rather they pay $1 invested at market rates of<br />

interest at the time the bet is placed. The idea is that if we pay x cents for the bet<br />

now, when it is discovered that we have bet correctly we will receive a sum of money<br />

that is worth exactly as much as $1 now. Still, we claim, it might be worthwhile to<br />

quote less than 50 cents for each of the bets. Even if we will receive $1 worth of<br />

reward if we wager correctly, there is every possibility that we’ll never find out. So<br />

it might be that placing a bet would be a losing play either way. To allow for this,<br />

the sum of our quotes for the p-bet and the ¬ p-bet may be less than $1. As Harman<br />

points out, to reply by wielding a Dutch Book argument purporting to show that this<br />

betting practice is incoherent would be blatantly question-begging. That argument<br />

simply assumes that p ∨ ¬ p is a logical truth, which is presumably part of what’s at<br />

issue. (In our terminology, this disjunction has the status of a CL -thesis which is not<br />

a IL -thesis.)<br />

Harman’s point is not to argue for a intuitionistic approach to probability. Rather,<br />

he is arguing against using probabilistic semantics for propositional logic. Such an approach<br />

he claims would be bound to lead to intuitionistic logic for the reasons given<br />

above. He thinks that, since this would be an error, the move to probabilistic semantics<br />

is simply misguided. Whatever we think of this conclusion, we can press into<br />

service his arguments for intuitionistic Bayesianism.<br />

2.2 The second argument for this approach turns on the anti-realism of some heterodox<br />

theorists. So George Shackle, for example, argues that if we are anti-realists<br />

about the future, we will assign positive probability to no future-directed proposition.<br />

The following summary is from a sympathetic interpreter of Shackle’s writing.<br />

[T]here is every reason to refuse additivity: [it] implies that the certainty<br />

that would be assigned to the set of possibilities should be ‘distributed’<br />

between different events. Now this set of events is undetermined as the<br />

future – that exists only in imagination – is. (Ponsonnet, 1996, 171)<br />

Shackle’s anti-realism is motivated by what most theorists would regard as a philosophical<br />

howler: he regards realism about the future as incompatible with human<br />

freedom, and holds that human beings are free. The second premise here seems harmless<br />

enough, but the first is notoriously difficult to motivate. Nevertheless, there are<br />

some better arguments than this for anti-realism about the future. If we adopt these,<br />

it isn’t clear why we should ‘assign certainty’ to the set of possibilities.<br />

Shackle is here assuming that for any proposition p, even a proposition about the<br />

future, p ∨ ¬ p is now true, although neither disjunct is true. Given his interests it<br />

seems better to follow Dummett here and say that if we are anti-realists about a subject<br />

then for propositions p about that subject, p ∨ ¬ p fails to be true. Hence we have<br />

no need to ‘assign certainty to the set of possibilities’. Or perhaps more accurately,<br />

assigning certainty to the set of possibilities does not mean assigning probability 1<br />

to p ∨ ¬ p; in particular, condition (P1) on -probability functions does not require<br />

this when we choose as IL .<br />

2.3 The third motivation for adopting an intuitionistic approach to probability<br />

is that it allows us to retain the Kolmogorov axioms for probability, in particular


From Classical to Intuitionistic Probability 491<br />

the Principle of Addition. This principle has, to my mind at least, some intuitive<br />

motivation. And the counterexamples levelled against it by heterodox theorists seem<br />

rather weak from the intuitionistic Bayesian perspective. For they all are cases where<br />

we might feel it appropriate to assign a low probability to a proposition and its negation<br />

3 . Hence if we are committed to saying Pr(A ∨ ¬A) = 1 for all A, we must give<br />

up the Principle of Addition. But the intuitionistic Bayesian simply denies that in<br />

these cases Pr(A ∨ ¬A) = 1, so no counterexample to Addition arises. This denial<br />

is compatible with condition (P1) on Pr’s being a IL -probability function since, as<br />

already noted, A ∨ ¬A is not in general a IL -thesis.<br />

2.4 The final argument for taking an intuitionistic approach is that it provides<br />

a justification for rejecting the positive arguments for classical Bayesianism. These<br />

provide a justification for requiring coherent degrees of belief to be representable by<br />

the classical probability calculus. There are a dizzying variety of such arguments<br />

which link probabilistic epistemology to decision theory, including: the traditional<br />

Dutch Book arguments found in Ramsey (1926), Teller (1973) and Lewis (1999b); depragmatized<br />

Dutch Book arguments which rely on consistency of valuations, rather<br />

than avoiding actual losses, as in Howson and Urbach (1989), Christensen (1996) and<br />

Hellman (1997); and arguments from the plausibility of decision theoretic constraints<br />

to constraints on partial beliefs, as in Savage (1954), Maher (1993) and Kaplan (1996).<br />

As well as these, there are arguments for classical Bayesianism which do not rely<br />

on decision theory in any way, but which flow either directly from the definitions<br />

of degrees of belief, or from broader epistemological considerations. A summary<br />

of traditional arguments of this kind is in Paris (1994). Joyce (1998) provides an<br />

interesting modern variation on this theme.<br />

All such arguments assume classical – rather than, say, intuitionistic – reasoning<br />

is appropriate. The intuitionist has a simple and principled reason for rejecting those<br />

arguments. The theorist who endorses CL when considering questions of inference,<br />

presumably lacks any such simple reason. And they need one, unless they think it<br />

appropriate to endorse one position knowing there is an unrefuted argument for an<br />

incompatible viewpoint.<br />

We are not insisting that non-Bayesians will be unable to refute these arguments<br />

while holding on to CL . We are merely suggesting that the task will be Herculean.<br />

A start on this project is made by Shafer (1981), which suggests some reasons for<br />

breaking the link between probabilistic epistemology and decision theory. Even if<br />

these responses are successful, such a response is completely ineffective against arguments<br />

which do not exploit such a link. As we think these are the strongest arguments<br />

for classical Bayesianism, non-Baeyesians have much work left to do. And it<br />

is possible that this task cannot be completed. That is, it is possible that the only<br />

questionable step in some of these arguments for classical Bayesianism is their use of<br />

non-constructive reasoning. If this is so only theorists who give up CL can respond<br />

to such arguments.<br />

3 Again the discussion in (Shafer, 1976, ch. 2) is the most obvious example of this, but similar examples<br />

abound in the literature.


From Classical to Intuitionistic Probability 492<br />

In sum, non-Bayesians need to be able to respond to the wide variety of arguments<br />

for Bayesianism. Non-Bayesians who hold on to CL must do so without questioning<br />

the implicit logical assumptions of such arguments. Given this restriction, producing<br />

these responses will be a slow, time-consuming task, the responses will in all likelihood<br />

be piecemeal, providing little sense of the underlying flaw of the arguments,<br />

and for some arguments it is possible that no effective response can be made. Intuitionistic<br />

Bayesians have a quick, systematic and, we think, effective response to all<br />

these arguments.<br />

3 More on Intuitionistic Probability Functions<br />

Having explained the motivation for intuitionistic Bayesianism, let us turn our attention<br />

in greater detail to its main source of novelty: the intuitionistic probability<br />

functions. We concentrate on logical matters here, in the following section justifying<br />

the singling out of this class of probability functions by showing that an epistemic<br />

state represented by Bel is invulnerable to a kind of Dutch Book if and only if Bel is<br />

an intuitionistic probability function.<br />

For the case of specifically classical probability functions, the conditions (P0)–(P4)<br />

of Section 1 involve substantial redundancy. For example, we could replace (P2) and<br />

(P3) by – what would in isolation be weaker conditions – (P2 ′ ) and (P3 ′ ).<br />

(P2 ′ ) If A B then Pr(A) = Pr(B)<br />

(P3 ′ ) If ¬(A ∧ B) then Pr(A ∨ B) = Pr(A) + Pr(B)<br />

However, in the general case of arbitrary -probability functions (or rather: those<br />

for which ¬ is amongst the connectives of the language of ), such a replacement<br />

would result in a genuine weakening, as we may see from a consideration of the class<br />

of IL -probability functions. While both (P2 ′ ) and (P3 ′ ) are satisfied for as IL , the<br />

class of functions Pr satisfying (P0), (P1), (P2 ′ ) and (P3 ′ ) is broader (for this choice of<br />

) than the class of intuitionistic probability functions. To see this, first note that<br />

the function P, defined immediately below, satisfies (P0), (P1), (P2) and (P3 ′ ), but not<br />

(P3).<br />

P(A) =<br />

�<br />

1 if p ∨ q IL A<br />

0 otherwise<br />

(Here p and q are a pair of atomic sentences.) To see that (P3 ′ ) is satisfied, assume<br />

P(A ∨ B) = 1 and IL ¬(A ∧ B). Then p ∨ q IL A ∨ B, and B IL ¬A. Hence<br />

p ∨ q IL A ∨ ¬A, but this only holds if either (1) p ∨ q IL A or (2) p ∨ q IL ¬A.<br />

(For if p ∨ q IL A ∨ ¬A, then p IL A ∨ ¬A and q IL A ∨ ¬A, whence by a generalization,<br />

due to Harrop, of the Disjunction Property for intuitionistic logic, either<br />

p IL A or p IL ¬A and similarly either q IL A or q IL ¬A. Thus one of the following<br />

four combinations obtains: (a) p IL A and q IL A, (b) p IL A and q IL ¬A, (c) p IL ¬A<br />

and q IL A, (d) p IL ¬A and q IL ¬A. But cases (b) and (c) can be ruled out since they<br />

would make p and q IL -incompatible, contradicting their status as atomic sentences,<br />

and from (a) and (d), (1) and (2) follow respectively.) If (1) first holds then P(A) = 1,


From Classical to Intuitionistic Probability 493<br />

as required. If (2) holds then p ∨ q IL (A ∨ B) ∧ ¬A and (A ∨ B) ∧ ¬A IL B, so P(B)<br />

= 1. The other cases are trivial to verify and are left to the reader.<br />

To see (P2) is needed (for the current choice of ), as opposed to just (P2 ′ ), consider<br />

the following Kripke tree.<br />

2 ¬ p<br />

3<br />

1<br />

We introduce a “weighting” function w by setting w(1) = 0.2, w(2) = 0.3, w(3) = -0.1<br />

and w(4) = 0.6. For any A, let P(A) = Σw(i), where the summation is across all points<br />

i that force A. So P( p) = 0.6 and P(¬¬ p) = 0.5, contradicting (P2). But (P0), (P1),<br />

(P2 ′ ) and (P3) are all satisfied, showing that (P2) is in the general case not derivable<br />

from these three conditions.<br />

4 Bets and Intuitionistic Probability Functions<br />

Say that an A-bet is a bet that pays $1 if A and nothing otherwise. These will sometimes<br />

be called bets on A. In this theory, as in real life, it is possible that neither A-bets<br />

nor ¬A-bets will ever be collected, so holding an A-bet and a ¬A-bet is not necessarily<br />

as good as holding $1. An A-bet becomes a winning bet, i.e. worth $1, just when<br />

it becomes known that A. We will assume that bookmakers and punters are both<br />

logically proficient and honest, so that when a B-bet becomes a winning bet and B<br />

IL<br />

A, then an A-bet is a winning bet. The picture underlying this story is the Kripke<br />

tree semantics for intuitionistic logic. Bettors are thought of as being at some node<br />

of a Kripke tree, an A-bet wins at that stage iff A is forced by that node. Bettors do<br />

not know that any future nodes will be reached, so they cannot be confident that all<br />

bets on classical tautologies ( CL -theses) will be winning. And more importantly, we<br />

take it that an (A ∨ B)-bet wins if and only if an A-bet wins or a B-bet wins. Again<br />

this mirrors the fact that A ∨ B is forced at a node iff A is forced or B is forced.<br />

Finally, to get the Dutch Book style argument going, assume that for any sequence<br />

of bets on A 1 , A 2 , ..., A k , the bettor values the sequence at $(Bel(A 1 ) + Bel(A 2 ) + ...<br />

+ Bel(A k )). This is obviously unrealistic and economically suspect 4 , but is perhaps a<br />

4 It is economically suspect because, in simplified terms, Bel(A) gives at best the use-value of an A-bet,<br />

but this is distinct from the exchange-value the agent places on the bet. And it is the exchange-value that<br />

determines her patterns of buying and selling.<br />

4<br />

p


From Classical to Intuitionistic Probability 494<br />

useful analogy. Then Bel leads to coherent valuations in all circumstances iff Bel is a<br />

intuitionistic probability function. That is, if Bel is not an intuitionistic probability<br />

function (henceforth: IPF) then there will be two finite sequences of bets S1 and S2 such that S1 is guaranteed to pay at least as much as S2 in all circumstances, but S2 is<br />

given higher value by the agent. For simplicity Bel will be called incoherent if this<br />

happens, and coherent otherwise. If Bel is an IPF there are no two such sequences, so<br />

it is coherent.<br />

If Bel is not an IPF then we just need to look at which axiom is breached in order<br />

to construct the sequences. For example, if (P3) is breached then let the sequences be<br />

〈A, B〉 and 〈A ∨ B, A ∧ B〉. The same number of propositions from each sequence<br />

are forced at every node of every Kripke tree, so the coherence requirement is that<br />

the two sequences receive the same value. But ex hypothesi they do not, so Bel is incoherent.<br />

Similar proofs suffice for the remaining axioms (the remaining conditions on<br />

-probability functions, that is, as they apply in the special case of = IL ).<br />

To show that if Bel is an IPF it is coherent, we need some more notation. Let<br />

〈A1 , ..., Ak 〉 be a sequence of propositions. Then say cn, k is the proposition true iff at<br />

least n of these are true. So c2,3 is the proposition (A1 ∧ A2 ) ∨ (A1 ∧ A3 ) ∨ (A2 ∧ A3 ).<br />

Assuming Bel is a IPF, we prove the following lemma holds for all k:<br />

k�<br />

k�<br />

Lemma: Bel(Ai ) = Bel(ci,k )<br />

i=1<br />

i=1<br />

The proof is by induction on k. For k=1 and k=2, the proof is given by the axioms.<br />

So it remains only to complete the inductive step. For ease of reading in the proof we<br />

write A for Bel(A) where no ambiguity would result.<br />

By the inductive hypothesis we have:<br />

�k+1<br />

k�<br />

k Ai = k ci,k + kAk+1 i=1<br />

i=1<br />

= (k − 1)<br />

= (k − 1)<br />

�k+1<br />

Since Ai =<br />

i=1<br />

k�<br />

ci,k +<br />

i=1<br />

k�<br />

ci,k +<br />

i=1<br />

k�<br />

+Ak+1 =<br />

i=1<br />

�k+1<br />

Ai + (k − 1)Ak+1 =<br />

i=1<br />

k�<br />

ci,k + kAk+1 i=1<br />

k�<br />

(ci,k ∨ Ak+1 ) + (ci,k + Ak+1 ) by k applications of (P3)<br />

i=1<br />

k�<br />

ci,k + Ak+1 , this equation simplifies to<br />

i=1<br />

k�<br />

(ci,k ∨ Ak+1 ) + (ci,k + Ak+1 )<br />

i=1<br />

Since ci,k ∨ Ak+1 ci,k+1 ∨ Ak+1 and ci,k ∧ Ak+1 ci+1,k+1 ∧ Ak+1 we have:<br />

�k+1<br />

k�<br />

k�<br />

Ai + (k − 1)Ak+1 = (ci,k+1 ∨ Ak+1 ) + (ci+1,k+1 ∧ Ak+1 )<br />

i=1<br />

i=1<br />

i=1


From Classical to Intuitionistic Probability 495<br />

Now, c1,k+1 ∨ Ak+1 ci,k+1 and ck+1,k+1 ∧ Ak+1 ck+1,k+1 from the definitions<br />

of c. So substituting in these equivalences and slightly renumbering, we get:<br />

�k+1<br />

�k−1<br />

�k−1<br />

Ai + (k − 1)Ak+1 = ci,k+1 + ck+1,k+1 + (ci,k+1 ∨ Ak+1 ) + (ci+1,k+1 ∧ Ak+1 )<br />

i=1<br />

Regrouping the last two summations and applying (P3),<br />

�k+1<br />

�k−1<br />

Ai + (k − 1)Ak+1 = ci,k+1 + ck+1,k+1 + ci+1,k+1 + Ak+1 i=1<br />

i=1<br />

i=1<br />

�k+1<br />

= ci,k+1 + (k − 1)Ak+1 i=1<br />

And cancelling out the second term on each side gives us the result we want. From<br />

this it follows immediately that Bel is coherent. Let S 1 and S 2 be any two sequences<br />

such that S 1 is guaranteed to pay as much as S 2 . That is, that S 2 pays $n entails S 1 pays<br />

at least $n for all n. Now the lemma shows that for each sequence of bets, their value<br />

equals the sum of the probability that they’ll pay at least n for all values of n, up to<br />

the length of the sequence. So by as many appeals to (P2) as there are bets in S 1 , we<br />

have that the value of S 2 is less than or equal to the value of S 1 , as required.<br />

Given the well-known problems with Dutch Book arguments 5 , it might be wondered<br />

if we can give a different justification for the axioms. Indeed it may be considered<br />

helpful to have a semantics for the logic which does not refer to betting practices.<br />

One possibility is to say that IPFs are normalised measures on Kripke trees. The idea<br />

is that the probability of a proposition is the measure of the set of points at which the<br />

proposition is forced. It is straightforward to give a non-constructive proof that the<br />

axioms are sound with respect to these semantics, but making this proof constructive<br />

and providing any proof that the axioms are complete is a harder task. So for now<br />

this Dutch Book justification for the axioms is the best available.<br />

Appendix: The Morgan–Leblanc–Mares Calculus<br />

In a series of papers (Morgan and LeBlanc (1983a,b), Morgan and Mares (1995)) an<br />

approach to probability grounded in intuitionistic logic has been developed. The<br />

motivation is as follows. A machine contains an unknown set of propositions S,<br />

which need not be consistent. Pr(A, B) is the maximal price we’d pay for a bet that<br />

S and B intuitionistically entail A (S, A IL B, that is). By standard Dutch Book arguments,<br />

we obtain axioms for a probability calculus which has some claim to being<br />

constructivist. The point of this section is to register the shortcomings of this approach<br />

as a theory of uncertain reasoning from evidence – to point out, that is, the<br />

implausibility of interpreting the axioms they derive as normative constraints on degrees<br />

of belief. (It should be noted from the start that this was not the advertised<br />

purpose of their theory, and at least one of the authors (Mares) has said (p.c.) that<br />

the primary purpose of constructing these theories was to generalise of the triviality<br />

5 See Maher (1993) for criticisms of the most recent attempts at successful Dutch Book arguments and<br />

references to criticisms of earlier attempts.<br />

i=1


From Classical to Intuitionistic Probability 496<br />

results proved in Lewis (1976b). So the purpose of this appendix may be to argue for<br />

something that isn’t in dispute: that these theories can’t be pushed into double duty<br />

as theories of reasoning under uncertainty.)<br />

The axiomatisations given in the Morgan and Leblanc papers differs a little from<br />

that given in the Morgan and Mares paper, but the criticisms levelled here apply to<br />

their common elements. In particular, the following four axioms are in both sets.<br />

(C1) 0 ≤ Pr(A, B) ≤ 1<br />

(C2) Pr(A, A ∧ B) = 1<br />

(C3) Pr(A, B ∧ C) · Pr(B, C) = Pr(B, A ∧ C) · Pr(A, C)<br />

(C4) Pr(A ⊃ B, C) = Pr(B, A ∧ C)<br />

These four are enough to get both the unwanted consequences. In particular, from<br />

these we get the ‘no negative evidence’ rule: Pr(A, B ∧ C) ≥ Pr(A, B). The proof is in<br />

Morgan and Mares (1995) Now given the semantic interpretation they have adopted,<br />

this is perhaps not so bad. After all, if we can prove A from B and S, we can certainly<br />

prove it from B ∧ C and S, but the converse does not hold. However from our<br />

perspective this feature seems a little implausible. In particular, if C is ¬A, it seems<br />

we should have Pr(A, B ∧ ¬A) = 0 unless B IL A, in which case Pr(A, B ∧ ¬A) is<br />

undefined.<br />

It shouldn’t be that surprising that we get odd results given (C4). Lewis (1976b)<br />

shows that adopting it for a (primitive or defined) connective ‘→’ within the classical<br />

probability calculus leads to triviality. And neither the arguments he uses there<br />

nor the arguments for some stronger conclusions in Lewis (1999b) rely heavily on<br />

classical principles. The papers by Morgan and Leblanc don’t discuss this threat, but<br />

it is taken discussed in detail in Morgan and Mares (1995). Morgan and Mares note<br />

that it’s possible to build a theory based on (C1) to (C4) that isn’t trivial in the sense<br />

Lewis described. But these theories still have enough surprising features that they<br />

aren’t suitable for use as a theory of reasoning under uncertainty.<br />

In intuitionistic logic we often take the falsum ⊥ as a primitive connective, functioning<br />

as a IL -antithesis. Hence a set S is intuitionistically consistent iff we do not<br />

have S IL ⊥. Now the following seems a plausible condition:<br />

(C⊥) For consistent B, Pr(⊥, B) = 0.<br />

Given consistent evidence, we have no evidence at all that the falsum is true. Hence<br />

we should set the probability of the falsum to 0 (as required by our condition (P0)<br />

from Section 1). Given Morgan and Leblanc’s original semantic interpretation there<br />

is less motivation for adopting (C⊥), since S might be inconsistent. The restriction<br />

to consistent B in (C⊥) is imposed because we take Pr(A, B) to be undefined for<br />

inconsistent B, as explained at the end of Section 1. (In more detail: if B is a IL -<br />

antithesis then Pr(B) = 0 for any intuitionistic probability function Pr, whence the<br />

undefinedness of Pr(A, B) by the remarks at the end of that section.) Morgan, Leblanc<br />

and Mares take it to be set at 1. The choice here is a little arbitrary, the only decisive<br />

factor being apparently the easier statement of certain results. Now if we take the<br />

falsum as a primitive the next move is usually to introduce ¬ as a defined connective,<br />

as follows.


From Classical to Intuitionistic Probability 497<br />

¬A = df A ⊃ ⊥<br />

Assuming A ∧ B is consistent, it follows from (C4) and (C⊥) that Pr(¬A, B) = 0.<br />

Again, from our perspective this is an implausible result. The main purpose of this<br />

appendix has been to show that the Morgan–Leblanc–Mares probability calculus cannot<br />

do the work Bayesians want a probability calculus to do. That is, it is implausible<br />

to regard their Pr(A, B) as the reasonable degree of belief in A given B. Hence the account<br />

of conditional probability these authors offer diverges from the intuitionistic<br />

Bayesianism that we have been urging heterodox theorists to endorse.


Should We Respond to Evil With Indifference?<br />

Abstract<br />

In a recent article, Adam Elga outlines a strategy for “Defeating Dr Evil<br />

with Self-Locating Belief”. The strategy relies on an indifference principle<br />

that is not up to the task. In general, there are two things to dislike<br />

about indifference principles: adopting one normally means confusing<br />

risk for uncertainty, and they tend to lead to incoherent views in some<br />

‘paradoxical’ situations. I argue that both kinds of objection can be levelled<br />

against Elga’s indifference principle. There are also some difficulties<br />

with the concept of evidence that Elga uses, and these create further difficulties<br />

for the principle.<br />

In a recent article, Adam Elga outlines a strategy for “Defeating Dr Evil with Self-<br />

Locating Belief”. The strategy relies on an indifference principle that is not up to<br />

the task. In general, there are two things to dislike about indifference principles:<br />

adopting one normally means confusing risk for uncertainty, and they tend to lead<br />

to incoherent views in some ‘paradoxical’ situations. Each kind of objection can be<br />

levelled against Elga’s theory, but because Elga is more careful than anyone has ever<br />

been in choosing the circumstances under which his indifference principle applies<br />

we have to be similarly careful in focussing the objections. Even with this care the<br />

objections I put forward here will be less compelling than, say, the objections (Keynes,<br />

1921, Ch. 4) put forward in his criticisms of earlier indifference principles. But<br />

there still may be enough to make us reject Elga’s principle. The structure of this<br />

note is as follows. In §§1 and 2 I set out Elga’s theory, in §§3 and 4 I discuss some<br />

initial objections that I don’t think are particularly telling, in §5 I discuss some paradoxes<br />

to which Elga’s theory seems to lead (this is reprised in §9 where I discuss a<br />

somewhat different paradoxical case) and in §§7 and 8 I argue that even Elga’s careful<br />

indifference principle involves a risk/uncertainty confusion.<br />

1 From Basel to Princeton<br />

In (1979a) David Lewis argued that the contents of contentful mental states were not<br />

propositions, but properties. When I think that I’m a rock star, I don’t attribute<br />

truth to the proposition <strong>Brian</strong> is a rock star, but rather attribute the property of<br />

rock stardom to myself. Lewis was led to this position by considering cases where<br />

a believer is mistaken about his own identity. For example, if I believe that I’m a<br />

rock star without believing that I’m <strong>Brian</strong>, and in fact while thinking that <strong>Brian</strong> is an<br />

infamous philosopher, it is odd to attribute to me belief in the proposition <strong>Brian</strong> is a<br />

rock star. But it is perfectly natural to say I self-attribute rock stardom, and that’s just<br />

what Lewis says.<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philosophy<br />

and Phenomenal Research 70 (2005): 613-35. Thanks to Jamie Dreier, Adam Elga and an anonymous<br />

referee for helpful discussions about this paper and suggestions for improvements.


Should We Respond to Evil With Indifference? 499<br />

If we accept Lewis’s position, there are two paths we can take. First, we can try<br />

simply replacing all talk of propositional attitudes with talk of proprietal attitudes,<br />

and trusting and hoping that this won’t make a difference to our subsequent theorising.<br />

Alternatively, we can see if changing the type of entity that is the content<br />

of a contentful state has distinctive consequences, and in particular see if it gives us<br />

the conceptual resources to make progress on some old problems. That’s the approach<br />

Adam Elga has taken in a couple of papers, and whatever one thinks of his<br />

conclusions, the early returns certainly suggest that this Lewisian outlook will prove<br />

remarkably fruitful.<br />

On the Lewisian approach, credences are defined over properties, and properties<br />

are sets of possibilia, i.e. centred worlds. Some properties are maximally precise, they<br />

are satisfied by exactly one possible object. Elga sometimes calls these maximally<br />

specific properties predicaments because they specify exactly what is happening to<br />

the agent that instantiates one. Say predicaments F 1 and F 2 are similar iff the F 1 and<br />

the F 2 are worldmates and their experiences are indistinguishable. Elga’s principle<br />

INDIFFERENCE says that if predicaments F 1 and F 2 are similar then any rational<br />

agent should assign equal credence to F 1 and F 2 . This becomes most interesting when<br />

there are similar F 1 and F 2 . So, for instance, consider poor O’Leary.<br />

O’LEARY O’Leary is locked in the trunk of his car overnight. He knows that he’ll<br />

wake up briefly twice during the night (at 1:00 and again at 2:00) and that the<br />

awakenings will be subjectively indistinguishable (because by 2:00 he’ll have<br />

forgotten the 1:00 awakening). At 1:00 he wakes up.<br />

Elga says that when O’Leary wakes up, he should assign equal credence to it being<br />

1:00 as to it being 2:00. So, provided O’Leary knows that one of these two hypotheses<br />

is true, INDIFFERENCE says that he should assign credence 1/2 to it being 1:00 at<br />

the wake up.<br />

Elga has an argument for INDIFFERENCE, which we shall get to by §8, but for<br />

a while I will look at some immediate consequences of the position. I’ll start with<br />

two reasons to think that INDIFFERENCE needs to be strengthened to play the role<br />

he wants it to play.<br />

2 Add it Up<br />

One difficulty with INDIFFERENCE as stated so far is that it applies only to very<br />

narrow properties, predicaments, and it is not clear how to generalise to properties<br />

in which we are more interested.<br />

BERNOULLIUM Despite months of research, Leslie still doesn’t know what the<br />

half-life of Bernoullium, her newly discovered element is. It’s between one and<br />

two nanoseconds, but she can’t manufacture enough of the stuff to get a better<br />

measurement than that. She does, however, know that she’s locked in the trunk<br />

of her car, and that like O’Leary she will have two indistinguishable nocturnal<br />

awakenings. She’s having one now in fact, but naturally she can’t tell whether<br />

it is the first or the second.


Should We Respond to Evil With Indifference? 500<br />

INDIFFERENCE says that Leslie should assign credence 1/2 to it being the first<br />

wake-up, right? Not yet. All that INDIFFERENCE says is that any two predicaments<br />

should receive equal credence. A predicament is maximally specific, so it specifies,<br />

inter alia, the half-life of Bernoullium. But for any x, Leslie assigns credence 0 to<br />

x being the half-life of Bernoullium, because there are uncountably many candidates<br />

for being the half-life, and none of them look better than any of the others. So she<br />

assigns credence 0 to every predicament, and so she satisfies INDIFFERENCE no<br />

matter what she thinks about what the time is. Even if, for no reason at all, she is<br />

certain it is her second awakening, she still satisfies INDIFFERENCE as it is written,<br />

because she assigns credence 0 to every predicament, and hence equal credence<br />

to similar predicaments.<br />

Fortunately, we can strengthen INDIFFERENCE to cover this case. To start,<br />

note that the motivations for INDIFFERENCE suggest that if two predicaments are<br />

similar then they should receive equal credence not just in the agent’s actual state, but<br />

even when the agent gets more evidence. Leslie should keep assigning equal credence<br />

to it being her first or second wake up if she somehow learns what the half-life of<br />

Bernoullium is, for example. This suggests the following principle:<br />

C-INDIFFERENCE If F 1 and F 2 are similar, and an agent does not know that she<br />

is in neither, then her conditional credence on being F 1 , conditional on being<br />

either F 1 or F 2 , should be 1/2. 1<br />

But even this doesn’t quite resolve our problem. Simplifying Leslie’s situation somewhat,<br />

the live predicaments are all of the following form: this is the first/second<br />

awakening, and the half-life of Bernoullium is x. C-INDIFFERENCE requires that<br />

for any c, conditional on the half-life of Bernoullium being c, Leslie assign credence<br />

1/2 to it being her first awakening. From this and the fact that Leslie’s credence<br />

function is a probability function it doesn’t follow that her credence in this being<br />

her first awakening is 1/2. So to get INDIFFERENCE to do the work it is meant<br />

to do in Leslie’s case (and presumably O’Leary’s case, since in practice there will be<br />

some other propositions about which O’Leary is deeply uncertain) I think we need<br />

to strengthen it to the following.<br />

P-INDIFFERENCE If G 1 and G 2 are properties such that:<br />

1 INDIFFERENCE entails C-INDIFFERENCE given the following extra assumptions. First, if IN-<br />

DIFFERENCE is true it is indefeasible, so it must remain true whatever one’s evidence is. Secondly,<br />

rational agents should update by conditionalisation. Thirdly, it is always possible for an agent to get evidence<br />

that tells her she is in F 1 or F 2 and no more. The third premise is at best an idealisation, but it is<br />

hard to see how or why that should tell against C-INDIFFERENCE.


Should We Respond to Evil With Indifference? 501<br />

(a) For all worlds w, there is at most one G 1 in w and at most one G 2 in w;<br />

(b) For all worlds w, there is a G 1 in w iff there is a G 2 in w; and<br />

(c) For all worlds w where there is a G 1 in w, the G 1 and the G 2 have indistinguishable<br />

experiences; then<br />

G 1 and G 2 deserve equal credence.<br />

Elga does not endorse either C-INDIFFERENCE or P-INDIFFERENCE, but I suspect<br />

he should given his starting assumptions. It is hard to believe if O’Leary is<br />

certain about everything save what time it is, then rationality imposes very strong<br />

constraints on his beliefs about time, while rationality imposes no such constraints<br />

should he (or Leslie) be uncertain about the half-life of Bernoullium. Put another<br />

way, it is hard to believe that in her current state Leslie could rationally assign credence<br />

0.9 to this being her first awakening, but if she decided the half-life of Bernoullium<br />

is 1.415 nanoseconds, then she would be required to change that credence to<br />

0.5. If we have INDIFFERENCE without P-INDIFFERENCE, that is possible. So<br />

I will assume in what follows that if C-INDIFFERENCE and P-INDIFFERENCE<br />

are false then INDIFFERENCE is heavily undermined. 2<br />

3 Out of sight, out of mind<br />

Elga’s discussion presupposes two kinds of internalism. First, he assumes that some<br />

internalist theory of experience is true. Second, he assumes that some internalist theory<br />

of justification is true. If the first assumption is false it threatens the applicability<br />

of the theory. If the second assumption is false it threatens the truth of the theory.<br />

An externalist theory of experience says that what kind of experience S is having<br />

is determined, inter alia, by what S is experiencing. While setting out such a view,<br />

John (Campbell, 2002, 124-6) says that two people sitting in duplicate prison cells<br />

looking at duplicate coffee cups will have different experiences, because one will have<br />

an experience of the coffee cup in her hand, and the other will not have an experience<br />

of that cup. This does not threaten INDIFFERENCE, but it does seem to render it<br />

trivial. On Campbell’s view, if two agents are able to make demonstrative reference to<br />

different objects, and there is no reason to think Elga’s agents in allegedly similar but<br />

not numerically identical predicaments cannot, they are having different experiences.<br />

Hence the situations are not really similar after all. Strictly speaking, this is good<br />

news for INDIFFERENCE, since it is hard given this view of experience to find<br />

counterexamples to it. But I doubt that Elga will be happy with this defence.<br />

The second kind of internalist assumption is more threatening. Many externalists<br />

about justification think whether a particular experience justifies a belief for an<br />

agent depends not just on intrinsic features of that experience, but on the relationship<br />

between experiences of that kind and the world around the agent. In some versions<br />

of this, especially the version defended by Timothy Williamson (1998), whether an<br />

2 Note also that if P-INDIFFERENCE is false, then Dr Evil has an easy way out of the ‘brain race’ that<br />

comes up at the end of Elga’s paper. He just need be told about some new element without being told its<br />

half-life, and magically he is free to assign credence 1 to his being on the spaceship rather than on Earth.<br />

This would reduce the interest of the puzzle somewhat I fear.


Should We Respond to Evil With Indifference? 502<br />

experience either constitutes or produces evidence depends on whether it constitutes<br />

or produces knowledge. Since it is not clear that any two similar agents know the<br />

same thing, since it is clear that they do not have the same true beliefs, on Williamson’s<br />

theory it seems that the agents will not have the same evidence. In particular, it<br />

is possible that part of one agent’s evidence is inconsistent with her being the other<br />

agent. If part of her evidence is that she has hands, then she is not a brain-in-a-vat having<br />

experiences like hers, and she should not assign high credence to the claim that<br />

she is one, no matter what INDIFFERENCE says. So Elga needs to reject this kind<br />

of externalism about evidence. This is not a devastating objection. I am sure that<br />

Elga does reject Campbell’s and Williamson’s theories, so just raising them against<br />

him without argument would be question-begging. But this does mean that the target<br />

audience for INDIFFERENCE is smaller than for some philosophical claims,<br />

since adherents of Campbell’s or Williamson’s views will be antecedently disposed to<br />

think INDIFFERENCE is useless or false.<br />

4 It’s Evidently Intransitive<br />

Dakota is sitting in a bright green room. She is trying to reconstruct how she got<br />

there when Dr Evil informs her just what happened. An epistemology student, not<br />

coincidentally called Dakota, was snatched out of her study and duplicated 999 times<br />

over. The duplicates were then numbered (though we’ve lost which number was<br />

given to the original) each put in a coloured cell. The thousand coloured cells rotated<br />

slowly through the colour sphere, starting with cell 0 (the new home of Dakota number<br />

0) being green, going blueish until cell 250 (for Dakota number 250) is just blue,<br />

then reddish until cell 500 is just red, swinging through the yellows with pure yellow<br />

reached at 750, and then back to the greens, with 999 being practically identical to<br />

1000. For any n, cells number n and n+1 are indistinguishable. That means that<br />

Dakota number n is similar, in Elga’s sense, to Dakota number n+1, for their (apparent)<br />

experiences before being in the rooms are identical, and their experiences in<br />

the rooms are indistinguishable. Hence our Dakota, sitting in the bright green room,<br />

should assign equal credence to being Dakota number n and Dakota number n+1 for<br />

any n. But this is absurd. Since she can see that her walls are green, she should assign<br />

high credence to being Dakota number 0, and credence 0 to being Dakota number<br />

500.<br />

The problem here is that Elga wants to define an equivalence relation on predicaments,<br />

the relation deserving the same credence as, out of an intransitive relation, being<br />

indistinguishable from. There are two possible responses, each of them perfectly defensible.<br />

First, Elga could deny the premise that the adjacent cells are indistinguishable. Although<br />

there is some prima facie plausibility to the claim that some different colours<br />

are indistinguishable, Delia Graff Fara (2001) has argued that this is false. It would<br />

mean committing to yet another controversial philosophical position, but if Elga endorsed<br />

Graff’s claims, he could easily deal with Dakota.<br />

Secondly, he could tinker with the definition of similarity. Instead of saying that<br />

possibilia represent similar predicaments iff they are indistinguishable worldmates,


Should We Respond to Evil With Indifference? 503<br />

he could say that they represent similar predicaments iff they are worldmates that are<br />

indistinguishable from the same predicaments. (This kind of strategy for generating<br />

an equivalence relation from an intransitive relation is borrowed from Goodman<br />

(1951).) Even if adjacent cells are indistinguishable from each other, they will not<br />

be indistinguishable from the same cells. This delivers the plausible result that the<br />

duplicate Dakotas stuck in the cells do not instantiate similar predicaments. Some<br />

might object that this move is ad hoc, but once we realise the need to make similar<br />

an equivalence relation, it seems clear enough that this is the most natural way to do<br />

that.<br />

5 Morgan and Morgan and Morgan and Morgan<br />

I think I outdid myself this time, said Dr Evil. I was just going along duplicating you,<br />

or at least someone like you, and the duplication process was taking less and less time.<br />

So I thought, I wonder what is the lower bound here? How quick can we make the<br />

duplication process? So I tried a few things to cut down the time it took, and I got<br />

a little better with practice, and, well, it turns out that the time taken can be made<br />

arbitrarily small. Before I knew it, there were infinitely many of you. Oops.<br />

Morgan was a little shocked. She could cope with having a duplicate or two<br />

around, but having infinitely many duplicates was a little hard to take. On the other<br />

hand, and this was hard to think about, perhaps she should be grateful. Maybe she<br />

was one of the later ones created, and she wouldn’t have existed if not for Evil’s<br />

irrational exuberance. She started to ponder how likely that was, but she was worried<br />

that it required knowing more about Evil than any mortal could possibly know.<br />

Well, continued Dr Evil, I did one thing right. As each duplicate was created I<br />

gave it a serial number, 0 for the original Morgan, 1 for the first duplicate and so on,<br />

so the bookkeeping will be easier. Don’t go looking for it, it’s written on your left<br />

leg in ectoplasmic ink, and you won’t be able to see it.<br />

Now that makes things easier, thought Morgan. By INDIFFERENCE the probability<br />

that my serial number is x is 1/n, where n is the number of duplicates created.<br />

So dividing 1 by infinity, that’s zero. So the probability that my serial number is<br />

less than x is the probability that it’s zero plus the probability that it’s one plus . . .<br />

plus the probability that it’s x, that’s still zero. So if he had stopped after x for any<br />

x, I would not exist with probability one. I’m liking Evil more and more, though<br />

something bothers me about that calculation.<br />

Morgan was right to worry. She’s just talked herself, with Elga’s help, into a<br />

violation of the principle of countable additivity. The additivity axiom in standard<br />

probability theory says that for any two disjoint propositions, the probability of<br />

their disjunction is the sum of their probabilities. The countable additivity axiom<br />

says that for any countable set of disjoint propositions, the probability that at least<br />

one of them is true is the sum of each of their probabilities. (It follows from the<br />

axioms of probability theory that this sum is always defined.) Here we have to alter<br />

these axioms slightly so they apply to properties rather than propositions, but still<br />

the principle of countable additivity seems plausible. But Morgan has to violate it.<br />

The probability she assigns to having some serial number or other is not zero, in fact


Should We Respond to Evil With Indifference? 504<br />

it is one as long as she takes Evil at his word. But for each x, the probability that her<br />

serial number is x is zero. In symbols, we have<br />

• Pr(∃x (Serial number = x)) = 1<br />

• ΣPr(Serial number = x) = 0<br />

But countable additivity says that these values should be equal.<br />

Orthodoxy endorses countable additivity, but there are notable dissenters that are<br />

particularly relevant here. Bruno de Finetti (1974) argued that countable additivity<br />

should be rejected because it rules out the possibility of an even distribution across<br />

the natural numbers. DeFinetti thought, as Morgan does, that we could rationally be<br />

in a position where we know of a particular random variable only that its value is a<br />

non-negative integer, and for every x, we assign equal probability to each hypothesis<br />

that its value is x. Since that is inconsistent with countable additivity, all the worse for<br />

countable additivity. This is a decent argument, though as de Finetti himself noted,<br />

it has some counterintuitive consequences.<br />

I decided, Dr Evil continued, to do something fairly spectacular with all these<br />

people. By some small tinkering with your physiology I found a way to make you<br />

immortal. Unfortunately, a quick scan of your psychology revealed that you weren’t<br />

capable of handling eternity. So every fifty years I will wipe all your memories and<br />

return you to the state you were in when duplicated. I will write, or perhaps I did<br />

write, on your right leg the number of times that your memories have been thus<br />

wiped. Don’t look, it’s also in ectoplasmic ink. Just to make things fun, I made<br />

enough duplicates of myself so that every fifty years I can tell you what happened.<br />

Each fifty-year segment of each physical duplicate will be an epistemic duplicate of<br />

every other such segment. How cool is that? 3<br />

Morgan was not particularly convinced that it was cool, but an odd thought<br />

crossed her mind once or twice. She had one number L written on her left leg,<br />

and another number R written on her right leg. She had no idea what those numbers<br />

were, but she thought she might be in a position to figure out the odds that L ≥ R.<br />

So she started reasoning as follows, making repeated appeals to C-INDIFFERENCE.<br />

(She must also appeal to P-INDIFFERENCE at every stage if there are other propositions<br />

about which she is uncertain. Assume that appeal made.)<br />

Let’s say the number on my left leg is 57. Then L ≥ R iff R < 58. But since there<br />

are 58 ways for R < 58 to be true, and infinitely many ways for R < 58 to be false, and<br />

by C-INDIFFERENCE each of these ways deserve the same credence conditional on<br />

L = 57, we get Pr(L ≥ R | L = 57) = 0. But 57 was arbitrary in this little argument,<br />

so I can conclude ∀l: Pr(L ≥ R | L = l) = 0. This seems to imply that Pr(L ≥ R) = 0,<br />

especially since I know L takes some value or other, but let’s not be too hasty.<br />

Let’s say the number on my right leg is 68. Then L ≥ R iff L ≥ 68. And since<br />

there are 68 ways for L ≥ 68 to be false, and infinitely many ways for it to be true, and<br />

by C-INDIFFERENCE each of these ways deserve the same credence conditional on<br />

3 Evil’s plan resembles in many respects a situation described by Jamie Dreier in his “Boundless Good”.<br />

The back story is a little different, but the situation is closely (and intentionally) modelled on his sphere<br />

of pain/sphere of pleasure example.


Should We Respond to Evil With Indifference? 505<br />

R = 68, we get Pr(L ≥ R | R = 68) = 1. But 68 was arbitrary in this little argument,<br />

so I can conclude ∀r: Pr(L ≥ R | R = r) = 1. This seems to imply that Pr(L ≥ R) =<br />

1, especially since I know R takes some value or other, but now I’m just confused.<br />

Morgan is right to be confused. She has not quite been led into inconsistency,<br />

because as she notes the last step, from ∀l: Pr(L ≥ R | L = l) = 0 to Pr(L ≥ R) =<br />

0 is not forced. In fact, the claim that this is always a valid inferential step is equivalent<br />

to the principle of countable additivity, which we have already seen a proponent<br />

of INDIFFERENCE in all its variations must reject. But it would be a mistake to<br />

conclude from this that we just have a standoff. What Morgan’s case reveals is that<br />

accepting the indifference principles that Elga offers requires giving up on an intuitively<br />

plausible principle of inference. That principle says that if the probability of<br />

p conditional on any member of a partition is x, then the probability of p is x. If we<br />

think that principle of inference is prima facie more plausible than Elga’s principle of<br />

indifference, as I think we should, that is pretty good prima facie evidence that Elga’s<br />

principle is wrong.<br />

The next three sections will be devoted to determining whether we can convert<br />

this persuasive argument into a knockdown argument (we cannot) and whether Elga’s<br />

arguments in favour of INDIFFERENCE do enough to overcome this prima facie<br />

argument that INDIFFERENCE is flawed (they do not). A concluding section notes<br />

how to redo this argument so it appeals only to potential rather than actual infinities.<br />

6. Intermission<br />

CHARYBDIS: I know how to make that argument stronger. Just get Evil to offer<br />

Morgan a bet on whether L ≥ R. Ask how much she’ll pay for a bet that pays<br />

€1 if L ≥ R and nothing otherwise. If she pays anything for it, tell her the value<br />

of L, whatever it is, and ask her if she’d like to sell that bet back for half what<br />

she paid for it. Since she now assigns probability zero to L ≥ R she’ll happily do<br />

that, and then she’ll have lost money. If she won’t pay anything for the bet to<br />

start with, offer her the reverse bet. She should pay €1 for that, and now apply<br />

the same tactics except tell her the value of R rather than L. Either way the stupid<br />

person will lose money.<br />

SCYLLA:Very practical Charybdis, but we’re not sure it gets to the heart of the<br />

matter. Not sure. Well, let us say why rather than leaving it like that. For one<br />

thing, Morgan might not like playing dice with Evil, even if Evil is the source of<br />

her life. So she might have a maximum price of 0 for either bet.<br />

CHARYBDIS:But then surely she’ll be turning down a sure win. I mean between<br />

the bets she has a sure gain of at least €1.<br />

SCYLLA:And if she is offered both bets at once we’re sure she would take that gain,<br />

but as we heard your story she wasn’t. 4<br />

CHARYBDIS:So does this mean her degree of belief in both R ≥ L and L ≥ R is 0?<br />

SCYLLA:It might mean that, and of course some smart people have argued that that<br />

is coherent, much to the chagrin of your Bayesian friends we’re sure. 5 But more<br />

likely it means that she just isn’t following the patterns of practical reasoning<br />

4 Compare the objection to Dutch Book arguments in Schick (1986).<br />

5 For example, Shafer (1976).


Should We Respond to Evil With Indifference? 506<br />

that you endorse. 6 Also, we’re not so sure about the overall structure of the<br />

argument. We think your reasoning is as follows. Morgan ends up doing something<br />

silly, giving up money. (Well, we’re not sure that’s always silly, but let’s say<br />

it is here.) So something went wrong. So she has silly beliefs. That last step goes<br />

by fairly fast we think. From her making some mistake or other, we can only<br />

conclude that, well, she made some mistake or other, not that she made some<br />

particular mistake in the composition of her credences. 7<br />

CHARYBDIS:What other mistake might she have made?<br />

SCYLLA:There are many hidden premises in your chains of reasoning to conclusions<br />

about how Morgan should behave. For instance, she only values a €1 bet on L<br />

≥ R at Pr(L ≥ R) if she knows she can’t buy that bet more cheaply elsewhere,<br />

or sell it for a larger price elsewhere. Even if those assumptions are true, Morgan<br />

may unreasonably believe they are false, and that might be her mistake. 8 But<br />

even that isn’t our main concern. Our main concern is that you understate how<br />

bad Morgan’s position is.<br />

CHARYBDIS:What’s worse for a mortal than assured loss of money?<br />

SCYLLA:Morgan is not a mortal any more, you know. And immortals we’re afraid<br />

are almost bound to lose money to clever enough tricksters. Indeed, a so-called<br />

Dutch Book can be made against any agent that (a) has an unbounded utility<br />

function and (b) is not overly opinionated, so there are still infinitely many ways<br />

the world could be consistent with their knowledge. 9 That includes us, and you<br />

dear Charybdis. And yet we are not as irrational as that Morgan. I don’t think<br />

analogising her position to ours really strengthens the case that she is irrational.<br />

CHARYBDIS:Next you might say that making money off her, this undeserving immortal,<br />

is immoral.<br />

SCYLLA:Perish the thoughts.<br />

6 Risky Business?<br />

There are two kinds of reasons to dislike indifference principles, both of them developed<br />

most extensively in Keynes (1921). The first, which we have been exploring a<br />

bit so far, is that such principles tend to lead to incoherence. The second is that such<br />

principles promote confusion between risk and uncertainty.<br />

Often we do not know exactly what the world is like. But not all kinds of ignorance<br />

are alike. Sometimes, our ignorance is like that of a roulette player facing a<br />

fair wheel about to be spun. She knows not what will happen, but she can provide<br />

good reasons for assigning equal credence to each of the 37 possible outcomes of the<br />

spin. Loosely following Frank Knight (1921), we will say that a proposition like The<br />

6Compare the state-dependent approach to decision-making discussed in Chambers and Quiggin<br />

(2000).<br />

7This point closely resembles an objection to Dutch Book reasoning made in Hájek (2005), though<br />

Scylla is much more sceptical about how much we can learn from these pragmatic arguments than Hájek<br />

is.<br />

8Scylla’s reasoning here is based on Milne (1991), though of course Milne’s argument is much less<br />

condensed than that.<br />

9 This is proven in McGee (1999).


Should We Respond to Evil With Indifference? 507<br />

ball lands in slot number 18 is risky. The distinguishing feature of such propositions is<br />

that we do not know whether they are true or false, but we have good reason to assign<br />

a particular probability to their truth. Other propositions, like say the proposition<br />

that there will be a nuclear attack on an American city this century, are quite unlike<br />

this. We do not know whether they are true, and we aren’t really in a position to<br />

assign anything like a precise numerical probability to their truth. Again following<br />

Knight, we will say such propositions are uncertain. In (1937b) Keynes described a<br />

number of other examples that nicely capture the distinction being drawn here.<br />

By ‘uncertain’ knowledge, let me explain, I do not mean merely to distinguish<br />

what is known for certain from what is only probable. The game<br />

of roulette is not subject, in this sense, to uncertainty; nor is the prospect<br />

of a Victory bond being drawn. Or, again, the expectation of life is only<br />

slightly uncertain. Even the weather is only moderately uncertain. The<br />

sense in which I am using the term is that in which the prospect of a<br />

European war is uncertain, or the price of copper and the rate of interest<br />

twenty years hence, or the obsolescence of a new invention, or the<br />

position of private wealth owners in the social system in 1970. About<br />

these matters there is no scientific basis on which to form any calculable<br />

probability whatever. We simply do not know. Nevertheless, the necessity<br />

for action and decision compels us as practical men to do our best to<br />

overlook this awkward fact and to behave exactly as we should if we had<br />

behind us a good Benthamite calculation of a series of prospective advantages<br />

and disadvantages, each multiplied by its appropriate probability,<br />

waiting to be summed. (Keynes, 1937b, 114-115)<br />

Note that the distinction between risky and uncertain propositions is not the distinction<br />

between propositions whose objective chance we know and those that we don’t.<br />

This identification would fail twice over. First, as Keynes notes, whether a proposition<br />

is risky or uncertain is a matter of degree, but whether we know something<br />

is, I presume, not a matter of degree. 10 Second, there are risky propositions with an<br />

unknown chance. Assume that our roulette player turns away from the table at a<br />

crucial moment, and misses the ball landing in a particular slot. Now the chance that<br />

it lands in slot 18 is 1 (if it did so land) or 0 (otherwise), and she does not know which.<br />

Yet typically, the proposition The ball lands in slot 18 is still risky for her, for she has<br />

no reason to change her attitude towards the proposition that it did land in slot 18.<br />

My primary theoretical objection to INDIFFERENCE is that the propositions<br />

it purports to provide guidance on are really uncertain, but it treats them as risky.<br />

Once we acknowledge the risk/uncertainty distinction, it is natural to think that our<br />

default state is uncertainty. Getting to a position where we can legitimately treat a<br />

proposition as risky is a cognitive achievement. Traditional indifference principles<br />

fail because they trivialise this achievement. An extreme version of such a principle<br />

says we can justify assigning a particular numerical probability, 0.5, to propositions<br />

merely on the basis of ignorance of any evidence telling for or against it. This might<br />

10 Though see Hetherington (2001) for an argument to the contrary.


Should We Respond to Evil With Indifference? 508<br />

not be an issue to those who think that “probability is a measure of your ignorance.”<br />

(Poole et al., 1998, 348) But to those of us who think probability is the very guide<br />

to life, such a position is unacceptable. It seems to violate the platitude ‘garbage in,<br />

garbage out’ since it takes ignorance as input, and produces a guide to life as output.<br />

INDIFFERENCE is more subtle than these traditional indifference principles, but<br />

this theoretical objection remains. The evidence that O’Leary or Morgan or Leslie<br />

has does not warrant treating propositions about their location or identity as risky<br />

rather than uncertain. When they must make decisions that turn on their identity or<br />

location, this ignorance provides little or no guidance, not a well-sharpened guide to<br />

action.<br />

In this section I argue that treating these propositions as uncertain lets us avoid<br />

the traps that Morgan falls into. In the next section I argue that the case Elga takes to<br />

support INDIFFERENCE says nothing to the theorist who thinks that the INDIF-<br />

FERENCE principle conflates risk and uncertainty. In fact, some features of that<br />

case seem to support the claim that the propositions covered by INDIFFERENCE<br />

are uncertain, not risky.<br />

In (1921), Keynes put forward a theory of probability that was designed to respect<br />

the distinction between risky propositions and uncertain propositions. He allowed<br />

that some propositions, the risky ones and the ones known to be true or false, had a<br />

numerical probability (relative to a body of evidence) while other propositions have<br />

non-numerical probabilities. Sometimes numerical and non-numerical probabilities<br />

can be compared, sometimes they cannot. Arithmetic operations are all assumed to<br />

be defined over both numerical and non-numerical probabilities. As Ramsey (1926)<br />

pointed out, in Keynes’s system it is hard to know what α + β is supposed to mean<br />

when α and β are non-numerical probabilities, and it is not even clear that ‘+’ still<br />

means addition in the sense we are used to.<br />

One popular modern view of probability can help Keynes out here. Following<br />

Ramsey, many people came to the view that the credal states of a rational agent could<br />

be represented by a probability function, that function being intuitively the function<br />

from propositions into the agent’s degree of belief in that proposition. In the<br />

last thirty years, there has been a lot of research on the theory that says we should<br />

represent rational credal states not by a single probability function, but by a set of<br />

such probability functions. Within philosophy, the most important works on this<br />

theory are by Henry Kyburg (1974), Isaac Levi (1974, 1980), Richard Jeffrey (1983a)<br />

and Bas van Fraassen (1990). What is important here about this theory is that many<br />

distinctive features of Keynes’s theory are reflected in it.<br />

Let S be the set of probability functions representing the credal states of a rational<br />

agent. Then for each proposition p we can define a set S(p) = {Pr(p): Pr ∈ S}. That<br />

is, S(p) is the set of values that Pr(p) takes for Pr being a probability function in S. We<br />

will assume here that S(p) is an interval. (See the earlier works cited for the arguments<br />

in favour of this assumption.) When p is risky, S(p) will be a singleton, the singleton<br />

of the number we have compelling reason to say is the probability of p. When p is a<br />

little uncertain, S(p) will be a fairly narrow interval. When it is very uncertain, S(p)<br />

will be a wide interval, perhaps as wide as [0, 1]. We say that p is more probable<br />

than q iff for all Pr in S, Pr(p) > Pr(q), and as probable as q iff for Pr in S, Pr(p) =


Should We Respond to Evil With Indifference? 509<br />

Pr(q). This leaves open the possibility that Keynes explicitly left open, that for some<br />

uncertain proposition p and some risky proposition q, it might be the case that they<br />

are not equally probable, but neither is one more probable than the other. Finally,<br />

we assume that when an agent whose credal states are represented by S updates by<br />

learning evidence e, her new credal states are updated by conditionalising each of the<br />

probability functions in S on e. So we can sensibly talk about S(p | e), the set {Pr(p<br />

| e): Pr ∈ S}, and this represents her credal states on learning e.<br />

(It is an interesting historical question just how much the theory sketched here<br />

agrees with the philosophical motivations of Keynes’s theory. One may think that<br />

the agreement is very close. If we take Keynes’s entire book to be a contextual definition<br />

of his non-numerical probabilities, a reading encouraged by Lewis (1970b), then<br />

we should conclude he was talking about sets like this, with numerical probabilities<br />

being singleton sets.)<br />

This gives us the resources to provide good advice to Morgan. Pick a monotone<br />

increasing function f from integers to [0, 1] such that as n → ∞, f (n) → 1. It won’t<br />

really matter which function you pick, though different choices of f might make the<br />

following story more plausible. Say that S(L ≥ R | L = l) = [0, f (l)]. The rough idea<br />

is that if L is small, then it is quite improbable that L ≥ R, although this is a little<br />

uncertain. As l gets larger, L ≥ R gets more and more uncertain. The overall effect<br />

is that we simply do not know what S(L ≥ R) will look like after conditionalising on<br />

the value of L, so we cannot apply the kind of reasoning Morgan uses to now come<br />

to some conclusions about the probability of L ≥ R.<br />

If we view the situations described by INDIFFERENCE as involving uncertainty<br />

rather than risk, this is exactly what we should expect. And note that in so doing,<br />

we need not undermine the symmetry intuition that lies behind INDIFFERENCE.<br />

Assume that F and G are similar predicaments, and I know that I am either F or G.<br />

INDIFFERENCE says I should assign equal probability to each, so S(I am F) = S(I<br />

am G) = {0.5}. But once we’ve seen how attractive non-numerical probabilities can<br />

be, we should conclude that all symmetry gives us is that S(I am F) = S(I am G),<br />

which can be satisfied if each is [0.4, 0.6], or [0.2, 0.8] or even [0, 1]. (I think that<br />

for O’Leary, for example, S(It is 1 o’clock) should be a set somehow like this.) Since I<br />

would not be assigning equal credence to I am F and I am G if I satisfied symmetry using<br />

non-numerical probabilities, so I will violate INDIFFERENCE without treating<br />

the propositions asymmetrically. Such a symmetric violation of INDIFFERENCE<br />

has much to recommend it. It avoids the incoherence that INDIFFERENCE leads<br />

to in Morgan’s case. And it avoids saying that ignorance about our identity can be a<br />

sharp guide to life. 11<br />

11 Bradley Monton (2002) discusses using sets of probability functions to solve another problem proposed<br />

by Elga, the Sleeping Beauty problem (Elga, 2000a). Monton notes that if Beauty’s credence in<br />

The coin landed heads is [0, 0.5] when she wakes up on Monday, then she doesn’t violate van Fraassen’s<br />

General Reflection Principle (van Fraassen, 1995). (I assume here familiarity with the Sleeping Beauty<br />

problem.) Monton has some criticisms of this move, in particular the consequences it has for updating,<br />

that don’t seem to carry across to the proposal sketched here. But his discussion is noteworthy as a use of<br />

this approach to uncertainty as a way to solve problems to do with similar predicaments.


Should We Respond to Evil With Indifference? 510<br />

A referee noted that the intuitive characterisation here doesn’t quite capture the<br />

idea that we should treat similar predicaments alike. The requirement that if F and G<br />

are similar then S(I am F) = S(I am G) does not imply that there will be a symmetric<br />

treatment of F and G within S if there are more than two similar predicaments. What<br />

we need is the following condition. Let T be any set of similar predicaments, g any<br />

isomorphism from T onto itself, and Pr any probability function in S. Then there<br />

exists a Pr ′ in S such that for all A in T, Pr(A) = Pr ′ (g(A)). When there are only two<br />

similar predicaments A and B this is equivalent to the requirement that S(A) = S(B),<br />

but in the general case it is a much stricter requirement. Still, it is a much weaker<br />

constraint than INDIFFERENCE, and not vulnerable to the criticisms of INDIF-<br />

FERENCE set out here.<br />

7 Boyfriend in a Coma<br />

Elga argues for INDIFFERENCE by arguing it holds in a special case, and then arguing<br />

that the special case is effectively arbitrary, so if it holds there it holds everywhere.<br />

The second step is correct, so we must look seriously at the first step. Elga’s<br />

conclusions about the special case, DUPLICATION, eventually rest on treating an<br />

uncertain proposition as risky.<br />

DUPLICATION After Al goes to sleep researchers create a duplicate of him in a<br />

duplicate environment. The next morning, Al and the duplicate awaken in<br />

subjectively indistinguishable states.<br />

Assume (in all these cases) that before Al goes to sleep he knows the relevant facts of<br />

the case. In that case INDIFFERENCE 12 dictates that when Al wakes up his credence<br />

in I am Al should be 0.5. Elga argues this dictate is appropriate by considering a pair<br />

of related cases.<br />

TOSS&DUPLICATION After Al goes to sleep, researchers toss a coin that has<br />

a 10% chance of landing heads. Then (regardless of the toss outcome) they<br />

duplicate Al. The next morning, Al and the duplicate awaken in subjectively<br />

indistinguishable states.<br />

Elga notes, correctly, that the same epistemic norms apply to Al on waking in DU-<br />

PLICATION as in TOSS&DUPLICATION. So if we can show that when Al wakes<br />

in TOSS&DUPLICATION his credence in I am Al should be 0.5, that too will suffice<br />

to prove INDIFFERENCE correct in this case. The argument for that claim has<br />

three premises. (I’ve slightly relabeled the premises for ease of expression.)<br />

(1) Pr(H) = 0.1<br />

(2) Pr(H | (H ∧ A) ∨ (T ∧ A)) = 0.1<br />

(3) Pr(H | (H ∧ A) ∨ (T ∧ D)) = 0.1<br />

12 As with earlier cases, strictly speaking we need C-INDIFFERENCE and P-INDIFFERENCE to draw<br />

the conclusions suggested unless Al is somehow certain about all other propositions. I will ignore that<br />

complication here, and in §9.


Should We Respond to Evil With Indifference? 511<br />

Here Pr is the function from de se propositions to Al’s degree of belief in them, H =<br />

The coin lands heads, T = The coin lands tails, A = I am Al and D = I am Al’s duplicate.<br />

From (1), (2) and (3) and the assumption that Pr is a probability function it follows<br />

that Pr(A) = 0.5, as required. This inference goes through even in the Keynesian<br />

theory that distinguishes risk from uncertainty. Premise (1) is uncontroversial, but<br />

both (2) and (3) look dubious. Since the argument for (3) would, if successful, support<br />

(2), I’ll focus, as Elga does, on (3). The argument for it turns on another case.<br />

COMA As in TOSS&DUPLICATION, the experimenters toss a coin and duplicate<br />

Al. But the following morning, the experimenters ensure that only one person<br />

wakes up: If the coin lands heads, they allow Al to wake up (and put the duplicate<br />

into a coma); if the coin lands tails, they allow the duplicate to wake up<br />

(and put Al into a coma).<br />

(It’s important that no one comes out of this coma, so assume that the victim gets<br />

strangled.)<br />

Elga then argues for the following two claims. If in COMA Al gets lucky and<br />

pulls through, his credence in H should be 0.1, as it was before he entered the dream<br />

world. Al’s credence in H in COMA should be the same as his conditional credence<br />

in H should be the same as his conditional credence in H given (H ∧ A) ∨ (T ∧ D) in<br />

TOSS&DUPLICATION. The second premise looks right, so the interest is on what<br />

happens in COMA. Elga argues as follows (notation slightly changed):<br />

Before Al was put to sleep, he was sure that the chance of the coin landing<br />

heads was 10%, and his credence in H should have accorded with this<br />

chance: it too should have been 10%. When he wakes up, his epistemic<br />

situation with respect to the coin is just the same as it was before he went<br />

to sleep. He has neither gained nor lost information relevant to the toss<br />

outcome. So his degree of belief in H should continue to accord with the<br />

chance of H at the time of the toss. In other words, his degree of belief<br />

in H should continue to be 10%.<br />

And this, I think, is entirely mistaken. Al has no evidence that his evidence is relevant<br />

to H, but absence of evidence is not evidence of absence. Four considerations support<br />

this conclusion.<br />

First, Al gets some evidence of some kind or other on waking. Certain colours<br />

are seen, certain pains and sensations are sensed, certain fleeting thoughts fleet across<br />

his mind. Before he sleeps Al doesn’t knows what these shall be. Maybe he thinks of<br />

the money supply, maybe of his girlfriend, maybe of his heroine, maybe of kidneys.<br />

He doesn’t know that the occurrence of these thoughts is probabilistically independent<br />

of his being Al rather than Dup, so he does not know they are probabilistically<br />

independent of H. So perhaps he need not retain the credence in H he has before he<br />

was drugged. Even if this evidence looks like junk, we can’t rule out that it has some<br />

force.<br />

Secondly, the kind of internalism about evidence needed to support Elga’s position<br />

is remarkably strong. (This is where the concerns raised in §3 become most


Should We Respond to Evil With Indifference? 512<br />

pressing.) Elga notes that he sets himself against both an extreme externalist position<br />

that says that Al’s memories and/or perceptions entail that he is Al and against an<br />

“intermediate view, according to which Al’s beliefs about the setup only partially undermine<br />

his memories of being Al. According to such a view, when Al wakes up his<br />

credence in H ought to be slightly higher than 10%.” But matters are worse than that.<br />

Elga must also reject an even weaker view that says that Al might not know whether<br />

externalism about evidence is true, so he does not know whether his credence in H<br />

should change. My view is more sympathetic to that position. When Al wakes, he<br />

does not know which direction is credences should move, or indeed whether there is<br />

such a direction, so his credence in H should be a spread of values including 0.1.<br />

Thirdly, Al’s position looks like cases where new evidence makes risky propositions<br />

uncertain. Mack’s betting strategy for the Gold Cup, a horse race with six<br />

entrants, is fairly simple. He rolls a fair die, and bets on whatever number comes up.<br />

Jane knows this is Mack’s strategy, but does not how the die landed this time. Nor<br />

does she know anything about horses, so the propositions Horse n wins the Gold Cup<br />

are uncertain for Jane for each n. Call these propositions w n , and the proposition<br />

that Mack’s die landed n d n . Right now, d 2 is risky, but h 2 is uncertain. Jane hears a<br />

party starting next door. Mack’s won. Jane has learned, inter alia, d 2 ↔ h 2 . Now it<br />

seems that d 2 , Mack’s die landed 2, inherits the uncertainty of h 2 , Horse number 2 won<br />

the Gold Cup. The formal theory of uncertainty I sketched allows for this possibility.<br />

It is possible that there be p, e such that S(p) is a singleton, while S(p | e) is a wide<br />

interval, in theory as wide as [0, 1]. This is what happens in Jane’s case, and it looks<br />

like it happens in Al’s case too. H used to be risky, but when he wakes he comes to<br />

learn H ↔ A, just as Jane learned d 2 ↔ h 2 . In each case, the left-hand clause of the<br />

biconditional inherits the uncertainty of the right-hand clause.<br />

Finally, H being uncertain for Al when he wakes in COMA is consistent with<br />

the intuition that Al has no reason to change his credences in H in one direction or<br />

another when he says goodbye to his duplicate. (Or, for all he knows, to his source.)<br />

Perhaps externalist theories of evidence provide some reason to raise these credences,<br />

as suggested above, but I do not rely on such theories. What I deny is that the absence<br />

of a reason to move one way or the other is a reason to stay put. Al’s credence in<br />

H might change in a way that reflects the fact H is now uncertain, just like A is in<br />

COMA, just like A is in TOSS&DUPLICATION, and, importantly, just like A is<br />

in DUPLICATION. I think the rest of Elga’s argument is right. DUPLICATION is<br />

a perfectly general case. In any such case, Al should be uncertain, in Keynes’s sense,<br />

whether he is the original or the duplicate.<br />

8 Shooting Dice can be Dangerous<br />

The good news, said Dr Evil, is that you are still mortal. Odysseus was not as upset<br />

as Dr Evil had expected. The bad news is that I’m thinking of torturing you. I’m<br />

going to roll this fair die, and if it lands 6 you will be tortured. If it does not, you will<br />

be (tentatively) released, and I’ll create two duplicates of you as you were when you<br />

entered this room, repeat this story to both them. Depending on another roll of this


Should We Respond to Evil With Indifference? 513<br />

fair die, I will either torture them both, or create two duplicates of each of them, and<br />

repeat the process until I get to torture someone. 13<br />

Odysseus thought through this for a bit. So I might be a duplicate you’ve just<br />

created, he said. I might not be Odysseus.<br />

You might not be, said Dr Evil, although so as to avoid confusion if you’re not<br />

him I’ll use his name for you.<br />

What happens if the die never lands 6, asked Odysseus. I’ve seen some odd runs<br />

of chance in my time.<br />

I wouldn’t be so sure of that, said Dr Evil. Anyway, that’s why I said I would<br />

tentatively release you. I’ll make the die rolls and subsequent duplication quicker and<br />

quicker so we’ll get through the infinite number of rolls in a finite amount of time. If<br />

we get that far I’ll just bring everyone back and torture you all. Aren’t I fair?<br />

Fairness wasn’t on Odysseus’s mind though. He was trying to figure out how<br />

likely it was that he would be tortured. He was also a little concerned about how<br />

likely it was that he was the original Odysseus, and if he was not whether Penelope<br />

too had been duplicated. As it turns out, his torturous computations would assist<br />

with the second question, though not the third. Two thoughts crossed his mind.<br />

I will be tortured if that die lands 6, which has a chance of 1 in 6, or if it never<br />

lands 6 again, which has a chance of 0. So the chance of my being tortured is 1 in 6. I<br />

have no inadmissible evidence, so the probability I should assign to torture is 1 in 6.<br />

Let’s think about how many Odysseuses there are in the history of the world.<br />

Either there is 1, in which case I’m him, and I shall be tortured. Or there are 3, in<br />

which case two of them shall be tortured, so the probability that I shall be tortured is<br />

2 in 3. Or there are 7, in which case four of them shall be tortured, so the probability<br />

that I shall be tortured is 4 in 7. And so on, it seems like the probability that I shall<br />

be tortured approaches 1 in 2 from above as the number of Odysseuses approaches<br />

infinity. Except, of course, in the case where it reaches infinity, when it is again<br />

certain that I shall be tortured. So it looks like the probability that I will be tortured<br />

is above 1 in 2. But I just concluded it is 1 in 6. Where did I go wrong?<br />

In his second thought, Odysseus appeals frequently to INDIFFERENCE. He<br />

then appeals to something like the conglomerability principle that tripped up Morgan.<br />

The principle Odysseus uses is a little stronger than the principle Morgan used.<br />

It says that if there is a partition and conditional on each member of the partition,<br />

the probability of p is greater than x, then the probability of p is greater than x. As<br />

we noted, this principle cannot be accepted in its full generality by one who rejects<br />

countable additivity. And one who accepts INDIFFERENCE must reject countable<br />

additivity. So where Odysseus goes wrong is in appealing to this inference principle<br />

after previously adopting an indifference principle inconsistent with it.<br />

This does not mean the case has no interest. Morgan’s case showed that when we<br />

have an actual infinity of duplicates, INDIFFERENCE can lead to counterintuitive<br />

results, and that the best way out might be to say that Morgan faced a situation of<br />

13 Dr Evil’s plans create a situation similar to the well known ‘shooting room’ problem. For the best<br />

analysis of that problem see Bartha and Hitchcock (1999). Dr Evil has changed the numbers involved in<br />

the puzzle a little bit to make the subsequent calculations a little more straightforward. He’s not very good<br />

at arithmetic you see.


Should We Respond to Evil With Indifference? 514<br />

uncertainty, not one of risk. But it might have been thought that something special<br />

about Morgan’s case, that she has infinitely many duplicates, might be responsible<br />

for the problems here. So it may be hoped that INDIFFERENCE can at least be<br />

accepted in more everyday cases. Odysseus shows that hope is in vain. All we need<br />

is the merest possibility of there being infinitely many duplicates, here a possibility<br />

with zero probability, to create a failure of conglomerability. This suggests that the<br />

problems with INDIFFERENCE run relatively deep.<br />

The details of how Odysseus’s case plays out given INDIFFERENCE are also<br />

interesting, especially to those readers not convinced by my refutation of INDIF-<br />

FERENCE. For their benefit, I will close with a few observations about how the<br />

case plays out.<br />

As in Morgan’s case, we can produce two different partitions of the possibility<br />

space that seem to support different conclusions about Odysseus’s prospects. Assume<br />

for convenience that Dr Evil makes a serial number for each Odysseus he makes, the<br />

Homeric hero being number 1, the first two duplicates being 2 and 3, and so on. Let<br />

N stand for the number of our hero, M for the number of Odysseuses that are made,<br />

and T for the property of being tortured. Then given INDIFFERENCE it behoves<br />

Odysseus to have his credences governed by the following Pr function.<br />

(4) (a) ∀k Pr(T | M = 2 k - 1) = 2 k-1 /(2 k - 1)<br />

(b) Pr(T | M = ∞) = 1<br />

(5) ∀n Pr(T | N = n) = 1/6<br />

Between 4a and 4b we cover all possible values for M, and in every case Pr(T ) is<br />

greater than 1/2. More interesting are Odysseus’s calculations about whether he is<br />

the Homeric hero, i.e. about whether N = 1. Consider first a special case of this,<br />

what the value of Pr(N = 1 | N < 8) is. At first glance, it might seem that this should<br />

be 1/7, because there are seven possible values for N less than 8. But this is too quick.<br />

There are really eleven possibilities to be considered.<br />

F 1 : N = 1 and M = 1 F 2 : N = 1 and M = 3 F 5 : N = 1 and M > 3<br />

F 3 : N = 2 and M = 3 F 6 : N = 2 and M > 3<br />

F 4 : N = 3 and M = 3 F 7 : N = 3 and M > 3<br />

F 8 : N = 4 and M > 3<br />

F 9 : N = 5 and M > 3<br />

F 10 : N = 6 and M > 3<br />

F 11 : N = 7 and M > 3<br />

By INDIFFERENCE, each of the properties in each column should be given equal<br />

probability. So we have<br />

x = P r (F 1 |N < 8)<br />

y = P r (F 2 |N < 8) = P r (F 3 |N < 8) = P r (F 4 |N < 8)<br />

z = P r (F 5 |N < 8) = ··· = P r (F 1 1|N < 8)<br />

We just have to solve for x, y and z. By the Principal Principle we get


Should We Respond to Evil With Indifference? 515<br />

(6) Pr(M = 1 | N = 1) = 1/6<br />

∴ x = (x + y + z) / 6<br />

(7) Pr(M = 3 | N = 1 and M ≥ 3) = 1/6<br />

∴ y = (y + z) / 6<br />

And since these 11 possibilities are all the possibilities for N < 8, we have<br />

(8) x + 3y + 7z = 1<br />

Solving for all these, we get x = 3/98, y = 5/196 and z = 25/196, so Pr(N = 1 |<br />

N < 8) = x + y + z = 9/49. More generally, we have the following (the proof of this<br />

is omitted):<br />

6k P r (N = 1|N < 2 k+1 ) =<br />

� k<br />

i=0 6i 10 k−i<br />

Since the RHS → 0 as k → ∞, Pr(N = 1) = 0. Our Odysseus is probably not the<br />

real hero. Similar reasoning shows that Pr(N = n) = 0 for all n. So we have another<br />

violation of countable additivity. But we do not have, as in Morgan’s case, a constant<br />

distribution across the natural numbers. In a sense, this distribution is still weighted<br />

towards the bottom, since for any n > 1, Pr(N = 1 | N = 1 ∨ N = n) > 1/2. Of<br />

course, I don’t think INDIFFERENCE is true, so these facts about what Odysseus’s<br />

credence function will look like under INDIFFERENCE are of purely mathematical<br />

interest to me. But it might be possible that someone more enamoured of INDIF-<br />

FERENCE can use this ‘unbalanced’ distribution to explain some of the distinctive<br />

features of the odd position that Odysseus is in.


The Bayesian and the Dogmatist<br />

There is a lot of philosophically interesting work being done in the borderlands between<br />

traditional and formal epistemology. It is easy to think that this would all be<br />

one-way traffic. When we try to formalise a traditional theory, we see that its hidden<br />

assumptions are inconsistent or otherwise untenable. Or we see that the proponents<br />

of the theory had been conflating two concepts that careful formal work lets us distinguish.<br />

Either way, the formalist teaches the traditionalist a lesson about what the live<br />

epistemological options are. I want to argue, more or less by example, that the traffic<br />

here should be two-way. By thinking carefully about considerations that move traditional<br />

epistemologists, we can find grounds for questioning some presuppositions<br />

that many formal epistemologists make.<br />

To make this more concrete, I’m going to be looking at a Bayesian objection to<br />

a certain kind of dogmatism about justification. Several writers have urged that the<br />

incompatibility of dogmatism with a kind of Bayesianism is a reason to reject dogmatism.<br />

I rather think that it is reason to question the Bayesianism. To put the<br />

point slightly more carefully, there is a simple proof that dogmatism (of the kind I<br />

envisage) can’t be modelled using standard Bayesian modelling tools. Rather than<br />

conclude that dogmatism is therefore flawed, I conclude that we need better modelling<br />

tools. I’ll spend a fair bit of this paper on outlining a kind of model that (a)<br />

allows us to model dogmatic reasoning, (b) is motivated by the epistemological considerations<br />

that motivate dogmatism, and (c) helps with a familiar problem besetting<br />

the Bayesian.<br />

I’m going to work up to that problem somewhat indirectly. I’ll start with looking<br />

at the kind of sceptical argument that motivates dogmatism. I’ll then briefly rehearse<br />

the argument that shows dogmatism and Bayesianism are incompatible. Then in<br />

the bulk of the paper I’ll suggest a way of making Bayesian models more flexible so<br />

they are no longer incompatible with dogmatism. I’ll call these new models dynamic<br />

Keynesian models of uncertainty. I’ll end with a brief guide to the virtues of my new<br />

kind of model.<br />

1 Sceptical Arguments<br />

Let H be some relatively speculative piece of knowledge that we have, say that G.<br />

E. Moore had hands, or that it will snow in Alaska sometime next year. And let<br />

E be all of our evidence about the external world. I’m not going to make many<br />

assumptions about what E contains, but for now E will stay fairly schematic. Now<br />

a fairly standard sceptical argument goes something like this. Consider a situation S<br />

in which our evidence is unchanged, but in which H is false, such as a brain-in-vat<br />

scenario, or a zombie scenario, or a scenario where the future does not resemble the<br />

past. Now a fairly standard sceptical argument goes something like this.<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Proceedings<br />

of the Aristotelian Society 107(2007): 169-185.


The Bayesian and the Dogmatist 517<br />

1. To know H we have to be in a position to know we aren’t in S<br />

2. We aren’t in a position to know that we aren’t in S<br />

3. So, we don’t know H<br />

There are a few immediate responses one could make, but which I’m going to dismiss<br />

without argument here. These include claiming the setup is incoherent (as in,<br />

e.g., Williamson (2000a)), rejecting the closure principle behind premise 1 (as in, e.g.,<br />

Dretske (1971), accepting the conclusion (the sceptical response), or saying that in<br />

different sceptical arguments, one or other of these positions is correct. Instead I<br />

want to look at responses that question premise 2. In particular, I want to look at<br />

responses that offer us reasons to accept premise 2, since it seems here that the sceptic<br />

is at her strongest. (If the sceptic merely insists that premise 2 is reasonable, we can<br />

reply either that it isn’t, as I’m inclined to think, or that here is a case where intuition<br />

should be revised.)<br />

Many epistemologists will write papers responding to ‘the sceptic’. I think this<br />

is a mistake, since there are so many different possible sceptics, each with different<br />

arguments for premise 2. (And, of course, some sceptics do not argue from sceptical<br />

scenarios like this one.) Here are, for instance, three arguments that sceptics might<br />

give for premise 2.<br />

Discrimination Argument<br />

1. Someone in S can’t discriminate her situation from yours.<br />

2. Indiscriminability is symmetric.<br />

3. If you can’t discriminate our situation from S, you can’t know you’re not in S.<br />

4. So you can’t know you’re not in S.<br />

Evidential (or Underdetermination) Argument<br />

1. Someone in S has the same evidence as you do.<br />

2. What you can know supervenes on what your evidence is.<br />

3. So, you can’t know you are not in S.<br />

Dialectical Argument<br />

1. There is no non-circular argument to the conclusion that you aren’t in S.<br />

2. If you were able to know you’re not in S, you would be able to produce a noncircular<br />

argument that concluded that you aren’t in S.<br />

3. So you can’t know that you aren’t in S.


The Bayesian and the Dogmatist 518<br />

I won’t say much about these arguments, save that I think in each case the second<br />

premise is very implausible. I suspect that most non-philosophers who are moved<br />

by sceptical arguments are tacitly relying on one or other of these arguments, but<br />

confirming that would require a more careful psychological study than I could do.<br />

But set those aside, because there’s a fourth argument that is more troubling. This<br />

argument takes its inspiration from what we might call Hume’s exhaustive argument<br />

for inductive scepticism. Hume said that we can’t justify induction inductively, and<br />

we can’t justify it deductively, and that exhausts the justifications, so we can’t justify<br />

induction. A similar kind of argument helps out the general sceptic.<br />

Exhaustive Argument<br />

1. If you know you aren’t in S, you know this a priori, or a posteriori<br />

2. You can’t know you aren’t in S a posteriori<br />

3. You can’t know you aren’t in S a priori<br />

4. So, you can’t know you aren’t in S<br />

This seems to be a really interesting argument to me. To make things simpler, I’ll<br />

stipulate that by a posteriori knowledge, I just mean knowledge that isn’t a priori.<br />

That makes the first premise pretty secure, as long as we’re assuming classical logic. 1<br />

Lots of philosophers take its third premise for granted. They assume that since it is<br />

metaphysically possible that you could be in S, this can’t be something you can rule<br />

out a priori. That strikes me as a rather odd capitulation to infallibilism. But I won’t<br />

push that here. Instead I’ll look at denials of the second premise.<br />

2 Dogmatism and a Bayesian Objection<br />

Someone who denies the second premise says that your empirical evidence can provide<br />

the basis for knowing that you aren’t in S, even though you didn’t know this a<br />

priori. I’m going to call such a person a dogmatist, for reasons that will become clear<br />

shortly. The dogmatist is not a sceptic, so the dogmatist believes that you can know<br />

H. The dogmatist also believes a closure principle, so the dogmatist also believes you<br />

can know E ⊃ H. If the dogmatist thought you could know E ⊃ H a priori, they’d<br />

think that you could know a priori that you weren’t in S. (This follows by another<br />

application of closure.) But they think that isn’t possible, so knowing E ⊃ H a priori<br />

isn’t possible. Hence you know E ⊃ H a posteriori.<br />

If we reflect on the fact that E is your total evidence, then we can draw two conclusions.<br />

The first is that the dogmatist thinks that you can come to know H on the<br />

basis of E even though you didn’t know in advance that if E is true, then H is true.<br />

You don’t, that is, need antecedent knowledge of the conditional in order to be able<br />

to learn H from E. That’s why I’m calling them a dogmatist. The second point is that<br />

the dogmatist is now running head on into a piece of Bayesian orthodoxy.<br />

1 Perhaps not a wise assumption around here, but one that I’ll make throughout in what follows.


The Bayesian and the Dogmatist 519<br />

To see the problem, note that we can easily prove (A), for arbitrary E, H and K. 2<br />

(A) Pr(E ⊃ H | E ∧ K) ≤ Pr(E ⊃ H | K), with equality iff Pr(E ⊃ H | E ∧ K) = 1<br />

Proof:<br />

1. Pr(E ⊃ H | K) =<br />

Pr(E ⊃ H | E ∧ K) Pr(E | K) +<br />

Pr(E ⊃ H | ¬E ∧ K) Pr(¬E | K) Prob theorem<br />

2. Pr(E ⊃ H | ¬E ∧ K) = 1 Logic<br />

3. Pr(E ⊃ H | E ∧ K) ≤ 1 Prob theorem<br />

4. Pr(E ⊃ H | K) ≥<br />

Pr(E ⊃ H | E ∧ K) Pr(E | K) +<br />

Pr(E ⊃ H | E ∧ K) Pr(¬E | K) 1, 2, 3<br />

5. Pr(E | K) + Pr(¬E | K) = 1 Prob theorem<br />

6. Pr(E ⊃ H | K) ≥ Pr(E ⊃ H | E ∧ K) 4, 5<br />

It is clear enough from the proof that line 6 is an equality iff line 3 is an equality, so<br />

we have proven (A). Now some authors have inferred from this something like (B)<br />

from (A). 3<br />

(B) It is impossible to go from not being in a position to know E ⊃ H to being in a<br />

position to know it just by receiving evidence E.<br />

The transition here should raise an eyebrow or two. (A) is a principle of probability<br />

statics. (B) is a principle of epistemological kinematics. To get from (A) to (B) we<br />

need a principle linking probability and epistemology, and a principle linking statics<br />

and kinematics. Fortunately, orthodox Bayesian confirmation theory offers us<br />

suggestions for both principles. We’ll write Cr(A) for the agent’s credence in A, and<br />

Cr E (A) for the agent’s credence in A when updated by receiving evidence E.<br />

LEARNING: If Cr E (A) ≤ Cr(A), then it is impossible to go from not being in a position<br />

to know A to being in a position to know it just by receiving evidence<br />

E.<br />

BAYES: Cr E (A) = Cr(A | E). That is, learning goes by conditionalisation.<br />

A quick browse at any of the literature on Bayesian confirmation theory will show<br />

that these principles are both widely accepted by Bayesians. Philosophers, even<br />

Bayesians, make false assumptions, so neither of these principles is obviously true.<br />

Nevertheless, I’m going to accept LEARNING at least for the sake of argument. I’m<br />

going to argue instead that the inference from (A) to (B) fails because BAYES fails.<br />

That is, I’m going to accept that if we could prove a principle I’ll call LOWER is true,<br />

then dogmatism in the sense I’m defending it fails.<br />

2 Again, the proof uses distinctively classical principles, in particular the equivalence of A with (A ∧ B)<br />

∨ (A ∧ ¬B.) But I will take classical logic for granted throughout. David Jehle pointed out to me that the<br />

proof fails without this classical assumption.<br />

3 Roger White (2006) and Stewart Cohen (2005) endorse probabilistic arguments against people who<br />

are, in my sense, dogmatists. John Hawthorne (2002) also makes a similar argument when arguing that<br />

certain conditionals, much like E ⊃ H, are a priori.


The Bayesian and the Dogmatist 520<br />

LOWER. Cr E (E ⊃ H) is less than or equal to Cr(E ⊃ H).<br />

Now there is a bad argument around here that the dogmatist might make. It might<br />

be argued that since the Bayesian approach (including BAYES) involves so much idealisation<br />

it could not be applicable to real agents. That’s a bad argument because the<br />

Bayesian approach might provide us with a good model for real agents, and models<br />

can be useful without being scale models. As long as the Bayesian model is the most<br />

appropriate model in the circumstances, then we can draw conclusions for the real<br />

world from facts about the model. The problem arises if there are alternative models<br />

which seem to fit just as well, but in which principles like LOWER are not true. If<br />

there are alternative models that seem better suited (or at least just as well suited) to<br />

modelling the situation of initial evidence acquisition, and those models do not make<br />

LOWER true, then we might think the derivation of LOWER in the Bayesian model<br />

is a mere consequence of the arbitrary choice of model. In the next section I will<br />

develop just such a model. I won’t argue that it is the best model, let alone the only<br />

alternative to the Bayesian model. But I will argue that it is as good for these purposes<br />

as the Bayesian model, and it does not imply LOWER.<br />

3 Bayes and Keynes<br />

The traditional Bayesian model of a rational agent starts with the following two principles.<br />

• At any moment, the agent’s credal states are represented by a probability function.<br />

• From moment to moment, the agent’s credal states are updated by conditionalisation<br />

on the evidence received.<br />

Over recent decades many philosophers have been interested in models that relax<br />

those assumptions. One particular model that has got a lot of attention (from e.g.<br />

Isaac Levi (1974, 1980), Richard Jeffrey (1983a), Bas van Fraassen (1990), Alan Hájek<br />

(2000, 2003) and many others) is what I’ll call the static Keynesian model. This model<br />

has the following features.<br />

• At any moment, the agent’s credal states are represented by a set of probability<br />

functions, called their representor.<br />

• The agent holds that p is more probable than q iff the probability of p is greater<br />

than the probability of q according to all probability functions in their representor.<br />

The agent holds that p and q are equally probable iff the probability of<br />

p is equal to the probability of q according to all probability functions in their<br />

representor.<br />

• From moment to moment, the agent’s credal states are updated by conditionalising<br />

each of the functions in the representor on the evidence received.


The Bayesian and the Dogmatist 521<br />

The second point is the big attraction. It allows that the agent need not hold that p is<br />

more probable than q, or q more probable than p, or that p and q are equally probable,<br />

for arbitrary p and q. And that’s good because it isn’t a rationality requirement that<br />

agents make pairwise probability judgments about all pairs of propositions. Largely<br />

because of this feature, I argued in an earlier paper that this model could be use to<br />

formalise the key philosophical ideas in Keynes’s Treatise on Probability. That’s the<br />

reason I call this a ‘Keynesian’ model.<br />

The modifier ‘static’ might seem a little strange, because the agent’s representor<br />

does change when she receives new evidence. But the change is always of a certain<br />

kind. Her ‘hypothetical priors’ do not change. If at t 1 her evidence is E 1 and her<br />

representor R 1 , and at t 2 her evidence is E 2 and her representor R 2 , then there is a<br />

‘prior’ representor R 0 such that the following two claims are true for all probability<br />

functions Pr.<br />

• Pr ∈ R 1 ↔ [∃Pr 0 ∈ R 0 : ∀p (Pr(p) = Pr 0 (p | E 1 )]<br />

• Pr ∈ R 2 ↔ [∃Pr 0 ∈ R 0 : ∀p (Pr(p) = Pr 0 (p | E 2 )]<br />

That is, there is a set of probability functions such that the agent’s representor at any<br />

time is the result of conditionalising each of those functions on her evidence. I’ll<br />

call any model with this property a static model, so the model described above is the<br />

static Keynesian model.<br />

Now there is a lot to like about the static Keynesian model, and I have made extensive<br />

use of it previous work. It is a particularly useful model to use when we need<br />

to distinguish between risk and uncertainty in the sense that these terms are used<br />

in Keynes’s 1937 article “The General Theory of Employment”. 4 The traditional<br />

Bayesian model assumes that all propositions are risky, but in real life some propositions<br />

are uncertain as well, and in positions of radical doubt, where we have little or<br />

no empirical evidence, presumably most propositions are extremely uncertain. And<br />

using the static Keynesian model does not mean we have to abandon the great work<br />

done in Bayesian epistemology and philosophy of science. Since a Bayesian model is a<br />

(degenerate) static Keynesian model, we can say that in many circumstances (namely<br />

circumstances where uncertainty can be properly ignored) the Bayesian model will be<br />

4 The clearest statement of the distinction that I know is from that paper.<br />

By ‘uncertain’ knowledge, let me explain, I do not mean merely to distinguish what is<br />

known for certain from what is only probable. The game of roulette is not subject, in<br />

this sense, to uncertainty; nor is the prospect of a Victory bond being drawn. Or, again,<br />

the expectation of life is only slightly uncertain. Even the weather is only moderately<br />

uncertain. The sense in which I am using the term is that in which the prospect of a<br />

European war is uncertain, or the price of copper and the rate of interest twenty years<br />

hence, or the obsolescence of a new invention, or the position of private wealth owners in<br />

the social system in 1970. About these matters there is no scientific basis on which to form<br />

any calculable probability whatever. We simply do not know. Nevertheless, the necessity<br />

for action and decision compels us as practical men to do our best to overlook this awkward<br />

fact and to behave exactly as we should if we had behind us a good Benthamite calculation<br />

of a series of prospective advantages and disadvantages, each multiplied by its appropriate<br />

probability, waiting to be summed. (Keynes, 1937b, 114-5)


The Bayesian and the Dogmatist 522<br />

appropriate. Indeed, these days it is something like a consensus among probabilists<br />

or Bayesians that the static Keynesian model is a useful generalisation of the Bayesian<br />

model. For example in Christensen (2005) it is noted, almost as an afterthought, that<br />

the static Keynesian model will be more realistic, and hence potentially more useful,<br />

than the traditional Bayesian model. Christensen doesn’t appear to take this as any<br />

kind of objection to Bayesianism, and I think this is just the right attitude.<br />

But just as the static Keynesian is more general than the Bayesian model, there<br />

are bound to be interesting models that are more general than the static Keynesian<br />

model. One such model is what I call the dynamic Keynesian model. This model has<br />

been used by Seth Yalcin to explicate some interesting semantic theories, but to the<br />

best of my knowledge it has not been used for epistemological purposes before. That<br />

should change. The model is like the static Keynesian model in its use of representors,<br />

but it changes the way updating is modelled. When an agent with representor R<br />

receives evidence E, she should update her representor by a two step process.<br />

• Replace R with U(R, E)<br />

• Conditionalise U(R, E), i.e. replace it with {Pr(• | E): Pr is in U(R, E)}<br />

In this story, U is a function that takes two inputs: a representor and a piece of<br />

evidence, and returns a representor that is a subset of the original representor. Intuitively,<br />

this models the effect of learning, via getting evidence E, what evidential<br />

relationships obtain. In the static Keynesian model, it is assumed that before the<br />

agent receives evidence E, she could already say which propositions would receive<br />

probabilistic support from E. All of the relations of evidential support were encoded<br />

in her conditional probabilities. There is no place in the model for learning about<br />

fundamental evidential relationships. In the dynamic Keynesian model, this is possible.<br />

When the agent receives evidence E, she might learn that certain functions<br />

that were previously in her representor misrepresented the relationship between evidence<br />

and hypotheses, particularly between evidence E and other hypotheses. In<br />

those cases, U(R, E) will be her old representor R, minus the functions that E teaches<br />

her misrepresent these evidential relationships.<br />

The dynamic Keynesian model seems well suited to the dogmatist, indeed to<br />

any epistemological theory that allows for fundamental evidential relationships to<br />

be only knowable a posteriori. As we’ll see below, this is a reason to stop here in<br />

the presentation of the model and not try and say something systematic about the<br />

behaviour of U. Instead of developing the model by saying more about U, we should<br />

assess it, which is what I’ll do next.<br />

4 In Defence of Dynamism<br />

In this section I want go over three benefits of the dynamic Keynesian model, and<br />

then say a little about how it relates to the discussion of scepticism with which we<br />

opened. I’m not going to say much about possible objections to the use of the model.<br />

That’s partially for space reasons, partially because what I have to say about the objections<br />

I know of is fairly predictable, and partially because the model is new enough


The Bayesian and the Dogmatist 523<br />

that I don’t really know what the strongest objections might be. So here we’ll stick<br />

to arguments for the view.<br />

4.1 The Dogmatist and the Keynesian<br />

The first advantage of the dynamic Keynesian model is that because it does not verify<br />

LOWER, it is consistent with dogmatism. Now if you think that dogmatism is<br />

obviously false, you won’t think this is much of an advantage. But I tend to think<br />

that dogmatism is one of the small number of not absurd solutions to a very hard<br />

epistemological problem with no obvious solution, so we should not rule it out preemptively.<br />

Hence I think our formal models should be consistent with it. What is<br />

tricky is proving that the dynamic Keynesian model is indeed consistent with it.<br />

To see whether this is true on the dynamic Keynesian model, we need to say what<br />

it is to lower the credence of some proposition. Since representors map propositions<br />

onto intervals rather than numbers, we can’t simply talk about one ‘probability’<br />

being a smaller number than another. 5 On the static Keynesian model, the most<br />

natural move is to say that conditionalisation on E lowers the credence of p iff for<br />

all Pr in the representor, Pr(p) > Pr(p | E). This implies that if every function in the<br />

representor says that E is negatively relevant to p, then conditionalising on E makes<br />

p less probable. Importantly, it allows this even if the values that Pr(p) takes across<br />

the representor before and after conditionalisation overlap. So what should we say<br />

on the dynamic Keynesian model? The weakest approach that seems viable, and not<br />

coincidentally the most plausible approach, is to say that updating on E lowers the<br />

credence of p iff the following conditions are met:<br />

• For all Pr in U(R, E), Pr(p | E) < Pr(p)<br />

• For all Pr in R but not in U(R, E), there is a Pr ′ in U(R, E) such that Pr ′ (p | E)<br />

< Pr(p)<br />

It isn’t too hard to show that for some models, updating on E does not lower the<br />

credence of E ⊃ H, if lowering is understood this way. The following is an extreme<br />

example, but it suffices to make the logical point. Let R be the minimal representor,<br />

the set of all probability functions that assign probability 1 to a priori certainties.<br />

And let U(R, E) be the singleton of the following probability function, defined only<br />

over Boolean combinations of E and H: Pr(E ∧ H) = Pr(E ∧ ¬H) = Pr(¬E ∧ H) =<br />

Pr(¬E ∧ ¬H) = 1<br />

3<br />

. Then the probability of E ⊃ H after updating is . (More precisely,<br />

4 4<br />

according to all Pr in U(R, E), Pr(E ⊃ H) = 3<br />

.) Since before updating there were Pr<br />

4<br />

in R such that Pr(E ⊃ H) < 3<br />

, in fact there were Pr in R such that Pr(E ⊃ H) = 0,<br />

4<br />

updating on E did not lower the credence of E ⊃ H. So the dynamic Keynesian model<br />

does not, in general, have as a consequence that updating on E lowers the credence of<br />

E ⊃ H. This suggests that LOWER in general is not true.<br />

5 Strictly speaking, the story I’ve told so far does not guarantee that for any proposition p, the values<br />

that Pr(p) takes (for Pr in the representor) form an interval. But it is usual in more detailed presentations<br />

of the model to put constraints on the representor to guarantee that happens, and I’ll assume we’ve done<br />

that.


The Bayesian and the Dogmatist 524<br />

It might be objected that if evidence E supports our knowledge that E ⊃ H, then<br />

updating on E should raise the credence of E ⊃ H. And if we define credence raising<br />

the same way we just defined credence lowering, updating on E never raises the credence<br />

of E ⊃ H. From a Keynesian perspective, we should simply deny that evidence<br />

has to raise the credence of the propositions known on the basis of that evidence. It<br />

might be sufficient that getting this evidence removes the uncertainty associated with<br />

those propositions. Even on the static Keynesian model, it is possible for evidence<br />

to remove uncertainty related to propositions without raising the probability of that<br />

proposition. A little informally, we might note that whether an agent with representor<br />

R is sufficiently confident in p to know that p depends on the lowest value that<br />

Pr(p) takes for Pr ∈ R, and updating can raise the value of this ‘lower bound’ without<br />

raising the value of Pr(p) according to all functions in R, and hence without strictly<br />

speaking raising the credence of p.<br />

The above illustration is obviously unrealistic, in part because U could not behave<br />

that way. It’s tempting at this stage to ask just how U does behave so we can work out<br />

if there are more realistic examples. Indeed, it’s tempting to try to attempt to provide<br />

a formal description of U. This temptation should be resisted. The whole point of the<br />

model is that we can only learn which hypotheses are supported by certain evidence<br />

by actually getting that evidence. If we could say just what U is, we would be able<br />

to know what was supported by any kind of evidence without getting that evidence.<br />

The best we can do with respect to U is to discover some of its contours with respect<br />

to evidence much like our own. And the way to make those discoveries will be to<br />

do scientific and epistemological research. It isn’t obvious that, say, looking for nice<br />

formal properties of U will help at all.<br />

4.2 The Problem of the Priors<br />

One really nice consequence of the dynamic Keynesian approach is that it lets us say<br />

what the representor of an agent with no empirical information should be. Say a<br />

proposition is a priori certain iff it is a priori that all rational agents assign credence<br />

1 to that proposition. Then the representor of the agent with no empirical evidence<br />

is {Pr: ∀p: If p is a priori certain, then Pr(p) = 1}. This is the minimal representor I<br />

mentioned above. Apart from assigning probability 1 to the a priori certainties, the<br />

representor is silent. Hence it treats all propositions that are not a priori certain in exactly<br />

the same way. This kind of symmetric treatment of propositions is not possible<br />

on the traditional Bayesian conception for logical reasons. (The reasons are set out<br />

in the various discussions of the paradoxes of indifference, going back to Bertrand<br />

(1888).) Such a prior representor is consistent with the static Keynesian approach,<br />

but it yields implausible results, since conditionalising on E has no effect on the distribution<br />

of values of Pr(p) among functions in the representor for any p not made<br />

a priori certain by E. (We’ll say p is made a priori certain by E iff E ⊃ p is a priori<br />

certain.) So if this is our starting representor, we can’t even get probabilistic evidence<br />

for things that are not made certain by our evidence. 6 So on the static Keynesian<br />

6 The argument in the text goes by a little quickly, because I’ve defined representors in terms on unconditional<br />

probabilities and this leads to complications to do with conditionalising on propositions of<br />

zero probability. A better thing to do, as suggested by Hájek (2003), is to take conditional probability as


The Bayesian and the Dogmatist 525<br />

model, this attractively symmetric prior representor is not available.<br />

I think one of the motivations of anti-dogmatist thinking is the thought that we<br />

should be able to tell a priori what is evidence for what. If it looking like there is a cow<br />

in front of us is a reason to think there is a cow in front of us, that should be knowable<br />

a priori. I think the motivation for this kind of position shrinks a little when we<br />

realise that an a priori prior that represented all the connections between evidence<br />

and hypotheses would have to give us a lot of guidance as to what to do (epistemically<br />

speaking) in worlds quite unlike our own. Moreover, there is no reason we should<br />

have lots that information. So consider, for a minute, a soul in a world with no spatial<br />

dimensions and three temporal dimensions, where the primary source of evidence for<br />

souls is empathic connection with other souls from which they get a (fallible) guide to<br />

those souls’ mental states. When such a soul conditionalises on the evidence “A soul<br />

seems to love me” (that’s the kind of evidence they get) what should their posterior<br />

probability be that there is indeed a soul that loves them? What if the souls have a<br />

very alien mental life, so they instantiate mental concepts very unlike our own, and<br />

souls get fallible evidence of these alien concepts being instantiated through empathy?<br />

I think it’s pretty clear we don’t know the answers to these questions. (Note that to<br />

answer this question we’d have to know which of these concepts were grue-like, and<br />

which were projectable, and there is no reason to believe we are in a position to know<br />

that.) Now those souls are presumably just as ignorant about the epistemologically<br />

appropriate reaction to the kinds of evidence we get, like seeing a cow or hearing a<br />

doorbell, as we are about their evidence. The dynamic Keynesian model can allow for<br />

this, especially if we use the very weak prior representor described above. When we<br />

get the kind of evidence we actually get, the effect of U is to shrink our representors<br />

to sets of probability functions which are broadly speaking epistemically appropriate<br />

for the kind of world we are in. Before we got that evidence, we didn’t know how<br />

we should respond to it, just like the spaceless empathic souls don’t know how to<br />

respond to it, just like we don’t know how to respond to their evidence.<br />

It is a commonplace observation that (a) prior probabilities are really crucial in<br />

Bayesian epistemology, but (b) we have next to no idea what they look like. I call<br />

this the problem of the priors, and note with some satisfaction that the dynamic<br />

Keynesian model avoids it. Now a cynic might note that all I’ve done is replace a<br />

hand-wavy story about priors with a hand-wavy story about updating. That’s true,<br />

but nevertheless I think this is progress. The things I’m being deliberately unclear<br />

about, such as what U should look like for E such as “Some other non-spatial tritemporal<br />

soul seems to love me” are things that (a) my theory says are not a priori<br />

knowable, and (b) I don’t have any evidence concerning. So it isn’t surprising that<br />

I don’t have much to say about them. It isn’t clear that the traditional Bayesian can<br />

offer any story, even by their own lights, as to why they are less clear about the<br />

structure of the prior probability conditional on such an E.<br />

primitive. If we do this we’ll define representors as sets of conditional probability functions, and the a<br />

priori representor will be {Pr: If p ⊃ q is a priori certain, then Pr(q | p) = 1}. Then the claim in the text<br />

will follow.


The Bayesian and the Dogmatist 526<br />

4.3 The Problem of Old Evidence<br />

When we get evidence E, the dynamic Keynesian model says that we should do two<br />

things. First, we should throw out some probability functions in our representor.<br />

Second, we should conditionalise those that remain. But this is a normative condition,<br />

not a description of what actually happens. Sometimes, when we get evidence<br />

E, we may not realise that it is evidence that supports some theory T. That is, we<br />

won’t sufficiently cull the representor of those probability functions where the probability<br />

of T given E is not high. Housecleaning like this is hard, and sometimes we<br />

only do it when it becomes essential. In this case, that means we only do it when<br />

we start paying serious attention to T. In that case we may find that evidence E, evidence<br />

we’ve already incorporated, in the sense of having used in conditionalisation,<br />

gives us reason to be more confident than we were in T. In such a case we’ll simply<br />

cull those functions where probability of T given E is not high, and we will be more<br />

confident in T. That’s how old evidence can be relevant on the dynamic Keynesian<br />

model. Since we have a story about how old evidence can be relevant, there is no<br />

problem of old evidence for the dynamic Keynesian.<br />

Famously, there is a problem of old evidence for traditional Bayesians. Now I’m<br />

not going to rehearse all the arguments concerning this problem to convince you that<br />

this problem hasn’t been solved. That’s in part because it would take too long and in<br />

part because I’m not sure myself that it hasn’t been solved. But I will note that if you<br />

think the problem of old evidence is a live problem for traditional Bayesians, then<br />

you have a strong reason for taking the dynamic Keynesian model seriously.<br />

4.4 Why Should We Care?<br />

The sceptic’s opening move was to appeal to our intuition that propositions like<br />

E ⊃ H are unknowable. We then asked what reasons we could be given for accepting<br />

this claim, because the sceptic seems to want to derive quite a lot from a raw<br />

intuition. The sceptic can respond with a wide range of arguments, four of which<br />

are mentioned above. Here we focussed on the sceptic’s argument from exhaustion.<br />

E ⊃ H isn’t knowable a priori, because it could be false, and it isn’t knowable a posteriori,<br />

because, on standard models of learning, our evidence lowers its credibility.<br />

My response is to say that this is an artefact of the model the sceptic (along with everyone<br />

else) is using. There’s nothing wrong with using simplified models, in fact<br />

it is usually the only way to make progress, but we must be always wary that our<br />

conclusions transfer from the model to the real world. One way to argue that a conclusion<br />

is a mere artefact of the model is to come up with a model that is sensitive<br />

to more features of reality in which the conclusion does not hold. That’s what I’ve<br />

done here. The dynamic Keynesian model is sensitive to the facts that (a) there is a<br />

distinction between risk and uncertainty and (b) we can learn about fundamental evidential<br />

connections. In the dynamic Keynesian model, it isn’t true that our evidence<br />

lowers the probability of E ⊃ H. So the anti-sceptic who says that E ⊃ H is knowable<br />

a posteriori, the person I’ve called the dogmatist, has a defence against this Bayesian<br />

argument. If the response is successful, then there may well be other applications of


The Bayesian and the Dogmatist 527<br />

the dynamic Keynesian model, but for now I’m content to show how the model can<br />

be used to defend the dogmatic response to scepticism.


Keynes, Uncertainty and Interest Rates<br />

Abstract<br />

Uncertainty plays an important role in The General Theory, particularly<br />

in the theory of interest rates. Keynes did not provide a theory of uncertainty,<br />

but he did make some enlightening remarks about the direction<br />

he thought such a theory should take. I argue that some modern innovations<br />

in the theory of probability allow us to build a theory which<br />

captures these Keynesian insights. If this is the right theory, however,<br />

uncertainty cannot carry its weight in Keynes’s arguments. This does<br />

not mean that the conclusions of these arguments are necessarily mistaken;<br />

in their best formulation they may succeed with merely an appeal<br />

to risk.<br />

“Employment was a problem because investment was; and investment<br />

was problematic because of the uncertainty of its return.” (Shapiro, 1997,<br />

83)<br />

Keynes clearly saw an important role for uncertainty in his General Theory. However,<br />

few contemporaries agreed with him, and subsequent ‘Keynesians’ generally obliterated<br />

the distinction between risk and uncertainty. In part this was caused by Keynes’s<br />

informal presentation of his views on uncertainty in The General Theory. This paper<br />

has two aims. The first is to sketch a formal theory of uncertainty which captures<br />

Keynes’s insights about the risk/uncertainty distinction. I argue that the theory of<br />

imprecise probabilities developed in recent years best captures Keynes’s intuitions<br />

about uncertainty. In particular this theory provides a formal distinction between<br />

risk and uncertainty, and allows for an analysis of Keynes’s ‘weight’ of arguments.<br />

However, the second aim is to show that if this is right then Keynes was wrong to<br />

draw the economic consequences of uncertainty that he did. In broad terms, I argue<br />

that uncertainty is economically impotent. It only has effects in conjunction<br />

with some other feature of models or the world, such as missing markets or agent<br />

irrationality. But these features plus the existence of risk are sufficient to get the conclusions<br />

Keynes wants. These conclusions of Keynes might be right, but if so they<br />

can be justified without reference to Keynesian uncertainty. At the end of the day,<br />

uncertainty is not as economically interesting as it appears.<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Cambridge<br />

Journal of Economics 26 (2002): 47-62.


Keynes, Uncertainty and Interest Rates 529<br />

1 Imprecise Probabilities<br />

In the classical, or Bayesian 1 , model of rationality all rational agents have precise<br />

degrees of belief, or credences, in each proposition. There is a probability function<br />

Be l such that for any proposition A, there is a number Be l (A). So if an agent believes<br />

p to degree x she believes p to degree 1−x. This is appropriate for some propositions.<br />

For example, if p is a proposition about the decay of an atom with known half-life,<br />

or about any event with a known objective chance and hence subject to risk and not<br />

uncertainty, the agent’s credences should reflect the chances. Since chances are precise<br />

and form a probability function, the credences will also have these properties. The<br />

Bayesian theory assumes that all situations can be treated by analogy with these.<br />

As Keynes pointed out in the famous QJE article (Keynes, 1937b), this analogy<br />

is clearly mistaken. When p is about the price of copper in thirty years, we do not<br />

know the chance that p will be true. And we do not have enough information to<br />

form a precise credence. As Keynes had argued in his Treatise on Probability sixteen<br />

years earlier, attempts to avoid this problem by appeal to a Principle of Indifference<br />

lead to contradiction. In The General Theory he noted that he still approved of this<br />

little argument (Keynes, 1936, 152). Hence Bayesians have no way of representing<br />

our ignorance in uncertain situations. They say that all rational agents have a precise<br />

epistemic attitude towards each proposition, believing it to some precise degree,<br />

whereas ignorance consists in not having such a precise attitude.<br />

The theory of imprecise probabilities avoids all of these difficulties. The theory is<br />

quite old, dating back to work by Gerhard Tintner (1941) and A. G. Hart (1942), but<br />

has only received extensive consideration recently. The best modern summaries are<br />

by the philosopher Isaac Levi (1980) and the statistician Peter Walley (1991). There<br />

are minor differences, but the theory I shall give captures all the common ingredients.<br />

According to Bayesians, states of rational agents are represented by a single probability<br />

function P r ; in the imprecise theory they are represented by a set of probability<br />

functions S. The agent’s credence in p is vague over the set of values that P r ( p) takes<br />

for P r ∈ S. In the extreme case, for every x ∈ [0,1] there will be a P r ∈ S such that<br />

P r ( p) = x. This represents almost total ignorance about p. The set S is called the<br />

‘representor’ of the agent whose credences it represents.<br />

It is important to stress what S represents, because there has been some confusion<br />

over this 2 . The P r do not represent the agent’s hypotheses about the correct distribution<br />

of objective chances. I use the phrase ‘objective chance’, or just ‘chance’, to refer<br />

to a property that plays a certain role in fundamental physics, the property which<br />

makes it the case that the whirrings of atoms in the void is indeterminate. Modern<br />

physics, or at least the most popular versions of it, teaches that chance infects all fundamental<br />

physical events. These chances fulfill all the properties that anyone has ever<br />

1 For this paper I follow Walley (1991) in describing those theorists who require that all agents have<br />

precise degrees of belief and these degrees form a probability function as Bayesians. There is some dispute<br />

as to the accuracy of this labelling, particularly as some paradigm case Bayesians, such as Jeffrey (1983a)<br />

and van Fraassen (1990), accept that degrees of belief can be vague. However, there is probably no other<br />

name as convenient or as recognisable.<br />

2 See, for example, Gärdenfors and Sahlin (1982), Levi (1982).


Keynes, Uncertainty and Interest Rates 530<br />

wanted in probabilities. They reflect long-run frequencies of repeated events, they<br />

put restrictions on reasonable degrees of belief, they can be properly applied to single<br />

cases, and so on. If all fundamental physical events are chance events, then arbitrary<br />

Boolean combinations of fundamental physical events should also, presumably, be<br />

chance events. But any event whatsoever is some combination of fundamental physical<br />

events, though for many it may not be clear which combination. So baseball<br />

games, romantic affairs and stock market movements are all chance events in this<br />

sense, even though they are not, for instance, repeatable events. Of course, trying<br />

to predict these using the laws of physics will be even less productive than trying to<br />

predict them using the methods we currently have available. Saying where all the<br />

atoms, or quarks, currently are is humanly impossible, and perhaps theoretically impossible<br />

as well. Even allowing for this, computing where they will move before they<br />

get there is beyond the capacity of any possible machine.<br />

I distinguish between a situation where the agent does not know the objective<br />

chance of some proposition, and a situation where the agent has no precise credence<br />

in that proposition. An agent can have a precise credence in p without knowing its<br />

objective chance. If the agent believes that a certain number of chance distributions<br />

are possible, and gives each of them a precise credence, this entails she has a precise<br />

credence in each event. (Imagine we see a fair coin be tossed, and land, but do not<br />

see how it falls. The objective chance that it shows heads is either one, if it does, or<br />

zero, otherwise. But the appropriate credence in the proposition, the coin has landed<br />

heads, is one half.) Rather the P r represent the precise credence distributions that<br />

are consistent with real imprecise distribution. For example, for some rational agent,<br />

and some proposition p, the agent’s epistemic state will determine that she believes<br />

p to a greater degree than 0.2, and a lesser degree than 0.4, but there will be no more<br />

facts about the matter. (In this case S will include a function P r such that P r ( p) = x<br />

for each x ∈ [0.2, 0.4].) If we ask her whether she thinks p is more likely than some<br />

proposition, call it q, which she believes to degree 0.3, she will not be able to say<br />

one way or the other. And this is not just because she lacks rationality or powers<br />

of introspective observation. It is no requirement of rationality that she believe p is<br />

more likely, less likely or equally likely than q As Levi and Walley have pointed out,<br />

the Bayesian arguments purporting to show this is a constraint on rationality have<br />

been hopelessly circular.<br />

The reasons for wanting to be able to represent uncertainty were stressed by Keynes,<br />

and are generally well known. Before showing why this theory captures Keynes’s<br />

intuitions about uncertainty, I will briefly mention two nice formal features of<br />

the theory of imprecise probabilities. On many theories of uncertainty, particularly<br />

those that represent uncertain agents as having interval valued degrees of belief, it<br />

is hard to explain comparative statements, like “ p seems more likely to me than q”.<br />

These comparatives are crucial to our everyday practices of probabilistic reasoning.<br />

We say p is more probable than q according to S iff for all P r ∈ S, P r ( p) > P r (q).<br />

This lets us say, as seems right, that A is more probable than A ∧ B for almost all<br />

propositions A, B.<br />

The second formal advantage is that we now have a simple way to update epistemic<br />

states on receiving new evidence. Let S be the agent’s current representor, and


Keynes, Uncertainty and Interest Rates 531<br />

the new evidence be e. Then the updated representor, S e is given as follows:<br />

S e = {P r (•|e) : P r ∈ S}<br />

That is, we just conditionalise every probability function in S. Again, updating has<br />

proven problematic for some approaches to uncertainty. The theory of evidence<br />

functions, developed by Dempster (1967) and Shafer (1976) allows that an agent can<br />

know that if either e or ¬e comes in as evidence, their credence in p will rise. This<br />

seems absurd; we can know before an experiment that whatever happens we’ll be<br />

more confident in p than we are now.<br />

To take a famous example, three prisoners X , Y and Z are about to be exiled to<br />

Elba. The governor decides on a whim that he will pardon one, and casts a fair die<br />

to choose which. He tells the guards who is pardoned, but instructs them not to tell<br />

the prisoners yet. X pleads futilely with his guard, and finally asks, “Can you tell<br />

me the name of one of the others who won’t be pardoned.” The guard, realising this<br />

will not reveal X ’s fate, agrees to answer. X thinks that if Y is pardoned, the guard<br />

will say Z, so there is at least a one-third probability of that. And if Z is pardoned,<br />

the guard will say Y , so there is also at least a one-third probability of that. But if<br />

he is pardoned, what the guard will have to decide what to say, and we can’t make<br />

probability judgements about free human decisions. On the Dempster-Shafer theory,<br />

the probability of X being freed is one-third, but the probability of X being freed and<br />

the guard saying Y goes to Elba is zero, and the probability of X being freed and the<br />

guard saying Z goes to Elba is zero. This is just a standard failure of additivity, and<br />

not at all objectionable. The problem is that when the guard says that Y will go to<br />

Elba, or that Z will go to Elba, the probability of X being freed rises to one-half. (I<br />

will not go through the mathematics here, because it can be found in any book on the<br />

Dempster-Shafer theory. See, for example, Walley (1991) or Yager et al. (1994).) Since<br />

X did not learn about his chances of freedom, this seems like a rather odd result. The<br />

theory of imprecise probabilities avoids this problem. It can be easily shown that<br />

on this theory for any evidence e if the probability of p given e is greater than the<br />

probability of p, then the probability of p given ¬e is less than the probability of p.<br />

(Again Walley (1991) contains the proof.)<br />

2 Keynes and Imprecise Probabilities<br />

Obviously enough, this is not the theory that Keynes formally endorses, either in<br />

his Treatise on Probability (Keynes, 1921) or his economic writings. Nevertheless, I<br />

think it is an important theory for understanding Keynes’s use of uncertainty. This is<br />

because it, and it alone, captures all of the underlying motivations of Keynes’s theory<br />

of uncertainty. Hence any economic consequences of uncertainty Keynes wants to<br />

draw will have to be derivable from this theory.<br />

I have so far spoken blithely of ‘Keynes’s theory of uncertainty’, implicitly assuming<br />

there is such a unique theory. In recent years a number of authors (e.g. Runde<br />

(1994a); Davis (1994); Coates (1996); Bateman (1996) have questioned this assumption,<br />

saying that Keynes changed his theory between the writing of the Treatise on<br />

Probability and The General Theory. I will not deal directly with such criticisms here


Keynes, Uncertainty and Interest Rates 532<br />

for a number of reasons. First, the main dispute is over whether probabilities are<br />

given by logic or are ‘merely subjective’, and that debate is independent of the debate<br />

about the effects of allowing imprecise probabilities. Secondly, there are obvious<br />

space constraints. Many of these alternative interpretations were put forward in<br />

book length arguments, and a fair response to them would not be short. Thirdly, and<br />

perhaps most importantly, I take it that the methodological game here is inference to<br />

the best explanation. Whatever criticisms I make of others’ interpretations would be<br />

rather weak unless I showed that some other overall story was more persuasive. And<br />

if I come up with a more persuasive story here criticisms of their accounts will be<br />

slightly redundant. So I hope the reader at least permits the indulgence of setting out<br />

a theory of Keynes’s ideas predicated on this rather controversial assumption.<br />

In the Treatise on Probability (Keynes (1921), hereafter TP) Keynes says that probability<br />

is essentially a property of ordered pairs of propositions, or what he calls<br />

arguments. He writes p/q = α, for the probability of hypothesis p on evidence q<br />

is α. Now this value α is rather unusual. It sometimes is a number, but sometimes<br />

not; it sometimes can be compared to all numbers, but sometimes not; it sometimes<br />

can be compared to other probability values such as β, but sometimes not and it can<br />

enter into arithmetic operations. As a consequence probabilities are subject to all the<br />

usual rules of the classical probability calculus. For example, whenever p and r are<br />

inconsistent, then ( p ∨ r )/q = p/q + r /q always holds, even when none of these<br />

values is numerical.<br />

These five properties are rather perplexing. Indeed, Keynes’s failure to explain or<br />

justify them fully is one of the main criticisms that Ramsey (Ramsey, 1926, 161-6)<br />

launches at Keynes’s theory. But on this theory they all fall out as consequences of<br />

our definitions. If p/q = α then α will be numerical iff there is some x such that<br />

for all P r ∈ S, P r ( p|q) = x. Similarly α > y, for real valued y, iff P r ( p|q) > y for<br />

all P r ∈ S. A similar definition holds for α < y and α = y, from which it can be<br />

seen that it is possible that α is neither greater than, less than, nor equal to y. If none<br />

of these hold we say that α and y are incomparable. If p/q = α and r /s = β then<br />

α > β iff for all P r ∈ S, P r ( p|q) > P r (r |s). Again similar definitions of less than<br />

and equal to apply, and the consequence of all these is that sometimes α and β will<br />

be comparable, sometimes not.<br />

Ramsey is right to question the intelligibility of Keynes’s use of addition and<br />

multiplication. We know what it means to add and multiply numbers, but we have<br />

no idea what it is to add or multiply non-numerical entities. However, on this theory<br />

addition and multiplication are perfectly natural. Since we represent α and β by sets,<br />

generally intervals, then α+β and α ˙ β will be sets. They are defined as follows. Again<br />

let p/q = α and r /s = β.<br />

α + β = {x : ∃P r ∈ S(P r ( p|q) + P r (r |s) = x)}<br />

α ˙ β = {x : ∃P r ∈ S(P r ( p|q) ˙ P r (r |s) = x)}<br />

These definitions are natural in the sense that we are entitled to say that the ‘+’<br />

in α + βmeans the same as the ‘+’ in 2 + 3. And the definitions show why Keynes’s


Keynes, Uncertainty and Interest Rates 533<br />

α’s and β’s will obey the axioms of the probability calculus. Even if p/q and ¬ p/q<br />

are non-numerical, p/q + ¬ p/q will equal {1}, or effectively 1. So we have something<br />

like the additivity axiom, without its normal counterintuitive baggage. The<br />

main problem with additivity is that sometimes we may have very little confidence<br />

in either p or ¬ p, but we are certain that p ∨ ¬ p. If we measure confidence by the<br />

lower bound on these probability intervals, this is all possible on our theory. Our<br />

technical apparatus removes much of the mystery behind Keynes’s theory, and fends<br />

off an important objection of Ramsey’s.<br />

The most famous of Keynes’s conceptual innovations in the TP is his introduction<br />

of ‘weight’. He does this in the following, relatively opaque, paragraph.<br />

As the relevant evidence at our disposal increases, the magnitude of the<br />

probability of the argument may either decrease or increase, according<br />

as the new knowledge strengthens the unfavourable or the favourable<br />

evidence; but something seems to have increased in either case, – we have<br />

a more substantial basis upon which to rest our conclusion. I express<br />

this by saying that an accession of new evidence increases the weight of<br />

an argument. New evidence will sometimes decrease the probability of<br />

an argument, but it will always increase its ‘weight’ (Keynes, 1921, 77,<br />

italics in original).<br />

The idea is that p/q measures how the evidence in q is balanced between supporting<br />

p and supporting ¬ p. The concept of weight is needed if we want to also know<br />

how much evidence there is. Note that weight only increases when relevant evidence<br />

comes in, not when any evidence comes in. The weight of the argument from my<br />

evidence to “Oswald killed JFK” is not increased when I discover the Red Sox won<br />

last night.<br />

The simplest definition of relevance is that new evidence e is irrelevant to p given<br />

old evidence q iff p/q ∧ e) = p/q, and relevant otherwise. Now there is a problem.<br />

Two pieces of evidence e 1 and e 2 can be irrelevant taken together, but relevant taken<br />

separately. For a general example, let e 1 be p ∨ r and e 2 be ¬ p ∨ r , for almost any<br />

proposition r . If I receive e 1 and e 2 sequentially, the weight of the argument from<br />

my evidence to p will have increased twice as I receive these new pieces of evidence.<br />

So it must be higher than it was when I started. But if I just received the two pieces<br />

of evidence at once, as one piece of evidence, I would have properly regarded it as<br />

irrelevant. Hence the weight in question would be unchanged. So it looks as if weight<br />

depends implausibly not on what the evidence is, but on the order in which it was<br />

obtained.<br />

Keynes avoids this implausibility by tightening up the definition of irrelevance.<br />

He says that e is irrelevant to p/q iff there are no propositions e 1 and e 2 such that e is<br />

logically equivalent to e 1 ∧ e 2 and either e 1 or e 2 is relevant to p/q. Unfortunately, as<br />

I noted in the previous paragraph for virtually any such evidence proposition there<br />

will be such propositions e 1 and e 2 . This was first noticed by Carnap (1950). Keynes,<br />

had he noticed this, would have had three options. He could conceded that everything<br />

is relevant to everything, including last night’s baseball results to the identity


Keynes, Uncertainty and Interest Rates 534<br />

of Kennedy’s assassin; he could have conceded that the order in which evidence appears<br />

does matter, or he could have given up the claim that new relevant evidence<br />

always increases the weight of arguments.<br />

The last option is plausible. Runde (1990) defends it, but for quite different reasons.<br />

He thinks weight measures the ratio of evidence we have to total evidence we<br />

believe is available. Since new evidence might lead us to believe there is much more<br />

evidence available than we had previously suspected, the weight might go down. I<br />

believe it holds for a quite different reason, one borne out by Keynes’s use of uncertainty<br />

in his economics. In The General Theory (Keynes (1936), hereafter GT ) Keynes<br />

stresses the connection between uncertainty and ‘low weight’ (GT : 148n). If we regard<br />

p as merely risky the weight of the argument from our evidence to p is high, if<br />

we regard p as uncertain the weight is low. In the Quarterly Journal of Economics article<br />

he argues that gambling devices are, or can be thought to be, free of uncertainty,<br />

whereas human actions are subject to uncertainty. So the intervention of humans can<br />

take a situation from being risky to being uncertain, and hence decrease the weight<br />

in question.<br />

For example, imagine we are playing a rather simple form of poker, where each<br />

player is dealt five cards and then bets on who has the best hand. Before the bets start,<br />

I can work out the chance that some other player, say Monica, has a straight. So my<br />

credence in the proposition Monica has a straight will be precise. But as soon as the<br />

betting starts, my credence in this will vary, and will probably become imprecise.<br />

Do those facial ticks mean that she is happy with the cards or disappointed? Is she<br />

betting high because she has a strong hand or because she is bluffing? Before the<br />

betting starts we have risk, but no uncertainty, because the relevant probabilities are<br />

all known. After betting starts, uncertainty is rife.<br />

The poker example supports my analysis of weight. If weight of argument rises<br />

with reduction of uncertainty, then in some rare circumstances weight of arguments<br />

decreases with new evidence. Let [x 1 , x 2 ] be the set given by {x : P r ( p|q) = x} for<br />

some P r ∈ S}, where S is the agent’s representor. Then the weight of the argument<br />

from p to q, for this agent, is 1 − (x 2 − x 1 ). That is, the weight is one when the agent<br />

has a precise degree of belief in p, zero when she is totally uncertain, and increasing<br />

the narrower the interval [x 1 , x 2 ] gets. Now in most cases new relevant evidence will<br />

increase the weight, but in some cases, like when we are watching Monica, this will<br />

not happen. I follow Lawson (1985) in saying that p is uncertain for an agent with<br />

evidence q iff p/q is non-numerical, i.e. iff the weight of the argument from q to p is<br />

less than one. Hence we get the connection between uncertainty and weight Keynes<br />

wanted. I also claim that the bigger x 2 − x 1 is, the more p/q is unlike a real number,<br />

the more uncertain p is. Keynes clearly intended uncertainty to admit of degrees<br />

Keynes (1937b), so this move is faithful to his intent.<br />

Keynes’s theory of probability is based around some non-numerical values whose<br />

nature and behaviour is left largely unexplained, and a concept of weight which is<br />

subject to a telling and simple objection. Nevertheless, his core ideas, that probabilities<br />

can but need not be precise, and that we need a concept like weight as well as<br />

just probability, both seem right for more general reasons. Hence the theory here,


Keynes, Uncertainty and Interest Rates 535<br />

which captures the Keynesian intuitions while explaining away his mysterious nonnumerical<br />

values and making the concept of weight more rigorous, looks to be as<br />

good as it gets for a Keynesian theory of uncertainty.<br />

One particularly attractive feature of the account is how conservative it is at the<br />

technical level. We do not need to change our logic, change which things we think are<br />

logical truths, or which things follow from which other things, in order to support<br />

our account of uncertainty. This is in marked contrast to accounts based on fuzzy<br />

logic or on logics of vagueness. Not only are such changes in the logic unmotivated,<br />

they appear to lead to mistakes. No matter how uncertain we are about how the<br />

stock will move over the day, we know it will either close higher or not close higher;<br />

and we know it will not both close higher and not close higher. The classical laws<br />

of excluded middle and non-contradiction seem to hold even in cases of massive uncertainty.<br />

This seems to pose a serious problem for theories of uncertainty based on<br />

alternative logics. The best approach is one, like the theory here, which is innovative<br />

in how it accounts for uncertainty, and conservative in the logic it presupposes.<br />

So as a theory of uncertainty I think this account has a lot to be said for it. However,<br />

it cannot support the economic arguments Keynes rests on it.<br />

3 The Economic Consequences of Uncertainty<br />

Uncertainty can impact on the demand for an investment in two related ways. First,<br />

it can affect the value of that particular investment; secondly, it can affect the value<br />

of other things which compete with that investment for capital. The same story is<br />

true for investment as a whole. First, uncertainty may reduce demand for investment<br />

directly by making a person who would otherwise be tempted to invest more cautious<br />

and hence reluctant to invest. Secondly, if this direct impact is widespread enough, it<br />

will increase the demand for money, and hence its price. But the price of money is<br />

just the market rate of interest. And the return that an investment must be expected<br />

to make before anyone, even an investor not encumbered by uncertainty, will make<br />

it is the rate of interest.<br />

When uncertainty reduces investment by increasing interest rates, I will say it has<br />

an indirect impact on investment. Keynes has an argument for the existence of this<br />

indirect impact. First, he takes the amount of consumption as a given (GT : 245).<br />

Or more precisely, for any period he takes the amount of available resources that<br />

will not be allocated to consumption as a given. There are three possible uses for<br />

these resources: they can be invested, they can be saved as bonds or loans, or they<br />

can be hoarded as money. There are many different types of investment, but Keynes<br />

assumes that any agent will already have made their judgement as to which is the best<br />

of these, so we need only consider that one. There will also be many different length<br />

bonds which the agent can hold. So as to simplify the discussion, Keynes proposes<br />

just treating these two at a time, with the shorter length bond called ‘money’ and<br />

the longer length loan called ‘debts’ (GT : 167n). Hence the rate of interest is the<br />

difference between the expected return of the shorter bond over the life of the longer<br />

bond and the return of the longer bond. So the rate of interest that we are interested<br />

in need not be positive, and when the two bond lengths are short will usually be


Keynes, Uncertainty and Interest Rates 536<br />

zero. However, it is generally presumed in discussions that the rate is positive. Now,<br />

Keynes assumes that an agent will only allocate resources to investment if investment<br />

looks to be at least as worthwhile as holding money, and at least as worthwhile as<br />

holding debts. In other words, he makes the standard reduction of n-way choice to a<br />

set of 2-way choices 3 . Usually if someone is of a mind to invest they will not favour<br />

holding money over holding debts. The only motivation for holding money, given<br />

positive interest rates, could be a desire to have accessible command over purchasing<br />

power, and investment foregoes that command. So in practice we only need look<br />

at two of the three possible pairwise choices here. Hence I will ignore the choice<br />

between investing and holding money, and only look at the money-debt choice and<br />

the debt-investment trade-off.<br />

Holding a debt provides a relatively secure return in terms of money. Relatively<br />

secure because there is the possibility of default. In practice, this means that there is<br />

not a sharp distinction between debts and investments, rather a continuum with say<br />

government bonds at one extreme and long-term derivatives at the other. Some activities<br />

that have the formal structure of ‘debts’, like say provision of venture capital,<br />

will be closer to the investment end of the continuum. Unlike debts then, investments<br />

as a rule do not have a secure return in terms of money. In most cases they<br />

do not even have a precise expected return (GT : 149; Keynes (1937b, 113)). Keynes<br />

does not presume that this means that people never invest unless the expected return<br />

on the investment is greater than the expected (indeed, known) return on debts. He<br />

says explicitly that were this true then ‘there might not be much investment’. Instead,<br />

he says that investment under uncertainty depends on ‘confidence’ (GT : 150).<br />

Therefore, the following looks compatible with his position.<br />

Bayesians say that each gamble has a precise expected value. The expected return<br />

on a bet that pays $1 if some fair coin lands heads is 50 cents. On this theory,<br />

expected values are imprecise, because probabilities are imprecise. Formally, say<br />

E P r (G) = α means that the expected return on G according to probability function<br />

P r is α. Roughly, the expected value for an agent of a gamble G will be {x : ∃P r ∈<br />

S : (E P r (G) = x)}, the set of expected values of the bet according to each probability<br />

function in the agent’s representor. Note that these are different from the possible<br />

outcomes of the bet. As we saw in the case of the coin, expected value can differ<br />

from any possible value of the bet. So let the expected value of inesting a certain<br />

sum be [α,β], and the expected value of buying a debt with that money be χ . Then<br />

the agent will invest iff (1 − ρ)α + ρβ ≥ χ , where ρ ∈ [0,1] measures the ‘state of<br />

confidence’. 4 Now when a crisis erupts, ρ will go to 0, and investment will dry up.<br />

In such cases the decision theory is similar to the one advanced by Levi (1980), Strat<br />

(1990) and Jaffray (1994). Since we are interested in a theory of unemployment, we<br />

3 Standard, but I bring it up because the modern theorist whose decision theory is closest to the one<br />

Keynes seems to adopt, Levi, explicitly rejects it.<br />

4 In case the reader fears I am being absurdly formal with an essentially informal idea, Keynes had such<br />

a variable, there described as measuring the ‘state of the news’, in early drafts, but it did not survive to the<br />

final stage. So my proposal is not a million miles from what Keynes intended merely by virtue of being<br />

algebraic.


Keynes, Uncertainty and Interest Rates 537<br />

are primarily interested in the cases where ρ is quite low, in which cases we can say<br />

uncertainty is reducing investment.<br />

That last statement might seem dubious at face value. In part, what I mean by<br />

it is this. When ρ is low the value of a set of bets will in general be more than the<br />

sum of the value of the bets taken separately. Because individual investors are fearful<br />

of exposure to uncertainty, which is presumably what ρ being low means, sets<br />

of investments which if undertaken collectively would be profitable (and everyone<br />

agrees that they would) will not be undertaken individually. This suggests a reason<br />

that theorists have thought government intervention might be appropriate in times<br />

of crisis. Alternatively, if ρ is low then the value of an investment, how much we<br />

will be prepared to pay for it, will probably be lower than our best estimate of its<br />

expected return, assuming the latter to be near (α + β)/2.<br />

I shall focus more closely on the indirect effects of uncertainty in section 5. The<br />

central idea is that the rate of interest, being the price of money, is completely determined<br />

in the market for money. However, this market has some rather strange<br />

properties. After all, money is barren, and it can generally be traded for something<br />

that is not barren. So, as Keynes puts it, why would anyone ‘outside a lunatic asylum’,<br />

want it? Why would the demand for money not drop to zero as soon as the<br />

rate of interest is positive?<br />

Because, partly on reasonable and partly on instinctive grounds, our desire<br />

to hold money as a store of wealth is a barometer of the degree of<br />

our distrust of our own calculations and conventions concerning the future<br />

... The possession of actual money lulls our disquietude; and the<br />

premium which we require to make us part with money is the measure<br />

of the degree of our disquietude (Keynes, 1937b, 116).<br />

Therefore, more uncertainty means more demand for money means higher interest<br />

rates. The rest of the story is standard. Even the confident agent will be disinclined<br />

to invest once the rate of interest rises. Using the little decision theory outlined<br />

above, more uncertainty means the gap between α and β grows, which if ρ is low<br />

will tend to reduce (1−ρ)α +ρβ, the ‘certainty equivalent’ of the expectation of the<br />

investment’s worth. On the other hand, uncertainty on the part of the community<br />

will tend, for similar reasons, to increase χ . Either way, investment suffers, and hence<br />

so does employment.<br />

4 Uncertainty and Money<br />

There is something very odd about all that we have done so far. Agents react to<br />

uncertainty by making their returns measured in dollars more stable. However, in<br />

doing so they make their returns measured in any other good less stable. If you have<br />

no idea what the price of widgets will be in twelve months time, then holding only<br />

widgets increases the uncertainty about how many dollars you will be worth then.<br />

However, it makes you more certain about how many widgets you will be worth.<br />

Why this preference for money? We deserve an explanation as to why one kind of<br />

uncertainty is given such a central place and other kinds are completely ignored.


Keynes, Uncertainty and Interest Rates 538<br />

Keynes has one explanation. He argues, or perhaps assumes, essentialism about<br />

money. Indeed the title of chapter 17 of The General Theory is ‘The Essential Properties<br />

of Interest and Money’. These essential properties are entirely functional. As<br />

Hicks puts it, “Money is defined by its functions ... money is what money does”<br />

(Hicks, 1967, 1). The explanation is that agents try to minimise uncertainty relative<br />

to whatever plays the functional role of money. Therefore, the explanation does not<br />

rely on any mystical powers of dollar bills. Rather, the work is done by the functional<br />

analysis of money.<br />

As a first approximation, the functional role money plays is that it is a medium<br />

of exchange. Keynes does not think this is quite the essential property; rather he says<br />

that money is essentially ‘liquid’, and perceived to be liquid. This means that if we<br />

hold money we are in a position to discharge obligations and make new purchases as<br />

they seem appropriate with greatest convenience and least cost. Even this is not what<br />

is given as the official essential property of money. To make the proof that demand<br />

for money is not demand for labour easier Keynes takes the essential properties of<br />

money to be its negligible elasticities of production and substitution. However, he<br />

makes clear that these are important because of their close connection to liquidity<br />

(GT : 241). Indeed, when he comes to define a non-monetary economy, he simply<br />

defines it as one where there is no good such that the benefits it confers via its liquidity,<br />

its ‘liquidity premium’ exceeds the carrying costs of the good. So the properties<br />

of having a negligible elasticity of production and substitution seem necessary but<br />

insufficient for something to be money.<br />

The reason that money uncertainty is more problematic than widget uncertainty<br />

is just that money is liquid. At the end of the day, the point of holding investments,<br />

bonds or money is not to maximise the return in terms of such units; it is to be used<br />

somehow for consumption. Hence, we prefer, ceteris paribus, to store wealth in ways<br />

that can be easily exchanged for consumption goods as and when required. Further,<br />

we may be about to come across more information about productive uses for our<br />

wealth, and if we do, we would prefer to have the least inconvenience about changing<br />

how we use wealth. Money is going to be the best store of wealth for each of these<br />

purposes. The strength of these preferences determines the liquidity premium that<br />

attaches to money.<br />

So Keynes’s story here is essentially a ‘missing markets’ story. If there were markets<br />

for every kind of transaction there would be no liquidity premium attaching to<br />

money, and hence no reason to be averse to uncertainty in terms of money returns<br />

as opposed to uncertainty in terms of X’s shares returns. There is a methodological<br />

difference here between decision theorists and economists. In decision theory it is<br />

common to specify what choices an agent does have. These will usually be finite, or<br />

at least simply specified. In economics it is more common to specify what choices an<br />

agent does not have, which markets are ‘missing’. In a sense the difference is purely<br />

cosmetic, but it can change the way problems are looked at. Since Keynes requires<br />

here some markets to be missing, it might be worth investigating what happens here<br />

from the more restrictive framework ordinarily applied in decision theory.<br />

In some decision-theoretic contexts, we can prefer liquidity even when we are<br />

completely certain about what our choices are and what their outcomes will be. Say


Keynes, Uncertainty and Interest Rates 539<br />

we are in a game where the object is to maximise our money over 2 days. We start<br />

with $100. On day 1, we have a choice of buying for $100 a ticket that will pay $200 at<br />

the end of day 2, and is non-transferable, or doing nothing. On day 2, if we still have<br />

our $100, we can buy with it a voucher which pays $300 at the end of day 2, or do<br />

nothing. Obviously, the best strategy is to do nothing on day 1, and buy the voucher<br />

on day 2. The point is just that money here has enough of a liquidity premium on<br />

day 1 that we are prepared to hold it and earn no interest for that day rather than<br />

buy the ticket (or two day bond) which will earn interest. So uncertainty is not a<br />

necessary condition for liquidity premia to exist. On the other hand, perhaps it is<br />

necessary for liquidity premia to exist in a world something like ours, where agents<br />

neither have all the choices they would have in a perfect market, nor as few as in<br />

this simple game. If we added a market for tickets and vouchers to our simple game<br />

the prices would be fixed so that money would lose its liquidity premium. Keynes<br />

suggests something like this is true for the worlds he is considering: “uncertainty as to<br />

the future course of the rate of interest is the sole intelligible explanation of the type<br />

of liquidity preference [under consideration]” (GT : 201). However here he merely<br />

means lack of certainty; there is no proof that if every agent had precise credences<br />

liquidity preference ought to disappear. So it looks like uncertainty in the sense<br />

discussed here, vague reasonable beliefs, does no theoretical work. Perhaps this is a<br />

bit quick, as the little game I considered is so far from a real-life situation. So I will<br />

look more closely at the effects uncertainty is supposed to have. Since it has received<br />

the bulk of the theoretical attention, I start with the indirect effects of uncertainty.<br />

5 Uncertainty and Liquidity Preference<br />

Keynes thinks the question of why money is demanded at all, why we do not all move<br />

from holding money into holding debts as soon as the rate of interest goes positive,<br />

needs answering. And he thinks the answer here will be particularly relevant to<br />

theories about the rate of interest. If the market in general is at equilibrium then the<br />

market in trades between any two goods must also be in equilibrium; in particular it<br />

cannot be that there are people holding money who would be prepared to buy debts<br />

at the current interest rate. So if the equilibrium interest rate is positive, there must<br />

be some people who would prefer to hold money than hold debts. This fact Keynes<br />

takes to be central to the correct theory of the rate of interest. Hence, to determine<br />

what the rate of interest will be, and what will cause it to change, I need to determine<br />

what causes a demand for money.<br />

Keynes distinguishes four motives for holding money (GT : Ch. 13; (Keynes,<br />

1937a, 215-223)). Two of these, the transactions motive and the finance motive, need<br />

not detain us. They just relate to the need to make payments in money and on<br />

time. The third, the speculative motive, is often linked to uncertainty, and indeed<br />

Keynes does so (GT : 201). But ‘uncertainty’ here is just used to mean absence of<br />

certainty, that is the existence of risk, which as noted above is not how I am using<br />

‘uncertainty’. As Runde (1994b) points out, an agent who is certain as to future<br />

movements in interest rates may still hold money for speculative reasons, as long as<br />

other agents who are not so certain have made mistaken judgements. The fourth


Keynes, Uncertainty and Interest Rates 540<br />

motive will hold most of my attention. Keynes argues that we may hold money for<br />

purely precautionary reasons.<br />

To provide for contingencies requiring sudden expenditure and for unforeseen<br />

opportunities of advantageous purchases, and also to hold an<br />

asset of which the value is fixed in terms of money to met a subsequent<br />

liability fixed in terms of money, are further motives for holding cash<br />

(GT : 196).<br />

Davidson (1988, 1991) justifies this as follows. Uncertainty arises whenever agents<br />

do not have sufficient knowledge to calculate the numerical probability of an event.<br />

This is given a rather frequentist gloss in Davidson, but that is not necessary. His<br />

idea is that we know what the probability of p is when we know the frequency<br />

of p-type events in the past and we know the future will resemble the past in this<br />

respect. The latter is cashed out as saying p is governed by an ‘ergodic process’. We<br />

can replace all this by saying that p is subject to uncertainty whenever we do not<br />

know its objective chance, whether or not objective chance ought to be analysed by<br />

frequentist approaches. Davidson then argues that since for most p we do not have<br />

this knowledge, we have to adopt ‘sensible’ approaches like holding money.<br />

Runde (1994b) objects that Davidson’s story is incoherent. On Davidson’s theoretical<br />

story there are only two epistemic states relative to p that are possible. An<br />

agent can know the chance of p, in which case their credence is set equal to it, or<br />

they are completely uncertain about it. In the latter case there can be no reason for<br />

taking some action rather than another. Now the reason that it is ‘sensible’ to hold<br />

money is that we expect money to be liquid. However, we do not know the chance<br />

of money remaining liquid; whether or not money remains liquid is not determined<br />

by an ergodic process. Hence, we have no reason for letting that partial belief be a<br />

guide to action.<br />

This is a fair criticism, but it can be met by amending the theory rather than<br />

by giving it up. On my theory, if an agent knows the chance of p they will have a<br />

precise degree of belief in p. When they do not their degree of belief will, in general,<br />

be vague but not totally vague. As with Keynes, I have uncertainty come in degrees.<br />

This amendment is enough to rescue Davidson’s theory. An agent might not know<br />

the chance that money will become illiquid in the next short period of time, but they<br />

might know enough for it to be reasonable to have a credence in that proposition<br />

which is vague over some small interval close to zero. It may still be sensible to hold<br />

some money even when the expected return on other investments really is vague.<br />

But is it sensible to prefer fixed to uncertain returns? In other words, is there a direct<br />

effect of uncertainty that makes people prefer bonds to investments?<br />

6 Uncertainty and Indecision<br />

As Keynes repeatedly stressed, investment is not like a game of chance where the<br />

expected results are known in advance. And this is part of the explanation for the<br />

extreme instability in investment levels compared to other economic variables.


Keynes, Uncertainty and Interest Rates 541<br />

The state of long-term expectation ... does not solely depend on the<br />

most probable forecast we can make. It also depends on the confidence<br />

with which we make this forecast (GT : 148).<br />

Human decisions affecting the future, whether personal or political or<br />

economic, cannot depend on strict mathematical expectation, since the<br />

basis for making such calculations does not exist ... it is our innate urge to<br />

activity which makes the wheels go round, our rational selves choosing<br />

between the alternatives as best we are able, calculating where we can,<br />

but often falling back for our motive on whim or sentiment or chance<br />

(GT : 162-3).<br />

The most charitable reading of Keynes here is to say he agreed, in principle, with<br />

what is sometimes referred to as a Horwitz-style decision rule. If the expected return<br />

of an investment is vague over [α,β] then its ‘value’ is given by (1 − ρ)α + ρβ,<br />

where ρ ∈ [0,1] is a measure of confidence. By the 1937 article, he has become more<br />

interested in the special case where confidence has collapsed and ρ is approaching<br />

0. This interpretation would explain all his references to decision-making under uncertainty<br />

in The General Theory and subsequent discussion, provided we make the<br />

safe assumption that ‘cold calculation’ would only have us spend x on an investment<br />

with expected return [α,β] when α ≥ x. In particular, any interpretation of the underlying<br />

decision theory here will have to give some role to ‘whim or sentiment or<br />

chance’, and I give it a variable, ‘ρ’. With this theory, I have the extensions needed to<br />

avoid Runde’s objection to Davidson. I have a continuum of degrees of uncertainty,<br />

rather than a raw dichotomy, and I have an explanation of why it is ‘sensible’ to prefer<br />

gambles with known expected returns, at least when ρ is relatively low.<br />

This theory is meant to serve two related purposes. It is meant to show why we<br />

might prefer money to debts, even though our best estimate of the expected return<br />

of the debts is positive, and again it is meant to show why we might prefer debts to<br />

investments even when our best estimate of the expected return of the investment is<br />

higher. And I think if the decision rule stipulated were plausible, it would show that<br />

uncertainty did have an economic effect. In particular, I think it would show both<br />

that in times of crises when ρ heads down, the level of investment will decrease even<br />

with other things being equal, and that collective action can be justified even when<br />

individual action is not. That is, the government can make sets of investments that<br />

are expected to be profitable although none of the individual investments is expected<br />

to be profitable.<br />

The decision theory does not, however, seem plausible. First, there are some<br />

technical problems for this theory. The problem is that if ρ < 1<br />

, then in cases where<br />

2<br />

uncertainty is guaranteed to increase in the near future agents following this rule will<br />

make decisions they are sure to regret. For example, assume an agent with ρ = 1<br />

3<br />

now has credence 1<br />

in p, but knows that some evidence will come in such that her<br />

2<br />

credence in p will become vague over [0.3,0.7] whatever the result of the experiment.<br />

As we saw in the case of poker players, this is plausible in some situations. The agent<br />

will now pay 50 cents for a bet which pays $1 if p and nothing otherwise, but after


Keynes, Uncertainty and Interest Rates 542<br />

the evidence comes in she’ll sell that bet for about 44 cents, incurring a sure loss. I<br />

leave it to the reader to judge the importance of these technical problems, given the<br />

rarity of cases where uncertainty is guaranteed to rise.<br />

There is also a philosophical problem. What precisely is ρ supposed to represent?<br />

If it is some kind of belief, its effects should have been incorporated into the credences.<br />

If it is some kind of desire its effects should have been incorporated into the<br />

evaluation of each of the states. This objection could be avoided, perhaps, if Keynes<br />

was trying to argue against the theory that investors just maximise dollar expected<br />

returns. It is not entirely clear whom Keynes thinks he is arguing against at some<br />

points. If this is his enemy, he is fighting a straw man, one who is vulnerable to much<br />

simpler objections. Whoever thought that all investment is profit driven, that no one<br />

ever went into business because they thought it would be fun to run a newspaper?<br />

Keynes’s only viable opponents here are saying that investors calculate the expected<br />

return, in utils, of each possible investment and choose the one whose returns are<br />

highest. Now perhaps for many dollar returns are the most important factor in determining<br />

util returns, but this is certainly not the only cause.<br />

If ρ represents something which is neither a belief nor a desire, then it is hard to<br />

see what effect it could have on action. Perhaps there are some exceptions to the rule<br />

that actions are caused only by beliefs and desires combining in the right way, such as<br />

actions caused by values, but these appear irrelevant to Keynes’s considerations, and<br />

he does not appeal to such exemptions. After all, he describes investment decisions<br />

made where the ‘cold calculations’ do not determine what should be done as being<br />

made by ‘whim or sentiment or chance’. Now whims and sentiments are surely<br />

desires, although chance is in a different boat. If he had just said ‘chance’ here he may<br />

have committed himself to a different decision theory, one where the agent can under<br />

uncertainty make any decision which is not known to be sub-optimal. But this does<br />

not justify the conclusion that uncertainty decreases investment; under that theory<br />

it is random whether uncertainty increases or decreases investment. Hence Keynes<br />

appears to be implausibly committed to a mental state which is neither a belief nor a<br />

desire but affects action.<br />

It might be objected here that I am relying on an overly individualistic theory<br />

of motivation; that what Keynes is committed to is nothing more than what anyone<br />

who has learned the difference between Robinson Crusoe economics and real-world<br />

economics would believe. There is an important truth behind this objection: the<br />

social causes of action cannot be overlooked. But this is not what I have done. The<br />

core assumption I made is that the only mental states relevant to action are beliefs and<br />

desires. Now the beliefs and desires that are relevant may not be (directly) concerned<br />

with the action at hand; they may be beliefs and desires about how society will view<br />

this action, or about similar actions which may or may not be performed by other<br />

members in society. And the beliefs and desires may not have as their immediate<br />

cause careful inference by the agent in question; they may be caused by the wave of<br />

panic or optimism in which the agent is caught up. In the real world, agents do not<br />

always change their beliefs and desires by reflection on new evidence, often emotion<br />

plays a larger role. So society has both evidential and non-evidential effects on action.<br />

But every time, the causal chain goes via the beliefs and desires of the agent. Society


Keynes, Uncertainty and Interest Rates 543<br />

causes actions by causing changes in the beliefs and desires of individuals. It is wrong<br />

to think that action is never caused by beliefs and desires about society, it is wrong<br />

to think that society never directly causes beliefs and desires which lead to action,<br />

but none of this implies that there can be mental states other than belief and desire<br />

relevant to action.<br />

7 Disquietude<br />

There are some comments from Keynes that suggest this reading is a little unfair.<br />

Rather than having a distinctive decision theory, he perhaps has a distinctive theory<br />

about what ought enter into the decision-theoretic calculations. The standard theory<br />

for why there is a demand for insurance is the falling marginal utility of money.<br />

Agents purchase insurance, and accept a lower expected dollar return because with<br />

insurance their expected util return, at the end of the duration of the insurance, is<br />

higher than if they had not purchased. This is the story given in, for example, Freidman<br />

and Savage (1952) where the existence of demand for insurance is taken as evidence<br />

for the declining marginal utility of money. But there is another reason agents<br />

might buy insurance. They might simply feel happier, over the duration of the insured<br />

period, knowing that they have insurance and are hence exposed to fewer risks<br />

or uncertainties than otherwise. If this is true then their expected ‘wealth’ in both<br />

dollars and utils at the end of a period might be lower if they insure than if otherwise,<br />

but it will be worthwhile because of the benefits during the period. Keynes suggests<br />

that this same desire for quietude can cause a demand for money. I presume, though<br />

it is not entirely clear, that this desire should be included within the precautionary<br />

motives for holding money.<br />

There are not two separate factors affecting the rate of investment,<br />

namely, the schedule of the marginal efficiency of capital [the expected<br />

return of investments] and the state of confidence. The state of confidence<br />

is relevant because it is one of the major factors determining the<br />

former (GT : 149).<br />

For the fact that each individual investor flatters himself that his commitment<br />

is “liquid” ... calms his nerves and makes him much more willing<br />

to run a risk (GT : 160).<br />

The possession of actual money lulls our disquietude; and the premium<br />

which we require to make us part with money is the measure of the<br />

degree of our disquietude (Keynes, 1937b, 116).<br />

A liquidity premium ... is not even expected to be rewarded. It is a<br />

payment, not for the expectation of increased tangible income at the end<br />

of the period, but for an increase sense of comfort and confidence during<br />

the period (Keynes, 1938a, 293-294).<br />

This explanation of the demand for certain returns is in some ways conservative and<br />

some ways radical. It is conservative because it does not immediately change the<br />

technical properties of preference. Many heterodox theories of preference drop such


Keynes, Uncertainty and Interest Rates 544<br />

theoretical restrictions as transitivity of preferences. By contrast the theory Keynes<br />

appears to be advocating is it least in principle conservative on this front. Agents are<br />

still going round maximising expected utility, just now it is expected utility over a<br />

period, not at the end of the period.<br />

But it is not all conservative. If we explain economic decisions in terms of the disquietude<br />

of the investor we discard the distinction between investment and consumption.<br />

It was always known that there were some goods that were not comfortably<br />

categorised, particularly cars, but this move makes every good in part a consumption<br />

good. If all this meant was that some helpful classifications have to be questioned, it<br />

would not be important. Rather, its importance flows from its implications for the<br />

norms for investment. It is always irrational to make an investment which will incur<br />

a sure loss. This principle is used to derive wide-ranging implications for decisiontheory.<br />

But it is not irrational to make a consumption decision which will result in<br />

sure loss at the end of a period in exchange for goods during that period. It is not<br />

always irrational to pay ten dollars for a movie ticket, even though this will incur a<br />

sure loss in the sense the buyer will surely have less wealth at the end of the movie<br />

than if they had not bought the ticket.<br />

Given this, the technical complaint I raised against the Horvitz-style decision<br />

rule misses the target. And the philosophical concern about what ρ represents is<br />

irrelevant. If the expected returns only measure how much various gambles will be<br />

worth at the end of the period, then some desires have not yet been included in our<br />

calculations. That is, ρ represents some desires but the theory is not guilty of doublecounting.<br />

So far this all seems to work, and explain the role of uncertainty. Indeed, I<br />

think this is the best extension of Keynes’s views in this area.<br />

While there seem to be few theoretical objections which can be raised at this<br />

point, there is a rather telling empirical objection. The only role given to disquietude<br />

in this theory is in deciding between alternatives where the returns on at least one<br />

are uncertain. But it seems implausible that disquietude could have this effect, but<br />

have no effect when choices are being made between alternatives where at least one<br />

is risky. I doubt the feelings of disquiet would be any different were I to have a large<br />

fortune riding on a roulette wheel or a baseball game. Disquietude arises because we<br />

do not know what will happen; maybe for some people it is greater when we do not<br />

know the expected returns, but I doubt it. Again, perhaps there is an explanation for<br />

demand for money in the real world to be found here, but uncertainty plays no role<br />

in the story, or at best a small cameo.<br />

8 Summary<br />

Keynes argued that uncertainty has a major economic impact. By driving people to<br />

store their wealth in ways with more stable returns, it increases the demand for cash<br />

and decreases the demand for investments. Not only does it drive down investments<br />

in this direct way, the increased demand for cash leads to higher interest rates and<br />

hence people are driven out of investment into bonds. However, there are a few<br />

problems with the story. First, the motivation for demanding returns fixed with<br />

respect to a certain good can only be that the markets between that good and other


Keynes, Uncertainty and Interest Rates 545<br />

goods are more complete. But if that is the case there is a reason to demand that good<br />

even when the world is completely certain. Secondly, the only decision-theoretic<br />

justification for this demand for fixed returns could be the disquiet generated by not<br />

knowing the return. This follows from the formalisation of uncertainty advocated<br />

in sections 1 and 2. But this disquiet could just as easily be generated by risk as by<br />

uncertainty. So Keynes has not shown that uncertainty has any particular economic<br />

impact. That’s the bad news. The good news is that many of the arguments seem to<br />

work without the reliance on uncertainty.


Stalnaker on Sleeping Beauty<br />

The Sleeping Beauty puzzle provides a nice illustration of the approach to self-locating<br />

belief defended by Robert Stalnaker in Our Knowledge of the Internal World<br />

(Stalnaker, 2008a), as well as a test of the utility of that method. The setup of the<br />

Sleeping Beauty puzzle is by now fairly familiar. On Sunday Sleeping Beauty is told<br />

the rules of the game, and a (known to be) fair coin is flipped. On Monday, Sleeping<br />

Beauty is woken, and then put back to sleep. If, and only if, the coin landed tails,<br />

she is woken again on Tuesday after having her memory of the Monday awakening<br />

erased. 1 On Wednesday she is woken again and the game ends. There are a few questions<br />

we can ask about Beauty’s attitudes as the game progresses. We’d like to know<br />

what her credence that the coin landed heads should be<br />

(a) Before she goes to sleep Sunday;<br />

(b) When she wakes on Monday;<br />

(c) When she wakes on Tuesday; and<br />

(d) When she wakes on Wednesday?<br />

Standard treatments of the Sleeping Beauty puzzle ignore (d), run together (b) and (c)<br />

into one (somewhat ill-formed) question, and then divide theorists into ‘halfers’ or<br />

‘thirders’ depending on how they answer it. Following Stalnaker, I’m going to focus<br />

on (b) here, though I’ll have a little to say about (c) and (d) as well. I’ll be following<br />

orthodoxy in taking 1<br />

to be the clear answer to (a), and in taking the correct answers<br />

2<br />

to (b) and (c) to be independent of how the coin lands, though I’ll briefly question<br />

that assumption at the end.<br />

An answer to these four questions should respect two different kinds of constraints.<br />

The answer for day n should make sense ‘statically’. It should be a sensible<br />

answer to the question of what Beauty should do given what information she then<br />

has. And the answer should make sense ‘dynamically’. It should be a sensible answer<br />

to the question of how Beauty should have updated her credences from some earlier<br />

day, given rational credences on the earlier day.<br />

As has been fairly clear since the discussion of the problem in Elga (2000b), Sleeping<br />

Beauty is puzzling because static and dynamic considerations appear to push in<br />

different directions. The static considerations apparently favour a 1<br />

answer to (b).<br />

3<br />

When Beauty wakes, there are three options available to her: It is Monday and the<br />

coin landed heads; It is Monday and the coin landed tails; It is Tuesday and the coin<br />

landed tails. If we can argue that each of those are equally probable given her evi-<br />

dence, we get the answer 1<br />

3<br />

swer to (b). The right answer to (a) is 1<br />

2<br />

. The dynamic considerations apparently favour a 1<br />

2 an-<br />

. Nothing happens on Monday or Tuesday<br />

† Penultimate draft only. Under review at Thanks to Adam Elga, Elizabeth Harman, Ishani Maitra,<br />

Ted Sider, Robert Stanlaker and Seth Yelcin for comments on an earlier, and mistake-riddled, draft.<br />

1 Note that I’m not assuming that Beauty’s memories are erased in other cases. This makes the particular<br />

version of the case I’m discussing a little different to the version popularised in Elga (2000b). This<br />

shouldn’t make any difference to most analyses of the puzzle, but it helps to clarify some issues.


Stalnaker on Sleeping Beauty 547<br />

that surprises Beauty. And credences should only change if we are surprised. So the<br />

right answer to (b) is 1<br />

2 .<br />

Since we must have harmony between dynamic and static considerations, one of<br />

these arguments must be misguided. (In fact, I think both are, to some degree.) These<br />

days there is a cottage industry of ‘thirders’ developing accounts of credal dynamics<br />

that accord with the 1<br />

3 answer to (b).2 But all of these accounts are considerably more<br />

complex than the traditional, conditionalisation-based, dynamic theory that we all<br />

grew up with.<br />

Three of the many attractions of Robert Stalnaker’s new account of self-locating<br />

knowledge are (i) that it offers a way to answer all four of our questions about Sleeping<br />

Beauty, (ii) that it does so while remaining both statically and dynamically plausible,<br />

and (iii) that the dynamic theory involved is, in large part, traditional conditionalisation.<br />

I spend most of this note setting out Stalnaker’s account, and setting<br />

out his derivation of a 1<br />

answer to (b). I conclude with some reasons for preferring a<br />

3<br />

slightly different solution of the Sleeping Beauty puzzle within the broad framework<br />

Stalnaker suggests.<br />

1 Stalnaker on Self-Location<br />

The picture of self-locating belief that we get from Lewis’s “Attitudes De Dicto and<br />

De Se” (Lewis, 1979a) has been widely adopted in recent years. 3 On Lewis’s picture,<br />

the content of an attitude is a set of centered worlds. For current purposes we’ll take<br />

to centered worlds to be 〈world, agent, time〉 triples. To believe that S, where S is a<br />

set of centered worlds, is to believe that the triple 〈your world, you, now〉 ∈ S.<br />

The motivation for this picture comes from reflection on how to represent locational<br />

uncertainty. If you’re sure where in New York City you are, you can pick out<br />

a point on a map and say “I’m there”. If you’re not sure exactly where you are, but<br />

you have some information, you can pick out a region on the map and say “I’m somewhere<br />

in that region”. If you’re not sure who you are, but you know where everyone<br />

is, you can do the same kind of thing. And it’s plausible that this is a (somewhat) realistic<br />

situation. As one modern-day Lewisian, Andy Egan, puts it ‘I can believe that<br />

my pants are on fire without believing that Egan’s pants are on fire, and I can hope<br />

that someone turns a fire extinguisher on me right now without hoping that someone<br />

turns a fire extinguisher on Egan at 5:41pm.” (Egan, 2004, 64) There is an important<br />

puzzle here that needs to be addressed, and can’t obviously be addressed in the framework<br />

Lewis accepted before 1979, where the content of a propositional attitude is a<br />

set of Lewisian concreta. If possible worlds are Lewisian concreta, then Lewisians<br />

like Egan are correct to respond to puzzles about location by saying, “sometimes (as<br />

when we want to know who or where we are) the world is not enough”. (Egan,<br />

2004, 64)<br />

But this response is too self-centered. Not all locational thoughts are self-locational<br />

thoughts. I can be just as uncertain about where that is as about where this<br />

2 See, for instance, Titlebaum (2008) and the references therein.<br />

3 Including by me. See Egan et al. (2005)


Stalnaker on Sleeping Beauty 548<br />

is, or as uncertain about who you are as about who I am. Imagine I’m watching<br />

Egan’s unfortunate adventures with his infernal trousers on a delayed video tape. I<br />

can believe his pants are on fire without believing Egan’s pants are on fire, and hope<br />

that someone turns a fire extinguisher on him then without hoping some turns a fire<br />

extinguisher on Egan at 5:41pm. Or, at least, that way of putting things sounds just<br />

as good as Egan’s original description of the case.<br />

For a different example, imagine I wake at night and come to believe it is midnight.<br />

As Lewis would represent it, I believe 〈w,me,now〉 ∈ {〈w, s, t〉 : t = midnight}.<br />

When I wake, I think back to that belief, and judge that I may have been mistaken.<br />

How should we represent this? Not that I now believe 〈w,me,now〉 /∈ {〈w, s, t〉 :<br />

t = midnight}. That’s obviously true - I know the sun is up. We want to represent<br />

something more contentious.<br />

The best, I think, the Lewisian can do is to pick out some description D of my<br />

earlier belief and say what I believe is 〈w,me,now〉 /∈ {〈w, s, t〉 : (ιx : D x)x happens<br />

at midnight}. That is, I believe the belief that satisfies D doesn’t happen at midnight.<br />

Is that good enough? Well, we might imagine the debate continuing with the anti-<br />

Lewisian proposing cases where D will not be unique (because of forgotten similar<br />

beliefs) or will not be satisfied (because of a misrecollection of the circumstances<br />

of the belief), and so this approach will fail. And we might imagine the Lewisian<br />

responding by complicating D, or by denying that in these cases we really do have<br />

beliefs about our earlier beliefs. In other words, we can imagine the familiar debates<br />

about descriptivism about names being replayed as debates about descriptivism about<br />

prior beliefs. As enjoyable as that may be, it’s interesting to consider a different<br />

approach.<br />

There’s a more philosophical reason to worry about Lewis’s model. If we model<br />

uncertainty as a class of relationships to possible worlds, it looks like there’s a lot of<br />

actual uncertainty we won’t be able to model. Indeed, there are three kinds of uncertainty<br />

that we can’t model in this framework. First, we can’t model uncertainty about<br />

logic and mathematics. Second, if we accept the necessity of identity, we can’t model<br />

uncertainty about identity claims. Whatever it is to be uncertain about whether a is<br />

b, it won’t be a distinctive relation to the set of worlds in which a is b, since that’s all<br />

the worlds. Third, we can’t model uncertainty about claims about self-identity, like<br />

I’m that guy. Lewis’s framework is an improvement on the sets of possible worlds approach<br />

because it helps with this third class of cases. But it doesn’t help with the first<br />

or, more importantly, with the second. We might think that a solution to puzzles<br />

about self-identity should generalise to solve puzzles about identity more broadly.<br />

Lewis’s model doesn’t. One of Stalnaker’s key insights is that we should, and can,<br />

have a model that addresses both kinds of puzzles about identity.<br />

On Stalnaker’s model, a belief is just a distinction between worlds. The content<br />

of a belief is a set of worlds, not a set of centered worlds. But worlds have more<br />

structure than we thought they had. The formal model is a bit more subtle than<br />

what I’ll sketch here, but I think I’ll include enough detail to cover the Sleeping<br />

Beauty case. In each world, each center, in Lewis’s sense, has a haecceity. A world<br />

is the Cartesian product of a Lewisian world, i.e. a world without haecceities, and<br />

a function from each contextually salient haecceity to a location. If we see a kiss,


Stalnaker on Sleeping Beauty 549<br />

and wonder who she is, who he is, and when they are kissing, then we can think<br />

of the worlds as quadruples consisting of a haecceity-free world (perhaps a Lewisian<br />

concreta), a woman, a man and a time. So we can represent three kinds of locational<br />

doubts, not just self-locational doubt. 4<br />

When an agent at center c believes something self-locating, e.g. that it is Monday,<br />

the content of their belief is that c’s haecceity is on a Monday. If they don’t know<br />

what day it is, there’s a sense in which they don’t know what they believe, since they<br />

don’t know whether what they are believing is that c’s center is on Monday, or that<br />

some other center’s haecceity is on Monday. 5 But their belief, the belief they would<br />

express on Monday by saying “It is Monday”, has two nice features. First, it is neither<br />

trivial, like the belief that Monday is Monday, nor changing in value over time, since<br />

c’s center is always on Monday. Second, it is the kind of belief that people on days<br />

other than Monday can share, or dispute. And this belief can be shared by others<br />

who have the capacity to think de re about c, even if they can’t uniquely describe it.<br />

It’s this last fact that lets Stalnaker handle the cases that proved problematic for Lewis<br />

and the neo-Lewisians. For instance, it lets Stalnaker model shared uncertainty about<br />

identity claims.<br />

With all that in place, it’s time to return to Sleeping Beauty. Let’s consider two<br />

propositions. The first, H, is that the coin landed heads. The second, M , is what<br />

Beauty can express when she wakes on Monday by saying “It is Monday”. That<br />

is, it is a singular proposition about a wakening experience that Beauty can now<br />

have singular thoughts about (since she is now undergoing it), but which she didn’t<br />

previously have the capacity to determinately pick out. We’ll call this wakening a.<br />

(Beauty might undergo multiple wakenings, but we’re going to focus on one for now,<br />

and call it a.) Given these three propositions, we can describe four possibilities. Or,<br />

as we’ll somewhat inaccurately describe them, four worlds. 6<br />

w 1 : H ∧ M<br />

w 2 : H ∧ ¬M<br />

w 3 : ¬H ∧ M<br />

w 4 : ¬H ∧ ¬M<br />

On Sunday, Beauty’s credences are distributed over the algebra generated by the partition<br />

{H,¬H}, i.e., {{w 1 , w 2 },{w 3 , w 4 }}. The algebra is that course-grained because<br />

she doesn’t have the capacity to think M thoughts. And that’s because she’s not acquainted<br />

with the relevant haecceities. So she can’t distinguish between worlds that<br />

differ only on whether M is true. On Sunday then, Beauty’s credences are given by<br />

P r (H) = P r (¬H) = 1<br />

2 .<br />

4Stalnaker thinks we have independent reason to treat these structured entities as simply worlds. The<br />

main point of the last few sentences was that we can adopt Stalnaker’s model while staying neutral on this<br />

metaphysical question.<br />

5Perhaps it would be better to say that individuals and times have haecceities, rather than saying centers<br />

do. I have little idea what could tell between these options, or even if there is a substantive issue here.<br />

6Of course worlds are considerably more detailed than this, but the extra detail is an unnecessary<br />

confusion for the current storyline.


Stalnaker on Sleeping Beauty 550<br />

When she wakes on Monday, two things happen. First, she becomes acquainted<br />

with a. So she can now think about whether a is on Monday. That is, she can now<br />

think about whether M is true. So she can now carve the possibility space more<br />

finely. Indeed, now her credences can be distributed over all propositions built out<br />

of the four possibilities noted above. The second thing that happens is that Beauty<br />

rules out one of these possibilities. In particular, she now knows that H ∧ ¬M , a<br />

proposition she couldn’t so much as think before, is actually false. That’s because if<br />

the coin landed heads, this very wakening could not have taken place on Tuesday.<br />

Stalnaker’s position on Beauty’s credences uses these two facts. First Beauty ‘recalibrates’<br />

her credences to the new algebra, then she updates by conditionalising on<br />

¬H ∨ M. If after recalibration, her credences are equally distributed over the four<br />

cells of the partition, the conditionalising on ¬H ∨ M will move P r (H) to 1<br />

. That 3<br />

is, the thirders win!<br />

But we might wonder why we use just this calibration, the one where all four cells<br />

get equal credence. We’re going to come back to this question below. But first, I want<br />

to use Stalnaker’s framework to respond to an interesting objection to the thirder<br />

position.<br />

2 Monty Hall<br />

Both C. S. Jenkins (2005) and Joseph Halpern (2004) have argued that the ‘thirder’<br />

solution is undermined by its similarity to fallacious reasoning in the Monty Hall<br />

case. The idea is easy enough to understand if we simply recall the Monty Hall<br />

problem. The agent is in one of three states s1 , s2 or s3 , and has reason to believe each<br />

is equally likely. She guesses which one she is in. An experimenter then selects a state<br />

that is neither the state she is in, nor the state she guessed, and tells her that she is<br />

not in that state. If she simply conditionalises on the content of the experimenter’s<br />

report, then her credence that she guessed correctly will go from 1 1<br />

to . This is a<br />

3 2<br />

bizarre failure of Reflection, so something must have gone wrong. 7 Both Jenkins and<br />

Halpern suggest that the violation of Reflection that ‘thirders’ endorse in Sleeping<br />

Beauty is just as bizarre.<br />

But the Sleeping Beauty puzzle is not analogous to the Monty Hall problem.<br />

That’s because in Sleeping Beauty we seem forced to have a violation of Reflection<br />

somewhere. Let’s think a bit again about Beauty’s credences on Wednesday, and<br />

let’s assume that we’re trying to avoid Reflection violations. Then on Monday (and<br />

Tuesday) her credence in H is 1<br />

. Now when Beauty awakes on those days, there are<br />

2<br />

three possibilities open to her. (Hopefully it won’t lead to ambiguity if I re-use the<br />

name a for the awakening Beauty is undergoing when thinking about H.)<br />

• a is Monday and H<br />

• a is Monday and ¬H<br />

• a is Tuesday and ¬H<br />

7 The standard response is to say that the agent shouldn’t just conditionalise on the content of the<br />

experimenter’s utterance, but on the fact that the experimenter is making just that utterance. We’ll return<br />

to this idea below.


Stalnaker on Sleeping Beauty 551<br />

When she wakes on Wednesday, she’s in a position to reflect on these possibilities.<br />

And she can rule out the second of them. That’s what she learns when she wakes and<br />

learns it is Wednesday; that if ¬H, then that last awakening was on Tuesday. Now<br />

since that last awakening, nothing odd has happened to Beauty. She hasn’t had her<br />

memories erased. She might have had her memories erased between Monday and<br />

Tuesday, but that’s not relevant to the time period she’s considering. Moreover, she<br />

knows that she hasn’t had her memories erased. So I think she’s in a position to<br />

simply conditionalise on her new evidence. And that new evidence is simply that<br />

whatever else was going on when she was thinking about those three possibilities,<br />

she wasn’t in the second possibility.<br />

But now we face a challenge. Beauty knows that Wednesday will come. So if<br />

her credence in H on Wednesday isn’t 1<br />

, then we’ll have a violation of Reflection.<br />

2<br />

The violation is that on Sunday her credence in H is 1<br />

, but she knows it will go up<br />

2<br />

on Wednesday. And that violation is just as bad as the violation of Reflection that<br />

‘thirders’ endorse. But if she conditionalises when she wakes up on Wednesday, then<br />

the only way her updated credence in H can be 1<br />

is if her prior credence in the first<br />

2<br />

and third options above were equal. And the only way that can happen is for her<br />

credence, when a is happening, in the proposition that a is Monday and ¬H is 0.<br />

But that’s bizarre. Whether or not the thirders are right to think that she should give<br />

equal credence to that possibility as to the two others, she can’t give it credence 0. So<br />

Reflection will fail somewhere.<br />

To see why Reflection is failing in these cases, it helps to look back at the requirements<br />

we need in order to get from conditionalisation to Reflection. In Rachael<br />

Briggs’s careful analysis of when Reflection holds, in Briggs (2009), Reflection is only<br />

guaranteed to hold when agents know what their evidence is. In other cases, even<br />

perfect conditionalisers may violate Reflection.<br />

This assumption, namely that agents know what their evidence is, is a kind of<br />

luminosity assumption. And not surprisingly, it has been challenged by Timothy<br />

Williamson (Williamson, 2000a, 230-3). What is a little more surprising is that we<br />

only need a relatively weak failure of luminosity in order to get problems for reflection.<br />

The assumption that agents know what their evidence is can be broken into<br />

two parts.<br />

• If p is part of S’s evidence, then S knows that p is part of her evidence.<br />

• If p is not part of S’s evidence, then S knows that p is not part of her evidence.<br />

The first part is, I think, implausible for reasons familiar from Williamson’s work.<br />

But the second is implausible even if one doesn’t like Williamson’s style of reasoning.<br />

If we think p must be true to be part of S’s evidence (as I think we should), and<br />

we think that rational agent’s can have false beliefs about anything, as also seems<br />

plausible by simple observation of how easy it is to be misled, then even a rational<br />

agent can fail to realise that p is not part of her evidence. The easiest way that can<br />

happen is if she falsely, but reasonably, believes p, and hence does not realise that due<br />

to its falsity, it is not part of her evidence.


Stalnaker on Sleeping Beauty 552<br />

Williamson provides an interesting model, based on a discussion in Shin (1989),<br />

of a case where an agent does not know that something is not part of her evidence.<br />

There are currently three possible states the agent could be in: s 1 , s 2 or s 3 . An experiment<br />

will be run, and after the experiment the agent will get some evidence depending<br />

on which state she’s in.<br />

• If she’s in s 1 , her evidence will rule out s 3 .<br />

• If she’s in s 2 , her evidence will rule out s 1 and s 3 .<br />

• If she’s in s 3 , her evidence will rule out s 1 .<br />

Assume the agent knows these conditionals before the experiment is run, and now<br />

let’s assume the experiment has been run. Let xRy mean that y is possible given the<br />

evidence S gets in x. Then we can see that R is transitive. That means that if p is<br />

part of S’s evidence, then her evidence settles that p is part of her evidence. But R<br />

is not Euclidean. So it is possible that p is not part of her evidence, even though her<br />

evidence does not settle that p is not part of her evidence. In particular, if she is in<br />

s 1 , that she isn’t in s 1 is not part of her evidence. But for all she can tell, she’s in s 2 .<br />

And if she’s in s 2 , her evidence does rule out her being in s 1 . So her evidence doesn’t<br />

settle that this is not part of her evidence.<br />

The model is obviously an abstraction from any kind of real-world case. But as<br />

we argued above, it is plausible that there are cases where an agent doesn’t know what<br />

evidence she lacks. And this kind of case makes for Reflection failure. Assume that<br />

the agent’s prior credences are (and should be) that each state is equally likely. And<br />

assume the agent conditionalises on the evidence she gets. Then her credence that<br />

she’s in s 2 will go up no matter what state she’s in. And she knows in advance this<br />

will happen. But there’s no obvious irrationality here; it’s not at all clear what kind<br />

of reflection-friendly credal dynamics would be preferably to updating by conditionalisation.<br />

8<br />

So when an agent doesn’t know what evidence she lacks, Reflection can fail. One<br />

way to think about the Sleeping Beauty case is that something like this is going on,<br />

although it isn’t quite analogous to the Shin-Williamson example discussed above. In<br />

that example, the agent doesn’t know what evidence she lacks at the later time. In<br />

the Sleeping Beauty case, we can reasonably model Beauty as knowing exactly what<br />

her evidence is when she wakes up. Her evidence does nothing more or less than rule<br />

out w 2 . That’s something she didn’t know before waking up. But in a good sense she<br />

didn’t know that she didn’t know that. That’s because she was not in a position to<br />

even think about w 2 as such. Since she wasn’t in a position to think about a, couldn’t<br />

distinguish, even in thought, between w 1 and w 2 . So any proposition she could think<br />

8 The idea that we should update by conditionalisation on our evidence, even when we don’t know<br />

what the evidence is, has an amusing consequence in the Monty Hall problem. The agent guesses that<br />

she’s in s i , and comes to know she’s not in s j , where i �= j . If she only comes to know that she’s not in<br />

s j , and not something stronger, such as knowing that she knows she’s not in s j , then she really should<br />

conditionalise on this, and her credence that her guess was correct will go up. This is the ‘mistaken’<br />

response to the puzzle that is frequently deprecated in the literature. But since the orthodox solutions to<br />

the puzzle rely on the agent reflecting on how she came to know ¬s j , it seems that it is the right solution<br />

if she doesn’t know that she knows ¬s j .


Stalnaker on Sleeping Beauty 553<br />

about, and investigate whether she knew or not, had to include either both w 1 and<br />

w 2 , or include neither of them. So the only way she could know that she didn’t know<br />

{w 1 , w 3 , w 4 } is if she tacitly knew she didn’t know that in virtue of knowing that she<br />

didn’t know {w 1 , w 2 , w 3 , w 4 }. But she didn’t know that she didn’t know that for the<br />

simple reason that she did know that {w 1 , w 2 , w 3 , w 4 }, i.e. the universal proposition,<br />

is true. So we have a case where Beauty doesn’t know what it is she doesn’t know at<br />

the earlier time. And like cases where the agent doesn’t know what she doesn’t know<br />

at the later time, this is a case where reflection fails.<br />

So there are two reasons to be sceptical of reflection-based arguments against the<br />

‘thirder’ solution to the Sleeping Beauty puzzle.<br />

• There is no plausible way for Beauty’s credence in H to be 1<br />

on both Monday<br />

2<br />

and Wednesday, but reflection requires this.<br />

• Reflection is only plausible when agents know both what evidence they have,<br />

and what evidence they lack, throughout the story. And it is implausible that<br />

Beauty satisfies this constraint, since she gains conceptual capacities during the<br />

story.<br />

But this isn’t a positive argument for the 1<br />

solution. I’ll conclude with a discussion<br />

3<br />

of two arguments for the 1<br />

solution. Both arguments are suggested by Stalnaker’s<br />

3<br />

framework, but only one of them is ultimately defensible.<br />

3 Stalnaker on Sleeping Ugly<br />

When we left Stalnaker’s discussion of the Sleeping Beauty case, we had just noticed<br />

that there was a question about why Beauty should respond to being able to more<br />

finely discriminate between states by ‘recalibrating’ to a credal state where each of<br />

w1 through w4 receive equal credence. This question about calibration is crucial to<br />

the Sleeping Beauty puzzle because there are other post-calibration distributions of<br />

credence are are prima facie viable. Perhaps, given what Beauty knows about the<br />

setup, she should never have assigned any credence to H ∧ ¬M . Rather, she should<br />

have made it so P r (¬H ∧ M) = P r (¬H ∧ ¬M) = 1<br />

1<br />

, and P r (H ∧ M) = . If she does<br />

4 2<br />

that, the conditionalising on ¬(H ∧ ¬M) won’t change a thing, and P r (H) will still<br />

be 1<br />

. That is, the halfers win!<br />

2<br />

One argument against this, and in favour of the equally weighted calibration, is<br />

suggested by Stalnaker’s ‘Sleeping Ugly’ example. Sleeping Ugly is woken up on<br />

Monday and again (with erased memories) on Tuesday however the coin lands. So<br />

when Ugly awakes, he has the capacity to think new singular thoughts, but he doesn’t<br />

get much evidence about them. In particular, he can’t share the knowledge Beauty<br />

would express by saying, “If the coin landed Heads, this is Monday.” 9 Now we might<br />

think it is intuitive that Ugly’s credences when he wakes up and reflects on his situation<br />

should be equal over the four possibilities. Moreover, all Ugly does is recalibrate;<br />

since he doesn’t learn anything about which day it is, his post-awakening credence<br />

9 Stalnaker notes that this is a reason for thinking Beauty does learn something when she wakes up, and<br />

so there’s a reason her credence in H changes.


Stalnaker on Sleeping Beauty 554<br />

just is his recalibration. If all this is correct, and if Beauty should recalibrate in the<br />

same way as Ugly, then Beauty should recalibrate to the ‘equally weighted calibration’.<br />

And now we’re back to victory for the thirders!<br />

But there’s little reason to believe the crucial premise about how Ugly should<br />

recalibrate his credences. What we know is that Ugly doesn’t have any reason to give<br />

any more credence to any one of the four possibilities than to the others. It doesn’t at<br />

all follow that he has reason to give equal credence to each, any more than in general<br />

an absence of reasons to treat one of the X s differently to the others is a reason to<br />

treat them all the same. 10<br />

The argument I’m considering here is similar to reasoning Adam Elga has employed<br />

Elga (2004b), and which I have criticised <strong>Weatherson</strong> (2005a). A central focus<br />

of my criticism was that this kind of reasoning has a tendency to lead to countable<br />

additivity violations. In an important recent paper, Jacob Ross (forthcoming) has<br />

shown that many thirder arguments similarly lead to countable additivity violations.<br />

He shows this by deriving what he calls the ‘Generalised Thirder Principle’ (hereafter,<br />

GTP) from the premises of these arguments. The GTP is a principle concerning<br />

a generalised version of the Sleeping Beauty problem. Here is Ross’s description<br />

of this class of problems.<br />

Let us define a Sleeping Beauty problem as a problem in which a fully rational<br />

agent, Beauty, will undergo one or more mutually indistinguishable<br />

awakenings, and in which the number of awakenings she will undergo is<br />

determined by the outcome of a random process. Let S be a partition of<br />

alternative hypotheses concerning the outcome of this random process.<br />

Beauty knows the objective chances of each hypothesis in S, and she also<br />

knows how many time she will awaken conditional on each of these hypotheses,<br />

but she has no other relevant information. The problem is to<br />

determine how her credence should be divided among the hypotheses in<br />

S when she first awakens. (Ross ms, 2-3)<br />

The GTP is a principle about this general class of problem. Here’s how Ross states<br />

it.<br />

Generalized Thirder Principle In any standard Sleeping Beauty problem, upon<br />

first awakening, Beauty’s credence in any given hypothesis in S must be proportional<br />

to the product of the hypothesis’ objective chance and the number of<br />

times Beauty will awaken conditional on this hypothesis. ... [We can] express<br />

this principle formally. For any hypothesis i ∈ S, let C h(i) be the objective<br />

chance that hypothesis i is true, and let N(i) be the number of times Beauty<br />

awakens if i is true. Let P be the Beauty’s credence function upon first awakening.<br />

The GTP states ...<br />

10 Compare this argument for giving nothing to charity. There are thousands of worthwhile charities,<br />

and I have no reason to give more to one than any of the others. But I can’t afford to give large equal<br />

amounts to each, and if I gave small equal amounts to each, the administrative costs would mean my<br />

donation has no effect. So I should treat each equally, and the only sensible practical way to do this is to<br />

give none to each. Note that you really don’t have to think one charity is more worthy than the others to<br />

think this is a bad argument; sometimes we just have to make arbitrary choices.


Stalnaker on Sleeping Beauty 555<br />

For all i, j ∈ S, P(i) N(i)C h(i)<br />

= whenever C h( j ) > 0. (Ross ms, 6-7)<br />

P( j ) N(j)C h( j )<br />

The argument I’m considering seems to be committed to the GTP. In a generalised<br />

Sleeping Beauty problem, we can imagine a version of Sleeping Ugly who will awake<br />

every day that Beauty might awake. The reasoning that leads one to think that Ugly<br />

should give equal credence to each of the two days in the original Sleeping Beauty<br />

case seems to generalise to imply that Ugly should give equal credence to each day in<br />

this more general case. But if in the general example Beauty calibrates to match these<br />

credences of Ugly, then conditionalises on the information she receives, then she’ll<br />

end up endorsing the GTP. That’s an unhappy outcome. It would be better to have<br />

an argument for the 1<br />

solution that doesn’t imply the GTP.<br />

3<br />

I’m going to argue that when Beauty wakes up her credences should satisfy the<br />

following two premises. (As always, I use a to name the awakening that Beauty is<br />

now undergoing, and I’m using C r for her credence function on waking.)<br />

P1: C r (a is Monday and H) = C r (a is Tuesday and ¬H)<br />

P2: C r (a is Monday and H) = C r (a is Monday and ¬H)<br />

These constraints imply, given what Beauty knows about the setup, that C r (H) = 1<br />

3 .<br />

The arguments for each premise are quite different.<br />

The argument for P1 is one I mentioned above, so I’ll just sketch it quickly here.<br />

On Wednesday, Beauty’s credence in H should be back to 1<br />

. But what she learns<br />

2<br />

on Wednesday is ¬(a is on Monday and ¬H). So on Monday, her credence in H<br />

conditional on ¬(a is on Monday and ¬H) should be 1<br />

. But given what Beauty<br />

2<br />

knows about the setup of the problem, this immediately implies P1.<br />

The argument for P2 requires a slightly more fanciful version of the example.<br />

Imagine that on Sunday night, Beauty is visited by a time traveller from Monday<br />

who comes back with a videotape of her waking on Monday, and tells her that it<br />

was taken on Monday. So Beauty now has the capacity to think about this very<br />

awakening, i.e., a. This doesn’t seem to affect her credences in H, it should still be 1<br />

2 .<br />

Now imagine that her memory of this visit is erased overnight, so when she wakes<br />

up on Monday her situation is just like in the original Sleeping Beauty problem.<br />

Call C r1 her credence function on Sunday after meeting the time traveller. And<br />

call C r2 her credence function on Monday after she wakes up and reflects on her situation.<br />

It seems the only relevant difference between the situation on Sunday and the<br />

situation on Monday is that Beauty has lost the information that a is on Monday. The<br />

following principle about situations where an agent loses information seems plausible.<br />

If C rold is the pre-loss credence function, and C rnew is the post-loss credence<br />

function, and E is the information lost, then<br />

• C r old ( p) = C r new ( p|E)


Stalnaker on Sleeping Beauty 556<br />

The idea here is that information loss is a sort of reverse conditionalisation. Applying<br />

this, we get that C r1 (H) = C r2 (H|a is Monday), so C r2 ((H|a is Monday) = 1<br />

, so 2<br />

C r2 (a is Monday and H) = C r2 (a is Monday and ¬H). And since the situation on<br />

Monday in the revised problem, i.e., the situation when Beauty’s credence function<br />

is C r2 is just like the situation in the original Sleeping Beauty problem on Monday,<br />

it follows that P1 is true in the original problem. And from P1 and P2, it follows that<br />

the thirder solution is right.<br />

But note a limitation of this solution. When Beauty wakes on Tuesday her credence<br />

function is defined over a different algebra of propositions to what it was defined<br />

over after meeting the time traveller. So there’s no time travel based argument<br />

that her credences on Tuesday should satisfy P2, or indeed that on Tuesday her credence<br />

in H should be 1<br />

. (For similar reasons, this kind of reason does not support<br />

3<br />

the GTP.)<br />

One might try and argue that Beauty’s situation on Tuesday is indistinguishable<br />

from her situation on Monday, and so she should have the same credences on Tuesday.<br />

Both the premise and the inference here seem dubious. On Tuesday, Beauty<br />

knows different singular propositions, so the situation isn’t clearly indistinguishable.<br />

But more importantly, it is implausible that indistinguishability implies same credences.<br />

The relation should have the same credences in is a transitive and symmetric<br />

relation between states. The relation is indistinguishable from is neither transitive nor<br />

symmetric. So I suspect that the kind of arguments developed here leave it an open<br />

question what Beauty’s credences should be on Tuesday, and indeed whether there is<br />

a unique value for what her credences then should be.


Dogmatism and Intuitionistic Probability<br />

David Jehle, <strong>Brian</strong> <strong>Weatherson</strong><br />

Many epistemologists hold that an agent can come to justifiably believe that p is<br />

true by seeing that it appears that p is true, without having any antecedent reason to<br />

believe that visual impressions are generally reliable. Certain reliabilists think this, at<br />

least if the agent’s vision is generally reliable. And it is a central tenet of dogmatism<br />

(as described by Pryor (2000a) and Pryor (2004a)) that this is possible. Against these<br />

positions it has been argued (e.g. by Cohen (2005) and White (2006)) that this violates<br />

some principles from probabilistic learning theory. To see the problem, let’s note<br />

what the dogmatist thinks we can learn by paying attention to how things appear.<br />

(The reliabilist says the same things, but we’ll focus on the dogmatist.)<br />

Suppose an agent receives an appearance that p, and comes to believe that p. Letting<br />

Ap be the proposition that it appears to the agent that p, and → be the material<br />

implication, we can say that the agent learns that p, and hence is in a position to infer<br />

Ap → p, once they receive the evidence Ap. 1 This is surprising, because we can prove<br />

the following.<br />

Theorem 1<br />

If P r is a classical probability function, then<br />

P r (Ap → p|Ap) ≤ P r (Ap → p).<br />

(All the theorems are proved in the appendix.) An obvious corollary of Theorem 1 is<br />

Theorem 2<br />

If P r is a classical probability function, then<br />

• P r (¬(Ap ∧ ¬ p)|Ap) ≤ P r (¬(Ap ∧ ¬ p)); and<br />

• P r (¬Ap ∨ p|Ap) ≤ P r (¬Ap ∨ p).<br />

And that’s a problem for the dogmatist if we make the standard Bayesian assumption<br />

that some evidence E is only evidence for hypothesis H if P r (H|E) > P r (H). For<br />

here we have cases where the evidence the agent receives does not raise the probability<br />

of Ap → p, ¬(Ap ∧ ¬ p) or ¬Ap ∨ p, so the agent has not received any evidence for<br />

them, but getting this evidence takes them from not having a reason to believe these<br />

propositions to having a reason to get them.<br />

In this paper, we offer a novel response for the dogmatist. The proof of Theorem<br />

1 makes crucial use of the logical equivalence between Ap → p and ((Ap →<br />

p) ∧ Ap) ∨ ((Ap → p) ∧ ¬Ap). These propositions are equivalent in classical logic,<br />

but they are not equivalent in intuitionistic logic. From this fact, we derive two<br />

claims. In section 1 we show that Theorems 1 and 2 fail in intuitionistic probability<br />

† In progress.<br />

1 We’re assuming here that the agent’s evidence really is Ap, not p. That’s a controversial assumption,<br />

but it isn’t at issue in this debate.


Dogmatism and Intuitionistic Probability 558<br />

theory. In section 2 we consider how an agent who is unsure whether classical or<br />

intuitionistic logic is correct should apportion their credences. We conclude that for<br />

such an agent, theorems analogous to Theorems 1 and 2 fail even if the agent thinks<br />

it extremely unlikely that intuitionistic logic is the correct logic. The upshot is that<br />

if it is rationally permissible to be even a little unsure whether classical or intuitionistic<br />

logic is correct, it is possible that getting evidence that Ap raises the rational<br />

credibility of Ap → p, ¬(Ap ∧ ¬ p) and ¬Ap ∨ p.<br />

1 Intuitionistic Probability<br />

In <strong>Weatherson</strong> (2004a), the notion of a ⊢-probability function, where ⊢ is an entailment<br />

relation, is introduced. For any ⊢, a ⊢-probability function is a function P r<br />

from sentences in the language of ⊢ to [0,1] satisfying the following four constraints. 2<br />

(P0) P r ( p) = 0 if p is a ⊢-antithesis, i.e. iff for any X , p ⊢ X .<br />

(P1) P r ( p) = 1 if p is a ⊢-thesis, i.e. iff for any X , X ⊢ p.<br />

(P2) If p ⊢ q then P r ( p) ≤ P r (q).<br />

(P3) P r ( p) + P r (q) = P r ( p ∨ q) + P r ( p ∧ q).<br />

We’ll use ⊢ C L to denote the classical entailment relation, and ⊢ I L to denote the intuitionist<br />

entailment relation. Then what we usually take to be probability functions<br />

are ⊢ C L -probability functions. And intuitionist probability functions are ⊢ I L -<br />

probability functions.<br />

In what follows we’ll make frequent appeal to three obvious consequences of<br />

these axioms, consequences which are useful enough to deserve their own names.<br />

Hopefully these are obvious enough to pass without proof. 3<br />

(P1 ∗ ) 0 ≤ P r ( p) ≤ 1.<br />

(P2 ∗ ) If p ⊣⊢ q then P r ( p) = P r (q).<br />

(P3 ∗ ) If p ∧ q is a ⊢-antithesis, then P r ( p) + P r (q) = P r ( p ∨ q).<br />

⊢-probability functions obviously concern unconditional probability, but we can easily<br />

extend them into conditional ⊢-probability functions by adding the following axioms.<br />

4<br />

2 We’ll usually assume that the language of ⊢ is a familiar kind of propositional calculus, with a countable<br />

infinity of sentence letters, and satisfying the usual recursive constraints. That is, if A and B are<br />

sentences of the language, then so are ¬A, A → B, A∧ B and A∨ B. It isn’t entirely trivial to extend some<br />

of our results to a language that contains quantifiers. This is because once we add quantifiers, intuitionistic<br />

and classical logic no longer have the same anti-theorems. But that complication is outside the scope of<br />

this paper. Note that for Theorem 6, we assume a restricted language with just two sentence letters. This<br />

merely simplifies the proof. A version of the construction we use there with those two letters being simply<br />

the first two sentence letters would be similar, but somewhat more complicated.<br />

3 <strong>Weatherson</strong> (2004a) discusses what happens if we make P2 ∗ or P3 ∗ an axiom in place of either P2 and<br />

P3. It is argued there that this gives us too many functions to be useful in epistemology. The arguments in<br />

Williams (2010) provide much stronger reasons for believing this conclusion is correct.<br />

4 For the reasons given in Hájek (2003), it is probably better in general to take conditional probability as<br />

primitive. But for our purposes taking unconditional probability to be basic won’t lead to any problems,<br />

so we’ll stay neutral on whether conditional or unconditional probability is really primitive.


Dogmatism and Intuitionistic Probability 559<br />

(P4) If r is not a ⊢-antithesis, then P r (·|r ) is a ⊢-probability function; i.e., it satisfies<br />

P0-P3.<br />

(P5) If r ⊢ p then P r ( p|r ) = 1.<br />

(P6) If r is not a ⊢-antithesis, then P r ( p ∧ q|r ) = P r ( p|q ∧ r )P r (q|r ).<br />

There is a simple way to generate ⊢ C L probability functions. Let 〈W ,V 〉 be a model<br />

where W is a finite set of worlds, and V a valuation function defined on them with<br />

respect to a (finite) set K of atomic sentences, i.e., a function from K to subsets of<br />

W . Let L be the smallest set including all members of K such that whenever A and B<br />

are in L, so are A∧ B, A∨ B, A → B and ¬A. Extend V to V ∗ , a function from L to<br />

subsets of W using the usual recursive definitions of the sentential connectives. (So<br />

w ∈ V ∗ (A ∧ B) iff w ∈ V ∗ (A) and w ∈ V ∗ (B), and so on for the other connectives.)<br />

Let m be a measure function defined over subsets of W. Then for any sentence S in<br />

L, P r (S) is m({w : w ∈ V ∗ (S)}). It isn’t too hard to show that Pr is a ⊢ C L probability<br />

function.<br />

There is a similar way to generate ⊢ I L probability functions. This method uses<br />

a simplified version of the semantics for intuitionistic logic in Kripke (1965). Let<br />

〈W , R,V 〉 be a model where W is a finite set of worlds, R is a reflexive, transitive<br />

relation defined on W , and V is a valuation function defined on them with respect to<br />

a (finite) set K of atomic sentences. We require that V be closed with respect to R, i.e.<br />

that if x ∈ V ( p) and xRy, then y ∈ V ( p). We define L the same way as above, and<br />

extend V to V ∗ (a function from L to subsets of W ) using the following definitions.<br />

w ∈ V ∗ (A∧ B) iff w ∈ V ∗ (A) and w ∈ V ∗ (B).<br />

w ∈ V ∗ (A∨ B) iff w ∈ V ∗ (A) or w ∈ V ∗ (B).<br />

w ∈ V ∗ (A → B) iff for all w ′ such that wRw ′ and w ′ ∈ V ∗ (A), w ′ ∈<br />

V ∗ (B).<br />

w ∈ V ∗ (¬A) iff for all w ′ such that wRw ′ , it is not the case that w ′ ∈<br />

V ∗ (A).<br />

Finally, we let m be a measure function defined over subsets of W . And for any<br />

sentence S in L, P r (S) is m({w : w ∈ V ∗ (S)}). <strong>Weatherson</strong> (2004a) shows that any<br />

such P r is a ⊢ I L probability function.<br />

To show that Theorem 1 may fail when P r is ⊢ I L a probability function, we<br />

need a model we’ll call M . The valuation function in M is defined with respect to a<br />

language where the only atomic propositions are p and Ap.<br />

W = {1,2,3}<br />

R = {〈1,1〉,〈2,2〉,〈3,3〉,〈1,2〉,〈1,3〉}<br />

V ( p) = {2}<br />

V (Ap) = {2,3}<br />

Graphically, M looks like this.


Dogmatism and Intuitionistic Probability 560<br />

2 Ap, p 3 Ap<br />

1<br />

We’ll now consider a family of measures over m. For any x ∈ (0,1), let m x be the<br />

measure function such that m x ({1}) = 1 − x, m x ({2}) = x, and m x ({3}) = 0. Corresponding<br />

to each function m x is a ⊢ I L probability function we’ll call P r x . Inspection<br />

of the model shows that Theorem 3 is true.<br />

Theorem 3.<br />

In M, for any x ∈ (0,1),<br />

(a) P r x (Ap → p) = P r x ((Ap → p) ∧ Ap) = x<br />

(b) P r x (¬Ap ∨ p) = P r x ((¬Ap ∨ p) ∧ Ap) = x<br />

(c) P r x (¬(Ap ∧ ¬ p)) = P r x (¬(Ap ∧ ¬ p) ∧ Ap) = x<br />

An obvious corollary of Theorem 3 is<br />

Theorem 4.<br />

For any x ∈ (0,1),<br />

(a) 1 = P r x (Ap → p|Ap) > P r x (Ap → p) = x<br />

(b) 1 = P r x (¬Ap ∨ p|Ap) > P r x (¬Ap ∨ p) = x<br />

(c) 1 = P r x (¬(Ap ∧ ¬ p)|Ap) > P r x (¬(Ap ∧ ¬ p)) = x<br />

So for any x, conditionalising on Ap actually raises the probability of Ap → p,¬(Ap∧<br />

¬ p) and ¬Ap ∨ p with respect to P r x . Indeed, since x could be arbitrarily low, it can<br />

raise the probability of each of these three propositions from any arbitrarily low value<br />

to 1. So it seems that if we think learning goes by conditionalisation, then receiving<br />

evidence Ap could be sufficient grounds to justify belief in these three propositions.<br />

Of course, this relies on our being prepared to use the intuitionist probability calculus.<br />

For many, this will be considered too steep a price to pay to preserve dogmatism.<br />

But in section 2 we’ll show that the dogmatist does not need to insist that intuitionistic<br />

logic is the correct logic for modelling uncertainty. All they need to show is that<br />

it might be correct, and then they’ll have a response to this argument.


Dogmatism and Intuitionistic Probability 561<br />

2 Logical Uncertainty<br />

We’re going to build up to a picture of how to model agents who are rationally uncertain<br />

about whether the correct logic is classical or intuitionistic. But let’s start by<br />

thinking how an agent who is unsure which of two empirical theories T 1 or T 2 is<br />

correct. We’ll assume that the agent is using the classical probability calculus, and the<br />

agent knows which propositions are entailed by each of the two theories. And we’ll<br />

also assume that the agent is sure that it’s not the case that each of these theories is<br />

false, and the theories are inconsistent, so they can’t both be true.<br />

The natural thing then is for the agent to have some credence x in T 1 , and credence<br />

1− x in T 2 . She will naturally have a picture of what the world is like assuming<br />

T 1 is correct, and on that picture every proposition entailed by T 1 will get probability<br />

1. And she’ll have a picture of what the world is like assuming T 2 is correct. Her<br />

overall credal state will be a mixture of those two pictures, weighted according to the<br />

credibility of T 1 and T 2 .<br />

If we’re working with unconditional credences as primitive, then it is easy to mix<br />

two probability functions to produce a credal function which is also a probability<br />

function. Let P r 1 be the probability function that reflects the agent’s views about<br />

how things probably are conditional on T 1 being true, and P r 2 the probability function<br />

that reflects her views about how things probably are conditional on T 2 being<br />

true. Then for any p, let C r ( p) = xP r 1 ( p) + (1 − x)P r 2 ( p), where C r is the agent’s<br />

credence function.<br />

It is easy to see that C r will be a probability function. Indeed, inspecting the<br />

axioms P0-P3 makes it obvious that for any ⊢, mixing two ⊢-probability functions<br />

as we’ve just done will always produce a ⊢-probability function. The axioms just<br />

require that probabilities stand in certain equalities and inequalities that are obviously<br />

preserved under mixing.<br />

It is a little trickier to mix conditional probability functions in an intuitive way,<br />

for the reasons set out in Jehle and Fitelson (2009). But in a special case, these difficulties<br />

are not overly pressing. Say that a ⊢-probability function is regular iff for any p,<br />

q in its domain, P r ( p|q) = 0 iff p ∧ q is a ⊢-antitheorem. Then, for any two regular<br />

conditional probability functions P r 1 and P r 2 we can create a weighted mixture of<br />

the two of them by taking the new unconditional probabilities, i.e. the probabilities<br />

of p given T , where T is a theorem, to be weighted sums of the unconditional<br />

probabilities in P r 1 and P r 2 . That is, our new function P r 3 is given by:<br />

P r 3 ( p|T ) = xP r 1 ( p|T ) + (1 − x)P r 2 ( p|T )<br />

In the general case, this does not determine exactly which function P r 3 is, since it<br />

doesn’t determine the value of P r 3 ( p|q) when P r 1 (q|T ) = P r 2 (q|T ) = 0. But since<br />

we’re paying attention just to regular functions this doesn’t matter. If the function is<br />

regular, then we can just let the familiar ratio account of conditional probability be a<br />

genuine definition. So in general we have,<br />

P r 3 ( p|q) = P r 3 ( p ∧ q|T )<br />

P r 3 (q|T )


Dogmatism and Intuitionistic Probability 562<br />

And since the numerator is 0 iff q is an anti-theorem, whenever P r ( p|q) is supposed<br />

to be defined, i.e. when q is not an anti-theorem, the right hand side will be well<br />

defined. As we noted, things get a lot messier when the functions are not regular, but<br />

those complications are unnecessary for the story we want to tell.<br />

Now in the cases we’ve been considering so far, we’ve been assuming that T 1 and<br />

T 2 are empirical theories, and that we could assume classical logic in the background.<br />

Given all that, most of what we’ve said in this section has been a fairly orthodox<br />

treatment of how to account for a kind of uncertainty. But there’s no reason, we<br />

say, why we should restrict T 1 and T 2 in this way. We could apply just the same<br />

techniques when T 1 and T 2 are theories of entailment.<br />

When T 1 is the theory that classical logic is the right logic of entailment, and T 2<br />

the theory that intuitionistic logic is the right logic of entailment, then P r 1 and P r 2<br />

should be different kinds of probability functions. In particular, P r 1 should be a ⊢ C L -<br />

probability function, and P r 2 should be a ⊢ I L -probability function. That’s because<br />

P r 1 represents how things probably are given T 1 , and given T 1 , how things probably<br />

are is constrained by classical logic. And P r 2 represents how things probably are<br />

given T 2 , and given T 2 , how things probably are is constrained by intuitionistic logic.<br />

If we do all that, we’re pushed towards the thought that the if someone is uncertain<br />

whether the right logic is intuitionistic or classical logic, then the right theory of<br />

probability for them is intuitionistic probability theory. That’s because of Theorem<br />

5.<br />

Theorem 5 Let P r 1 be a regular conditional ⊢ C L -probability function,<br />

and P r 2 be a regular conditional ⊢ I L -probability function that is not a<br />

⊢ C L -probability function. And let P r 3 be defined as in the text. (That is,<br />

P r 3 (A) = xP r 1 (A) + (1 − x)P r 2 (A), and P r 3 (A|B) = P r 3 (A∧B)<br />

P r 3 (B) .) Then P r 3<br />

is a regular conditional ⊢ I L -probability function.<br />

That’s to say, if the agent is at all unsure whether classical logic or intuitionistic logic<br />

is the correct logic, then their credence function should be an intuitionistic probability<br />

function.<br />

Of course, if the agent is very confident that classical logic is the correct logic,<br />

then they couldn’t rationally have their credences distributed by any old intuitionistic<br />

probability function. After all, there are intuitionistic probability functions such<br />

that P r ( p ∨ ¬ p) = 0, but an agent whose credence that classical logic is correct is,<br />

say, 0.95, could not reasonably have credence 0 in p ∨ ¬ p. For our purposes, this<br />

matters because we want to show that an agent who is confident, but not certain,<br />

that classical logic is correct can nevertheless be a dogmatist. To fill in the argument<br />

we need,<br />

Theorem 6 Let x be any real in (0,1). Then there is a probability function<br />

C r that (a) is a coherent credence function for someone whose cre-


Dogmatism and Intuitionistic Probability 563<br />

dence that classical logic is correct is x, and (b) satisfies each of the following<br />

inequalities:<br />

P r (Ap → p|Ap) > P r (Ap → p)<br />

P r (¬Ap ∨ p|Ap) > P r (¬Ap ∨ p)<br />

P r (¬(Ap ∧ ¬ p)|Ap) > P r (¬(Ap ∧ ¬ p))<br />

The main idea driving the proof of Theorem 6 which is set out in the appendix, is<br />

that if intuitionistic logic is correct, it’s possible that conditionalising on Ap raises the<br />

probability of each of these three propositions from arbitrarily low values to 1. So as<br />

long as the prior probability of each of the three propositions, conditional on intuitionistic<br />

logic being correct, is low enough, it can still be raised by conditionalising<br />

on Ap.<br />

More centrally, we think Theorem 6 shows that the probabilistic argument against<br />

dogmatism is not compelling. The original argument noted that the dogmatist says<br />

that we can learn the three propositions in Theorem 6, most importantly Ap → p,<br />

by getting evidence Ap. And it says this is implausible because conditionalising on<br />

Ap lowers the probability of Ap → p. But it turns out this is something of an artifact<br />

of the very strong classical assumptions that are being made. The argument not only<br />

requires the correctness of classical logic, it requires that the appropriate credence the<br />

agent should have in classical logic’s being correct is one. And that assumption is,<br />

we think, wildly implausible. Even if the agent should be very confident that classical<br />

logic is the correct logic, it shouldn’t be a requirement of rationality that she be<br />

absolutely certain that it is correct.<br />

So we conclude that this argument fails. A dogmatist about perception who is<br />

at least minimally open-minded about logic can marry perceptual dogmatism to a<br />

probabilistically coherent theory of confirmation.<br />

This paper is one more attempt on our behalf to defend dogmatism from a probabilistic<br />

challenge. <strong>Weatherson</strong> (2007) defends dogmatism from the so-called “Bayesian<br />

objection" to dogmatism. And Jehle (2009) not only shows that dogmatism can be<br />

situated nicely into a probabilistically coherent theory of confirmation, but also that<br />

within such a theory, many of the traditional objections to dogmatism are easily rebutted.<br />

We look forward to future research on the connections between dogmatism<br />

and probability, but we remain skeptical that dogmatism will be undermined solely<br />

by probabilistic considerations.<br />

Appendix: Proofs<br />

Theorem 1<br />

If P r is a classical probability function, then<br />

P r (Ap → p|Ap) ≤ P r (Ap → p).<br />

Proof: Assume P r is a classical probability function, and ⊢ the classical consequence<br />

relation.


Dogmatism and Intuitionistic Probability 564<br />

Ap → p ⊣⊢ ((Ap → p) ∧ Ap) ∨ ((Ap → p) ∧ ¬Ap) (31.1)<br />

P r (Ap → p) = P r (((Ap → p) ∧ Ap) ∨ ((Ap → p) ∧ ¬Ap)) 1, P2 ∗<br />

(31.2)<br />

P r ((Ap → p) ∧ Ap) ∨ ((Ap → p) ∧ ¬Ap)) =<br />

P r ((Ap → p) ∧ Ap) + P r ((Ap → p) ∧ ¬Ap)<br />

P3 ∗<br />

(31.3)<br />

P r ((Ap → p) ∧ Ap) = P r (Ap)P r (Ap → p|Ap) P6 (31.4)<br />

P r ((Ap → p) ∧ ¬Ap) = P r (¬Ap)P r (Ap → p|¬Ap) P6 (31.5)<br />

P r (Ap → p) =<br />

P r (Ap)P r (Ap → p|Ap) + P r (¬Ap)P r (Ap → p|¬Ap)<br />

2, 4, 5 (31.6)<br />

¬Ap ⊢ Ap → p (31.7)<br />

P r (Ap → p|¬Ap) = 1 7, P5 (31.8)<br />

P r (Ap → p|Ap) ≤ 1 P1*, P4 (31.9)<br />

P r (Ap → p) ≥<br />

P r (Ap)P r (Ap → p|Ap) + P r (¬Ap)P r (Ap → p|Ap)<br />

6, 8, 9 (31.10)<br />

⊢ Ap ∨ ¬Ap (31.11)<br />

P r (Ap ∨ ¬Ap) = 1 11, P1 (31.12)<br />

P r (Ap) + P r (¬Ap) = 1 12, P3 ∗<br />

(31.13)<br />

P r (Ap → p) ≥ P r (Ap → p|Ap) 10, 13 (31.14)<br />

Note (14) is an equality iff (9) is an equality or P r (¬Ap) = 0.<br />

Theorem 2<br />

If P r is a classical probability function, then<br />

• P r (¬(Ap ∧ ¬ p)|Ap) ≤ P r (¬(Ap ∧ ¬ p)); and<br />

• P r (¬Ap ∨ p|Ap) ≤ P r (¬Ap ∨ p).<br />

Proof: Assume P r is a classical probability function, and ⊢ the classical consequence<br />

relation.<br />

Ap → p ⊣⊢ ¬(Ap ∧ ¬ p) (31.1)<br />

P r (Ap → p) = P r (¬(Ap ∧ ¬ p)) 1, P2 ∗<br />

(31.2)<br />

P r (Ap → p|Ap) = P r (¬(Ap ∧ ¬ p)|Ap) 1, P4, P5 (31.3)<br />

P r (Ap → p) ≥ P r (Ap → p|Ap) Theorem 1 (31.4)<br />

P r (¬(Ap ∧ ¬ p)|Ap) ≥ P r (¬(Ap ∧ ¬ p)) 2,3,4 (31.5)<br />

Ap → p ⊣⊢ ¬Ap ∨ p (31.6)<br />

P r (Ap → p) = P r (¬Ap ∨ p) 6, P2 ∗<br />

(31.7)<br />

P r (Ap → p|Ap) = P r (¬Ap ∨ p|Ap) 6, P4, P5 (31.8)<br />

P r (¬Ap ∨ p|Ap) ≥ P r (¬Ap ∨ p) 4,7,8 (31.9)


Dogmatism and Intuitionistic Probability 565<br />

The only minor complication is with step 3. There are two cases to consider, either<br />

Ap is a ⊢-antitheorem or it isn’t. If it is a ⊢-antitheorem, then both the LHS and RHS<br />

of (3) equal 1, so they are equal. If it is not a ⊢-antitheorem, then by P4, P r (·|Ap) is<br />

a probability function. So by P2 ∗ , and the fact that Ap → p ⊣⊢ ¬(Ap ∧ ¬ p), we have<br />

that the LHS and RHS are equal.<br />

Theorem 3.<br />

In M, for any x ∈ (0,1),<br />

(a) P r x (Ap → p) = P r x ((Ap → p) ∧ Ap) = x<br />

(b) P r x (¬Ap ∨ p) = P r x ((¬Ap ∨ p) ∧ Ap) = x<br />

(c) P r x (¬(Ap ∧ ¬ p)) = P r x (¬(Ap ∧ ¬ p) ∧ Ap) = x<br />

Recall what M looks like.<br />

2 Ap, p 3 Ap<br />

1<br />

The only point where Ap → p is true is at 2. Indeed, ¬(Ap → p) is true at 3, and<br />

neither Ap → p nor ¬(Ap → p) are true at 1. So P r x (Ap → p) = m x ({2}) = x. Since<br />

Ap is also true at 2, that’s the only point where (Ap → p) ∧Ap is true. So it follows<br />

that P r x ((Ap → p) ∧ Ap) = m x ({2}) = x.<br />

Similar inspection of the model shows that 2 is the only point where ¬(Ap ∧ ¬ p)<br />

is true, and the only point where ¬Ap ∨ p is true. And so (b) and (c) follow in just<br />

the same way.<br />

In slight contrast, Ap is true at two points in the model, 2 and 3. But since<br />

m x ({3}) = 0, it follows that m x ({2,3}) = m x ({2}) = x. So P r x (Ap) = x.<br />

Theorem 4.<br />

For any x ∈ (0,1),<br />

(a) 1 = P r x (Ap → p|Ap) > P r x (Ap → p) = x<br />

(b) 1 = P r x (¬Ap ∨ p|Ap) > P r x (¬Ap ∨ p) = x<br />

(c) 1 = P r x (¬(Ap ∧ ¬ p)|Ap) > P r x (¬(Ap ∧ ¬ p)) = x


Dogmatism and Intuitionistic Probability 566<br />

We’ll just go through the argument for (a); the other cases are similar. By P6, we<br />

know that P r x (¬(Ap ∧ ¬ p)|Ap)P r x (Ap) = P r x ((Ap → p) ∧ Ap). By Theorem 3,<br />

we know that P r x (Ap) = P r x ((Ap → p) ∧ Ap), and that both sides are greater than<br />

0. (Note that the theorem is only said to hold for x > 0.) The only way both these<br />

equations can hold is if P r x (¬(Ap ∧ ¬ p)|Ap) = 1. Note also that by hypothesis,<br />

x < 1, and from this claim (a) follows. The other two cases are completely similar.<br />

Theorem 5 Let P r 1 be a regular conditional ⊢ C L -probability function,<br />

and P r 2 be a regular conditional ⊢ I L -probability function that is not a<br />

⊢ C L -probability function. And let P r 3 be defined as in the text. (That is,<br />

P r 3 (A) = xP r 1 (A) + (1 − x)P r 2 (A), and P r 3 (A|B) = P r 3 (A∧B)<br />

P r 3 (B) .) Then P r 3<br />

is a regular conditional ⊢ I L -probability function.<br />

We first prove that P r 3 satisfies the requirements of an unconditional ⊢ I L -probability<br />

function, and then show that it satisfies the requirements of a conditional ⊢ I L -probability<br />

function.<br />

If p is an ⊢ I L -antithesis, then it is also a ⊢ C L -antithesis. So P r 1 ( p) = P r 2 ( p) = 0.<br />

So P r 3 (A) = 0x + 0(1 − x) = 0, as required for (P0).<br />

If p is an ⊢ I L -thesis, then it is also a ⊢ C L -thesis. So P r 1 ( p) = P r 2 ( p) = 1. So<br />

P r 3 ( p) = x + (1 − x) = 1, as required for (P1).<br />

If p ⊢ I L q then p ⊢ C L q. So we have both P r 1 ( p) ≤ P r (q) and P r 2 ( p) ≤ P r 2 (q).<br />

Since x ≥ 0 and (1 − x) ≥ 0, these inequalities imply that xP r 1 ( p) ≤ xP r (q) and<br />

(1 − x)P r 2 ( p) ≤ (1 − x)P r 2 (q). Summing these, we get xP r 1 ( p) + (1 − x)P r 2 ( p) ≤<br />

xP r 1 (q) + (1 − x)P r 2 (q). And by the definition of P r 3 , that means that P r 3 ( p) ≤<br />

P r 3 (q), as required for (P2).<br />

Finally, we just need to show that P r 3 ( p) + P r 3 (q) = P r 3 ( p ∨ q) + P r 3 ( p ∧ q), as<br />

follows:<br />

P r 3 ( p) + P r 3 (q) = xP r 1 ( p) + (1 − x)P r 2 ( p) + xP r 1 (q) + (1 − x)P r 2 (q)<br />

= x(P r 1 ( p) + P r 1 (q)) + (1 − x)(P r 2 ( p) + P r 2 (q))<br />

= x(P r 1 ( p ∨ q) + P r 1 ( p ∧ q)) + (1 − x)(P r 2 ( p ∨ q) + P r 2 ( p ∧ q))<br />

= xP r 1 ( p ∨ q) + (1 − x)P r 2 ( p ∨ q) + xP r 1 ( p ∧ q)) + (1 − x)P r 2 ( p ∧ q)<br />

= P r 3 ( p ∨ q) + P r 3 ( p ∧ q) as required<br />

Now that we have shown P r 3 is an unconditional ⊢ I L -probability function, we need<br />

to show it is a conditional ⊢ I L -probability function, where P r 3 ( p|r ) = d f<br />

P r 3 ( p∧r )<br />

P r 3 (r ) .<br />

Remember we are assuming that both P r 1 and P r 2 are regular, from which it clearly<br />

follows that P r 3 is regular, so this definition is always in order. (That is, we’re never<br />

dividing by zero.) The longest part of showing P r 3 is a conditional ⊢ I L -probability<br />

function is showing that it satisfies (P4), which has four parts. We need to show that<br />

P r (·|r ) satisfies (P0)-(P3). Fortunately these are fairly straightforward.<br />

If p is an ⊢ I L -antithesis, then so is p ∧ r . So P r 3 ( p ∧ r ) = 0, so P r 3 ( p|r ) = 0, as<br />

required for (P0).


Dogmatism and Intuitionistic Probability 567<br />

If p is an ⊢I L-thesis, then p ∧ r ⊣⊢ r , so P r3 ( p ∧ r ) = P r3 (r ), so P r3 ( p|r ) = 1, as<br />

required for (P1).<br />

If p ⊢I L q then p ∧ r ⊢I L q ∧ r . So P r3 ( p ∧ r ) ≤ P r3 (q ∧ r ). So P r3 ( p∧r ) (q∧r )<br />

P r3 (r ) ≤ P r3 . P r3 (r )<br />

That is, P r 3 ( p|r ) ≤ P r 3 (q|r ), as required for (P2).<br />

Finally, we need to show that P r 3 ( p|r ) + P r 3 (q|r ) = P r 3 ( p ∨ q|r ) + P r 3 ( p ∧<br />

q|r ), as follows, making repeated use of the fact that P r 3 is an unconditional ⊢ I L -<br />

probability function, so we can assume it satisfies (P3), and that we can substitute<br />

intuitionistic equivalences inside P r 3 .<br />

P r3 ( p|r ) + P r3 (q|r ) = P r3 ( p ∧ r )<br />

P r3 (r ) + P r3 (q ∧ r )<br />

P r3 (r )<br />

= P r3 ( p ∧ r ) + P r (q ∧ r )<br />

P r3 (r )<br />

= P r3 (( p ∧ r ) ∨ (q ∧ r )) + P r3 (( p ∧ r ) ∧ (q ∧ r ))<br />

P r3 (r )<br />

= P r3 ( p ∨ q) ∧ r ) + P r3 (( p ∧ q) ∧ r )<br />

P r3 (r )<br />

= P r3 ( p ∨ q) ∧ r )<br />

+<br />

P r3 (r )<br />

P r3 (( p ∧ q) ∧ r )<br />

P r3 (r )<br />

= P r 3 ( p ∨ q|r ) + P r 3 ( p ∧ q|r ) as required<br />

Now if r ⊢ I L p, then r ∧ p I L ⊣⊢ I L p, so P r 3 (r ∧ p) = P r 3 ( p), so P r 3 ( p|r ) = 1,<br />

as required for (P5).<br />

Finally, we show that P r 3 satisfies (P6).<br />

P r3 ( p ∧ q|r ) = P r3 ( p ∧ q ∧ r )<br />

P r3 (r )<br />

= P r3 ( p ∧ q ∧ r ) P r3 (q ∧ r )<br />

P r3 (q ∧ r ) P r3 (r )<br />

= P r 3 ( p|q ∧ r )P r 3 (q|r ) as required<br />

Theorem 6 Let x be any real in (0,1). Then there is a probability function<br />

C r that (a) is a coherent credence function for someone whose credence<br />

that classical logic is correct is x, and (b) satisfies each of the following<br />

inequalities:<br />

P r (Ap → p|Ap) > P r (Ap → p)<br />

P r (¬Ap ∨ p|Ap) > P r (¬Ap ∨ p)<br />

P r (¬(Ap ∧ ¬ p)|Ap) > P r (¬(Ap ∧ ¬ p))


Dogmatism and Intuitionistic Probability 568<br />

We’ll prove this by constructing the function P r . For the sake of this proof, we’ll<br />

assume a very restricted formal language with just two atomic sentences: Ap and p.<br />

This restriction makes it easier to ensure that the functions are all regular, which as<br />

we noted in the main text lets us avoid various complications. The proofs will rely<br />

on three probability functions defined using this Kripke tree M .<br />

1 Ap, p 2 Ap 3 p<br />

4<br />

0<br />

We’ve shown on the graph where the atomic sentences true: Ap is true at 1 and<br />

2, and p is true at 1 and 3. So the four terminal nodes represent the four classical<br />

possibilities that are definable using just these two atomic sentences. We define two<br />

measure functions m 1 and m 2 over the points in this model as follows:<br />

m1 m2 m({0}) m({1}) m({2}) m({3}) m({4})<br />

0<br />

x 1−x 1<br />

1<br />

x<br />

2<br />

2<br />

1−x<br />

4<br />

We’ve just specified the measure of each singleton, but since we’re just dealing with<br />

a finite model, that uniquely specifies the measure of any set. We then turn each of<br />

these into probability functions in the way described in section 1. That is, for any<br />

proposition X , and i ∈ {1,2}, P ri (X ) = mi (MX ), where MX is the set of points in M<br />

where X is true.<br />

Note that the terminal nodes in M , like the terminal nodes in any Kripke tree,<br />

are just classical possibilities. That is, for any sentence, either it or its negation is<br />

true at a terminal node. Moreover, any measure over classical possibilities generates<br />

a classical probability function. (And vice versa, any classical probability function<br />

is generated by a measure over classical possibilities.) That is, for any measure over<br />

classical possibilities, the function from propositions to the measure of the set of<br />

possibilities at which they are true is a classical probability function. Now m1 isn’t<br />

quite a measure over classical possibilities, since strictly speaking m1 ({0}) is defined.<br />

But since m1 ({0}) = 0 it is equivalent to a measure only defined over the terminal<br />

nodes. So the probability function it generates, i.e., P r1 , is a classical probability<br />

function.Of course, with only two atomic sentences, we can also verify by brute force<br />

that P r1 is classical, but it’s a little more helpful to see why this is so. In contrast,<br />

P r2 is not a classical probability function, since P r2 ( p ∨ ¬ p) = 1 − x<br />

, but it is an<br />

2<br />

intuitionistic probability function.<br />

So there could be an agent who satisfies the following four conditions:<br />

2<br />

1−x<br />

4<br />

4<br />

1<br />

4<br />

4<br />

1<br />

4


Dogmatism and Intuitionistic Probability 569<br />

• Her credence that classical logic is correct is x;<br />

• Her credence that intuitionistic logic is correct is 1 − x;<br />

• Conditional on classical logic being correct, she thinks that P r 1 is the right<br />

representation of how things probably are; and<br />

• Conditional on intuitionistic logic being correct, she thinks that P r 2 is the<br />

right representation of how things are.<br />

Such an agent’s credences will be given by a ⊢ I L -probability function P r generated by<br />

‘mixing’ P r 1 and P r 2 . For any sentence Y in the domain, her credence in Y will be<br />

xP r 1 (Y )+(1− x)P r 2 (Y ). Rather than working through each proposition, it’s easiest<br />

to represent this function by mixing the measures m 1 and m 2 to get a new measure<br />

m on the above Kripke tree. Here’s the measure that m assigns to each node.<br />

m<br />

m({0}) m({1}) m({2}) m({3}) m({4})<br />

x(1−x)<br />

2<br />

3x 2 −2x+1<br />

4<br />

As usual, this measure m generates a probability function P r . We’ve already argued<br />

that P r is a reasonable function for someone whose credence that classical logic is x.<br />

We’ll now argue that P r (Ap → p|Ap) > P r (Ap → p).<br />

It’s easy to see what P r (Ap → p) is. Ap → p is true at 1, 3 and 4, so<br />

1−x 2<br />

4<br />

P r (Ap → p) = m(1) + m(3) + m(4)<br />

= 3x2 − 2x + 1<br />

4<br />

= 3x2 − 2x + 3<br />

4<br />

1<br />

4<br />

+ 1 1<br />

+<br />

4 4<br />

Since P r is regular, we can use the ratio definition of conditional probability to work<br />

out P r (Ap → p|Ap).<br />

P r ((Ap → p) ∧ Ap)<br />

P r (Ap → p|Ap) =<br />

P r (Ap)<br />

m(1)<br />

=<br />

m(1) + m(2)<br />

=<br />

3x 2 −2x+1<br />

4<br />

3x 2 −2x+1<br />

4<br />

+ 1−x2<br />

4<br />

3x<br />

=<br />

2 − 2x + 1<br />

(3x 2 − 2x + 1) + (1 − x 2 )<br />

= 3x2 − 2x + 1<br />

2(x 2 − x + 1)<br />

1<br />

4


Dogmatism and Intuitionistic Probability 570<br />

Putting all that together, we have<br />

⇔<br />

P r (Ap → p|Ap) > P r (Ap → p)<br />

3x 2 − 2x + 3<br />

4<br />

> 3x2 − 2x + 1<br />

2(x 2 − x + 1)<br />

⇔ 3x 2 − 2x + 3 > 6x2 − 4x + 2<br />

x 2 − x + 1<br />

⇔ (3x 2 − 2x + 3)(x 2 + x + 1) > 6x 2 − 4x + 2<br />

⇔ 3x 4 − 5x 3 + 8x 2 − 5x + 3 > 6x 2 − 4x + 2<br />

⇔ 3x 4 − 5x 3 + 2x 2 − x + 1 > 0<br />

⇔ (3x 2 + x + 1)(x 2 − 2x + 1) > 0<br />

⇔ (3x 2 + x + 1)(x − 1) 2 > 0<br />

But it is clear that for any x ∈ (0,1), both of the terms of the LHS of the final line are<br />

positive, so their product is positive. And that means P r (Ap → p|Ap) > P r (Ap →<br />

p). So no matter how close x gets to 1, that is, no matter how certain the agent gets<br />

that classical logic is correct, as long as x does not reach 1, conditionalising on Ap<br />

will raise the probability of Ap → p. As we’ve been arguing, as long as there is any<br />

doubt about classical logic, even a vanishingly small doubt, there is no probabilistic<br />

objection to dogmatism.<br />

To finish up, we show that P r (¬Ap ∨ p|Ap) > P r (¬Ap ∨ p) and P r (¬(Ap ∧<br />

¬ p)|Ap) > P r (¬(Ap ∧ ¬ p)). To do this, we just need to note that Ap → p, ¬Ap ∨<br />

p and ¬(Ap ∧ ¬ p) are true at the same points in the model, so their probabilities,<br />

both unconditionally and conditional on Ap, will be identical. So from P r (Ap →<br />

p|Ap) > P r (Ap → p) the other two inequalities follow immediately.


Part VI<br />

Metaphysics


Intrinsic Properties and Combinatorial Principles<br />

Three objections have recently been levelled at the analysis of intrinsicness<br />

offered by Rae Langton and David Lewis. While these objections do<br />

seem telling against the particular theory Langton and Lewis offer, they<br />

do not threaten the broader strategy Langton and Lewis adopt: defining<br />

intrinsicness in terms of combinatorial features of properties. I show<br />

how to amend their theory to overcome the objections without abandoning<br />

the strategy.<br />

Three objections have recently been levelled at the analysis of intrinsicness in Rae<br />

Langton and David Lewis’s “Defining ‘Intrinsic”’. Yablo (1993) has objected that<br />

the theory rests on “controversial and (apparently) irrelevant” judgements about the<br />

relative naturalness of various properties. Dan Marshall and Josh Parsons Marshall<br />

and Parsons (2001) have argued that quantification properties, such as being accompanied<br />

by an cube, are counterexamples to Langton and Lewis’s theory. And Theodore<br />

Sider Sider (2001b) has argued that maximal properties, like being a rock, provide<br />

counterexamples to the theory. In this paper I suggest a number of amendments to<br />

Langton and Lewis’s theory to overcome these counterexamples. The suggestions are<br />

meant to be friendly in that the basic theory with which we are left shares a structure<br />

with the theory proposed by Langton and Lewis. However, the suggestions are not<br />

meant to be ad hoc stipulations designed solely to avoid theoretical punctures, but<br />

developments of principles that follow naturally from the considerations adduced by<br />

Langton and Lewis.<br />

1 Langton and Lewis’s Theory<br />

Langton and Lewis base their theory on a combinatorial principle about intrinsicness.<br />

If a property F is intrinsic, then whether a particular object is F is independent<br />

whether there are other things in the world. This is just a specific instance of the general<br />

principle that if F is intrinsic then whether some particular is F is independent of<br />

the way the rest of the world is. So if F is intrinsic, then the following four conditions<br />

are met:<br />

(a) Some lonely object is F;<br />

(b) Some lonely object is not-F;<br />

(c) Some accompanied object is F; and<br />

(d) Some accompanied object is not-F.<br />

The quantifiers in the conditions range across objects in all possible worlds, and indeed<br />

this will be the quantifier domain in everything that follows (except where indicated).<br />

An object is ‘lonely’ if there are no wholly distinct contingent things in<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philosophy<br />

and Phenomenological Research 63 (2001): 365-380. Thanks to David Lewis, Europa Malynicz, Dan<br />

Marshall, Daniel Nolan, Josh Parsons and Ted Sider for helpful discussions.


Intrinsic Properties and Combinatorial Principles 573<br />

its world. The effect of including ‘distinct’ in this definition is that an object can be<br />

lonely even if it has proper parts; an object is not identical with its parts, but nor is it<br />

distinct from them. Following Langton and Lewis, I will say that any property that<br />

meets the four conditions is ‘independent of accompaniment’.<br />

All intrinsic properties are independent of accompaniment, but so are some extrinsic<br />

properties. For example, the property being the only round thing is extrinsic,<br />

but independent of accompaniment. So Langton and Lewis do not say that independence<br />

of accompaniment is sufficient for intrinsicness. However, within a certain<br />

class of properties, what we might call the basic properties, they do say that any property<br />

independent of accompaniment is intrinsic. A property is basic if it is neither<br />

disjunctive nor the negation of a disjunctive property. Langton and Lewis define the<br />

disjunctive properties as follows:<br />

[L]et us define the disjunctive properties as those properties that can be<br />

expressed by a disjunction of (conjunctions of) natural properties; but<br />

that are not themselves natural properties. (Or, if naturalness admits of<br />

degrees, they are much less natural than the disjuncts in terms of which<br />

they can be expressed.) Langton and Lewis (2001)<br />

Langton and Lewis assume here that there is some theory of naturalness that can be<br />

plugged in here, but they are explicitly ecumenical about what the theory may be.<br />

They mention three possibilities: naturalness might be primitive; it might be defined<br />

in terms of which universals and tropes exist, if you admit such into your ontology;<br />

or it might be defined in terms of which properties play a special role in our theory.<br />

Call the first the primitivist conception, the second the ontological conception, and<br />

the third the pragmatic conception. (One can generate different versions of the pragmatic<br />

theory by altering what one takes to be ‘our theory’. In Taylor (1993), which<br />

Langton and Lewis credit as the canonical statement of the pragmatic conception,<br />

naturalness is relativised to a theory, and the theories he focuses on are ‘regimented<br />

common sense’ and ‘unified science’.) Langton and Lewis’s intention is to be neutral<br />

as to the correct interpretation of naturalness whenever they appeal to it, and I will<br />

follow their policy.<br />

With these concepts, we can now define intriniscness. A property is basic intrinsic<br />

iff it is basic and independent of accompaniment. Two objects are duplicates iff they<br />

have the same basic intrinsic properties. And a property is intrinsic iff there are no<br />

two duplicates that differ with respect to it.<br />

Langton and Lewis make one qualification to this definition: it is only meant<br />

to apply to pure, or qualitative, properties, as opposed to impure, or haeccceitistic,<br />

properties. One reason for this restriction is that if there are any impure intrinsic<br />

properties, such as being John Malkovich, they will not have the combinatorial features<br />

distinctive of pure intrinsic properties. If F is a pure intrinsic property then<br />

there can be two wholly distinct things in a world that are F. This fact will be crucial<br />

to the revised definition of intrinsicness offered below. However, it is impossible to<br />

have wholly distinct things in the same world such that each is John Malkovich. So


Intrinsic Properties and Combinatorial Principles 574<br />

for now I will follow Langton and Lewis and just say what it takes for a pure property<br />

to be intrinsic. As Langton and Lewis note, it would be nice to complete the<br />

definition by giving conditions under which impure properties are intrinsic, but the<br />

little task of working out the conditions under which pure properties are intrinsic<br />

will be hard enough for now.<br />

2 Three Objections<br />

Stephen Yablo (Yablo, 1993) criticises the judgements of naturalness on which this<br />

theory rests. Consider again the property being the only round thing, which is extrinsic<br />

despite being independent of accompaniment. If Langton and Lewis are right, this<br />

must not be a basic property. Indeed, Langton and Lewis explicitly say that it is the<br />

negation of a disjunctive property, since its negation can be expressed as: being round<br />

and accompanied by a round thing or being not-round. Yablo’s criticism is that it is<br />

far from obvious that the existence of this expansion shows that being the only round<br />

thing is disjunctive. For simplicity, let us name all the salient properties:<br />

R = df being the only round thing<br />

S = df being not the only round thing<br />

T = df being round and accompanied by a round thing<br />

U = df being not-round<br />

(Something is accompanied by an F iff one of its distinct worldmates is F.) Langton<br />

and Lewis claim that since S = T ∨ U, and S is much less natural than T and than U, S<br />

is disjunctive, so R is not basic. Yablo notes that we can also express S as being round<br />

if accompanied by a round thing, so it differs from T only in that it has an if where T<br />

has an and. Given this expansion, we should be dubious of the claim that S is much<br />

less natural than T. But without that claim, R already provides a counterexample to<br />

Langton and Lewis’s theory, unless there is some other expression of R or S that<br />

shows they are disjunctive. 1<br />

Dan Marshall and Josh Parsons Marshall and Parsons (2001) argue that the same<br />

kind of difficulties arise when we consider certain kinds of quantificational properties.<br />

For example, let E be the property being such that a cube exists. This is independent<br />

of accompaniment, since a lonely cube is E, a lonely sphere is not E, each of us<br />

is accompanied and E, and each of Max Black’s two spheres is accompanied and not<br />

E. So it is a counterexample to Langton and Lewis if it is basic. Marshall and Parsons<br />

1 It would be no good to say that Langton and Lewis should be more liberal with their definition of<br />

disjunctiveness, and say instead that a property is disjunctive iff it can be expressed as a disjunction. Any<br />

property F can be expressed as the disjunction F and G or F and not G, or for that matter, F or F, so this<br />

would make every property disjunctive.<br />

I do not want to dismiss out of hand the possibility that there is another expression of S that shows it is<br />

disjunctive. Josh Parsons suggested that if we define T ′ to be being accompanied by a round thing, then S is<br />

T ′ ∨ U, and there is some chance that T ′ is more natural than S on some conceptions of naturalness. So we<br />

cannot derive a decisive counterexample from Yablo’s discussion. Still, Langton and Lewis need it to be<br />

the case that on any account of naturalness, there is an expression that shows S or R to be disjunctive, and<br />

unless T ′ is much more natural than S on all conceptions of naturalness, this task is still far from complete.


Intrinsic Properties and Combinatorial Principles 575<br />

note that, like all properties, it does have disjunctive expressions. For example x is E<br />

iff x is a cube or x is accompanied by a cube. And E is a less natural property than<br />

being a cube. But it is not at all intuitive that E is much less natural than the property<br />

being accompanied by a cube. This does not just show that Langton and Lewis have<br />

to cease being ecumenical about naturalness, because on some conceptions of naturalness<br />

it is not clear that E is much less natural than being accompanied by a cube.<br />

Rather, this example shows that there is no conception of naturalness that could play<br />

the role that Langton and Lewis want. The properties E and being accompanied by<br />

a cube seem just as natural as each other on the ontological conception of naturalness,<br />

on the pragmatic conception of naturalness, and, as far as anyone can tell, on<br />

the primitivist conception. This is not because E is particularly natural on any of<br />

these conceptions. It certainly does not, for example, correspond to a universal, and<br />

it does not play a special role in our thinking or in ideal science. But since there is no<br />

universal for being accompanied by a cube, and that property does not play a special<br />

role in our thinking or in ideal science, it seems likely that each property is as natural<br />

as the other.<br />

Theodore Sider (2001b) notes that similar problems arise for maximal properties,<br />

like being a rock. A property F is maximal iff large parts of Fs are typically not<br />

Fs. For example, being a house is maximal; a very large part of a house, say a house<br />

minus one window ledge, is not a house, it is just a large part of a house. Purported<br />

proof: call the house minus one window ledge house-. If Katie buys the house she<br />

undoubtedly buys house-, but she does not thereby buy two houses, so house- is not<br />

a house. As Sider notes, this is not an entirely conclusive proof, but it surely has<br />

some persuasive force. Maximal properties could easily raise a problem for Langton<br />

and Lewis’s definition. All maximal properties are extrinsic; whether a is a house<br />

depends not just on how a is, but on what surrounds a. Compare: House- would be<br />

a house if the extra window ledge did not exist; in that case it would be the house that<br />

Katie buys. But some maximal properties are independent of accompaniment. Being<br />

a rock is presumably maximal: large parts of rocks are not rocks. If they were then<br />

presumably tossing one rock up into the air and catching it would constitute juggling<br />

seventeen rocks, making an apparently tricky feat somewhat trivial. But there can be<br />

lonely rocks. A rock from our planet would still be a rock if it were lonely. Indeed,<br />

some large rock parts that are not rocks would be rocks if they were lonely. And<br />

it is clear there are be lonely non-rocks (like our universe), accompanied rocks (like<br />

Uluru) and accompanied non-rocks (like me).<br />

Since being a rock is independent of accompaniment and extrinsic, it is a counterexample<br />

if it is basic. Still, one might think it is not basic. Perhaps being a rock is<br />

not natural on the primitivist conception. (Who is to say it is?) And perhaps it does<br />

not correspond to a genuine universal, or to a collection of tropes, so it is a disjunctive<br />

property on the ontological conception of naturalness. Sider notes, however, that on<br />

at least one pragmatic conception, where natural properties are those that play a special<br />

role in regimented common sense, it does seem particularly natural. Certainly it<br />

is hard to find properties such that being a rock can be expressed as a disjunction of<br />

properties that are more central to our thinking than being a rock. So this really does<br />

seem to be a counterexample to Langton and Lewis’s theory.


Intrinsic Properties and Combinatorial Principles 576<br />

3 The Set of Intrinsic Properties<br />

It is a platitude that a property F is intrinsic iff whether an object is F does not depend<br />

on the way the rest of the world is. Ideally this platitude could be morphed into<br />

a definition. One obstacle is that it is hard to define the way the rest of the world is<br />

without appeal to intrinsic properties. For example, even if F is intrinsic, whether<br />

a is F is not independent of whether other objects have the property not being accompanied<br />

by an F, which I will call G. To the extent that having G is a feature of<br />

the way the rest of the world is, properties like G constitute counterexamples to the<br />

platitude. Since platitudes are meant to be interpreted to be immune from counterexamples,<br />

it is wrong to interpret the platitude so that G is a feature of the way the rest<br />

of the world is. The correct interpretation is that F is intrinsic iff whether an object<br />

is F does not depend on which intrinsic properties are instantiated elsewhere in the<br />

world.<br />

If what I call the independence platitude is to be platitudinous, we must not treat<br />

independence in exactly the same way as Langton and Lewis do. On one definition,<br />

whether a is F is independent of whether the rest of the world is H iff it is possible<br />

that a is F and the rest of the world H, possible that a is not-F and the rest of the<br />

world H, possible that a is F and the rest of the world not-H, and possible that a is<br />

not-F and the rest of the world not-H. On another, whether a if F is independent of<br />

whether the rest of the world is H iff whether a is F is entirely determined by the<br />

way a itself, and nothing else, is, and whether the rest of the world is H is determined<br />

by how it, and not a, is. This latter definition is very informal; hence the need for<br />

the formal theory that follows. But it does clearly differ from the earlier definition<br />

in a couple of cases. The two definitions may come apart if F and H are excessively<br />

disjunctive. More importantly, for present purposes, they come apart if F is the necessary<br />

property (that everything has), or the impossible property (that nothing has).<br />

In these cases, whether a is F is entirely settled by the way a, and nothing else is, so<br />

in the latter sense it is independent of whether the rest of the world is H. But it is<br />

not the case that all four possibilities in the former definition are possible, so it is not<br />

independent of whether the rest of the world is H in that sense. Since there is some<br />

possibility of confusion here, it is worthwhile being clear about terminology. When I<br />

talk about independence here, I will always mean the latter, informal, definition, and<br />

I will refer to principles about which combinations of intrinsic properties are possible,<br />

principles such as Langton and Lewis’s principle that basic intrinsic properties<br />

are independent of accompaniment, as combinatorial principles. So, in the terminology<br />

I am using, the combinatorial principles are attempts to formally capture the<br />

true, but elusive, independence platitude with which I opened this section.<br />

Since the platitude is a biconditional with intrinsic on either side, it will be a little<br />

tricky to morph it into a definition. But we can make progress by noting that the<br />

platitude tells us about relations that hold between some intrinsic properties, and<br />

hence about what the set of intrinsic properties, which I will call SI, must look like.<br />

For example, from the platitude it follows that SI is closed under Boolean operations.<br />

Say that F and G are intrinsic. This means that whether some individual a is F<br />

is independent of how the world outside a happens to be. And it means that whether


Intrinsic Properties and Combinatorial Principles 577<br />

a is G is independent of the way the world outside a happens to be. This implies that<br />

whether a is F and G is independent of the way the world outside a happens to be,<br />

because whether a is F and G is a function of whether a is F and whether a is G. And<br />

that means that F and G is intrinsic. Similar reasoning shows that F or G, and not F<br />

are also intrinsic. Call this condition Boolean closure.<br />

Another implication of the independence platitude is that SI must be closed under<br />

various mereological operations. If F is intrinsic then whether a is F is independent<br />

of the outside world. If some part of a is F, that means, however the world outside<br />

that part happens to be, that part will be F. So that means that however the world<br />

outside a is, a will have a part that is F. Conversely, if a does not have a part that is<br />

F, that means all of a’s parts are not F. As we saw above, if F is intrinsic, so is not F.<br />

Hence it is independent of the world outside a that all of its parts are not F. That is,<br />

it is independent of the world outside a that a does not have a part that is F. In sum,<br />

whether a has a part that is F is independent of how the world outside a turns out<br />

to be. And that means having a part that is F is intrinsic. By similar reasoning, the<br />

property Having n parts that are F will be intrinsic if F is for any value of n. Finally,<br />

the same reasoning shows that the property, being entirely composed of n things that<br />

are each F is intrinsic if F is intrinsic. The only assumption used here is that it is<br />

independent of everything outside b that b is entirely composed of the particular<br />

things that it is composed of, but again this seems to be a reasonable assumption. So,<br />

formally, if F ∈ SI, then Having n parts that are F ∈ SI, and Being entirely composed of<br />

n things that are F ∈ SI. Call this condition mereological closure.<br />

Finally, and most importantly, various combinatorial principles follow from the<br />

independence platitude. One of these, that all intrinsic properties are independent<br />

of accompaniment, forms the centrepiece of Langton and Lewis’s theory. The counterexamples<br />

provided by Marshall and Parsons, and by Sider, suggest that we need<br />

to draw two more combinatorial principles from the platitude. The first is that if F<br />

and G are intrinsic properties, then whether some particular object a is F should be<br />

independent of how many other things in the world are G. More carefully, if F and<br />

G are intrinsic properties that are somewhere instantiated then, for any n such that<br />

there is a world with n+1 things, there is a world constituted by exactly n+1 pairwise<br />

distinct things, one of which is F, and the other n of which are all G. When I say the<br />

world is constituted by exactly n+1 things, I do not mean that there are only n+1<br />

things in the world; some of the n+1 things that constitute the world might have<br />

proper parts. What I mean more precisely is that every contingent thing in the world<br />

is a fusion of parts of some of these n+1 things. Informally, every intrinsic property<br />

is not only independent of accompaniment, it is independent of accompaniment by<br />

every intrinsic property. As we will see, this combinatorial principle, combined with<br />

the Boolean closure principle, suffices to show that Marshall and Parsons’s example,<br />

being such that a cube exists, is extrinsic.<br />

Sometimes the fact that a property F is extrinsic is revealed by the fact that nothing<br />

that is F can be worldmates with things of a certain type. So the property being<br />

lonely is extrinsic because nothing that is lonely can be worldmates with anything<br />

at all. But some extrinsic properties are perfectly liberal about which other properties<br />

can be instantiated in their world; they are extrinsic because their satisfaction


Intrinsic Properties and Combinatorial Principles 578<br />

excludes (or entails) the satisfaction of other properties in their immediate neighbourhood.<br />

Sider’s maximal properties are like this. That a is a rock tells us nothing at all<br />

about what other properties are instantiated in a’s world. However, that a is a rock<br />

does tell us something about what happens around a. In particular, it tells us that<br />

there is no rock enveloping a. If there were a rock enveloping a, then a would not<br />

be a rock, but rather a part of a rock. If being a rock were intrinsic, then we would<br />

expect there could be two rocks such that the first envelops the second. 2 The reason<br />

that being a rock is extrinsic is that it violates this combinatorial principle. (As a<br />

corollary to this, a theory which ruled out being a rock from the class of the intrinsic<br />

just because it is somehow unnatural would be getting the right result for the wrong<br />

reason. Being a rock is not a property like being a lonely electron or an accompanied<br />

non-electron that satisfies the independence platitude in the wrong way; rather, it fails<br />

to satisfy the independence platitude, and our theory should reflect this.)<br />

So we need a second combinatorial principle that rules out properties like being<br />

a rock. The following principle does the job, although at some cost in complexity.<br />

Assume there is some world w 1 , which has some kind of spacetimelike structure. 3 Let<br />

d 1 and d 2 be shapes of two disjoint spacetimelike regions in w 1 that stand in relation<br />

A. Further, suppose F and G are intrinsic properties such that in some world there<br />

is an F that wholly occupies a region with shape d 1 , and in some world, perhaps<br />

not the same one, there is a G that wholly occupies a region with shape d 2 . By<br />

‘wholly occupies’ I mean that the F takes up all the ‘space’ in d 1 , and does not take<br />

up any other ‘space’. (There is an assumption here that we can identify shapes of<br />

spacetimelike regions across possible worlds, and while this assumption seems a little<br />

contentious, I hope it is acceptable in this context.) If F, G, d 1 , d 2 and A are set up in<br />

this way, then there is a world where d 1 and d 2 stand in A, and an F wholly occupies<br />

a region of shape d 1 in that world, and a G wholly occupies a region of shape d 2 in<br />

that world. In short, if you could have an F in d 1 , and you could have a G in d 2 ,<br />

and d 1 and d 2 could stand in A, then all three of those things could happen in one<br />

world. This kind of combinatorial principle has been endorsed by many writers on<br />

modality (for example Lewis 1986 and Armstrong 1989), and it seems something we<br />

should endorse in a theory on intrinsic properties.<br />

In sum, the set of intrinsic properties, SI, has the following four properties:<br />

(B) If F ∈ SI and G ∈ SI then F and G ∈ SI and F or G ∈ SI and not F ∈ SI<br />

(M) If F ∈ SI then Having n parts that are F ∈ SI and Being entirely composed of exactly<br />

n things that are F ∈ SI<br />

(T) If F ∈ SI and G ∈ SI and there is a possible world with n+1 pairwise distinct<br />

things, and something in some world is F and something in some world is G,<br />

then there is a world with exactly n+1 pairwise distinct things such that one is<br />

F and the other n are G.<br />

2 I assume here that there are rocks with rock-shaped holes in their interior. This seems like a reasonable<br />

assumption, though without much knowledge of geology I do not want to be too bold here.<br />

3 Perhaps all worlds have some kind of spacetimelike structure, in which case this qualification is unnecessary,<br />

but at this stage it is best not to take a stand on such a contentious issue.


Intrinsic Properties and Combinatorial Principles 579<br />

(S) If F ∈ SI and G ∈ SI and it is possible that regions with shapes d 1 and d 2 stand<br />

in relation A, and it is possible that an F wholly occupy a region with shape<br />

d 1 and a G wholly occupy a region with shape d 2 , then there is a world where<br />

regions with shapes d 1 and d 2 stand in A, and an F wholly occupies the region<br />

with shape d 1 and a G wholly occupies the region with shape d 2 .<br />

Many other sets than SI satisfy (B), (M), (T) and (S). That is, there are many sets I k<br />

such that each condition would still be true if we were to substitute I k for SI wherever<br />

it appears. Say that any such set is an I-set. Then F is intrinsic only if F is an<br />

element of some I-set. Is every element of every I-set intrinsic? As we will see, sadly<br />

the answer is no. However, most of the counterexamples proposed to Langton and<br />

Lewis’s theory are not elements of any I-set, so we already have the resources to show<br />

they are extrinsic.<br />

4 Responding to Counterexamples<br />

Marshall and Parsons noted that E, the property being such that a cube exists, is independent<br />

of accompaniment. However, it is not part of any I-set. To see this, assume<br />

it is in I k , which is an I-set. By (B), not E is also in I k . So by (T), there is a world<br />

where something is E, and there are two things, one of which is E and the other of<br />

which is not E. But clearly this cannot be the case: if something in a world is E, so is<br />

everything else in the world. Hence I k cannot be an I-set, contrary to our assumption.<br />

Intuitively, E is extrinsic because whether it is satisfied by an individual is not<br />

independent of whether other individuals satisfy it.<br />

Some other quantificational properties, such as being one of at most seventeen<br />

cubes, require a different argument to show that they are not in any I-set. Call that<br />

property E17. (Note, by the way, that E17 is independent of accompaniment, and<br />

not obviously disjunctive.) If E17 is in an I-set, then by (T) there is a world containing<br />

exactly 18 things, each of which is E17. But this is clearly impossible, since<br />

everything that is E17 is a cube, and everything that is E17 is in a world containing<br />

at most seventeen cubes. So E17 is not in any I-set, and hence is extrinsic. Similarly,<br />

being the only round thing cannot be in an I-set, because if it were by (T) there would<br />

be a world in which two things are the only round thing, which is impossible. So a<br />

definition of intrinsicness in terms of I-sets need not make the odd postulations about<br />

naturalness that Yablo found objectionable.<br />

Assume, for reductio, that being a rock is in an I-set. There is a rock that is roughly<br />

spherical, and there is a rock that has a roughly spherical hollow in its interior. (Actually,<br />

there are many rocks of each type, but we only need one of each.) Let d 1 be the<br />

region the first rock takes up, and assume that the shape of the hollow in the second<br />

is also d 1 . If it is not, we could always find another rock with a hollow this shape,<br />

so the assumption is harmless. Let d 2 be the region the second rock, the one with<br />

this nicely shaped hollow, takes up. If being a rock is an I-set, then by (S) there is a<br />

world where d 2 exactly surrounds d 1 , there is a rock wholly occupying d 1 and a rock<br />

wholly occupying d 2 . But this is impossible; if there were rock-like things in both d 1<br />

and d 2 , they would both be parts of a single large rock, that extends outside both d 1


Intrinsic Properties and Combinatorial Principles 580<br />

and d 2 and if there were not a rock-like thing in one or the other region, then there<br />

would not be a rock in that region. So no set satisfying (S) contains being a rock, so<br />

that property is not in any I-set, and hence is extrinsic.<br />

The first extrinsic property independent of accompaniment that Langton and<br />

Lewis consider is CS: being spherical and lonely or cubical and accompanied. This too<br />

is not in any I-set. Again, assume for reductio that it is. In the actual world, there are<br />

(accompanied) cubes that are entirely composed of eight smaller cubes. Both the large<br />

cube and the eight smaller cubes are accompanied, so they are both CS. Hence there<br />

is a CS that is entirely composed of eight things that are CS. By (M), being entirely<br />

composed of exactly eight things that are CS is in the I-set. By (B), being CS and entirely<br />

composed of exactly eight things that are CS is in the I-set. So by (T), there is a world in<br />

which something has that property, and there is nothing else. (To see that (T) entails<br />

this, let G be any element of the I-set, and let n be zero.) That is, there is a lonely CS<br />

that is composed of eight things that are CS. But this is impossible. A lonely CS is a<br />

sphere, but its eight parts are not lonely, and are CS, so they must be cubes. And no<br />

sphere is entirely composed of exactly eight cubes. So CS cannot be in an I-set, and<br />

hence is extrinsic.<br />

5 Problem Cases and Disjunctive Properties<br />

Those five successes might make us think that only intrinsic properties are ever in<br />

I-sets. However there are still some extrinsic properties that can slip into I-sets. For<br />

an example, consider the property LCS, defined as follows:<br />

x is LCS ↔ (x is cubical and not both lonely and simple) or (x is lonely,<br />

simple and spherical)<br />

The smallest set containing LCS and satisfying (B) and (M) is an I-set. There is an important<br />

reason for this. Define a simple world as a world containing just one mereological<br />

simple, and a compound world as a world that is not a simple world. Whether<br />

a property satisfies (T) and (S) (or, more precisely, whether a set containing that property<br />

can satisfy (T) and (S)) depends on just how the property interacts with other<br />

properties in compound worlds and whether it is ever instantiated in simple worlds.<br />

Since the same things are LCS as are cubical in compound worlds, these two properties,<br />

LCS and being cubical, interact with other properties in compound worlds in<br />

the same way. And each property is instantiated in simple worlds, although they are<br />

instantiated in different simple worlds. In sum, the properties are similar enough to<br />

be indistinguishable by (T) and (S), and that means we will not be able to show that<br />

LCS is extrinsic using just those considerations.<br />

Any property that agrees with an intrinsic property, like being cubical, in the<br />

compound worlds, and is somehow extended so it is instantiated in simple worlds,<br />

will be in an I-set. This is not just because we have not put enough restrictions on<br />

what makes an I-set. There are just no combinatorial principles we could deduce<br />

from the independence platitude that LCS violates. This is because any such principle<br />

would, like (T) and (S), be satisfied or not depending just on how the property


Intrinsic Properties and Combinatorial Principles 581<br />

interacts with other properties in worlds where there are things to interact with, i.e.<br />

the compound worlds, and whether it is instantiated in the simple worlds. It is to the<br />

good that our deductions from the independence platitude did not show that LCS is<br />

extrinsic, because in an important sense LCS, like all properties that agree with some<br />

intrinsic property in all compound worlds, satisfies the platitude.<br />

So at this point appeal to disjunctive and non-disjunctive properties is needed. Intuitively,<br />

intrinsic properties are not only capable of being instantiated in all possible<br />

combinations with other intrinsic properties, they are capable of being so instantiated<br />

in the same way in all these possible combinations. We need to distinguish between<br />

the disjunctive and the non-disjunctive properties in order to say which properties<br />

are instantiated the same way in all these different combinations.<br />

It might be thought at this stage that we could just adopt Langton and Lewis’s definition<br />

of the disjunctive properties. If that definition worked, we could say the basic<br />

intrinsic properties are the non-disjunctive properties that are in I-sets, then define<br />

duplication and intrinsicness as they do in terms of basic intrinsics. The definition<br />

does not, it seems, work as it stands because it does not show that LCS is disjunctive.<br />

This will be easier to follow if we name all the components of LCS, as follows:<br />

C = being cubical<br />

L = being lonely<br />

M = being simple<br />

H = being spherical<br />

LCS = (C & ¬(L&M)) ∨ (L& M & H)<br />

Let us agree that LCS is not a natural property, if naturalness is an on/off state,<br />

or is very unnatural, if naturalness comes in degrees. On Langton and Lewis’s first<br />

definition, it is disjunctive if it is a disjunction of conjunctions of natural properties.<br />

This seems unlikely: ¬(L & M) is not a natural property. This is the property of<br />

being in a compound world, hardly a natural property. Similarly, C & ¬(L & M),<br />

being a cube in a compound world, is hardly natural either. We could insist that these<br />

properties are natural, but at this point Yablo’s complaint, that clear facts like the<br />

extrinsicness of LCS are being made to rest on rather obscure facts, like the putative<br />

naturalness of being in a compound world, returns to haunt us. (I assume, for the sake<br />

of the argument, that L & M & H is a natural property, though this assumption could<br />

be easily questioned.) On the second definition, LCS is disjunctive if it is much less<br />

natural than ¬(L & M), or than C & ¬(L & M). Again, it seems unlikely that this is<br />

the case. These properties seem rather unnatural. I have defined enough terms that<br />

we can state in the lexicon of this paper just what ¬(L & M) amounts to, i.e. being<br />

in a compound world, but the apparent simplicity of this definition should not make<br />

us think that the properties are natural. It is true in natural languages that predicates<br />

that are easy to express are often natural, but this fact does not extend across to the<br />

technical language that is employed here.<br />

The way out is to change the definition of disjunctive properties. A property<br />

is disjunctive, intuitively, if it can be instantiated in two quite different ways. Most


Intrinsic Properties and Combinatorial Principles 582<br />

properties of the form: (N 1 & U 1 ) ∨ (N 2 & U 2 ), where N 1 and N 2 pick out distinct<br />

(relatively) natural properties, and U 1 and U 2 pick out distinct (relatively) unnatural<br />

properties that are independent of N 1 and N 2 , will be like this. If we name this<br />

predicate F, there will be two quite different types of Fs: those that are N 1 and those<br />

that are N 2 . Note that this will be true no matter how unnatural U 1 and U 2 are;<br />

provided some Fs are N 1 , and some are N 2 , there will be these two ways to be F. So I<br />

suggest we amend Langton and Lewis’s definition of disjunctiveness as follows:<br />

A property F is disjunctive iff it can be expressed as a disjunction of<br />

conjunctions, i.e.: (A 11 & . . . & A 1n ) ∨ . . . ∨ (A k1 & . . . & A km ) and in<br />

each disjunct, at least one of the conjuncts is much more natural than F.<br />

On this definition it is clear that LCS is disjunctive, since it is much less natural than<br />

being cubical and than being spherical, and in its expression above, being cubical is one<br />

of the conjuncts in the first disjunct, and being spherical is one of the conjuncts in the<br />

second disjunct. These kinds of comparisons of naturalness do not seem contentious,<br />

or any less obvious than the conclusions about extrinsicness we use them to generate.<br />

Further, the new definition of disjunctiveness is not meant to be an ad hoc fix.<br />

Rather this requirement that only one conjunct in each disjunct need be much more<br />

natural than F seems to follow directly from the reason we introduced the concept<br />

of disjunctiveness to begin with. For each F that satisfies the combinatorial principle<br />

(either independence of accompaniment in Langton and Lewis’s theory, or being in<br />

an I-set in my theory), we wanted to know whether it only does this because there<br />

are two or more ways to be an F. If F satisfies the definition of disjunctiveness I offer<br />

here, it seems there are two or more ways to be an F, so the fact that it can be in an<br />

I-set should not lead us to believe it is intrinsic.<br />

Using this definition of disjunctiveness, we can say that the basic intrinsic properties<br />

are those that are neither disjunctive nor the negation of a disjunctive property,<br />

and are in at least one I-set, then say duplicates are things that share all basic intrinsic<br />

properties, and finally that intrinsic properties are properties shared by all duplicates.<br />

There are two reasons for thinking that this definition might well work. First, as we<br />

have seen it handles a wide range of hard cases. More importantly, the way that the<br />

hard cases were falling gave us reason to suspect that the only extrinsic properties<br />

that will be in I-sets are properties like LCS: properties that agree with some intrinsic<br />

property in all compound worlds. It is reasonably clear that these properties will be<br />

disjunctive according to the above definition. To see this, let F be the extrinsic property<br />

in an I-set, and let G be the intrinsic property it agrees with in all compound<br />

worlds. Then for some J, F can be expressed as (G & ¬(L & M)) ∨ (L & M & J), and it<br />

will presumably be much less natural than G, probably much less natural than J, and<br />

almost certainly much less natural than being simple, our L. So if these are the only<br />

kind of extrinsic properties in I-sets, our definition is correct.<br />

Indeed, if these are the only kinds of extrinsic properties in I-sets, we may not<br />

even need to worry about which properties should count as disjunctive. Say that a<br />

property F blocks another property G iff both F and G are in I-sets, but there is no<br />

I-set containing both F and G. If F and G were both intrinsic, then there would be an


Intrinsic Properties and Combinatorial Principles 583<br />

I-set they are both in, such as say SI, so the fact that there is no such I-set shows that<br />

one of them is extrinsic. Note that LCS blocks being cubical. To prove this, assume<br />

LCS and being cubical are in an I-set, say I k . By two applications of (B), LCS and not<br />

cubical is in I k . This property is instantiated in some possible worlds: it is instantiated<br />

by all lonely spheres. So by (T) there should be a world containing two things that<br />

satisfy LCS and not cubical. But only lonely, simple spheres satisfy this property, so<br />

there is no world where two things satisfy it, contradicting our assumption that LCS<br />

and being cubical can be in the same I-set. The proof here seems perfectly general:<br />

if G is intrinsic and F differs from G only in which things in simple worlds satisfy<br />

it, and G is in an I-set, then F will block G. Blocking, as defined, is symmetric, so<br />

the fact that F blocks G is no evidence that F is extrinsic, as opposed to G. Still, if<br />

G is much more natural than F, then in all probability the reason F blocks G is that<br />

they agree about all cases in compound worlds, and disagree just about the simple<br />

worlds. In that case, it seems that F is extrinsic, and G is intrinsic. So I think the<br />

following conjecture has merit: F is intrinsic iff it is in an I-set and does not block<br />

any property much more natural than itself. If the conjecture works, the only kind<br />

of naturalness comparisons we need to make will be between properties like LCS and<br />

properties like being cubical. Again, I think these kinds of comparisons should be<br />

fairly uncontentious.<br />

6 Back to Basics?<br />

Most of the work in my theory is done by the concept of I-sets. It might be wondered<br />

whether we can do without them. In particular, it might be thought that the new<br />

definition of disjunctivenes I offer in ˘g5 will be enough to rescue Langton and Lewis’s<br />

theory from the objections I have been fretting about. Indeed, the new definition of<br />

disjunctiveness does suffice for responding to Yablo’s objection. However, it will not<br />

do on its own, and I think it will end up being essential to define intrinsicness in<br />

terms of I-sets.<br />

Yablo notes that a property like being the only red thing is independent of accompaniment,<br />

and that the way Langton and Lewis suggest showing it is disjunctive is by<br />

expressing its negation as being red and accompanied by a red thing, or not being red.<br />

Yablo criticises the claim that the first of these disjuncts really is a natural property.<br />

Above I agreed that this was a good objection. However, on the new definition of<br />

disjunctiveness, it is beside the point.<br />

To show that not being the only red thing is disjunctive, we need only express it as<br />

a disjunction of conjunctions such that at least one conjunct in each disjunct is much<br />

more natural than it is. We have the disjunctive expansion of not being the only red<br />

thing, and the first disjunct is being red and accompanied by a red thing. Now this<br />

disjunct as a whole may not be particularly natural, but the first conjunct, being red,<br />

is much more natural than not being the only red thing. So all we need to show is<br />

that one of the conjuncts in the second disjunct is much more natural than the whole<br />

disjunction. Since the second disjunct has only one conjunct, this means we have to<br />

show not being red is much more natural than not being the only red thing. However,<br />

there seems to be no simple way to show this. It is just entirely unclear how natural


Intrinsic Properties and Combinatorial Principles 584<br />

properties like not being red should seem to be. My guess (for what it is worth) is<br />

that like most properties that can be expressed by negations of English predicates, it<br />

is very unnatural. Certainly it is very unnatural if we suppose, as seems fair in this<br />

context, that F is only a natural property if all the things that are F resemble each<br />

other in some important way. The class of things that are not red is as heterogeneous<br />

a class as you can hope to find; blue berries, green leaves, silver Beetles, colourless<br />

gases and immaterial souls all find their way in. It is true that in New Work for<br />

a Theory of Universals, David Lewis provides two importantly distinct criteria for<br />

naturalness. One is the resemblance criterion just mentioned. The other is that F is<br />

only perfectly natural if it is fundamental. It might be thought that when we look<br />

at this criterion, it does turn out that not being red is much more fundamental than<br />

being the only red thing. Even if this is the case, it is not clear that it does help,<br />

or more importantly, that it should help. The problem Langton and Lewis were<br />

trying to handle is that not being the only red thing satisfies a particular combinatorial<br />

principle (independence of accompaniment), but only, they say, because there are two<br />

different ways of instantiating that property: not being red and being accompanied by<br />

a red thing. The problem is that not being red is not a way to instantiate a property,<br />

because it is not a way that something could be. It seems very intuitive that ‘ways<br />

things could be’, in this sense, are resemblance properties: they are properties that<br />

make for resemblance amongst their instantiators. And even if we can defend the<br />

claim that not being red is a fundamental property, the fact that it is not a resemblance<br />

property seems to undercut Langton and Lewis’s case here.<br />

The new definition of disjunctiveness does not provide a defender of Langton<br />

and Lewis’s theory with a response to Yablo’s criticism. On the new definition of<br />

disjunctiveness, we do not have to show that being red and accompanied by a red thing<br />

is more natural than not being the only red thing in order to show that the latter is<br />

disjunctive. However, in order to show that not being the only red thing is disjunctive,<br />

we still need to show that not being red is a moderately natural property, and this<br />

does not seem to be true.<br />

7 Conclusion<br />

There are four major differences between the analysis of intrinsic properties provided<br />

here and the one provided by Langton and Lewis. Three of these are reflected in the<br />

difference between the combinatorial principle they use, independence of accompaniment,<br />

and the combinatorial principle I use, membership in an I-set. All properties<br />

that are in I-sets are independent of accompaniment, but they also have a few<br />

other nice features. First, membership in an I-set guarantees not just independence of<br />

whether there are other things, but independence of what other types of things there<br />

are. This is the independence principle encoded in condition (T) on I-sets. Secondly,<br />

membership in an I-set guarantees independence of where the other things are. This<br />

is the principle encoded in condition (S). Third, the mereological principle (M) has<br />

no parallel in Langton and Lewis’s theory.<br />

The effect of these extra three restrictions is that I have to make many fewer<br />

appeals to naturalness than do Langton and Lewis. The fourth difference between


Intrinsic Properties and Combinatorial Principles 585<br />

their theory and mine is in the role naturalness considerations play in determining<br />

which properties are intrinsic. In section 5 I offer two ways of finishing the analysis<br />

using naturalness. The first is in the new definition of disjunctiveness; with this<br />

definition in hand we can finish the story just as Langton and Lewis suggest. The<br />

second is in terms of blocking: F is intrinsic iff it is in an I-set and does not block<br />

any property that it is much less natural than. Both ways are designed to deal with a<br />

quite specific problem: properties that differ only in which things instantiate them in<br />

simple worlds have the same combinatorial features, so a definition of intrinsicness in<br />

terms of combinatorial features (as is Langton and Lewis’s, and as is mine) will not be<br />

able to distinguish them. Still, both solutions seem likely to provide the same answer<br />

in all the hard cases: the right answer.


The Asymmetric Magnets Problem<br />

There are many controversial theses about intrinsicness and duplication. The first<br />

aim of this paper is to introduce a puzzle that shows that two of the uncontroversial<br />

sounding ones can’t both be true. The second aim is to suggest that the best way out<br />

of the puzzle requires sharpening some distinctions that are too frequently blurred,<br />

and adopting a fairly radical reconception of the ways things are.<br />

1 Two Theses about Duplication<br />

In all of David Lewis’s discussions of intrinsicness and duplication, he held that the<br />

two concepts are connected by a tight circle of interdefinition. Duplicates share all<br />

of their intrinsic features, and objects that share all of their intrinsic features are duplicates.<br />

(Lewis, 1983c,a; Langton and Lewis, 1998). Both of these claims are a little<br />

controversial. One might hold that some impure properties that aren’t shared by all<br />

duplicates, like having George Clooney as a part, are nevertheless intrinsic since gaining<br />

or losing them seems to amount to a non-Cambridge change (<strong>Weatherson</strong>, 2006a).<br />

And one might hold that some properties which don’t differ between duplicates by<br />

definition, such as being a duplicate of the Louvre as it actually is, are nevertheless extrinsic<br />

(Dunn, 1990). So maybe Lewis’s tight circle of interdefinition is not beyond<br />

question. But the following principle seems utterly uncontroversial to me.<br />

Intrinsicness Principle :<br />

• If a and b differ in their pure intrinsic features, they are not duplicates;<br />

• If a and b have the same pure intrinsic features, then they are duplicates<br />

That conjunction is the first of our (hitherto) uncontroversial theses. The second<br />

needs a bit more work to state formally.<br />

It is fairly intuitive that whether two objects are duplicates is not an emergent<br />

feature of reality. In some sense, whether two complex are duplicates just depends<br />

on the properties of their parts and the relations between their parts. But this claim<br />

does turn out to be controversial; David Lewis (1983c) has controverted it. To a first<br />

approximation, his theory says that whether two objects are duplicates depends on<br />

whether they share the same perfectly natural properties. If there are any perfectly<br />

natural properties that are emergent, i.e. which are properties that complex objects<br />

have but not in virtue of the properties of or relations between their parts, then<br />

whether two objects are duplicates will also be emergent. Now Lewis doesn’t think<br />

there are any emergent perfectly natural properties, since the existence of such properties<br />

would be incompatible with the thesis of Humean Supervenience. But Lewis<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philosophical<br />

Perspectives 20 (2006): 479-92. Thanks to the Philosophy Program at the RSSS, ANU, where this was<br />

first drafted, to audiences at University of Manitoba and Stanford University, and to the attendees at my<br />

seminar on David Lewis at Cornell University. I am especially grateful to Ben Caplan, John Hawthorne,<br />

Ishani Maitra, Raul Saucedo and Wolfgang Schwarz.


The Asymmetric Magnets Problem 587<br />

doesn’t think that Humean Supervenience is a necessary truth, let alone a conceptual<br />

truth, but at best a contingent truth. So the principle that duplication is not emergent<br />

is not something that is true in virtue of the concept of duplication.<br />

Still, nothing in Lewis’s views suggest that the following principle is false. If all<br />

the fundamental properties are not emergent, i.e. they are properties that complex<br />

things have in virtue of the fundamental properties of and relations between their<br />

parts, then duplication is not emergent. We might try and formalise this as follows.<br />

If all the fundamental properties are not emergent, then if the parts of x and y are<br />

duplicates, then x and y are duplicates. This principle is, however, too strong. It<br />

doesn’t account for the possibility that the parts of x and y are arranged differently.<br />

For instance, in the following example, the fusion of a and b is not a duplicate of the<br />

fusion of c and d, even though a is a duplicate of c and b is a duplicate of d.<br />

a<br />

b<br />

The problem is that the arrangement of the two objects is different. So what we need<br />

is a principle that says that if all the perfectly natural properties are not emergent<br />

properties, and if the parts of x and y are duplicates, and those parts are arranged the<br />

same way, then x and y are duplicates. Saying this formally is not exactly trivial. The<br />

following version uses the idea of an isometry 1 .<br />

Parts Principle This principle holds in all worlds in which no fundamental properties<br />

or quantities are emergent. If X and Y are sets of material objects, a is<br />

the fusion of the members of X and b is the fusion of the members of Y, f is a<br />

function X → Y, and i is an isometry defined on the space that X and Y are in,<br />

and the following conditions hold:<br />

• For all x in X, f (x) is a duplicate of x; and<br />

• For all x in X, if r exactly occupied by x, then i(r) is the region exactly<br />

occupied by f (x),<br />

then a and b are duplicates.<br />

The Parts Principle is not as easy to state as the Intrinsicness Principle, but I think<br />

the idea it is expressing is fairly intuitive. Nevertheless, I think the two principles<br />

cannot both be true.<br />

1 An isometry is “a transformation that does not change the distance between points” (Yaglom, 1962,<br />

11). That is, it is a function from points to points that doesn’t change distances. Although the isometry is<br />

initially defined as a function on points, it can obviously be extended to a function from regions to regions.<br />

If r is a region, i.e. a set of points, then f (r) is {f (x): x ∈ r}.<br />

c<br />

d


The Asymmetric Magnets Problem 588<br />

2 Three Distinctions<br />

The problem I’ll be focussing on looks rather simple, but it brings out several points<br />

that seem to have metaphysical interest. In particular, it highlights the importance<br />

of three distinctions that are easy to blur when doing metaphysics. It will make the<br />

exposition of the puzzle easier to place these distinctions up front.<br />

The first distinction is between features and properties. Most metaphysicians accept<br />

that to fully characterise the world, we need to do more than say what exists. As<br />

well as saying what there is, we need to say how the things that exist are. It is easy to<br />

assume that to do that, we need to say what properties things have. But this need not<br />

be correct, or at least it need not be correct if we are looking to characterise the world<br />

in the most fundamental way. It might be that the fundamental features of reality are<br />

quantities, i.e. features that objects have to different degrees or in different amounts.<br />

Properties, like being green are features, but quantities, like mass or velocity are also<br />

features, just features that can be instantiated to different degrees or magnitudes. So<br />

feature is a more general category than property, and so as to not beg any questions, I’ll<br />

talk about intrinsic features and fundamental features rather than intrinsic properties<br />

and fundamental properties throughout. My solution to the problem will involve assuming<br />

that at least some of the fundamental features of reality are indeed quantities<br />

not properties.<br />

The second distinction is between fundamental features and perfectly natural features.<br />

Fundamental features are features that do not obtain in virtue of other features<br />

obtaining. The fundamental features are part of a minimal basis we need for characterising<br />

reality. Generally fundamental features are related to other fundamental<br />

features by exceptionless laws, though this is not part of their definition. What is<br />

definitional is that they are basic and that they provide a basis for characterising the<br />

world without redundancy. (As a Humean, I’d also say that there are no necessary<br />

connections between distinct fundamental features, but that is a controversial metaphysical<br />

thesis, not a defining characteristic.) Perfectly natural features are features<br />

that make for primitive objective resemblance between things that instantiate them.<br />

By a primitive objective resemblance, I mean an objective resemblance that does not<br />

obtain in virtue of sharing more basic (in the limit, fundamental) properties. David<br />

Lewis (1983c) assumes, without much by way of argument as far as I can see, that the<br />

fundamental features are the perfectly natural features. My solution to the problem<br />

will involve rejecting that identity.<br />

The final distinction is between the thesis that all the fundamental non-spatiotemporal<br />

features of reality are intrinsic properties of points, and the thesis that these features<br />

are local features. Jeremy Butterfield (2006) has stressed the importance of this<br />

distinction for metaphysics. A feature is local to a point iff it is intrinsic to arbitrarily<br />

small regions around that point. For example, the slope of a curve at a point is local<br />

to that point, even though it isn’t intrinsic to the point. So locality and intrinsicness<br />

can come apart. (This raises interesting questions about, for example, whether it is<br />

best to state the thesis of Humean Supervenience in terms of local properties or in<br />

terms of intrinsic properties of points.) I’ll say more about the importance of this<br />

distinction in section 5.


The Asymmetric Magnets Problem 589<br />

I’ve already made use of these distinctions in setting out the principles about intrinsicness<br />

in section one. (In particular, it is crucial that the Parts Principle is stated<br />

in terms of fundamental rather than perfectly natural features.) Using them we can<br />

get to our central puzzle, the Asymmetric Magnets Problem.<br />

3 The Asymmetric Magnets Problem<br />

Our puzzle is similar to the spinning sphere, often thought to raise a problem for<br />

Humean Supervenience (Armstrong, 1980). The similarity is not in respect of its<br />

target; the puzzle is meant to be a puzzle for everyone who accepts those two principles,<br />

not just the Humean. Rather, the similarity is in that the puzzle is set in a world<br />

where there are homogeneous physical objects. Such a world is in many ways quite<br />

distant from actuality. But I think such worlds are useful fictions for elucidating the<br />

conceptual connections between central concepts in metaphysics. The puzzle is also<br />

set in a world with Euclidean spatial geometry. Again this is a fiction, but a useful<br />

one for working out conceptual connections.<br />

In this world, some of the fundamental features are what we’ll call vector features.<br />

(This is a much smaller deviation from actuality.) Vector features are either quantities<br />

like velocity the value of which is a vector, or properties like having velocity v,<br />

where v is some vector. In particular, the strength and direction of the magnetic field<br />

throughout the world is a fundamental feature of the world. I’ll assume that both<br />

space-time regions and physical objects can have these vector features, although I’m<br />

only going to focus on the field strength and direction at a point in a physical object.<br />

Finally, I’ll assume that all of the fundamental physical quantities in the world<br />

are local. So there are no fundamental emergent quantities in the world. It might<br />

be worried that the last two assumptions are inconsistent, and that vector quantities<br />

could not be local. I’ll come back to this worry in section 5.<br />

Some of the things in this world are magnets. These are homogeneous objects<br />

with a uniform non-zero magnetic field throughout. I’m going to represent the magnetic<br />

field strength and direction of such a magnet with an arrow pointing towards<br />

the north pole of the magnet. The length of the arrow is proportional to the strength<br />

of the field.<br />

The simplest kind of magnet is a bar magnet, just like the kind I used to play with<br />

in primary school. (Apart from being homogeneous of course!) These are cuboids<br />

with equal heights and depths, and a long length in the direction of their magnetic<br />

field. Suzy is playing with some magnets and, tiring of using her magnets to grab<br />

the other childrens’, decides to sharpen one end of each of her magnets for use as a<br />

weapon. The teacher notices this, confiscates the weaponised magnets, and lays them<br />

out on her desk. Here is what they look like from the teacher’s point of view.


The Asymmetric Magnets Problem 590<br />

A B C<br />

I’ve added the labels.<br />

Each magnet has one sharp end and one flat end. Each also has one north pole<br />

and one south pole. And, of course, each has one end to the right (from the teachers’<br />

point of view) and one end to the left. The distribution of these properties of ends is<br />

different in the three cases.<br />

• A’s north pole is sharp and to the right.<br />

• B’s north pole is sharp and to the left.<br />

• C’s north pole is flat and to the left.<br />

Question: Which of the magnets are duplicates?<br />

Answer: A and B, but not C.<br />

I hope you agree with the answer! If not, let me provide a small argument.<br />

A and B are intrinsic duplicates because we could ‘line up’ A and B by picking<br />

A up, spinning it around, and moving it across a bit. And that’s only possible if<br />

the two objects are duplicates. This idea, that objects that can be transformed into<br />

one another by simple geometric transformations such as rotation and translation is<br />

a very deep part of our conceptual scheme. Consider, for example, Euclid’s proof of<br />

proposition 4.<br />

Let ABC, DEF be two triangles having the two sides AB, AC equal to the<br />

two sides DE, DF respectively, namely AB to DE and AC to DF, and the<br />

angle BAC equal to the angle EDF. I say that the base BC is also equal to<br />

the base EF, the triangle ABC will be equal to the triangle DEF . . . For,<br />

if the triangle ABC be applied to the triangle DEF, and if the point A be<br />

placed on the point D and the straight line AB on E, then the point B will<br />

also coincide with E, because AB is equal to DE. Again, AB coinciding<br />

with DE, the straight line AC will also coincide with DF, because the<br />

angle BAC is equal to the angle EDF; hence the point C will also coincide<br />

with the point F, because AC is again equal to DF. But B also coincided<br />

with E; hence the base BC will coincide with the base EF . . . and will be<br />

equal to it. Thus the whole triangle ABC will coincide with the whole<br />

triangle DEF, and will be equal to it. (Euclid, 1956, 247-8)<br />

As many mathematicians have pointed out over the centuries, this is not Euclid’s<br />

finest moment as a geometer. The idea he’s pushing is clear enough. If ABC and DEF<br />

satisfy the assumptions, then you can pick up ABC and place it on DEF, so that the<br />

sides and vertices all coincide. Does this prove that the sides and angles in the original<br />

triangle are equal? Not really, or at least not without the assumption that picking up


The Asymmetric Magnets Problem 591<br />

ABC and moving it around doesn’t change its side lengths or angle magnitudes. And<br />

Euclid hadn’t said anything at that stage of the Elements to justify this assumption.<br />

So qua axiomatic geometer Euclid has blundered here. But there is more to life<br />

than axiomatic geometry. There is, for instance, metaphysics. And the assumption<br />

Euclid is using here is, I think, a sound metaphysical intuition. (If it weren’t, the<br />

complaints about this fundamental proof in Euclid would have been earlier, and more<br />

frequent, than they actually were.) That intuition is, I think, that intrinsic properties<br />

are not, ceteris paribus, changed merely by moving objects around. Of course other<br />

things are not always equal; the intrinsic properties of a car are not preserved if you<br />

drive it into a wall. But the kind of abstract motion that Euclid is contemplating<br />

when he moves ABC onto DEF, or that I’m contemplating when I think about moving<br />

B around so it lines up with A, does not destroy intrinsic properties. So that’s an<br />

argument that A and B are intrinsically alike.<br />

On the other hand, A and B each have a property that C lacks. Their magnetic<br />

field points towards their sharp end. This is in some sense a relational property, it is<br />

defined in part in terms of two things pointing in the same direction, but it doesn’t<br />

seem like a relation between the magnet and anything else. In general, properties<br />

that things have in virtue of relations between their parts are intrinsic properties. (It<br />

is intrinsic to the earth, for example, that more of its surface is wet than dry, even<br />

though this property is defined in terms of a relation.) So this is an intrinsic property<br />

of A. And, given the plausibility of the Intrinsicness Principle, that’s a reason to think<br />

that A and C are not intrinsic.<br />

4 The Principles and the Problem<br />

Here then is our problem. Try to answer the following question given the two principles:<br />

Is the direction of a vector feature an intrinsic feature of its bearer or not? If<br />

yes, then A and B are not duplicates. If no, then A and C are duplicates. (In fact all<br />

three are duplicates, though I won’t prove this.) Neither way does it turns out that<br />

A and B are duplicates, but C is not, as we need. The aim of this section is to spell<br />

out that little argument in more detail, so we can see how the principles relate to the<br />

problem.<br />

First, we’ll assume that the direction of a vector feature is an intrinsic feature of its<br />

bearer. We need a way to rigidly denote directions, so we’ll call the direction that the<br />

vector in A points d 1 , and the direction that the vector in B points d 2 . Since d 1 �= d 2 ,<br />

A and B differ in their intrinsic properties. By the second clause of the Intrinsicness<br />

Principle, it follows that A and B are not duplicates.<br />

Second, we’ll assume that the direction of a vector feature is not an intrinsic feature<br />

of its bearer. Now we want to show that B and C are duplicates. To do this<br />

we’ll use the Parts Principle. All of the fundamental quantities are local, so the Parts<br />

Principle applies. Now let the members of X and Y be the point-sized parts of A and<br />

C. Let l be the distance from the tip of the pointed end of A to the tip of the pointed<br />

end of C. The isometry i is a translation with length l and direction d 1 , i.e. a function<br />

that maps any point to the point that is distance l away from it in direction d 1 . This<br />

isometry maps A onto C. By the first clause of the Intrinsicness Principle, and the


The Asymmetric Magnets Problem 592<br />

assumption that direction is not intrinsic, every point in A is a duplicate of any point<br />

in C. So by the Parts Principle, A and C are required.<br />

The conclusion is that if we want to say that A and B are duplicates, but A and C<br />

are not duplicates, then we can’t hold on to both the Intrinsicness Principle and the<br />

Parts Principle.<br />

I think we should give up the Parts Principle. In particular, we should say that<br />

the Parts Principle holds only if all the perfectly natural features of reality are local,<br />

and this might fail to hold even if all the fundamental features of reality are local. The<br />

need for the distinction between these possibilities is, I think, the main lesson of the<br />

problem. But before we get to that conclusion, I want to address an objection to the<br />

argument so far.<br />

5 Two Worries About Locality<br />

I can an imagine an objection to this argument along the following lines. In the setup<br />

of the problem, I said that some of the fundamental features of reality are vectorvalued<br />

quantities. I also said that all of the fundamental features of reality are local.<br />

But these assumptions are inconsistent. Vector properties are not intrinsic properties<br />

of points. (Since we’re trying to hold on to the Intrinsicness Principle, we have to<br />

accept this.) Hence they are not, in the salient sense, local features of reality.<br />

I think this objection is sound all the way to the last step. As noted above, we<br />

need to distinguish between local properties and intrinsic properties of points. The<br />

distinction is common in mathematics, but has not been paid sufficient attention in<br />

metaphysics. Jeremy Butterfield’s (2006) is an important exception, one that was very<br />

influential on this paper.<br />

So it is fair to say that all fundamental features of reality are local if all the facts<br />

about the world supervene on facts about the distribution of fundamental features<br />

in arbitrarily small regions, plus facts about the spatiotemporal arrangement of those<br />

regions. And there is no reason to think that positing vector features as fundamental<br />

is inconsistent with the fundamental features being local in this sense. Butterfield argues,<br />

persuasively, that velocity properties in Newtonian mechanics are not intrinsic<br />

properties of points, but he stresses that this doesn’t mean they are not local in this<br />

sense. He is focussing on velocity, and what he says doesn’t immediately translate<br />

to all vector properties. (It matters to his argument, for example, that velocities are<br />

conceptually connected to the positions of objects at different times, in a way that, for<br />

example, magnetic fields are not.) But I think his conclusions are independently plausible.<br />

Indeed, the argument from isometry above is an argument for the very same<br />

conclusion. So the short version of my reply is to concede that once we’ve allowed<br />

vector properties as fundamental, we can’t say that all the fundamental features of<br />

reality are intrinsic properties of points and spatiotemporal relations between them,<br />

but this is consistent with saying that all the fundamental features of reality are local.<br />

Once we’ve said that, however, a different kind of objection becomes salient. It<br />

might be thought that if the fundamental features are intrinsic properties of regions<br />

not of points, the natural version of the Parts Principle is slightly weaker than as<br />

stated. In particular, we should focus our attention to cases where the sets X and


The Asymmetric Magnets Problem 593<br />

Y consist of objects with positive size. Because this weakening flows naturally from<br />

the definition of locality, it doesn’t look like an ad hoc weakening. However this<br />

weakening does not at the end of the day help to save the Parts Principle. That’s<br />

because we can find a different way to divide up A and C into parts of positive size<br />

so that the Parts Principle still applies. A sketch of how we’ll (start to) divide up A is<br />

here.<br />

The idea is that we make one large square part, and then divide the rest of A up into<br />

infinitely many diamonds. We do this recursively. Note that we start with a triangle<br />

whose base is to the left and vertex to the right. We create from this a diamond whose<br />

four vertices are the vertex of the triangle, and the midpoints of each of the three sides<br />

of the triangle. If we imagine cutting this diamond out of the magnet, we’d be left<br />

with two small triangles, each with a base on the left and a vertex on the right. We<br />

can do the same trick to create diamonds and (in imagination) cut them out, leaving<br />

us with four triangles. Repeat this until we have an infinity of diamonds. The fusion<br />

of all these diamonds with the large square will be our original magnet. Moreover,<br />

since every part is symmetric around the axis perpendicular to d 1 , each part will be<br />

a duplicate of the corresponding part in C. So the Parts Principle still tells us, falsely,<br />

that A and C are duplicates. We have to look somewhere else to avoid the problem.<br />

6 The solution and its problems<br />

The Asymmetric Magnets Problem looks easy. It is easy to say intuitively why A<br />

and B are duplicates, but C is not. The reason was given at the end of section three.<br />

In both A and B, the magnetic field ‘points’ in the same direction that the physical<br />

object does, while in C this is not the case. The difficulty arises when we try and<br />

shoehorn this intuition into a formal theory. We need to say that it is intrinsic to the<br />

magnet that its magnetic field points the same way it does. And we need to say this<br />

without saying that the direction of the magnetic field is itself intrinsic. I know of<br />

one way to do this, but it involves some overheads. I’m not going to argue for this<br />

at any length here, but I think the difficulty of providing a general solution to the<br />

Asymmetric Magnets Problem is one of many reasons to think that we should learn<br />

to live with these overheads.<br />

My solution starts with Lewis’s definition of duplication. I gave a rough statement<br />

of this above; we now need a more precise statement. For Lewis, two objects


The Asymmetric Magnets Problem 594<br />

are duplicates iff there is a mapping m from parts of one to parts of the other that (a)<br />

is an isomorphism and (b) for all n-place perfectly natural properties P, and all parts<br />

x 1 , . . . , x n of the first object, Px 1 . . . x n iff Pm(x 1 ). . . m(x n ). So the objects are duplicates<br />

if their parts have the same natural properties, and stand in the same perfectly natural<br />

relations. That’s how Lewis’s theory goes; now we have to start adding variations.<br />

The first variation is quite radical, but one we have independent reason to make.<br />

As John Hawthorne (2006) and David Denby (2001) have argued, Lewis’s theory<br />

of properties has difficulties accounting for quantities. Hawthorne notes that if we<br />

just take individual mass properties, e.g. having mass 17kg, having mass 42ng etc as<br />

perfectly natural, there is no way to state physical laws involving mass, such as the<br />

law of gravitation, as simple statements where all predicates denote perfectly natural<br />

properties. But the theory of laws in Lewis (1983c) says that all physical laws<br />

are simple statements where all predicates denote perfectly natural properties. This<br />

is something of a problem. For different reasons, Denby suggests that we take determinables<br />

as being perfectly natural. The individual mass properties are perfectly<br />

natural, he suggests, but not fundamental. What is fundamental is the determinable,<br />

mass, of which they are determinate.<br />

I think we should make a more radical move in the interests of simplicity. What<br />

reason do we have for thinking that the fundamental ways things are are properties<br />

rather than quantities or magntitudes? Very little reason, I’d say. Modern physics<br />

seems much more concerned with quantities than properties. What properties it is<br />

concerned with, such as being positively charged, seem to be derived from more fundamental<br />

quantities, such as charge. It would perhaps be convenient for formal semantics<br />

if the world had an object-property structure to match the subject-predicate<br />

structure of simple sentences. But we have no reason to believe the world will be so<br />

accommodating. It might turn out that there are a few fundamental quantities in the<br />

world. A quantity is a feature that objects have to different degrees. We can identify<br />

each value a quantity takes with a property. (Examples are properties like having<br />

mass 17kg.) But that shouldn’t make us think that the properties are metaphysically<br />

primary. They might be derived from the quantities. Hawthorne’s and Denby’s arguments<br />

push us towards that conclusion, and I’ll show here that assuming quantities<br />

are primary helps us state a solution to the Asymmetric Magnets Problem.<br />

Some quantities take simple values. The values of the mass quantity, for instance,<br />

are sufficiently simple that they can be represented by real numbers. But not all<br />

quantities are like that. In some cases the values are structured entities, which are<br />

composed of a magnitude and some other some other part or parts. Vector quantities<br />

are like this. We can naturally think of vectors as structured entities composed of a<br />

direction and a magnitude. I’m going to assume, at least for the sake of solving this<br />

problem, that any perfectly natural quantity takes values that either are magnitudes<br />

(as mass does) or takes values that are structured entities composed, among other<br />

things, of a magnitude. (This is an empirical assumption, and it may well not be<br />

true. If it is not true, the analysis of duplication below will need to be made more<br />

complicated.) For ease of exposition, I’ll say that a function f is perfectly natural iff it<br />

maps objects onto values, such that there is some perfectly natural quantity such that<br />

for any x, f (x) is the value that quantity takes with respect to x. So if the quantity is


The Asymmetric Magnets Problem 595<br />

mass, f (x) is x’s mass. And I’ll say that | f (x)| is the magnitude of this value, in the<br />

sense described above. (The notation here is slightly non-standard, since I allow that<br />

magnitudes may be negative numbers. For example, if f represents charge and x is<br />

negatively charged, then | f (x)| may be a negative number.)<br />

Now for the definition of duplication. Two objects are duplicates iff there is<br />

a mapping m from parts of one to parts of the other that (a) is an isomorphism<br />

and (b) for all n-place natural functions f , and all parts x 1 ,..., x n of the first object,<br />

| f (x 1 ,..., x n )| = | f (m(x 1 ),..., f (m(x n ))|. So the objects are duplicates if the magnitudes<br />

of each of the natural quantities of each of their parts are the same. This allows<br />

that the quantities can vary without loss of intrinsic character, provided there is no<br />

variation in magnitude.<br />

The Asymmetric Magnets Problem suggests a view on which the directions of<br />

vector features are indirectly relevant to the intrinsic nature of objects. ‘Indirectly’<br />

because changing the direction doesn’t change the intrinsic properties of objects. But<br />

‘relevant’ because the direction can matter, as we see when comparing A and C. The<br />

definition of duplication in terms of quantities that take structured values allows us<br />

to capture this indirect relevance. We’ll do so by defining a feature whose magnitude<br />

varies depending on how the object’s shape and the direction of its vector features are<br />

coordinated.<br />

Let f be a function representing some perfectly natural quantity such that f (x) is<br />

a vector. That is, f represents some perfectly natural vector quantity. This quantity<br />

may or may not be fundamental, though it will be fundamental in the cases under<br />

consideration here. Let c be a function that takes an object as input and returns its<br />

geometric centre as output. (By the geometric centre of x I mean the centre of mass<br />

of an object with the same shape as x and uniform mass density throughout.) Now<br />

suppose that the following function is perfectly natural.<br />

g(x, y, z) = d f the cosine of the angle between f (x) and the ray from g(y) to g(z)<br />

The motivation for taking this to be perfectly natural (but obviously not fundamental)<br />

is that it delivers the right results about the Asymmetric Magnets Problem, and<br />

it seems to deliver those results for the right reasons. To see it delivers the right results,<br />

just apply the above definition of duplication. Two objects are duplicates iff<br />

there is an isomorphism m from the parts of one to the parts of the other such<br />

that for all n-place natural functions f , and all parts x 1 ,..., x n of the first object,<br />

| f (x 1 ,..., x n )| = | f (m(x 1 ),..., f (m(x n ))|. To make the discussion easier, we’ll redraw<br />

the magnets with some salient parts labelled.<br />

A1 A2 B2 B1 C1 C2 A B C


The Asymmetric Magnets Problem 596<br />

Any isomorphism from A to B that satisfy this constraint has to map A 1 to B 1 , and<br />

A 2 to B 2 . And any isomorphism from A to C that satisfy this constraint has to map<br />

A 1 to C 1 , and A 2 to C 2 . Now let f be the function whose value is represented by<br />

the arrow, and let g be the function defined as above. So if A and B are duplicates,<br />

it must be the case that g(A, A 1 , A 2 ) = g(B, B 1 , B 2 ). It is easy to verify that since<br />

f (A) points in the same direction as the ray from the centre of A 1 to the centre of<br />

A 2 , g(A,A 1 ,A 2 ) = 1, and similarly the ray from the centre of B 1 to the centre of B 2<br />

points in the same direction as f (B), g(B, B 1 , B 2 ) = 1. So there is no reason here to<br />

doubt that A and B are duplicates. On the other hand, since the ray from the centre<br />

of C 1 to the centre of C 2 is in the opposite direction to f (C), g(C, C 1 , C 2 ) = -1. So<br />

there is no isomorphism from parts of A to parts of C that preserves the value of<br />

perfectly natural properties, so A and C are not duplicates, as required.<br />

I don’t doubt that there are other ways to solve this problem, so I certainly won’t<br />

try arguing that this is the only solution. But I think it works, and the reason it<br />

works is because the values of natural quantities are structured entities, in this case<br />

vectors. Because they have structure, we can use one part of the structure (i.e. the<br />

magnitude) in determining what is directly relevant to intrinsicness, and another part<br />

of the structure (in this case the direction) in determining what is indirectly relevant.<br />

So it’s an important advantage of using quantities rather than properties as the centrepiece<br />

of our metaphysics that the values of natural quantities can be structured<br />

entities, and having something like structured quantities seems crucial to solving this<br />

problem.<br />

Although g is represents a perfectly natural quantity, it does not represent a fundamental<br />

quantity. Instead, it represents a quantity whose value supervenes on the<br />

distribution of other perfectly natural quantities. So we have to allow that there is a<br />

distinction between the fundamental quantities and the perfectly natural quantities.<br />

I don’t think this is a cost of the theory; there is no way to capture the idea that directions<br />

are indirectly relevant without distinguishing between the perfectly natural<br />

and the fundamental, so the Asymmetric Magnets Problem is a reason to make such<br />

a distinction. (I’m indebted here to Ben Caplan.)<br />

We can reduce the apparent cost of this distinction by noting that one reason<br />

we might have for blocking redundant natural quantities does not apply here. (By<br />

a redundant quantity, I just mean one that supervenes on the fundamental quantities.)<br />

We don’t want to say that disjunctive properties like being grue, that supervene<br />

on other natural properties, are perfectly natural. But that’s not primarily because<br />

of the supervenience, but because of the fact that grueness doesn’t make for resemblance<br />

amongst the things that instantiate it. So at least that reason for caring about<br />

redundancy doesn’t apply here. (I’m indebted here to Raul Saucedo.)<br />

Finally, it is crucial to my solution that A, B and C have these parts. If A, B<br />

and C are extended simples, I can’t run the argument I make here. Indeed, if they<br />

are extended simples, it looks like they are duplicates by my definition. That seems<br />

bad. I think this is a problem that we don’t need to worry about, because this isn’t<br />

a real possibility. I’ll concede for the sake of argument that there are such things<br />

as extended simples. What I don’t see any need to concede the possibility of are<br />

asymmetric extended simples. In general, the way that we deduce that an object has


The Asymmetric Magnets Problem 597<br />

parts is by noting it has different properties at different places. (This point is made<br />

in Sider (2003).) I think this is just the right strategy to use, as a matter of necessity.<br />

If an object has different properties in different locations, it has different parts in<br />

those different locations. So there could not be extended simples that are asymmetric<br />

magnets. The case where my theory produces the wrong result is an impossible case.<br />

7 Wrapping Up<br />

This paper has had several ambitions, some loftier than others. The most basic aim<br />

has been to introduce the Asymmetric Magnets Problem, and argue that it is going<br />

to be hard work for a systematic theory of intrinsicness to account for the facts about<br />

the problem. The more profound aims involve tearing apart concepts that metaphysicians<br />

often take for granted are interchangeable. My solution to the problem<br />

involves distinguishing local features from intrinsic features of points, fundamental<br />

features from perfectly natural features, and, most importantly, features from properties.<br />

The last of these is I think the biggest point. If we come to believe that quantities,<br />

not qualities, are the fundamental ways things are, then quite a bit of metaphysical<br />

orthodoxy needs rewriting. Some of that rewriting may be simple; just a matter of<br />

crossing out ‘li’ and writing in ‘nt’ in the middle of some words. But changes in fundamental<br />

metaphysics tend not to be isolated, and the rewriting project may lead to<br />

more wide-ranging changes. (Egan (2004) makes this point well, with an important<br />

illustration.) Now I certainly haven’t given anything like a conclusive argument in<br />

this paper that we should set about that project immediately. I have, however, provided<br />

one reason to think the project will eventually be necessary, and I suspect that<br />

more reasons will be provided in the future.


Chopping Up Gunk<br />

John Hawthorne, <strong>Brian</strong> <strong>Weatherson</strong><br />

Atomism, the view that indivisible atoms are the basic building blocks of physical<br />

reality, has a distinguished history. But it might not be true. The history of physical<br />

science certainly gives many of us pause. Every time some class of objects appeared<br />

to be the entities that Newton had described as “solid, massy, hard, impenetrable,<br />

movable Particles” out of which “God in the Beginning formed Matter” (Newton,<br />

1952, 400), further research revealed that these objects were divisible after all. One<br />

might be tempted to see that history as confirming Leibniz’ dismissal of atomism as<br />

a “youthful prejudice” . 1 Perhaps material objects and their parts are always divisible.<br />

There are no extended atoms; nor are there point particles which compose material<br />

beings. 2<br />

When first presented with this hypothesis, our imaginations are quickly drawn to<br />

picturing the process whereby a quantity of such matter – call it gunk -- is chopped<br />

up into smaller and smaller pieces. Prima facie, there is nothing problematic here:<br />

insofar as such a process continues without end, the initial quantity gets resolves into<br />

smaller and smaller chunks with no limit to the diminution. But suppose this process<br />

is packed into an hour, as imagined by Jose Bernadete in his 1964 monograph Infinity:<br />

Take a stick of wood. In 1/2 minute we are to divide the stick into two<br />

equal parts. In the next 1/4 minute we are to divide each of the two<br />

pieces again into two equal parts. In the next 1/8 minute we are to divide<br />

each of the four pieces (for there are now four equal pieces) again into<br />

two equal parts, &c. ad infinitum (Bernadete, 1964, 184).<br />

If matter is divisible without end there seems to be no conceptual obstacle to each of<br />

the divisions. Yet how are we to imagine the situation at the end of the hour, when<br />

the super-task (call it ‘super-cutting’) has been performed on a quantity of gunk? 3<br />

If there were extended atoms that were never annihilated, it is clear enough what<br />

would happen if some super-being undertook to perform super-cutting: the process<br />

would grind to a halt when insurmountably hard particles resisted the chopper.<br />

If, meanwhile, there were point-sized particles that composed planes that were<br />

as thin as a line, it would be natural to picture the limit of the process as a sea of<br />

separated slivers, each devoid of finite extent along one dimension. As Benardete,<br />

† Penultimate draft only. Please cite published version if possible. Final version published in The<br />

Monist 87 (2004): 339-50.<br />

1See ‘Nature Itself’ in (Leibniz, 1998, 220).<br />

2Cf Leibniz: ‘I hold that matter is essentially an aggregate, and consequently that it always has actual<br />

parts,’ in ‘Third Explanation of The New System,’ (Leibniz, 1998, 193).<br />

3What is important, of course, is that the sequence of separations occur: it does not matter whether<br />

some kind of super-sharp knife is responsible for them. In what follows, descriptions of cutting sequences<br />

can be replaced without loss of content by descriptions of separation sequences, leaving it open whether<br />

repulsive forces or chance events or knives or . . . are responsible for the separation sequence.


Chopping Up Gunk 599<br />

notes, one might then redo super-cutting in order to finally resolve the original stick<br />

into a sea of “metaphysical motes” devoid of finite extent in any direction:<br />

At the end of the minute how many pieces of wood will we have laid out<br />

before us? Clearly an infinite number. If the original stick was twenty<br />

inches in length, one inch in width, and one inch in depth, what are<br />

the dimensions of the metaphysical chips into which the stick has been<br />

decomposed? Each chip will be one inch by one inch by one inch by –<br />

what? So prodigiously thin must each chip be that its value is certifiably<br />

less then any rational (or irrational) quantity. Let us now take up one of<br />

the metaphysical chips and decompose it further into an infinite number<br />

of metaphysical splinters. In 1/2 minute we shall divide the chip into two<br />

equal parts. Each pieces will be one inch by 1/2 inch. In the next 1/4<br />

minute we shall divide each of the two pieces again into two equal parts,<br />

yielding four pieces each being one inch by 1/4 inch. In the next 1/8<br />

minute we shall divide each of the four pieces again into two equal parts,<br />

&c ad infinitum. At the end of the mute we shall have composed the<br />

metaphysical chip into metaphysical splinters. Each splinter will be one<br />

inch in length. Let us now take up one of the metaphysical splinters and<br />

break it down into an infinite number of metaphysical motes (Bernadete,<br />

1964, 184-5)<br />

The number of cuts made on the stick, the chip and the splinter respectively is aleph<br />

zero. The number of chips, splinters and motes left at the end of each cutting process,<br />

meanwhile, is aleph one. (Think of numbering each piece in a super-cutting process<br />

by an infinite expansion of one’s and zero’s as follows: if it lay on the left of the first<br />

cut, the first numeral is a zero, if to the right, the first numeral is a one; it if lay on<br />

the left of one of the pieces that was divided at the second round of cutting its second<br />

numeral is a zero, if to the right a one; and so on. For each decimal expansion of one’s<br />

and zero’s there is a bit and the end with that expansion.). This result is surprising to<br />

some, but poses no deep conceptual confusion. With an ontology of chips, splinters<br />

and motes available to us, there is a natural description available to us of the limit to<br />

the super-cutting processes described.<br />

But what to say when gunk is subjected to super-cutting? If each quantity of<br />

matter has proper parts, then a sea of metaphysical motes is not, it would seem, an<br />

available outcome. In what follows, we unpack this puzzle, providing along the way<br />

some a priori physics for gunk-lovers. The problem is only well formed when we<br />

make explicit some of the assumptions that drive it. We do so below:<br />

(1) Gunk<br />

Every quantity of matter has proper parts.<br />

(2) Conservation of Matter:<br />

No mereological sum of quantities of matter can be destroyed by any<br />

sequence of cuts (though it may be scattered) 4 .<br />

4 The ‘can’ here is one of nomological possibility.


Chopping Up Gunk 600<br />

(3) Occupation<br />

If a quantity of matter occupies a point of space, then there is some volume,<br />

extended in all dimensions, to which that point belongs which that<br />

quantity of matter occupies.<br />

(4) Super-cutting<br />

The laws of the world permit super-cutting.<br />

Note that (1), the thesis that every quantity of matter has parts does not, by itself,<br />

entail any of the other theses. One might also think that matter sometimes vanishes<br />

as a result of some sequence of cuts, denying (2). One might hold that there are metaphysical<br />

splinters (and perhaps even chips), denying (3). One might hold that any<br />

given quantity of matter does have point sized pieces but that those pieces themselves<br />

have parts (the parts being of the same size as the whole in this case), denying (3).<br />

One might hold that some pieces of gunk can occupy, say, a spherical region and also<br />

a single isolated point at some considerable distance from the spherical region (while<br />

maintaining that no part of it merely occupies the point), also denying (3). One<br />

might imagine that while always having parts, the parts below a certain thickness are<br />

inseparable, denying (4). One might think there is a minimum amount of time that<br />

any event of separation takes, also denying (4) and so on.<br />

If the gunk hypothesis is maintained, but one or more of (2) to (4) is jettisoned,<br />

there is no problem left to solve. For example: If we are allowed to suppose that<br />

gunk may vanish, then it will be perfectly consistent to say that nothing is left at<br />

the limit of super-cutting. If we are allowed parts that lack finite extent, then it will<br />

be consistent to adopt Benardete’s picture of the outcome. And so on. Our puzzle,<br />

properly formulated is: What would happen if super-cutting occurred in a world<br />

where (1) to (4) are true?<br />

In order to answer that question, we need to supplement Bernadete’s brief discussion<br />

of the super-cutting process. It is not immediately clear from what he says that<br />

super-cutting a piece of wood will turn an object into chips, even assuming the wood<br />

to be composed of point particles. That is a natural description of the limit of the<br />

process, but it is hardly one that is forced upon us by the barebones description of<br />

the process that Benardete provides. When we divide the stick into two pieces, and<br />

then into four pieces, where are we to put these pieces? Presumably we must ensure<br />

that they are separated. If not, it will not be clear that we really have splinters left<br />

at the end. If the stick is cut into four, but the four pieces are then stored so closely<br />

together that they are not scattered any more, then we will not have four scattered<br />

objects after two rounds of cutting. By extension, unless we separate the pieces sufficiently<br />

after each round (or at least after sufficiently many of them) then even in a<br />

world where matter is composed of point particles, it is not clear that there will be<br />

infinitely many chips left at the end. Note in this connection that there are limits<br />

as to how far we can separate the objects. In a world where super-cutting produces<br />

chips, we could not, from left to right, put a one inch gap between each chip and any<br />

other, since there are aleph one chips and not aleph one inches of space along any


Chopping Up Gunk 601<br />

vector. Nor is it even clear what kind of spacing will do the trick: how we are to keep<br />

aleph one chips separated from each other? What we need is a formal model showing<br />

how super-cutting is to be performed. Only then can we answer with any precision<br />

what would happen were super-cutting to be performed on gunk.<br />

Assume, for simplicity, that we have a stick that is exactly one inch long. At the<br />

first stage, cut the stick into two 1/2 inch long pieces, move the left-hand one 1/4<br />

inch leftwards and the right hand one 1/4 inch rightwards. This can be accomplished<br />

in 1/2 second without moving the objects at a speed of faster than 1 inch per second,<br />

or accelerating or decelerating the objects at a rate higher than 4 inches per second per<br />

second. 5 At the second stage, cut each piece into two, and move each of the left-hand<br />

pieces 1/16 of an inch leftwards, and each of the right-hand pieces 1/16 of an inch<br />

rightwards. So if the original piece occupied the interval [0, 1) on a particular axis,<br />

the four pieces will now occupy the intervals: [-5/16, -1/16), [1/16, 5/16), [11/16,<br />

15/16), [17/16, 21/16). (The reason we are using these half-open intervals is to avoid<br />

questions about whether the objects that are separated by the cut used to overlap.)<br />

This cutting and moving can be accomplished in 1/4 of a second, without any piece<br />

attaining a velocity higher than 1/2 inch per second, or an acceleration higher than 4<br />

inches per second per second.<br />

The third stage of the cutting is to take each of these four pieces, cut them in<br />

two, move the left-hand part of each of the four 1/64 of an inch to the left, and the<br />

right-hand part 1/64 of an inch to the right. So the eight pieces now occupy the intervals:<br />

[-21/64, -13/64), [-11/64, -3/64), [3/64, 11/64), [13/64, 21/64), [43/64, 51/64),<br />

[53/64, 61/64), [67/64, 75/64), [77/64, 85/64). Again, this cutting and moving can<br />

be accomplished within 1/8 of a second, without any piece attaining a velocity higher<br />

than 1/4 inch per second, or an acceleration higher than 4 inches per second per second.<br />

6<br />

In general, at stage n, we take the 2 n-1 pieces, divide each of them in two, move<br />

the left-hand piece 1/2 2n inches leftward, and the right-hand piece 1/2 2n inches rightward.<br />

This can all be done in 1/2 n seconds without any piece attaining a velocity<br />

higher than 1/2 n-1 inches per second, or an acceleration higher than 4 inches per second<br />

per second. So the whole super-cut can be performed in 1 second: the first stage<br />

in 1/2 second, the second stage in 1/4 second, the third stage in 1/8 second, and so<br />

on. Note, moreover, that the whole super-cut can be performed in a second without<br />

the pieces ever moving at any huge velocity. If readers doubted the possibility of<br />

super-cutting because they believed it to be a necessary truth that no matters travels<br />

at or beyond the speed of light, their doubts were misplaced: no piece of matter in<br />

the super-cutting process approaches a superluminous velocity.<br />

5 The idea is that in the first quarter second we accelerate the object at 4 inches per second per second.<br />

This will raise its velocity to 1 inch per second, and move the object 1/8 of an inch. In the second quarter<br />

second we decelerate it at 4 inches per second per second, so its velocity ends up at zero, and it ends up<br />

having moved 1/4 of an inch.<br />

6 Note that, interestingly, if we moved the pieces 1/2 inch after the first round, 1/4 inch after the second<br />

round, 1/8 inch after the third round and so on then at the limit, each left and right edge that was once<br />

attached will have moved back together again. The process we have chosen preserves separation in a way<br />

that the aforementioned process does not.


Chopping Up Gunk 602<br />

Further, in this kind of procedure, a quantity of matter that is scattered during<br />

the super-cutting process remains scattered during the process. To see this, first consider<br />

a particular example. We noted above that at the second stage there were pieces<br />

occupying the intervals [-5/16, -1/16) and [1/16, 5/16). Before this, the point 0 had<br />

been occupied; at this stage a gap of 1/8 inch around 0 had been opened. This gap<br />

keeps being closed at each stage. After the third stage there were pieces occupying the<br />

intervals [-11/64, -3/64), [3/64, 11/64), so the gap is now only 3/32 inch. After the<br />

fourth stage, there will be pieces at [-27/256, -11/256), [11/256, 27/256), so the gap<br />

is now only 11/128 inch. This process will make the gap ever smaller, but will not<br />

lead to its closure. As the process continues, the size of the gap will approach 1/12<br />

of an inch, but never cross that boundary. To see this, note that the size of the gap<br />

in inches after stage n (n ≥ 3) is 1/8 - 1/2 5 - 1/2 7 - . . . - 1/2 2n-1 . The sum of the series<br />

1/2 5 + 1/2 7 + . . . is 1/24. Hence the gap at stage n is greater than 1/8 - 1/24 = 1/12.<br />

So once the pieces around 0 have been separated, they will never be rejoined.<br />

This result applies generally to all of the separated pieces in the super-cut. Once a<br />

gap is created, parts of pieces from either side of the gap are moved ever closer to the<br />

centre of the gap at every subsequent stage of the super-cut. But since we decrease the<br />

distance moved by each piece at each stage of the cut, and in particular decrease it by<br />

a factor greater than 2, the pieces, once disjointed, will never be united.<br />

How is the matter arranged at the end of the super-cut? To answer this question<br />

we need to assume that motion is continuous. For each part of the object we can<br />

calculate its position function, the function from the length of time the super-cut has<br />

been in progress to the position of the part. At least, we can calculate this for all times<br />

until the end of the super-cut. With the continuity assumption in place we can infer<br />

that its position at the end of the cut is the limiting value of its position function. So<br />

we make this assumption.<br />

We assumed above that there is a Cartesian axis running along the object; say that<br />

a part a covers a point x just in case a occupies some region [y, z), and y ≤ x and z > x.<br />

When we say a occupies [y, z), we do not mean to imply it occupies only that region,<br />

just that it occupies at least that region. Assume then that a part a occupies a point x<br />

(0 ≤ x < 1), and that the binary representation of x is 0.x 1 x 2 . . . x n . . . , where for each<br />

x i , x i equals 0 or 1, and for all i, there exists a j > i such that x j equals zero. 7 If x 1 = 1,<br />

then x ≥ 1/2, so the some part of a, a small part that originally covered x, will be<br />

moved rightward at the first stage. It is possible that a itself may be split by the cut,<br />

but there will be a small part around x that is not split, and it will move rightward. If<br />

x 1 = 0, then x < 1/2, so some part of a, a small part that originally covered x, will be<br />

moved leftward at the first stage. Indeed, in general some part of a, a small part that<br />

originally covered x, will be moved rightward at the n’th stage if x n = 1, and some<br />

part of a, a small part that originally covered x, will be moved leftward at the n’th<br />

stage if x n = 0.<br />

7 The final condition is important to rule out numbers having two possible representations. For example,<br />

we have to choose whether the representation of 1/2 should be 0.1000. . . or 0.0111. . . , and we<br />

somewhat arbitrarily, choose the former.


Chopping Up Gunk 603<br />

Using the fact that a part gets moved 2 -2n inches at stage n, we can infer that after<br />

n stages, a small part that originally covered x and has not been split by the cuts will<br />

cover the following point after n cuts.<br />

x + (−1)x 1 +1<br />

4<br />

+ (−1)x 2 +1<br />

16<br />

+ ··· + (−1)x n +1<br />

Assuming continuity of motion, we can assume that a will end up with a part that<br />

eventually covers the following point, which we will call f (x).<br />

f (x) = x +<br />

∞�<br />

i=1<br />

(−1) x i +1<br />

From this, it follows immediately that for all x in [0, 1), f (x) will end up being occupied.<br />

It turns out that these are the only points that are occupied at the end of the<br />

super-cut.<br />

Assume that a point y is occupied at the end of the super-cut. We will construct<br />

a number c such that y = f (c). Recall that we noted above that whenever two pieces<br />

were separated, a gap was created between them that would never be completely<br />

filled. While parts of the stick would move closer and closer to the centre of that<br />

gap during the super-cut, the middle two-thirds of the gap would never be reoccupied.<br />

That interval, that would never be reoccupied, would be liberated. The interval<br />

[1/3, 2/3) is liberated at the first stage, the intervals [-1/24, 1/24) and [23/24, 25/24)<br />

are liberated at the second stage, the intervals [-37/192, -35/192), [35/192, 37/192),<br />

[155/192, 157/192) and [227/192, 229/192) are liberated at the third stage, and so on.<br />

If y is occupied, then y must not be in any liberated interval. Therefore it must be<br />

either to the left or to the right of any interval that is liberated.<br />

Let c 1 equal 0 if y is to the left of the first liberated interval, [1/3, 2/3), and 1<br />

otherwise. Given the value of c 1 , it is already determined which side y is of one of<br />

the intervals liberated at the second stage. If y is to the left of [1/3, 2/3), for example,<br />

then it is to the left of [23/24, 25/24). But the value of c 1 does not determine which<br />

side y is of the other interval. Let c 2 equal 0 if y is to the left of that interval, and<br />

1 otherwise. The values of c 1 and c 2 determine which side y is of three of the four<br />

intervals liberated at the fourth stage, but leave open which side it is of one of these<br />

four. Let c 3 equal 0 if y is to the left of that interval, 1 otherwise. If we repeat this<br />

procedure for all stages, we will get values of c i for all i. Let c be the number whose<br />

binary expansion is 0.c 1 c 2 . . . c n . . . . It follows that y = f (c). The reason once it is<br />

determined which side y is of each of the liberated intervals, y has been determined<br />

to fall in an interval that is exactly one point wide, and f (c) is in that interval, so f (c)<br />

must equal y. So y is occupied iff for some x, y = f (x). Say S = {y: ∃x (y = f (x))}; the<br />

conclusion is that all and only the points in S are occupied.<br />

Could a piece of gunk occupy the points in S? Not given the assumptions we have<br />

made so far. S has two properties that might not seem consistent at first glance. It is<br />

dense in the sense that for any point y in S, and any distance δ, there is another point<br />

z in S such that |y - z| < δ. But it is disconnected in the sense that for any two points<br />

2 2i<br />

2 2n


Chopping Up Gunk 604<br />

y and z in S, there is an extended region r between y and z that is wholly unoccupied.<br />

The proofs of density and disconnectedness are given in the appendix.<br />

Given (3), disconnectedness is inconsistent with gunk occupying S. If a material<br />

object occupies S, it must occupy the points in S. Let y be any one of these points.<br />

By (3), S must occupy some extended region containing y, say, [y 1 , y 2 ). Two cases<br />

to consider. First case: y 1 < y. If [y 1 , y 2 ) ⊂ S, then y 1 and y are in S, and so are all<br />

the points in between them. Since the object occupies S, it follows that these points<br />

are occupied. Hence there is no extended region between y 1 and y that is wholly<br />

unoccupied, which is inconsistent with disconnectedness. Second case: y 1 = y. Again,<br />

[y 1 , y 2 ) ⊂ S, and since this interval is non-empty, y 2 > y 1 . Hence (y 1 + y 2 ) / 2 is greater<br />

than y 1 , and all the points between it and y 1 are occupied. This is also inconsistent<br />

with disconnectedness. So given (3), no material object could occupy S.<br />

In summary, (1) through (4) plus continuity of motion cannot be true together.<br />

From (1), (2), and (4), we inferred that our super-cutting process was possible, and<br />

that it would not destroy any quantity of matter (though of course it would scatter<br />

it). Assuming continuity of motion, we calculated which points would be occupied<br />

after the super-cut. By (3) we concluded that no piece of gunk could occupy those<br />

points, or indeed any subset of them, yielding an inconsistent result. Suppose that<br />

the continuity of motion thesis is dropped. We can then maintain (1) to (4) with<br />

consistency. One should note, however, that a world where (1) to (4) holds would be<br />

a strange world indeed: if super-cutting is performed, the pieces of gunk would have<br />

to jump location at the limit. The gunk cannot occupy S: but in order to occupy a<br />

different set of points, various quantities of matter would have to jump position at<br />

the limit.<br />

If one believes in gunk one has a choice: Abandon one or more of (2) to (4)<br />

or else deny that it is nomologically necessary that motion be continuous. Which<br />

assumption should be dropped? We leave it to the gunk lover to select the most<br />

tolerable package. The choice for the gunk lover is a little unenviable. Those who<br />

are attracted to the view that the actual world is gunky are very much wedded to (1)<br />

and (3). When philosophers take seriously the idea that that matter has parts all the<br />

way down 8 , they do not imagine conjoining that thesis with point sized parts, or else<br />

immaterial parts 9 , or else quantities of matter that are as thin as a plane, and so on.<br />

With a commitment to (1) and (3) in place, super-cutting will be loaded with physical<br />

significance. Accept that the laws of nature permits super-cutting and one will be<br />

committed to either denying the conservation of matter or the continuity of motion.<br />

Appendix<br />

To prove density, note that if y is occupied, there is a point x with binary representation<br />

0.x 1 x 2 . . . such that y = f (x). For any positive δ, there is an integer n such that<br />

δ > 2 -2n-1 . Let v be the number represented by 0.x 1 x 2 . . . x n x n+1 ′ xn+2 x n+3 . . . , where<br />

x n+1 ′ = 1 iff xn+1 = 0, and x n+1 ′ = 0 otherwise. The difference between f (x) and f (v)<br />

8 See, for example, Zimmerman (1996).<br />

9 Leibniz, with his monads, is an exception of course. No contemporary gunk lover wants a monadol-<br />

ogy, however.


Chopping Up Gunk 605<br />

will be exactly 2 -2n-1 . Since f (v) is occupied, and y = f (x), there is an occupied point<br />

exactly 2 -2n-1 inches from y, so there is a point less than δ inches from y, as required.<br />

To prove disconnectedness, let y and z be any two distinct occupied points. So<br />

for some distinct v, x, y = f (x) and z = f (v). Say that the binary representation of<br />

x is 0.x 1 x 2 . . . , and the binary representation of v is 0.v 1 v 2 . . . Let j be the lowest<br />

number such that x j �= v j . (Since x and v are distinct, there must be at least one value<br />

j.) Without loss of generality, assume that x j = 0 and v j = 1. (There is no loss of<br />

generality because we are just trying to show that between any two occupied points<br />

there is a gap, so it does not matter which of the two points is the rightward one.) Let<br />

k be the number with binary representation 0.x 1 x 2 . . . x j-1 1, and let l 2 be f (k). Finally,<br />

define l 1 by the following equation:<br />

l i =<br />

j�<br />

i=1<br />

(−1) x i +1<br />

2 2i<br />

+<br />

∞�<br />

1<br />

i= j +1 2 2i<br />

It is easy enough to see that f (x), that is y, must be less that l 1 . For l 1 is the value<br />

that f (x) would take were every digit in the binary expansion of x after j be 1. But by<br />

definition there must be some value j ′ > j such that x j ′ = 0. From this it follows that:<br />

∞�<br />

i= j +1<br />

1 (−1)<br />

><br />

2i<br />

i= j +1<br />

xi +1<br />

2 2i<br />

2<br />

And from that it follows that l 1 > f (x). Indeed, by similar reasoning, it follows that<br />

for all u < k, f (u) < l 1 . Since f is monotone increasing, it also follows that for all<br />

u ≥ k, f (u) ≥ l 2 . And from those facts, it follows that there does not exist a u such<br />

that f (u) ∈ [l 1 , l 2 ). And since y < l 1 < l 2 ≤ z, this implies that there is an extended<br />

unoccupied region between y and z, as required.<br />

∞�


Intrinsic and Extrinsic Properties<br />

I have some of my properties purely in virtue of the way I am. (My<br />

mass is an example.) I have other properties in virtue of the way I interact<br />

with the world. (My weight is an example.) The former are the<br />

intrinsic properties, the latter are the extrinsic properties. This seems<br />

to be an intuitive enough distinction to grasp, and hence the intuitive<br />

distinction has made its way into many discussions in ethics, philosophy<br />

of mind, metaphysics and even epistemology. Unfortunately, when we<br />

look more closely at the intuitive distinction, we find reason to suspect<br />

that it conflates a few related distinctions, and that each of these distinctions<br />

is somewhat resistant to analysis.<br />

1 Introduction<br />

The standard way to introduce the distinction between intrinsic and extrinsic properties<br />

is by the use of a few platitudes. Stephen Yablo provides perhaps the most<br />

succinct version: “You know what an intrinsic property is: it’s a property that a<br />

thing has (or lacks) regardless of what may be going on outside of itself.” (1999, 479).<br />

David Lewis provides a more comprehensive list of platitudes.<br />

A sentence or statement or proposition that ascribes intrinsic properties<br />

to something is entirely about that thing; whereas an ascription of extrinsic<br />

properties to something is not entirely about that thing, though<br />

it may well be about some larger whole which includes that thing as part.<br />

A thing has its intrinsic properties in virtue of the way that thing itself,<br />

and nothing else, is. Not so for extrinsic properties, though a thing may<br />

well have these in virtue of the way some larger whole is. The intrinsic<br />

properties of something depend only on that thing; whereas the extrinsic<br />

properties of something may depend, wholly or partly, on something<br />

else. If something has an intrinsic property, then so does any perfect duplicate<br />

of that thing; whereas duplicates situated in different surroundings<br />

will differ in their extrinsic properties. (Lewis, 1983a, 111-112 in<br />

reprint)<br />

As we shall see, the last claim Lewis makes (that duplicates never differ with respect<br />

to intrinsic properties) is somewhat controversial. The other way to introduce the<br />

subject matter is by providing examples of paradigmatic intrinsic and extrinsic properties.<br />

One half of this task is easy: everyone agrees that being an uncle is extrinsic,<br />

as is being six metres from a rhodadendron. The problem with using this method<br />

to introduce the distinction is that there is much less agreement about which properties<br />

are intrinsic. Lewis has in several places (1983a; 1986b; 1988c) insisted that shape<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Stanford<br />

Encyclopedia of Philosophy.


Intrinsic and Extrinsic Properties 607<br />

properties are intrinsic, but one could hold that an object’s shape depends on the<br />

curvature of the space in which it is embedded, and this might not even be intrinsic<br />

to that space (Nerlich, 1979), let alone the object. Lewis also mentions charge and<br />

internal structure as being examples of intrinsic properties.<br />

1.1 Philosophical Importance<br />

The distinction between intrinsic and extrinsic properties plays an essential role in<br />

stating several interesting philosophical problems. Historically, the most prominent<br />

of these has to do with notions of intrinsic value. G. E. (Moore, 1903, §18) noted<br />

that we can make a distinction between things that are good in themselves, or possess<br />

intrinsic value, and those that are good as a means to other things. To this day there<br />

is still much debate over whether this distinction can be sustained (Feldman, 1998;<br />

Kagan, 1998), and if it can which kinds of things possess intrinsic value Krebs (1999).<br />

In particular, one of the central topics in contemporary environmental ethics is the<br />

question of which kinds of things (intelligent beings, conscious beings, living things,<br />

species, etc) might have intrinsic value. While this is the oldest (and still most common)<br />

use of the intrinsic/extrinsic distinction in philosophy, it has not played much<br />

role in the discussions of the distinction in metaphysics, to which we now turn.<br />

As P. T.Geach (1969) noted, the fact that some object a is not F before an event<br />

occurs but is F after that event occurs does not mean that the event constitutes, in<br />

any deep sense, a change in a. To use a well-worn example, at the time of Socrates’s<br />

death Xanthippe became a widow; that is, she was not a widow before the event of<br />

her husband’s death, but she was a widow when it ended. Still, though that event<br />

constituted (or perhaps was constituted by) a change in Socrates, it did not in itself<br />

constitute a change in Xanthippe. Geach noted that we can distinguish between real<br />

changes, such as what occurs in Socrates when he dies, from mere changes in which<br />

predicates one satisfies, such as occurs in Xanthippe when Socrates dies. The latter he<br />

termed ‘mere Cambridge’ change. There is something of a consensus that an object<br />

undergoes real change in an event iff there is some intrinsic property they satisfied<br />

before the event but not afterwards.<br />

David Lewis (1986b, 1988c) built on this point of Geach’s to mount an attack<br />

on endurantism, the theory that objects persist by being wholly located at different<br />

times, and that there can be strict identity between an object existing at one time<br />

and one existing at another time. Lewis argues that this is inconsistent with the idea<br />

that objects undergo real change. If the very same object can be both F (at one time)<br />

and not F (at another), this means that F-ness must be a relation to a time, but this<br />

means that it is not an intrinsic property. So any property that an object can change<br />

must be extrinsic, so nothing undergoes real change. Lewis says that this argument<br />

supports the rival theory of perdurantism, which says that objects persist by having<br />

different temporal parts at different times. While this argument is controversial (see<br />

Haslanger (1989), Johnston (1987) and Lowe (1988) for some responses), it does show<br />

how considerations about intrinsicness can resonate within quite different areas of<br />

metaphysics.<br />

The other major area where the concept of intrinsicness has been put to work is<br />

in stating various supervenience theses. Frank Jackson (1998) defines physicalism in


Intrinsic and Extrinsic Properties 608<br />

terms of duplication and physical duplication, which are in turn defined in terms of<br />

intrinsic properties. This definition builds upon a similar definition offered by Lewis<br />

(1983c). Similarly, Jaegwon Kim (1982) defines a mind/body supervenience thesis in<br />

terms of intrinsic properties. As Theodore Sider (1993) notes, the simplest way to<br />

define the individualist theory of mental content that Tyler Burge (1979) attacks is<br />

as the claim that the content of a thinker’s propositional attitudes supervenes on the<br />

intrinsic properties of the thinker. And many internalist theories in epistemology<br />

are based around the intuition that whether a thinker is justified in believing some<br />

proposition supervenes on the intrinsic properties of the thinker.<br />

Most of the philosophical applications are independent of the precise analysis<br />

of intrinsicness, but work on this analysis has helped clarify debates in two ways.<br />

First, the distinction between the two notions of intrinsicness, discussed in section<br />

2 below, helps clarify a number of these debates. More concretely, Theodore Sider’s<br />

(2003) observation that most of the properties in folk theory are ’maximal’ and hence<br />

not intrinsic provides a strong argument against various theories that appeal to the<br />

intuitive intrinsicness of some everyday property.<br />

Though these are the most prominent uses of the intrinsic/extrinsic distinction<br />

in philosophy, they by no means exhaust its uses. Many applications of the distinction<br />

are cited by I. L. Humberstone (1996), including the following. George<br />

Schlesinger (1990) uses the distinction between intrinsic and extrinsic properties to<br />

state a non-trivial version of Mill’s principle of the uniformity of nature, though<br />

Schlesinger gives his distinction a different name. Wlodzimierz Rabinowicz (1979)<br />

uses the distinction to formulate principles of universalizability for moral principles<br />

and natural laws. And E. J. Khamara (1988) uses a distinction between relational and<br />

non-relational properties to state a non-trivial version of the principle of Identity of<br />

Indiscernibles.<br />

1.2 Global and Local<br />

Whether a property is intrinsic, and whether some individual that has that property<br />

has it intrinsically, are different issues. The property being square or married is no<br />

doubt an extrinsic property; but it is a property that is had intrinsically by all squares<br />

(assuming being square is intrinsic). Once we have these two concepts, a ‘global’<br />

concept of intrinsicness of properties, and a ‘local’ concept of a particular object<br />

being intrinsically such as to possess some property, we might wonder how they<br />

are connected. (The names ‘global’ and ‘local’ are taken from Humberstone (1996)).<br />

In particular, we might wonder which of these should be primary in an analysis of<br />

intrinsicness.<br />

At first glance, the principles (GTL) and (LTG) seem to connect the two concepts.<br />

(GTL) If F is a (globally) intrinsic property, and a is F, then a is intrinsically F<br />

(LTG) If every a that is F is intrinsically F, then F is a (globally) intrinsic property.<br />

(GTL) is undoubtedly true, but (LTG) is more problematic. If the quantifier in it is<br />

actualist (i.e. only ranges over actual objects), then it is clearly false. Let F be the<br />

property being square or being inside a golden mountain. Even if the quantifier is


Intrinsic and Extrinsic Properties 609<br />

possibilist, it is not clear that (LTG) should be true. For a problematic example, let<br />

F be the property being square or being such that the number 21 does not exist. Every<br />

possible object that is F is square, and hence intrinsically F, but it is not clear that F<br />

is an intrinsic property. This question (like a few others we will discuss below) turns<br />

on the metaphysics of properties. If two properties that are necessarily coextensive<br />

are identical (as Lewis believes), or are guaranteed to be alike in whether they are<br />

intrinsic or extrinsic (as Sider 1993 argues), then F will be intrinsic. If properties can<br />

be individuated more finely than this, and if their intrinsicness or otherwise turns on<br />

this fine-grained individuation, then maybe F is not intrinsic. We will return to this<br />

issue in some of the discussions below.<br />

2 Notions of Intrinsicness<br />

Many different distinctions have been called the intrinsic/extrinsic distinction. As<br />

J. Michael Dunn (1990) notes, some authors have used ‘intrinsic’ and ‘extrinsic’ to<br />

mean ‘essential’ and ‘accidental’. Dunn is surely right in saying that this is a misuse<br />

of the terms. A more interesting distinction is noted by <strong>Brian</strong> (Ellis, 1991, discussed<br />

in Humberstone (1996)). Ellis suggests we should distinguish between properties that<br />

objects have independently of any outside forces acting on them (what we will call the<br />

Ellis-intrinsic properties), and those that they have in virtue of those outside forces<br />

(the Ellis-extrinsic properties). For many objects (such as, say, a stretched rubber<br />

band) their shape will be dependent on the outside forces acting on them, so their<br />

shape will be Ellis-extrinsic. If one is committed to the idea that shapes are intrinsic,<br />

one should think this means that the distinction between the Ellis-intrinsic and<br />

Ellis-extrinsic properties is not the same as the intrinsic/extrinsic distinction. Such a<br />

judgement may seem a little hasty, but in any case we will turn now to distinctions<br />

that have received more attention in the philosophical literature.<br />

2.1 Relational vs. Non-Relational Properties<br />

Many writers, especially in the literature on intrinsic value, use ‘relational’ for the<br />

opposite of intrinsic. This seems to be a mistake for two reasons. The first reason is<br />

that many properties seem to be both be relational and intrinsic. For example, most<br />

people have the property having longer legs than arms, and indeed seem to have this<br />

property intrinsically, even though the property consists in a certain relation being<br />

satisfied. Maybe the property is not intrinsic if whether or not something is an arm<br />

or a leg is extrinsic, so perhaps this isn’t a conclusive example, but it seems troubling.<br />

As Humberstone notes, some might respond by suggesting that a relational property<br />

is one such that if an object has it, then it bears some relation to a distinct thing. But<br />

this won’t do either. Not being within a mile of a rhodadendron is clearly relational,<br />

but does not consist in bearing a relation to any distinct individual, as we can see by<br />

the fact that a non-rhodadendron all alone in a world can satisfy it.<br />

A larger problem is that it seems being intrinsic and being relational are properties<br />

of two very different kinds of things. Consider again the property F, being<br />

square or being such that the number 21 does not exist. Assuming (as we do for now)<br />

that we can make sense of the relational/non-relational distinction, F is a relational


Intrinsic and Extrinsic Properties 610<br />

property. But F is necessarily co-extensive with the property being square, which<br />

is surely non-relational. So two necessarily co-extensive properties can differ with<br />

respect to whether they are relational. We can put this point a few different ways.<br />

If any two properties that are necessarily co-extensive are identical, then being relational<br />

is not a property of properties, but a property of concepts, or in any case of some<br />

things individuated as finely as Fregean concepts. If we think that intrinsicness is not<br />

a property that makes such fine distinctions, then the relational/non-relational and<br />

intrinic/extrinsic distinctions are quite different, for they are distinctions between<br />

different kinds of things.<br />

2.2 Qualitative vs. Non-Qualitative Properties<br />

As noted above, one of the platitudes Lewis lists when isolating the concept of intrinsicness<br />

is that duplicates never differ with respect to their intrinsic properties. Lewis<br />

holds a further principle that may not be obvious from the above quote: that any<br />

property with respect to which duplicates never differ is intrinsic. Of course, this is<br />

only true if the quantifiers in it are interpreted as possibilist. Otherwise the property<br />

having a greater mass than any man that has ever existed will be intrinsic, since it never<br />

differs between actual duplicates. We will assume from now on that all quantifiers are<br />

possibilist, unless otherwise noted. And, following Humberstone, we will say that<br />

the properties that do not differ between duplicates are the qualitative properties,<br />

which is not to say they are not also the intrinsic properties.<br />

Despite this two-way connection between intrinsicness and duplication, we do<br />

not yet have an analysis because the relevant concept of duplication can only be (easily)<br />

analysed in terms of intrinsic properties. In the next section we will look at the<br />

two ways Lewis has attempted to analyse that concept and hence break into the circle.<br />

But for now it is worth looking at some results that follow directly from the idea<br />

that intrinsic properties are those that do not differ between duplicates.<br />

First, as (Humberstone, 1996, 227) notes, if this is our definition of intrinsicness,<br />

then we can easily analyse local intrinsicness in terms of global intrinsicness. An<br />

object x is intrinsically P iff all its duplicates are P, that is, if all objects that have the<br />

same (global) intrinsic properties as it does are P. And having this concept of local<br />

intrinsicness is quite useful, because it lets us explain what is right about an intuitively<br />

attractive (though ultimately mistaken) claim about intrinsic properties. Let P be an<br />

intrinsic property and Q a property such that an object’s having Q is entailed by its<br />

having P. One might think in those circumstances that Q would be intrinsic, since<br />

its possession follows from a fact solely about the object in question, namely that it<br />

is P. This isn’t right in general; to see why let P be the property of being square and<br />

Q be the property being square or being inside a golden mountain. For some objects<br />

that are Q their Q-ness follows from facts solely about the object, but for others it<br />

follows from facts quite extrinsic to the object in question. But, Humberstone notes,<br />

something similar is true. If x possess P intrinsically, and being P entails being Q,<br />

then x possesses Q intrinsically. This local concept of intrinsicness might also do<br />

philosophical work; presumably the intrinsic value of an object depends on which<br />

properties it intrinsically possesses, not on which intrinsic properties it possesses.


Intrinsic and Extrinsic Properties 611<br />

Secondly, it follows from the definition that necessarily co-extensive properties<br />

are alike in being intrinsic or not. In particular, any property that every possible<br />

object has, such as being such that the number 21 exists, will be an intrinsic property.<br />

Robert Francescotti (1999) takes this to be a decisive mark against Lewis’s theory,<br />

but others (e.g. Sider (1993), <strong>Weatherson</strong> (2001b)) have been willing to treat it as<br />

a philosophical discovery. It is crucial to the proof that Lewis’s theory entails that<br />

this property is intrinsic that the quantifiers in the theory are possibilist. If we let<br />

the quantifiers range over the right kind of impossibilities (such as the situations of<br />

Barwise and Perry (1983), or the impossible worlds of Nolan (1998)) then one can<br />

have duplicates that, for example, differ with respect to whether the number 21 exists.<br />

Since this approach has not been developed in any detail it is impossible to say at this<br />

time whether it would have serious untoward consequences.<br />

Thirdly, the duplication relation is transitive, so any duplicate of a duplicate of<br />

David Lewis, is a duplicate of David Lewis. That means that being a duplicate of<br />

David Lewis is an intrinsic property of all those objects. While this might plausibly<br />

be a property that Lewis intrinsically possesses, it is somewhat surprising that it is<br />

intrinsic to all of his duplicates. Dunn (1990) reports that Lewis in conversation<br />

said that this property (being a duplicate of David Lewis) is equivalent to an infinite<br />

conjunction of intrinsic properties (the ones Lewis has) so it should turn out to be<br />

intrinsic.<br />

Conversely, assuming the metaphysics of counterpart theory, none of these duplicates<br />

of David Lewis is David Lewis himself, so the property being (identical with)<br />

David Lewis turns out to be extrinsic on this account. Even if we drop the counterpart<br />

theory, and allow that objects in different worlds might be strictly identical,<br />

still not all duplicates of Lewis will be identical with Lewis, so the property being<br />

(identical with) David Lewis will still not be intrinsic. It might seem rather odd that<br />

a property so internal to Lewis should not be intrinsic. Yablo (1999) notes that in<br />

some cases, such as this one, we can make an argument for identity properties not<br />

being intrinsic. If it is essential to David Lewis that he be descended from a particular<br />

zygote Z, then the fact that something is David Lewis entails that something else is<br />

a zygote, and any property whose possession entails the existence of other objects is<br />

usually held to be extrinsic. Still, Yablo argues, it is very plausible that the identity<br />

properties of some things (especially atoms) should be intrinsic.<br />

Finally, we can define relative notions of duplication, and hence relative notions<br />

of intrinsicness (Humberstone, 1996, 238). To give just one interesting example, say<br />

that a property is nomically intrinsic iff it never differs between duplicates in worlds<br />

with the same laws. Then many dispositional properties might turn out to be nomically<br />

intrinsic, capturing nicely the idea that they are in a sense internal to the objects<br />

that possess them, while their manifestation depends both on external facts, and on<br />

the laws being a certain way.<br />

2.3 Interior vs. Exterior Properties<br />

J. Michael Dunn (1990) suggests that odd consequences of Lewis’s theory are sufficient<br />

to look for a new concept of intrinsicness. He suggests that the notion of intrinsicness<br />

is governed by platitudes like the following. “Metaphysically, an intrinsic


Intrinsic and Extrinsic Properties 612<br />

property of an object is a property that the object has by virtue of itself, depending<br />

on no other thing. Epistemologically, an intrinsic property would be a property that<br />

one could determine by inspection of the object itself - in particular, for a physical<br />

object, one would not have to look outside its region of space-time” (1990, 178) As<br />

Dunn notes, the metaphysical definition here is the central one, the epistemological<br />

platitude is at best a heuristic. On this view, being identical to X will be intrinsic<br />

(except in cases where X has essential properties rooted outside itself), while being a<br />

duplicate of X will not be. Also, having X as a part will be intrinsic, though it is not<br />

on Lewis’s account.<br />

Dunn argues that the appropriate logic in which to formulate claims of intrinsicness<br />

and to investigate what consequences they have is the relevant logic R, and he<br />

provides some of the details of how this should be done. But until we see a specific<br />

formulation of the idea (comparable in specificity to the accounts of Peter Vallentyne<br />

and Stephen Yablo, discussed below in section 3.3), we cannot comment on its consequences.<br />

Still, there is an intuitive distinction here, and it clearly differs from the<br />

distinction Lewis discusses.<br />

2.4 Which is the real distinction?<br />

If we grasp the three distinctions discussed above, we might well ask which of them is<br />

the intrinsic/extrinsic distinction? It is possible that this question has no determinate<br />

answer. Humberstone suggests that we have three interesting distinctions here, each<br />

of which can do some philosophical work, and there is not much interest in the issue<br />

of which of them is called the distinction between intrinsic and extrinsic properties.<br />

If we do decide to investigate this seriously, we should perhaps be prepared to be<br />

disappointed - there is no guarantee that there will be a fact of the matter which<br />

distinction the words ‘intrinsic’ and ‘extrinsic’ latch onto.<br />

Should we just give up on identifying the intrinsic/extrinsic distinction; then, on<br />

pain of having some indeterminacy in our philosophical theories, we must reformulate<br />

the theories that are framed using this distinction, specifying which distinction<br />

should take the role of the intrinsic/extrinsic distinction in each case. Sider, in the<br />

course of defending the philosophical interest of the qualitative/non-qualitative distinction,<br />

makes a start on doing this. He notes that in the debates about supervenience,<br />

the distinction that is usually relevant is the qualitative/non-qualitative one.<br />

If we let being (identical to) X be an intrinsic property, then most of the supervenience<br />

theses discussed will be trivially true, because it will be impossible to have duplicates<br />

that are different objects, and hence impossible to have duplicates that differ with<br />

respect to the contents of their beliefs, or the justificatory status of their beliefs, or<br />

their phenomenal states, or whatever. But these theses are not trivially true; so if we<br />

are to formulate the distinctions this way, we had better not let identity properties be<br />

intrinsic in these contexts.<br />

This, of course, does not show that the qualitative/non-qualitative distinction is<br />

the only one that can do philosophical work. Indeed, when trying to grasp what real<br />

change amounts to, it seems to be the interior/exterior distinction that is relevant.<br />

Say that a has b as a part, and consider the event whereby b is replaced in a by c,<br />

which happens to be a duplicate of b. This event seems to constitute a real change in


Intrinsic and Extrinsic Properties 613<br />

a, not merely a Cambridge change, but it does not constitute a change in qualitative<br />

properties.<br />

3 Attempts at Analysis<br />

We will first look at two attempts to analyse the qualitative/non-qualitative distinction,<br />

and then at two more ambitious projects that aim to capture intrinsicness in all<br />

of its facets.<br />

3.1 Combinatorial Theories<br />

As Yablo noted, if an object has a property intrinsically, then it has it independently<br />

of the way the rest of the world is. The rest of the world could disappear, and the<br />

object might still have that property. Hence a lonely object, an object that has no<br />

wholly distinct worldmates, could have the property. Note that in the sense relevant<br />

here, two objects are only ‘wholly distinct’ if they have no parts in common, not<br />

if they are merely non-identical. The idea is that a lonely object could have proper<br />

parts. This is good, since having six proper parts is presumably an intrinsic property.<br />

Many extrinsic properties could not be possessed by lonely objects – no lonely object<br />

is six metres from a rhododendron, for example.<br />

This suggests an analysis of intrinsicness: F is an intrinsic property iff it is possible<br />

for a lonely object to be F. This analysis is usually attributed to Kim (1982) (e.g.<br />

in Lewis (1983a) and Sider (1993)), though Humberstone (1996) dissents from this<br />

interpretation. Both directions of the biconditional can be challenged.<br />

Some objects change in mass over time: this is presumably an intrinsic property<br />

of those objects. If necessitarian theories of laws are true (as endorsed by Ellis (2001)<br />

and Shoemaker (1984)), then there could not be a world with just that object, as the<br />

conservation of matter would be violated. If any kind of combinatorial analysis of<br />

intrinsicness can work, we have to assume something like Hume’s dictum that there<br />

are no necessary connections between distinct existences. Indeed, all combinatorial<br />

theories of intrinsicness do assume this, and further that the range of what is possible<br />

can be taken as given in crafting a theory of intrinsicness. This might be thought<br />

problematic, since the best way to formally spell out Hume’s dictum itself appeals to<br />

the concept of intrinsicness (Lewis, 1986b, 87-91).<br />

The analysis is only viable as an analysis of the qualitative/non-qualitative distinction,<br />

since it rules that being a duplicate of David Lewis is an intrinsic property. This<br />

feature, too, is shared by all combinatorial theories of intrinsicness.<br />

The major problem with this analysis is that the ‘if’ direction of the biconditional<br />

is clearly false. As Lewis (1983a) pointed out, the property being lonely is had by some<br />

possible lonely objects, but it is not intrinsic.<br />

Rae Langton and David Lewis (1998) designed a theory to meet this objection.<br />

Their theory resembles, in crucial respects, the theory sketched in an appendix to<br />

Dean Zimmerman’s paper “Immanent Causation” (Zimmerman, 1997). The two theories<br />

were developed entirely independently. We will focus on Langton and Lewis’s<br />

version here, because it is more substantially developed, and more widely discussed


Intrinsic and Extrinsic Properties 614<br />

in the literature. On their theory, a property F is independent of accompaniment iff<br />

the following four conditions are met:<br />

1. There exists a lonely F<br />

2. There exists a lonely non-F<br />

3. There exists an accompanied (i.e. not lonely) F<br />

4. There exists an accompanied non-F<br />

Langton and Lewis’s idea is that if F is intrinsic, then whether or not an object is F<br />

should not depend on whether or not it is lonely. So all four of these cases should<br />

be possible. Still, some extrinsic properties satisfy all four conditions. Consider, for<br />

instance, the property being lonely and round or accompanied and cubical. A lonely<br />

sphere suffices for (a), a lonely cube for (b), an actual cube for (c) and an actual sphere<br />

for (d). So they have to rule out this property. They do it by the following five-step<br />

process.<br />

First, Langton and Lewis identify a class of privileged natural (or non-disjunctive)<br />

properties. Lewis (1983c) had argued that we need to recognise a distinction between<br />

natural and non-natural properties to make sense of many debates in metaphysics,<br />

philosophy of science, philosophy of language and philosophy of mind, and suggested<br />

a few ways we might draw the distinction. We might take the natural properties<br />

to be those that correspond to real universals, or those that appear in the canonical<br />

formulations of best physics or regimented common sense, or even take the distinction<br />

to be primitive. Langton and Lewis say that it should not matter how we<br />

draw the distinction for present purposes, as long as we have it, and properties like<br />

being lonely and round or accompanied and cubical are not natural.<br />

Secondly, they say properties are disjunctive iff they “can be expressed by a disjunction<br />

of (conjunctions of) natural properties; but are not themselves natural properties.”<br />

Thirdly, they say a property is basic intrinsic iff it is non-disjunctive and satisfies<br />

(a) through (d). Fourthly, they say two (possible) objects are duplicates iff they<br />

share the same basic intrinsic properties. Finally, they say F is an intrinsic property<br />

iff two duplicates never differ with respect to it.<br />

Three objections have been pressed against this view. Stephen Yablo (1999) objected<br />

to the role of natural properties in the analysis, which he argued introduced<br />

irrelevant material, and implied that the theory was at best de facto, but not de jure,<br />

correct. Dan Marshall and Josh Parsons (2001) claimed that according to this definition,<br />

the property being such that a cube exists is non-disjunctive, but it satisfies (a)<br />

through (d), so it would be basic intrinsic, despite being extrinsic. Theodore Sider<br />

(2001a) claimed that the theory could not handle maximal properties: properties of<br />

objects that are not shared by their large proper parts. Sider claims that being a rock is<br />

such a property: large parts of rocks are not rocks. So being a rock is extrinsic, since a<br />

duplicate of a large part of a rock might be a rock if it is separated from the rest of the<br />

rock. But, argued Sider, on some interpretations of ‘natural property’ it is natural,<br />

and hence basic intrinsic.<br />

<strong>Brian</strong> <strong>Weatherson</strong>’s (2001b) theory was designed to meet these three objections.<br />

In his theory, combinatorial principles of possibility are not used to derive characteristics<br />

of individual intrinsic properties, as Kim and Langton and Lewis do, but


Intrinsic and Extrinsic Properties 615<br />

characteristics of the whole set of intrinsic properties. He argues that this set, call it<br />

SI, will have the following properties:<br />

• If F ∈ SI and G ∈ SI, then F and G ∈ SI and F or G ∈ SI and not F ∈ SI<br />

• If F ∈ SI then Having n parts that are F ∈ SI and Being entirely composed of<br />

exactly n things that are F ∈ SI<br />

• If F ∈ SI and G ∈ SI and there is a possible world with n+1 pairwise distinct<br />

things, and something in some world is F and something in some world is G,<br />

then there is a world with exactly n+1 pairwise distinct things such that one is<br />

F and the other n are G.<br />

• If F ∈ SI and G ∈ SI and it is possible that regions with shapes d 1 and d 2 stand<br />

in relation A, and it is possible that an F wholly occupy a region with shape<br />

d 1 and a G wholly occupy a region with shape d 2 , then there is a world where<br />

regions with shapes d 1 and d 2 stand in A, and an F wholly occupies the region<br />

with shape d 1 and a G wholly occupies the region with shape d 2 .<br />

The first two principles are closure principles on the set. The third principle says that<br />

any two intrinsic properties that can be instantiated can be instantiated together any<br />

number of times. And the fourth says that if objects having two intrinsic properties<br />

can be in two regions, and those two regions can be in a particular spatial relation,<br />

then the regions can be in that relation while filled by objects having those properties.<br />

The third principle suffices to show that being such that a cube exists could not be in<br />

SI, and the fourth to show that being a rock could not be.<br />

<strong>Weatherson</strong>’s theory does not entirely avoid appeals to a concept of naturalness,<br />

though the counterexamples that prompt the appeal are now much more recherché.<br />

Without such an appeal, then if F and G are intrinsic properties that atoms could<br />

have, nothing in his theory rules out the property being simple, lonely and F or being<br />

G from being intrinsic. There are a few ways for the appeal to go at this point, see<br />

<strong>Weatherson</strong> (2001b) and Lewis (2001a) for a few suggestions. The following moves,<br />

taken directly from Langton and Lewis, will probably work if any will. Say that the<br />

basic intrinsic properties are those non-disjunctive properties such that their membership<br />

in SI is consistent with the above four principles. Two objects are duplicates<br />

if they do not differ with respect to basic intrinsic properties. A property is intrinsic<br />

if it never differs between duplicates.<br />

John Hawthorne (2001) has suggested that all these combinatorial theories have<br />

a problem with properties of the form being R-related to something, where R is a<br />

perfectly natural relation that is neither reflexive nor irreflexive. Such properties are<br />

extrinsic, but Hawthorne suggests they will satisfy all the combinatorial principles,<br />

and their close connection to natural relations means that they will be natural enough<br />

to cause problems for all these combinatorial approaches.<br />

In a recent paper, Gene D. Witner, William Butchard and Kelly Trogdon (2005),<br />

hereafter WBT, propose a different kind of variation on Langton and Lewis’s analysis.<br />

They agree that the core idea of an analysis should be independence from accompaniment.<br />

But they argue that Lewis’s notion of ’naturalness’ is too mysterious to deliver<br />

a useful analysis. Instead they propose to use the idea that an individual has some


Intrinsic and Extrinsic Properties 616<br />

properties in virtue of having other properties. Their analysis then is that F is intrinsic<br />

iff everything that has F has F intrinsically, and an object a has F intrinsically iff<br />

for every G such that a is F in virtue of being G, G is independent of accompaniment.<br />

As WBT mention, their theory needs some relatively fine judgments about what<br />

properties are instantiated in virtue of which other properties in order to handle<br />

some hard cases, especially Sider’s examples involving maximal properties. And they<br />

note that the theory seems to break down altogether over some impure properties,<br />

such as being identical to a for any a that can exist either alone or accompanied. This<br />

is a property that a has, but doesn’t have in virtue of any distinct property. And<br />

that property is independent of accompaniment, so it is intrinsic. But intuitively a<br />

could have had a distinct duplicate, so being identical to a shouldn’t count as intrinsic.<br />

WBT’s solution is to say that the theory is only meant to apply to pure properties.<br />

This might not work as it stands. If a is the F then the property of being identical to<br />

the only actual F will also improperly count as intrinsic on their view for much the<br />

same reason. It might be better to say that their analysis is intended as an analysis<br />

of interior properties, rather than as an analysis of qualitative properties, unlike the<br />

other combinatorial theories.<br />

3.2 Natural Kind Theories<br />

In On the Plurality of Worlds, David Lewis presents a quite different analysis of intrinsic<br />

properties. As with the combinatorial theory that he and Rae Langton defend, it<br />

heavily exploits the idea that some properties are more natural than others. In fact,<br />

it rests even more weight on it. Here is Lewis’s statement of the theory:<br />

[I]t can plausibly be said that all perfectly natural properties are intrinsic.<br />

Then we can say that two things are duplicates iff (1) they have exactly<br />

the same perfectly natural properties, and (2) their parts can be put into<br />

correspondence in such a way that corresponding parts have exactly the<br />

same perfectly natural properties, and stand in the same perfectly natural<br />

relations. . . Then we can go on to say that an intrinsic property is one<br />

that can never differ between duplicates. (Lewis, 1986b, 61-62)<br />

Like the combinatorial theories, this is an attempt at analysing qualitative intrinsicness.<br />

Anyone who thinks this is too modest an aim to be worthwhile will be<br />

disappointed. It rests heavily on the ‘plausible’ claim that all perfectly natural properties<br />

are intrinsic, and, implicitly, that the perfectly natural properties are sufficient<br />

to characterise the world completely. The last assumption is needed because the theory<br />

rules out the possibility that there are two objects that share all their perfectly<br />

natural properties, but differ with respect to some intrinsic property or other. One<br />

consequence of these assumptions is that a world is fully characterised by the intrinsic<br />

properties of its inhabitants and the perfectly natural relations between those inhabitants.<br />

Lewis thinks this is true for the actual world, it is just his doctrine of Humean<br />

supervenience. (Lewis, 1986c, i-xiii). But it might be thought a stretch to think it is<br />

true of all worlds.


Intrinsic and Extrinsic Properties 617<br />

In their paper of 1998, Langton and Lewis claim the only advantage of their theory<br />

over Lewis’s old theory is that it makes fewer assumptions about the nature of<br />

natural properties. They also note that Lewis still believes those assumptions, but<br />

they think it is worthwhile to have a theory that gets by without them. It also seems<br />

that Lewis’s new theory, perhaps as amended, provides more insight into the nature<br />

of intrinsicness.<br />

3.3 Contractions<br />

Peter Vallentyne (1997) develops a theory based around the idea that x’s intrinsic<br />

properties are those properties it would have if it were alone in the world. He defines<br />

a contraction of a world as “a world ‘obtainable’ from the original one solely by ‘removing’<br />

objects from it.” (211) As a special case of this, an x-t contraction, where x is<br />

an object and t a time, is “a world ‘obtainable’ from the original one by, to the greatest<br />

extent possible, ‘removing’ all objects wholly distinct from x, all spatial locations not<br />

occupied by x, and all times (temporal states of the world) except t, from the world.”<br />

(211) Vallentyne allows that there might not be unique x-t contraction; sometimes<br />

we can remove one of two objects, but not both, from the world while leaving x, so<br />

there will be one x-t contraction which has one of these in it, and another that has<br />

the other.<br />

He then says that F is intrinsic iff for all x, t, all x-t contractions are such that<br />

Fx is true in the contraction iff it is true in the actual world. In short, a property is<br />

intrinsic to an object iff removing the rest of the world doesn’t change whether the<br />

object has the property.<br />

Vallentyne notes that this definition will not be very enlightening unless we understand<br />

the idea of a contraction. This seems related to the objection Langton and<br />

Lewis (1998) urge against Vallentyne. They say that Vallentyne’s account reduces<br />

to the claim that a property is intrinsic iff possession of it never differs between an<br />

object and its lonely duplicates, a claim they think is true but too trivial to count as<br />

an analysis. Their position is that we cannot understand contractions without understanding<br />

duplication, but if we understand duplication then intrinsicness can be<br />

easily defined, so Vallentyne’s theory is no advance.<br />

Stephen Yablo (1999) argues that this criticism is too quick. Vallentyne should<br />

best be understood as working within a very different metaphysical framework to<br />

Lewis. For Lewis, no (ordinary) object exists at more than one world, so Vallentyne’s<br />

contractions, being separate worlds, must contain separate objects. Hence x-t<br />

contractions can be nothing other than lonely duplicates, and the theory is trivial.<br />

Yablo suggests that the theory becomes substantive relative to a metaphysical background<br />

in which the very same object can appear in different worlds. (In chapter 4<br />

of Plurality Lewis has a few arguments against this idea, and Yablo has interesting<br />

responses to these arguments. A thorough investigation of this debate would take<br />

us too far from the topic.) If this is the case then we can get a grip on contractions<br />

without thinking about duplications - the x-t contraction of a world is the world that<br />

contains x itself, and as few other things as possible.<br />

Josh Parsons (2007) argues against this approach. He says that there is no good<br />

way to understand sentences like “In w, Prince Philip is not a husband”, where w is


Intrinsic and Extrinsic Properties 618<br />

the world consisting of just Prince Philip. More generally, we cannot make sense of<br />

what extrinsic properties x has in the x-t contraction.<br />

3.4 Non-relationality<br />

Robert Francescotti (1999) recently outlined an analysis that takes the concept of<br />

intrinsicness as non-relationality to be primary. Francescotti takes a property to be<br />

extrinsic iff an object possesses it in virtue of its relations to other objects. So being<br />

a duplicate of Jack and being such that the number 17 exists are extrinsic, while being<br />

identical to Jack and being a vertebrate (i.e. having a vertebral column) are intrinsic.<br />

As noted above, this means that we must either have a hyper-intensional notion of<br />

properties, or we say that intrinsicness is a property of concepts, not of properties.<br />

Francescotti takes the former option.<br />

Francescotti notes that not all relational properties are extrinsic. Having a vertebral<br />

column, for instance, seems to be relational in that it consists of a relation to<br />

a vertebral column, but it is also an intrinsic property. So he focuses on relations<br />

to distinct objects. The definition of intrinsicness goes as follows. First we define a<br />

d-relational property. F is d-relational iff:<br />

(a) there is a relation R, and an item y, such that (i) x’s having F consists in x’s<br />

bearing R to y, and (ii) y is distinct from x; or<br />

(b) there is a relation R, and a class of items C, such that (i) x’s having F consists in<br />

there being some member of C to which x bears R, and (ii) at least one member<br />

of C to which x bears R is distinct from x; or<br />

(c) there is a relation R, and a class of items C, such that (i) x’s having F consists in<br />

x’s bearing R to every member of C, and (ii) it is possible that there is a member<br />

of C that is distinct from x.<br />

We then define intrinsic properties as being those that are not d-relational.<br />

F is an intrinsic property of x = df x has F, and F is not a d-relational<br />

property of x.<br />

Francescotti’s theory provides intuitively plausible answers to all the cases he considers,<br />

provided of course that we identify the intrinsic/non-intrinsic distinction with<br />

the relational/non-relational distinction, rather than one of the other two distinctions<br />

considered in section two. Like the other three theories, it has one unexplained<br />

(or perhaps underexplained) primitive, in this case the consists-in relation.<br />

As Francescotti notes, following Khamara (1988), it won’t do to say x’s having F consists<br />

in x’s bearing R to y just in case it is necessary that Fx iff xRy. That makes it<br />

too easy for having a property to consist in some necessary being (say God, or the<br />

numbers) being a certain way. Rather, he says, “x’s having F, consists in the event<br />

or state, x’s having G, just in case x’s having F is the very same event or state as x’s<br />

having G.” (599) Whether this can handle all the hard cases seems to depend how<br />

the theory of identity conditions for events and states turns out. (See the entries on<br />

Donald Davidson and events for some start on this.)


Part VII<br />

Comments and Criticism


In Defense of a Kripkean Dogma<br />

Jonathan Ichikawa, Ishani Maitra, <strong>Brian</strong> <strong>Weatherson</strong><br />

In “Against Arguments from Reference” (Mallon et al., 2009), Ron Mallon, Edouard<br />

Machery, Shaun Nichols, and Stephen Stich (hereafter, MMNS) argue that recent<br />

experiments concerning reference undermine various philosophical arguments that<br />

presuppose the correctness of the causal-historical theory of reference. We will argue<br />

three things in reply. First, the experiments in question—concerning Kripke’s<br />

Gödel/Schmidt example—don’t really speak to the dispute between descriptivism<br />

and the causal-historical theory; though the two theories are empirically testable,<br />

we need to look at quite different data than MMNS do to decide between them.<br />

Second, the Gödel/Schmidt example plays a different, and much smaller, role in<br />

Kripke’s argument for the causal-historical theory than MMNS assume. Finally, and<br />

relatedly, even if Kripke is wrong about the Gödel/Schmidt example—indeed, even<br />

if the causal-historical theory is not the correct theory of names for some human<br />

languages—that does not, contrary to MMNS’s claim, undermine uses of the causalhistorical<br />

theory in philosophical research projects.<br />

1 Experiments and Reference<br />

MMNS start with some by now famous experiments concerning reference and mistaken<br />

identity. The one they focus on, and which we’ll focus on too, is a variant of<br />

Kripke’s Gödel/Schmidt example. Here is the question they gave to subjects.<br />

Suppose that John has learned in college that Gödel is the man who<br />

proved an important mathematical theorem, called the incompleteness<br />

of arithmetic. John is quite good at mathematics and he can give an accurate<br />

statement of the incompleteness theorem, which he attributes to<br />

Gödel as the discoverer. But this is the only thing that he has heard about<br />

Gödel. Now suppose that Gödel was not the author of this theorem. A<br />

man called “Schmidt” whose body was found in Vienna under mysterious<br />

circumstances many years ago, actually did the work in question. His<br />

friend Gödel somehow got hold of the manuscript and claimed credit for<br />

the work, which was thereafter attributed to Gödel. Thus he has been<br />

known as the man who proved the incompleteness of arithmetic. Most<br />

people who have heard the name ‘Gödel’ are like John; the claim that<br />

Gödel discovered the incompleteness theorem is the only thing they have<br />

ever heard about Gödel. When John uses the name ‘Gödel,’ is he talking<br />

about:<br />

(A) the person who really discovered the incompleteness of arithmetic?<br />

or<br />

† Penultimate draft only. Please cite published version if possible. Final version forthcoming in Philosophy<br />

and Phenomenological Research.


In Defense of a Kripkean Dogma 621<br />

(B) the person who got hold of the manuscript and claimed credit for<br />

the work? (MMNS 2009: 341)<br />

The striking result is that while a majority of American subjects answer (B), consistently<br />

with Kripke’s causal-historical theory of names, the majority of Chinese<br />

subjects answer (A). 1 To the extent that Kripke’s theory is motivated by the universality<br />

of intuitions in favour of his theory in cases like this one, Kripke’s theory is<br />

undermined.<br />

There are now a number of challenges to this argument in the literature. Before<br />

developing our own challenge, we’ll briefly note five extant ones, which all strike us<br />

as at least approximately correct.<br />

(1) Kripke’s theory is a theory of semantic reference. When asked who John is<br />

talking about, it is natural that many subjects will take this to be a question<br />

about speaker reference. And nothing in Kripke’s theory denies that John might<br />

refer to the person who proved the incompleteness of arithmetic, even if his<br />

word refers to someone else. (Ludwig, 2007; Deutsch, 2009)<br />

(2) Kripke’s argument relies on the fact that ‘Gödel’ refers to Gödel, not to the<br />

universality or otherwise of intuitions about what it refers to. That some experimental<br />

subjects don’t appreciate this fact doesn’t make it any less of a fact.<br />

(Deutsch, 2009)<br />

(3) If the subjects genuinely were descriptivists, it isn’t clear how they could make<br />

sense of the vignette, since the name ‘Gödel’ is frequently used in the vignette<br />

itself to refer to the causal origin of that name, not to the prover of the incompleteness<br />

or arithmetic. 2<br />

On a related point, Martí doesn’t mention this, but subjects who aren’t descriptivists<br />

should also object to the vignette, since in the story John doesn’t learn<br />

Gödel proved the incompleteness of arithmetic, at least not if ‘learn’ is factive.<br />

(Martí, 2009)<br />

(4) The experiment asks subjects for their judgments about a metalinguistic, and<br />

hence somewhat theoretical, question about the mechanics of reference. It’s<br />

better practice to observe how people actually refer, rather than asking them<br />

what they think about reference. (Martí, 2009; Devitt, 2010)<br />

1 Note that a causal descriptivist about names will also say that the correct answer to this question is<br />

(B). So the experiment isn’t really testing descriptivism as such versus Kripke’s causal-historical theory, but<br />

some particular versions of descriptivism against Kripke’s theory. These versions of descriptivism say that<br />

names refer to the satisfiers of (generally non-linguistic) descriptions that the name’s user associates with<br />

the name. One such version is ‘famous deeds’ descriptivism, and the descriptions MMNS use are typically<br />

famous deeds; nevertheless, that seems inessential to their experiments. When we use ‘descriptivism’ in<br />

this paper, we’ll mean any such version of descriptivism. Thanks here to an anonymous referee.<br />

2 This objection relies on an empirical assumption that may be questionable. It assumes that the subject<br />

of the experiment associates the same description with ‘Gödel’ as John does. A subject who (a) is a<br />

descriptivist and (b) associates with the name ‘Gödel’ the description ‘the man who proved the compatibility<br />

of time travel and general relativity’, can also make sense of the vignette, contra Martí. So perhaps<br />

the objection could be resisted. But we think this empirical assumption is actually fairly plausible. Unless<br />

the experimental subjects were being picked from a very biased sample, the number of subjects who are<br />

familiar with Gödel’s work on closed time-like curves is presumably vanishingly small! We’re grateful here<br />

to an anonymous referee.


In Defense of a Kripkean Dogma 622<br />

(5) Intuitions about the Gödel/Schmidt case play at best a limited role in Kripke’s<br />

broader arguments, so experimental data undermining their regularity do not<br />

cast serious doubt on Kripke’s theory of reference. (Devitt, 2010)<br />

We think challenges (1)-(3) work. Something like (4) should work too, although it<br />

requires some qualification. Consider, for instance, what happens in syntax. It’s<br />

true, of course, that we don’t go around asking ordinary speakers whether they<br />

think Lectures on Government and Binding was an advance over Aspects. Or, if we<br />

did, we wouldn’t think it had much evidential value. But that’s not because ordinary<br />

speaker judgments are irrelevant to syntax. On the contrary, judgments about<br />

whether particular strings constitute well-formed sentences are an important part of<br />

our evidence. 3 But they are not our only evidence, or even our primary evidence;<br />

we also use corpus data about which words and phrases are actually used, and many<br />

syntacticians take such usage evidence to trump evidence from metasemantic intuitions.<br />

4 Even when we do seek such intuitive answers, perhaps because there isn’t<br />

enough corpus data to settle the usage issue, the questions might be about cases that<br />

are quite different to the cases we primarily care about. So we might ask a lot about<br />

speakers’ judgments concerning questions even if we care primarily about the syntax<br />

of declarative sentences.<br />

If what Kripke says in Naming and Necessity (hereafter, NN) is right, then we<br />

should expect something similar in the case of reference. Kripke anticipates that<br />

some people will disagree with him about some of the examples, and offers a few<br />

replies. (Our discussion here largely draws on footnote 36 of NN.) Part of his reply<br />

is a version of point 1 above; those disagreements may well be over speaker reference,<br />

not semantic reference. That reply is correct; it’s hard for us to hear a question about<br />

who someone is talking about as anything but a question about speaker reference.<br />

He goes on to note that his theory makes empirical predictions about how names are<br />

used.<br />

If I mistake Jones for Smith, I may refer (in an appropriate sense) to Jones<br />

when I say that Smith is raking the leaves . . . Similarly, if I erroneously<br />

3 This point suggests Martí’s criticism of MMNS as stated overshoots. She wants to dismiss arguments<br />

from metalinguistic intuitions altogether. But intuitions about well-formedness are metalinguistic intuitions,<br />

and they are a key part of the syntactician’s toolkit. Martí concedes something like this point, but<br />

claims that the cases are not on a par, because syntax concerns a normative issue and reference does not.<br />

We’re quite suspicious that there’s such a striking distinction between the kind of subject-matter studied<br />

by syntacticians and semanticists. Devitt’s version of this point is more modest and does not obviously<br />

commit to this exaggeration.<br />

4 Here’s one example where testing intuitions and examining the corpus may lead to different answers.<br />

Many people think, perhaps because they’ve picked up something from a bad style guide, that the sentence<br />

‘Whenever someone came into Bill’s shop, he greeted them with a smile’, contains one or two syntactic<br />

errors. (It uses a possessive as the antecedent of a pronoun, and it uses ‘them’ as a bound singular variable.)<br />

Even if most subjects in a survey said such a sentence was not a well-formed sentence of English, corpus<br />

data could be used to show that it is. Certainly the existence of a survey showing that users in, say, Scotland<br />

and New Jersey give different answers when asked about whether the sentence is grammatical would not<br />

show that there’s a syntactic difference between the dialects spoken in Scotland and New Jersey. You’d<br />

also want to see how the sentences are used.


In Defense of a Kripkean Dogma 623<br />

think that Aristotle wrote such-and-such passage, I may perhaps sometimes<br />

use ‘Aristotle’ to refer to the actual author of the passage . . . In<br />

both cases, I will withdraw my original statement, and my original use<br />

of the name, if apprised of the facts. (NN 86n)<br />

This seems entirely right. There’s some sense in which John, in MMNS’s vignette, is<br />

referring to Gödel and some sense in which he’s referring to Schmidt. Just thinking<br />

about the particular utterance he makes using ‘Gödel’ won’t help much in teasing<br />

apart speaker reference and semantic reference. What we should look to are patterns<br />

of—or if they’re not available, intuitions about—withdrawals of statements containing<br />

disputed names. To use the example Kripke gives here, consider a speaker who (a)<br />

associates with the name ‘Aristotle’ only the description ‘the author of The Republic’,<br />

(b) truly believes that a particular passage in The Republic contains a quantifier scope<br />

fallacy, and (c) is a descriptivist. She might say “Aristotle commits a quantifier scope<br />

fallacy in this passage.” When she’s informed that the passage was written by Plato,<br />

she’ll no longer utter those very words, but she’ll still insist that the sentence she<br />

uttered was literally true. That’s because she’ll claim that in that sentence ‘Aristotle’<br />

just referred to the author of the passage, and that person did commit a quantifier<br />

scope fallacy. A non-descriptivist will take back the claim expressed, though she<br />

might insist that what she intended to say was true.<br />

So to show that subjects in different parts of the world really have descriptivist<br />

intuitions about the Gödel/Schmidt case, we might ask about whether they think<br />

John should withdraw, or clarify, his earlier statements if apprised of the facts. Or<br />

we might ask whether they would withdraw, or clarify, similar statements they had<br />

made if apprised of the facts. Or, even better, we might test whether in practice people<br />

in different parts of the world really do withdraw their prior claims at different<br />

rates when apprised of the facts about a Gödel/Schmidt case. Kripke is right that<br />

given descriptivism, a speaker shouldn’t feel obliged to withdraw the original statement<br />

when apprised of the facts, but given the causal-historical theory, they should.<br />

So there are experiments that we could run which would discriminate between descriptivist<br />

and causal-historical approaches, but we don’t think the actual experiment<br />

MMNS run does so.<br />

In its broad terms, we agree with Devitt’s challenge (5), although we understand<br />

the role of the Gödel/Schmidt case rather differently than he does. We turn now to<br />

this question.<br />

2 Gödel’s Role in Naming and Necessity<br />

In the first section we argued that the experimental data MMNS offer do not show<br />

that the correct account of the Gödel/Schmidt example is different in different dialects.<br />

In this section we want to argue that there’s very little one could show about<br />

the Gödel/Schmidt example that would bear on the broader question of what the<br />

correct theory of reference is. To see this, let’s review where the Gödel/Schmidt<br />

example comes up in Naming and Necessity.


In Defense of a Kripkean Dogma 624<br />

In the first lecture, Kripke argues, via the modal argument, that names can’t be<br />

synonymous with descriptions. The reason is that in modal contexts, substituting<br />

a name for an individuating description alters truth values. So a pure descriptivism<br />

that treats names and descriptions as synonymous is off the table. What’s left, thinks<br />

Kripke, is what Soames calls “weak descriptivism” (Soames, 2003, Volume II, 356).<br />

This is the view that although names are not synonymous with descriptions, and do<br />

not abbreviate descriptions, they do have their reference fixed by descriptions. Here<br />

is the way Kripke introduces the picture that he is attacking.<br />

The picture is this. I want to name an object. I think of some way of<br />

describing it uniquely and then I go through, so to speak, a sort of mental<br />

ceremony: By ‘Cicero’ I shall mean the man who denounced Cataline . . .<br />

[M]y intentions are given by first, giving some condition which uniquely<br />

determines an object, then using a certain word as a name for the object<br />

determined by this condition. (NN 79)<br />

The Gödel/Schmidt example, or at least the version of it that MMNS discuss,<br />

comes up in Kripke’s attack on one of the consequences of this picture of naming.<br />

(A variant on the example, where no one proves the incompleteness of arithmetic, is<br />

used to attack another consequence of the theory.) So the role of the Gödel/Schmidt<br />

example is to undermine this picture of names and naming.<br />

But note that it is far from the only attack on this picture. Indeed, it is not even<br />

the first attack. Kripke’s first argument is that for most names, most users of the<br />

name cannot give an individuating description of the bearer of the name. In fact,<br />

those users cannot even give a description of the bearer that is individuating by their<br />

own lights. The best they can do for ‘Cicero’ is ‘a Roman orator’ and the best they<br />

can do for ‘Feynman’ is ‘a famous physicist’. (NN 81) But it isn’t that these users<br />

think that there was only one Roman orator, or that there is only one famous physicist.<br />

It’s just that they don’t know any more about the bearers of these names they<br />

possess. The important point here is that Kripke starts with some examples where<br />

the best description a speaker can associate with a name is a description that isn’t<br />

individuating even by the speakers’ own lights. And he thinks that descriptivists can’t<br />

explain how names work in these cases.<br />

Now perhaps we’ll get new experimental evidence that even in these cases, some<br />

experimental subjects have descriptivist intuitions. Some people might intuit that if a<br />

speaker does not know of any property that distinguishes Feynman from Gell-Mann,<br />

their name ‘Feynman’ is indeterminate in reference between Feynman from Gell-<br />

Mann. We’re not sure what such an experiment would tell us about the metaphysics<br />

of reference, but maybe someone could try undermining Kripke’s argument this way.<br />

But that’s not what MMNS found; their experiments don’t bear on what Kripke<br />

says about ‘Feynman’, and hence don’t bear on his primary argument against weak<br />

descriptivism.<br />

Some philosophers will hold that although the picture Kripke describes here,<br />

i.e., weak descriptivism, can’t be right in general for Feynman/Gell-Mann reasons, it


In Defense of a Kripkean Dogma 625<br />

could be true in some special cases. We agree. So does Kripke. The very next sentence<br />

after the passage quoted above says, “Now there may be some cases in which<br />

we actually do this.” (NN 79) And he proceeds to describe three real life cases (concerning<br />

‘Hesperus’, ‘Jack the Ripper’ and ‘Neptune’) where the picture is plausibly<br />

correct. But he thinks these cases are rare. In particular, we shouldn’t think that the<br />

existence of an individuating description is sufficient reason to believe that we are in<br />

such a case. That, at last, is the point of the Gödel/Schmidt example. His conclusion<br />

from that example is that weak descriptivism isn’t correct even in those special cases<br />

of names where the speaker possesses a description that she takes to be individuating. 5<br />

Michael Devitt (2010) also argues that MMNS exaggerate the importance of the<br />

Gödel/Schmidt case. He identifies a number of Kripke’s other arguments (including<br />

the Feynman one we mention) that he takes to be more central, and, like us, he<br />

argues that MMNS’s results do not cast doubt on these arguments. We agree, noting<br />

only two points of difference. First, as suggested above, although the Gödel/Schmidt<br />

case is not the only or the most central motivation for Kripke’s theory of reference,<br />

we do think that it plays a distinctive role, compared with that of, for instance, the<br />

Feynman case. It refutes even the weak version of weak descriptivism according to<br />

which, in the special case in which subjects do possess individuating descriptions,<br />

those descriptions determine reference. We think the Gödel/Schmidt case (together<br />

with the Peano/Dedekind case) form the basis of the only argument in Naming and<br />

Necessity against this weak weak descriptivism. (On a closely related point, we, unlike<br />

Devitt, take the Gödel/Schmidt case to be addressing a quantitative question about<br />

how common descriptive names are, not the qualitative question about whether the<br />

causal-historical theory is true at all; we’ll expand on this point below.) Second,<br />

Devitt expresses some scepticism about the Gödel/Schmidt judgment on the grounds<br />

that the relevant case is somewhat ‘fanciful’—actual cases, Devitt suggests, are better<br />

to be trusted. While there is surely some truth in the suggestion that intuitions about<br />

esoteric and complicated cases can be less trustworthy than those about everyday<br />

ones, we see little reason for concern in this instance; the Gödel case does not describe<br />

a scenario we should expect to find trouble thinking about.<br />

Our reconstruction of the structure of Kripke’s argument should make it clear<br />

how unimportant the Gödel/Schmidt example is to the broader theoretical questions.<br />

If Kripke were wrong about the Gödel/Schmidt case, that would at most show that<br />

there are a few more descriptive names than we thought there were. But since the<br />

existence of some descriptive names is consistent with the causal-historical theory<br />

of reference, the existence of a few more is too. All the Gödel/Schmidt example is<br />

used for in Naming and Necessity is to show that the number of descriptive names in<br />

English is not just small, it is very small. But the truth of the causal-historical theory<br />

5 The Gödel/Schmidt example is also distinctive in another way, in that the description in question<br />

actually applies to the referent of the name, and indeed speakers actually know this. But the flow of the<br />

text around the example (especially on page 84) suggests Kripke intends the example to make the same<br />

point as is made by other examples, such as the Peano/Dedekind case (in which the possessed description<br />

doesn’t actually apply to the referent of the name). So this is probably not crucial to the point the example<br />

makes. We’ll return below to the issue of just what this example shows. The key point is that the more<br />

distinctive the example is, the less that would follow if Kripke were wrong about the example; he might<br />

only be wrong about examples with just those distinctive features.


In Defense of a Kripkean Dogma 626<br />

of reference doesn’t turn on whether there are few descriptive names, or very few<br />

descriptive names.<br />

Once we see that the Gödel/Schmidt example concerns a quantitative question<br />

(are descriptive names rare or very rare?) rather than a qualitative question (is the<br />

causal-historical theory correct?), we can see some limitations of the experiment<br />

MMNS rely on. The case that MMNS describes to their subjects has several distinctive<br />

features, and it isn’t clear that we’d be justified in drawing conclusions from<br />

it about cases that lack those features. Here is one such feature. The subject of the<br />

vignette (John) acquires the name ‘Gödel’ at the same time as he acquires an individuating<br />

description of Gödel. Suppose it turned out that, in some dialects at least,<br />

that would be sufficient for the name to be a descriptive name; i.e., for it to be a<br />

name whose reference is fixed by a description somehow attached to that name. If<br />

this conjecture is true, then descriptive names are a little more common than Kripke<br />

thinks they are, but not a lot more common. Now we don’t actually think this conjecture<br />

is true. And for the reasons given in section 1 we don’t think this experiment<br />

is evidence for it. What we do think is that (a) it’s hard to see how studying reactions<br />

to cases like the Gödel/Schmidt example could show more than that some such<br />

claim about the prevalence of descriptive names is true, and (b) such claims are not<br />

inconsistent with the causal-historical theory.<br />

We’ve argued that even if Kripke is wrong about the Gödel/Schmidt example,<br />

that doesn’t undermine the arguments for the main conclusions of Naming and Necessity.<br />

A natural inference from this is that experiments about the Gödel/Schmidt<br />

example can’t undermine those conclusions. We think the natural inference is correct.<br />

A referee has suggested that this is too quick. After all, if we have experimental<br />

evidence that Kripke is wrong about the Gödel/Schmidt case, we might have some<br />

grounds for suspicion about the other cases that Kripke uses in the arguments for<br />

more central conclusions. That is, if MMNS are right about the Gödel/Schmidt case,<br />

that doesn’t give us a deductive argument against the other anti-descriptivist moves,<br />

but it might give us an inductive argument against them. This is an important worry,<br />

but we think it can be adequately responded to.<br />

The first thing to note is that it would be foolish to fall back to a general scepticism<br />

about human judgment just because people disagree in their intuitive reactions to<br />

some tricky cases. This point is well argued by Timothy Williamson in his (2007, Ch.<br />

6). If there’s a worry here, it must be because the evidence about the Gödel/Schmidt<br />

example supports a more modest generalisation about judgments about cases, but<br />

that generalisation is nevertheless strong enough to undermine Kripke’s other arguments.<br />

We doubt such a generalisation exists.<br />

It can’t be that the experiments about the Gödel/Schmidt example show that<br />

intuitive judgments about reference are systematically mistaken. Most of our intuitions<br />

in this field are surely correct. For instance, our intuitions that ‘Kripke’ refers<br />

to Kripke and not Obama, and that ‘Obama’ refers to Obama and not Kripke, are<br />

correct. (And experiments like the ones MMNS ran don’t give us any reason at all<br />

to doubt that.) And we could produce many more examples like that. At most, the<br />

experiments can show us that there are spots of inaccuracy in a larger pool of correct<br />

judgments.


In Defense of a Kripkean Dogma 627<br />

It might be argued that we should be sceptical of intuitions about reference in<br />

counterfactual cases. The correct judgments cited in the previous paragraph are all<br />

about real cases, but the Gödel/Schmidt example is not a real case. Now we don’t<br />

think that the experiments do undermine all intuitions about reference in counterfactual<br />

cases, but even if they did, that wouldn’t affect the Kripkean argument. That’s<br />

because the central argument against descriptivism at the start of Lecture II involves<br />

real cases. The heavy lifting is done by cases where speakers don’t think they have<br />

an individuating description to go along with names they use (e.g., ‘Feynman’ and<br />

‘Gell-Mann’), or they believe they have an individuating description, but that description<br />

involves some kind of circularity (e.g., ‘Einstein’, ‘Cicero’). It seems to us<br />

that these cases are much more like the cases where we know people have accurate<br />

intuitions about reference (e.g., ‘Obama’ refers to Obama), than they are like cases<br />

where there is some dispute about their accuracy (e.g., ‘Gödel’ would refer to Gödel<br />

even if Schmidt had proved the incompleteness of arithmetic). So there’s no reason to<br />

doubt the intuitions that underlie these central Kripkean arguments. And so there’s<br />

no reason from these experiments to doubt the anti-descriptivist conclusions Kripke<br />

draws from them.<br />

3 Reference in Philosophy<br />

If the data about the Gödel/Schmidt example don’t undermine the causal-historical<br />

theory of reference, then presumably they don’t undermine philosophical uses of<br />

that theory. But we think MMNS overstate the role that theories of reference play in<br />

philosophical theorising, and we’ll end by saying something about this.<br />

One simple reaction to MMNS’s argument is to say that at most they show that<br />

the causal-historical theory of reference is not true of some dialects. But, a philosopher<br />

might say, they are not writing in such a dialect, and the causal-historical theory<br />

is true of their dialect. And that’s all they needed for their argument. MMNS anticipate<br />

this objection, and reply to it in section 3.3 of their paper. The reply is, in<br />

essence, that such a picture would make a mess of communication. If we posit dialectical<br />

variation to explain different reactions to the Gödel/Schmidt example, and<br />

to other examples, then we cannot know what dialect someone is speaking without<br />

knowing how they respond to these examples. And plainly we don’t need to quiz<br />

people in detail about philosophical examples in order to communicate with them.<br />

We offer three replies.<br />

First, at least one of us is on record raising in principle suspicions about this<br />

kind of argument Maitra (2007). The take-home message from that paper is that<br />

communication is a lot easier than many theorists have supposed, and requires much<br />

less pre-communicative agreement. It seems to us that the reply MMNS offer here is<br />

susceptible to the arguments in that paper, but for reasons of space we won’t rehearse<br />

those arguments in detail.<br />

Second, it’s one thing to think that variation in reference between dialects leads to<br />

communication breakdown, it’s another thing altogether to think that variation in<br />

meta-semantics leads to such breakdown. A little fable helps make this clear. In some<br />

parts of Melbourne, ‘Gödel’ refers to Gödel because of the causal chains between the


In Defense of a Kripkean Dogma 628<br />

users of the name and the great mathematician. In other parts, ‘Gödel’ refers to Gödel<br />

because the speakers use it as a descriptive name, associated with the description ‘the<br />

man who proved the incompleteness of arithmetic’. Kevin doesn’t know which area<br />

he is in when he sees a plaque over a door saying “Gödel lived here”. It seems to us<br />

that Kevin can understand the sign completely without knowing how ‘Gödel’ got<br />

its reference. Indeed, he even knows what proposition the sign expresses. So metasemantic<br />

variation between dialects need not lead to communicative failure, even<br />

when hearers don’t know which dialect is being used.<br />

Third, if MMNS’s argument succeeds, it seems to us that it shows descriptivist<br />

theories, including the weak weak descriptivism that Kripke is arguing against with<br />

the Gödel/Schmidt example, are doomed. (The arguments in this paragraph are not<br />

original. Similar arguments are used frequently in, e.g., Fodor and Lepore (1992).) It’s<br />

a platitude that different people know different things. Barring a miracle, that means<br />

different people will associate different descriptions with different names. If there is<br />

widespread use of descriptive names, that means there will be widespread differences<br />

in which descriptions are associated with which names. And that will produce at least<br />

as much communicative difficulty as having some people be causal-historical theorists<br />

and some people be descriptivists. In short, if MMNS’s argument against ‘referential<br />

pluralism’ is sound, there is an equally sound argument against descriptivism. And<br />

note that this argument doesn’t rely on any thought experiments about particular<br />

cases. It doesn’t even rely on thought experiments about names like ‘Einstein’, where<br />

there isn’t any evidence that Kripke is wrong about how those names work.<br />

Dialectically, the situation is this. MMNS have offered an argument from the<br />

possibility of communicating under conditions of ignorance about one’s interlocutor’s<br />

knowledge. Similar arguments have been offered against descriptivism. If such<br />

arguments are successful, then descriptivism is false, and there’s no problem with<br />

philosophers making arguments from the falsity of descriptivism. If such arguments<br />

are unsuccessful, then MMNS haven’t shown that it is wrong for philosophers to assume<br />

that the causal-historical theory is the right theory for their dialect, even if some<br />

other people are descriptivists. And, as MMNS concede, as long as the philosophers<br />

themselves speak a causal-historical theory dialect, the uses of the causal-historical<br />

theory in philosophy seem appropriate. The only way this argument could fail is if<br />

MMNS’s argument from the possibility of communicating under conditions of ignorance<br />

about one’s interlocutor’s knowledge is stronger than the analogous arguments<br />

against descriptivism. But we see no reason to believe that is so. If anything, it seems<br />

like a weaker argument, because of the considerations arising from our fable about<br />

Kevin and the ‘Gödel lived here’ sign.<br />

So we don’t think MMNS have a good reply to the philosopher who insists that<br />

they only need the causal-historical theory to be true of their dialect. But in fact we<br />

think that philosophers rarely even assume that much.<br />

Let’s consider one of the examples that they cite: Richard Boyd’s use of the causalhistorical<br />

theory of reference in developing and defending his version of “Cornell<br />

Realism” in his (1988). Here’s one way one could try and argue for moral realism<br />

from the causal-historical theory.


In Defense of a Kripkean Dogma 629<br />

1. The causal-historical theory of reference is the correct theory of reference for<br />

all words in all dialects (or at least our dialect).<br />

2. So, it is the correct theory for ‘good’.<br />

But that’s not Boyd’s actual argument. And that’s a good thing, because the first<br />

premise is implausible. Someone defending it has to explain descriptive names like<br />

‘Neptune’, logical terms like ‘and’, empty predicates like ‘witch’, and so on. And<br />

Boyd’s not in that business. His argument is subtler. Boyd uses the causal-historical<br />

theory for two purposes. First, he uses the development of a naturalistically acceptable<br />

theory of reference as part of a long list of developments in post-positivist philosophy<br />

that collectively constitute a “distinctively realist conception of the central<br />

issues in the philosophy of science” (Boyd, 1988, 188). Second, he uses the causalhistorical<br />

theory of reference, as it applies to natural kind terms, as part of a story<br />

about how we can know a lot about kinds that are not always easily observable (Boyd,<br />

1988, 195-196). By analogy, he suggests that we should be optimistic that a naturalistically<br />

acceptable moral theory exists, and that it is consistent with us having a lot of<br />

moral knowledge.<br />

Once we look at the details of Boyd’s argument, we see that it is an argument that<br />

duelling intuitions about the Gödel/Schmidt example simply can’t touch. In part<br />

that’s because Boyd cares primarily about natural kind terms, not names. But more<br />

importantly it is because, as we noted in section 2, the only point that’s at issue by the<br />

time Kripke raises the Gödel/Schmidt example is the number of descriptive names.<br />

Just looking at the arguments Kripke raises before that example gives us more than<br />

enough evidence to use in the kind of argument Boyd is making.<br />

It would take us far beyond the length of a short reply to go through every philosophical<br />

use of the causal-historical theory that MMNS purport to refute in this much<br />

detail. But we think that the kind of response we’ve used here will frequently work.<br />

That is, we think few, if any, of the arguments they attack use the parts of the causalhistorical<br />

theory that Kripke is defending with the Gödel/Schmidt example, and so<br />

even if that example fails, it wouldn’t undermine those theories.


Defending Causal Decision Theory<br />

In “Some Counterexamples to Causal Decision Theory”, Andy Egan argues that<br />

causal decision theory cannot handle certain cases that I’ll call ‘asymmetric Death<br />

in Damascus’ cases. I’m going to argue that causal decision theory is not undermined<br />

by asymmetric Death in Damascus cases.<br />

Egan’s arguments all turn on intuitive judgments about such cases. Those intuitions,<br />

insofar as they are reliable, seem to support a quite general principle, that I’ll<br />

call Egan’s Safety Principle or (ESP).<br />

When we are discussing principles in decision theory, there are two things we<br />

have to check. One is whether the principle gives plausible results when it is the only<br />

principle we need to use to make a decision. And (ESP) does quite well by that standard.<br />

That is, in effect, what Egan shows. The second is whether the principle leads<br />

to plausible results in more complicated cases when conjoined with other, plausible,<br />

principles of practical inference. And I’ll argue that (ESP) does very badly on this<br />

test. Indeed, combined with some fairly innocuous principles, (ESP) ends up giving<br />

us contradictory advice about a case. If we take those other principles to be laws of<br />

the logic of decision, then (ESP) is inconsistent. Even if we don’t draw such a strong<br />

conclusion, we’ll see that the outputs of (ESP) are confusing at best, and in some cases<br />

incoherent. This suggests to me that both (ESP) and the intuitions that support it are<br />

unreliable, and hence shouldn’t ground an overthrow of causal decision theory.<br />

1 Death in Damascus<br />

Egan’s examples are similar in some respects to the Death in Damascus case introduced<br />

to the decision theory literature in Allan Gibbard and William Harper’s classic<br />

paper, “Counterfactuals and Two Kinds of Expected Utility.” (Gibbard and Harper,<br />

1978, 157-158)<br />

Consider the story of the man who met Death in Damascus. Death<br />

looked surprised, but then recovered his ghastly composure and said,<br />

‘I AM COMING FOR YOU TOMORROW’. The terrified man that night<br />

bought a camel and rode to Aleppo. The next day, Death knocked on<br />

the door of the room where he was hiding, and said ‘I HAVE COME FOR<br />

YOU’.<br />

‘But I thought you would be looking for me in Damascus’, said the man.<br />

‘NOT AT ALL’, said Death ‘THAT IS WHY I WAS SURPRISED TO SEE<br />

YOU YESTERDAY. I KNEW THAT TODAY I WAS TO FIND YOU IN<br />

ALEPPO’.<br />

Now suppose the man knows the following. Death works from an appointment<br />

book which states time and place; a person dies if and only if<br />

the book correctly states in what city he will be at the stated time. The<br />

† Unpublished.


Defending Causal Decision Theory 631<br />

book is made up weeks in advance on the basis of highly reliable predictions.<br />

An appointment on the next day has been inscribed for him.<br />

Suppose, on this basis, the man would take his being in Damascus the<br />

next day as strong evidence that his appointment with Death is in Damascus,<br />

and would take his being in Aleppo the next day as strong evidence<br />

that his appointment is in Aleppo...<br />

If... he decides to go to Aleppo, he then has strong grounds for expecting<br />

that Aleppo is where Death already expects him to be, and hence it is<br />

rational for him to prefer staying in Damascus. Similarly, deciding to<br />

stay in Damascus would give him strong grounds for thinking that he<br />

ought to go to Aleppo.<br />

In cases like this, the agent is in a real dilemma. Assuming that he goes to Aleppo,<br />

probably he would have been better off had he gone to Damascus. And if he stays in<br />

Damascus, then probably he would have been better off if he had left. As soon as he<br />

does something, it will be the thing that is irrational to do, given his evidence.<br />

The case as presented has two complicating features. First, given that there is<br />

only one Death, the man can avoid Death’s predictive powers by using some kind<br />

of randomising device to choose where he goes. In game theoretic terminology, the<br />

man could play a mixed strategy. (This is recommended in Weirich (2008).) If Death<br />

could be in multiple places, and would be if he predicted the man would do this, this<br />

option would be closed off. So I will mostly ignore cases where mixed strategies offer<br />

a way out. 1<br />

The second complicating factor is that it isn’t clear how much disutility the man<br />

puts into buying a camel, riding to Aleppo etc. It seems from the case that the utility<br />

or disutility of this is supposed to be minimal, but it would be good to be more<br />

specific, and to think about cases where that disutility is not minimal. For instance,<br />

we could imagine a case where buying the camel would bankrupt the man’s heirs.<br />

Formally, we’ll consider cases that have the following structure, where O 1 and O 2<br />

are choices, S 1 and S 2 are states, x i j is the payoff for making choice O i in state S j , and<br />

for each i choosing O i is evidence that the agent is in state S i .<br />

S 1<br />

S 2<br />

O 1 x 11 x 12<br />

O 2 x 21 x 22<br />

We also assume that x 11 < x 21 and x 22 < x 12 , so whatever the agent does, they have<br />

evidence that they would have been better choosing otherwise. We’ll also assume,<br />

though the grounds for this assumption will need to be specified, that mixed strategies<br />

are unavailable, or unadvisable, for the agent. Any such case is a Death in Damascus<br />

case.<br />

1 In section 4 I briefly note that even if we allow mixed strategies, we don’t end up with a considerably<br />

more intuitive outcome.


Defending Causal Decision Theory 632<br />

2 Asymmetric Death in Damascus<br />

An asymmetric Death in Damascus case is simply a Death in Damascus case, as specified<br />

above, with x 11 �= x 22 . 2 We’ll notate our cases so that x 11 > x 22 . Egan’s examples<br />

are a subset of asymmetric Death in Damascus cases with three distinguishing characteristics.<br />

• x 11 is much much greater than x 22 .<br />

• x 12 is much much greater than x 22 .<br />

• x 21 is just a little greater than x 11 .<br />

If those three conditions are met, we’ll call the case an ‘Egan case’, and call O 1 the<br />

‘Safe’ option and O 2 the ‘Risky’ option. And we’ll say call S 1 the ‘PredSafe’ state and<br />

S 2 the ‘PredRisk’ state. We’ll illustrate these terms with Egan’s example Newcomb’s<br />

Firebomb. (Egan, 2007b, 109-110)<br />

There are two boxes before you. Box A definitely contains $1,000,000.<br />

Box B definitely contains $1,000. You have two choices: take only box A<br />

(call this one-boxing), or take both boxes (call this two-boxing). You will<br />

signal your choice by pressing one of two buttons. There is, as usual,<br />

an uncannily reliable predictor on the scene. If the predictor has predicted<br />

that you will two-box, he has planted an incendiary bomb in box<br />

A, wired to be detonated (burning up the $1,000,000) if you press the<br />

two-box button. If the predictor has predicted that you will one-box,<br />

no bomb has been planted, nothing untoward will happen, whichever<br />

button you press. The predictor, again, is uncannily accurate.<br />

Egan doesn’t make explicit what happens if the demon predicts you’ll play a mixed<br />

strategy, but let’s assume, as in the original Newcomb case, that the predictor will<br />

treat this like two-boxing, and include the bomb. And let’s further assume, as seems<br />

reasonable, that given this mixed strategies are a very bad idea in the circumstances.<br />

Now let’s look at the payoff table for Newcomb’s Firebomb. I’ll assume, as seems<br />

harmless enough in these cases, that payoffs in dollars translate easily and linearly to<br />

payoffs in utilities.<br />

One-Boxing Predicted Two-Boxing Predicted<br />

(PredSafe) (PredRisk)<br />

Take one box (Safe) 1,000,000 1,000,000<br />

Take two boxes (Risky) 1,001,000 1,000<br />

2 Such cases appear to be first discussed by Richter (1984). Among other things, he noted some of the<br />

ways I listed two paragraphs ago in which the original Death in Damascus case could be an asymmetric<br />

case.


Defending Causal Decision Theory 633<br />

As we can see, x 11 (i.e., 1,000,000) is much much greater than x 22 (i.e., 1,000), and<br />

only a little less than x 21 (i.e., 1,001,000), while x 12 (i.e., 1,000,000) is also much much<br />

greater than x 22 . So this fits the pattern described above. So the principle that the<br />

intuitions driving Egan’s argument support seems to be that to be that taking the safe<br />

option is uniquely rational in Egan cases. This principle will play a big role in what<br />

follows, so let’s give it a name.<br />

Egan’s Safety Principle (ESP) In an Egan case, taking the Safe option is the unique<br />

rational choice.<br />

Using this principle we can give a brisk statement of Egan’s objection to causal decision<br />

theory.<br />

1. In an Egan case, taking the Safe option is the unique rational choice.<br />

2. Causal decision theory does not say that in an Egan case taking the Safe option<br />

is the unique rational choice.<br />

3. So, causal decision theory is either mistaken, if it denies (ESP), or incomplete,<br />

if it does not say anything about what the rational thing to do is in Egan cases. 3<br />

So far we’ve just looked at cases where the agent has two options. In the next section<br />

I’ll consider certain three option cases, and argue that if we assume (ESP) we end up<br />

with implausible conclusions. I conclude that, as plausible as (ESP) looks when we<br />

consider cases like Newcomb’s Firebomb, it cannot ultimately be accepted.<br />

3 Egan Cases with Alternatives<br />

In each of the following cases, the agent has a choice between three boxes, of which<br />

they can choose exactly one. In each case there is a demon that predicts what the<br />

agent will choose. The demon is very good at making predictions. In particular,<br />

the demon is very probably correct in her prediction conditional on any choice the<br />

agent makes. How much money the demon puts into each box is dependent on her<br />

predictions of the agent’s choice. I won’t specially notate this in any way, but in each<br />

case, if the demon predicts that the agent is using a mixed strategy, then the demon<br />

will put no money in any box. And I’ll assume this is sufficient for the agent to not<br />

to play a mixed strategy. 4<br />

We’ll be interested in two cases - here is the first of them. This is what I’ll call<br />

the ‘ABC choice’. The rows represent the player’s choices, the columns represent the<br />

3 Egan notes that given a particular implementation of causal decision theory, that in Lewis (1981b),<br />

and some particular assumptions about the agent’s credences, the agent will choose O 2 , which he regards<br />

as irrational. But Lewis’s implementation is not the only implementation, and the credences Egan ascribes<br />

are neither obviously correct nor obviously part of causal decision theory. So it isn’t obvious, I think,<br />

that causal decision theory as such recommends choosing the Risky option. Indeed, given the variety of<br />

implementations of causal decision theory, it isn’t obvious that causal decision theory as such makes any<br />

prescription about Egan cases. But Egan is clearly right that causal decision theory of any stripe doesn’t<br />

uniquely recommend the Safe choice, and that’s enough to get an objection to causal decision theory going<br />

if (ESP) is true.<br />

4 Arntzenius (2008) argues that the agent should use a mixed strategy in Egan cases as originally described.<br />

This is less plausible given my stipulations about the demon.


Defending Causal Decision Theory 634<br />

demon’s predictions. The cells represent how much utility the agent gets given the<br />

prediction, as specified in the row, and the prediction, as specified in the column.<br />

Demon predicts A Demon predicts B Demon predicts C<br />

Agent chooses A 4000 5000 5000<br />

Agent chooses B 5000 1000 5000<br />

Agent chooses C 5000 800 4000<br />

I’m going to argue that if (ESP) is true, choosing A is uniquely rational here.<br />

Note first that choosing B weakly dominates choosing C. If the demon predicts<br />

that the agent will choose A, then B and C are just as good, and otherwise B is better<br />

than C. So B is at least as good as C, and better if the probability that A is predicted<br />

is less than 1.<br />

Now we’ll just look at the comparison between A and B. Assume temporarily<br />

that the demon will either predict that A will be chosen or predict that B will be<br />

chosen. Conditional on that assumption, the choice between A and B looks like this.<br />

Demon predicts A Demon predicts B<br />

Agent chooses A 4000 5000<br />

Agent chooses B 5000 1000<br />

But this is an Egan case by our definition, with A being Safe and B being Risky. And<br />

we’re assuming (ESP). So, conditional on the demon predicting A or B, A is a better<br />

choice than B. It is clear that conditional on the demon predicting C, that A and B<br />

are equally good choices. But those two options, either the demon predicts A or B,<br />

or the demon predicts C, exhaust the possibilities. And A is a better choice than B on<br />

the first option and just as good a choice as B on the second option. By an application<br />

of the sure thing principle, A is at least as good a choice as B, and better unless the<br />

probability that the demon predicts C is 1. Slightly more formally, the argument is<br />

1. Conditional on the demon predicting A or B, choosing A is better than choosing<br />

B.<br />

2. Conditional on the demon predicting C, choosing A is exactly as good as choosing<br />

B.<br />

3. Those two options (the demon predicting A or B; the demon predicting C) are<br />

exclusive and exhaustive.<br />

4. So choosing A is at least as good as choosing B, and better if the probability<br />

that the demon predicts C is less than 1.<br />

The motivation for the first premise is (ESP). The second and third premises are true<br />

by stipulation in the case. And the validity of the argument is guaranteed by the sure<br />

thing principle. Our next step involves an application of the transitivity of better<br />

than.


Defending Causal Decision Theory 635<br />

1. Choosing A is at least as good as choosing B, and better if the probability the<br />

demon predicts C is less than 1.<br />

2. Choosing B is at least as good as choosing C, and better if the probability the<br />

demon predicts A is less than 1.<br />

3. So choosing A is better than choosing C.<br />

The first premise is what we derived from the previous argument. The second premise<br />

is true by weak dominance. Transitivity alone merely gives us that choosing A is at<br />

least as good as choosing C. But the two could only be exactly as good if choosing A<br />

was exactly as good as choosing B, and choosing B was exactly as good as choosing C.<br />

And the conditions under which those two equalities obtain are incompatible, since<br />

it would require that both the demon predicting that A is chosen and the demon<br />

predicting that C is chosen have probability 1.<br />

Since A is at least as good as B, and better unless the probability of C being predicted<br />

is 1, and B is at least as good as C, and better unless the probability of A being<br />

predicted is 1, it follows that A is better than C by the transitivity of preference. Indeed,<br />

it seems that A must be considerably better, since choosing the Safe option is<br />

meant to be a clearly preferable choice in an Egan case.<br />

Here is the second case we’ll be looking at, what I’ll call the ‘DEF choice’.<br />

Demon predicts D Demon predicts E Demon predicts F<br />

Agent chooses D 4000 800 5000<br />

Agent chooses E 5000 1000 5000<br />

Agent chooses F 5000 5000 4000<br />

The reasoning here will be similar to the ABC choice, so I won’t go through it in<br />

anything like the same detail. Since E weakly dominates D, E must be better than D.<br />

Conditional on the demon predicting E or F, the choice between E and F is an Egan<br />

case, with F being Safe and E being Risky. So by the assumption of (ESP), F is better<br />

than E conditional on E or F being predicted. If the demon predicts D, then E and F<br />

are equally good. So by the sure thing principle, F is simply better than E, unless the<br />

probability that the demon will predict D is 1, in which case they are equally good.<br />

By transitivity, F is better than D.<br />

But this all seems exceedingly odd. The difference between the A/C comparison<br />

and the D/F comparison is simply that, if the demon predicts that neither of them<br />

will be chosen, then A is better than C, and D is better than F. But since, given (ESP),<br />

there is very little reason for picking the ‘middle’ option, i.e. B or E, to be chosen, and<br />

the demon knows this, and the agent knows the demon knows this, the probability<br />

of the middle option being predicted is vanishingly small. So it can’t explain much<br />

by way of why one option would be better than another.<br />

I conclude from all this that we can’t always accept (ESP). Given (ESP), there<br />

is almost no relevant difference. But (ESP) implies there is all the difference in the<br />

world between them. So (ESP) is incoherent, and hence false.


Defending Causal Decision Theory 636<br />

Here’s another way of looking at the problem that these choices raise for (ESP).<br />

Assume you’re a rational agent making the ABC or DEF choice, as described above,<br />

and (ESP) is a true constraint on rational decision making. Then both B and E are<br />

ruled out, as shown above. And the demon knows you are rational, so the demon<br />

won’t predict B or E. So the choices in question look like these.<br />

Demon predicts A Demon predicts C<br />

Agent chooses A 4000 5000<br />

Agent chooses C 5000 $4000<br />

Demon predicts D Demon predicts F<br />

Agent chooses D 4000 5000<br />

Agent chooses F 5000 4000<br />

It looks like the same choice! Given (ESP), that is, the ABC choice and the DEF<br />

choice are on a par. But also given (ESP), it is irrational to choose C over A, and<br />

rationally mandatory to choose F over D. That is, (ESP) both says that the choice<br />

between A and C is just the same as the choice between D and F, and says that you<br />

should treat these choices differently. That seems incoherent to me. So I conclude<br />

(ESP) is false.<br />

But what (ESP) says about Egan cases is very intuitive. That’s the point of Egan’s<br />

paper; it’s (ESP), not causal decision theory, that tracks intuitions around here. From<br />

this I conclude that intuitions about these problems are not to be trusted. So even<br />

if causal decision theory says somewhat counterintuitive things about Egan cases,<br />

and Egan quite clearly shows that it does, the right conclusion is that intuition is<br />

untrustworthy, and causal decision theory is not undermined.<br />

4 Objections and Replies<br />

I’ll conclude with four possible objections to my argument, and brief replies to each<br />

of them.<br />

Objection: The difference between the A/C choice and the D/F choice is in what<br />

happens if the demon predicts B/E. And F is much better than C conditional on the<br />

demon making this ‘middle column’ choice. That explains why (ESP) recommends<br />

choosing A and F.<br />

Reply: If anything, this reasoning should point us in the opposite direction. Since the<br />

demon can ‘see through’ the reasoning of the objector, it is less likely that the demon<br />

will predict A is chosen than that the demon will predict D is chosen. And given<br />

that the demon predicts A will be chosen, the last thing you want to choose is A. So<br />

there’s no justification here for (ESP)’s flipping between A and F.


Defending Causal Decision Theory 637<br />

Objection: The argument so far has only shown that there’s a small gain to choosing<br />

A over C, and a small gain to choosing F over D, assuming (ESP). And perhaps the<br />

difference in the middle column could explain this.<br />

Reply: We should reject the premise of the objection. If Egan’s objection to causal<br />

decision theory is to work, we have to know (ESP) is correct. Given standard safety<br />

principles for knowledge, that implies that the Safe option should be much better<br />

than the Risky option in an Egan case. That’s hardly an uncharitable inference to<br />

draw from Egan’s paper; it seems clear that in the Egan cases he discusses, the Safe<br />

option is taken to be easily superior. That implies that A should be much better than<br />

B, and hence than C. And F should be much better than E, and hence than D. But<br />

that’s absurd, given that they only differ if the demon makes a prediction that (ESP)<br />

says shouldn’t be made.<br />

Objection: Given evidential decision theory, it’s a wash whether we choose A or<br />

C, and a wash whether we choose D or F. So there’s nothing wrong with picking,<br />

somewhat arbitrarily, A and F.<br />

Reply: For one thing, as the previous reply shows, Egan’s argument against causal<br />

decision theory requires that the choice between A and C not be a wash, but in fact<br />

be clearly in A’s favour. For another, this objection turns on trivial features of the<br />

case. Imagine the following slight alternative to the ABC choice.<br />

Demon predicts G Demon predicts H Demon predicts I<br />

Agent chooses G 4000 5000 5000<br />

Agent chooses H 5000 1000 5000<br />

Agent chooses I 5000 800 4500<br />

The only difference is in the bottom right corner of the table. Since the argument<br />

for A (now an argument for G) only uses the fact that the middle row dominates the<br />

bottom row, and the middle row does indeed still dominate the bottom row, that<br />

argument still goes through. So (ESP) says that in this choice, you should choose<br />

G. But the evidential decision theorist says that, if the demon is good enough, you<br />

should choose I. Since (ESP) is inconsistent with evidential decision theory, it can’t<br />

use evidential decision theory in its defence. 5<br />

Objection: Imagine these choices, the ABC choice and the DEF choice, are games,<br />

and the demon’s payout is 1 if the prediction is correct, 0 otherwise. Then the Nash<br />

equilibrium in ABC includes A but not C, and the Nash equilibrium in DEF includes<br />

F but not D. This justifies treating the cases differently.<br />

Reply: As with the previous reply, the primary point will be that (ESP) can’t use<br />

theories that it is inconsistent with to defend its strange consequences. But the idea<br />

5 The GHI choice is problematic for (ESP) for another reason. To the extent I have the intuitions<br />

driving (ESP), I also intuit that G is better than H, that H is better than I, and that I is better than G. So in<br />

this case, I think the intuitions behind (ESP) imply intransitivity of better than. That’s another sign that<br />

the intuitions are unreliable, and hence not a source of evidence.


Defending Causal Decision Theory 638<br />

behind the objection is interesting enough to think through. The ABC and DEF<br />

choices are a little complex, so let’s take the same idea but apply it to Newcomb’s<br />

Firebomb. And we’ll assume, contrary to what was stipulated to date, that the demon<br />

won’t get flustered and put no money in any box if a mixed strategy is detected.<br />

Then we’ll have the following game on our hands, where the first number in each<br />

cell represents the agent’s payout, and the second cell represents the demon’s payout.<br />

One-Boxing Predicted Two-Boxing Predicted<br />

(PredSafe) (PredRisk)<br />

Take one box (Safe) (1000000, 1) (1000000, 0)<br />

Take two boxes (Risky) (1001000, 0) (1000, 1)<br />

There is a Nash equilibrium to this game, but it isn’t one that helps (ESP). The equilibrium<br />

is that the demon plays PredSafe with probability 0.999, and plays PredRisk<br />

with probability 0.001. And the agent plays Safe with probability 0.5, and Risky<br />

with probability 0.5. That is, the agent simply tosses a fair coin to choose. Since<br />

(ESP) is motivated by the thought that there’s only one rational choice here, the<br />

(ESP) theorist must think that playing the mixed strategy that is part of the unique<br />

Nash equilibrium is deeply misguided. If the (ESP) theorist says that, I won’t object,<br />

but I would object if they then turn around and use equilibrium considerations in<br />

defending (ESP), as this objection purports to do.


Keynes and Wittgenstein<br />

Abstract<br />

Three recent books have argued that Keynes’s philosophy, like Wittgenstein’s,<br />

underwent a radical foundational shift. It is argued that Keynes,<br />

like Wittgenstein, moved from an atomic Cartesian individualism to a<br />

more conventionalist, intersubjective philosophy. It is sometimes argued<br />

this was caused by Wittgenstein’s concurrent conversion. Further, it is<br />

argued that recognising this shift is important for understanding Keynes’s<br />

later economics. In this paper I argue that the evidence adduced<br />

for these theses is insubstantial, and other available evidence contradicts<br />

their claims.<br />

1 Introduction<br />

Three recent books (Davis, 1994; Bateman, 1996; Coates, 1996) have argued that the<br />

philosophy behind Keynes’s later economics (in particular the General Theory) is<br />

closer to Wittgenstein’s post Tractarian theorising than to his early philosophy as expressed<br />

in his Treatise on Probability. 1 If Keynes did follow Wittgenstein in the ways<br />

suggested it would represent a substantial change from his early neoplatonist epistemology.<br />

In this paper I argue that the evidence for this thesis is insubstantial, and<br />

the best explanation of the evidence is that Keynes’s philosophical views remained<br />

substantially unchanged.<br />

There are three reasons for being interested in this question. The first is that it is<br />

worthwhile getting the views of a thinker as important as Keynes right. The second is<br />

that it would be mildly unfortunate for those of us attracted to Keynes’s epistemology<br />

to find out that it was eventually rejected by its creator 2 . Most importantly, all parties<br />

agree that Keynes thought his philosophical theories had substantial consequences<br />

for economic theory. It is a little unusual for philosophical theories to have practical<br />

consequences; if one is claimed to it is worthwhile identifying and evaluating the<br />

claim.<br />

Section 2 examines Bateman’s claim that Keynes abandoned the foundations of<br />

his early theory of probability. Bateman’s arguments turn, it seems, on an equivocation<br />

between different meanings of ‘Platonism’. On some interpretations the arguments<br />

are sound but don’t show what Bateman wants, on all others they are unsound.<br />

Section 3 looks at the conventionalist, intersubjective theory of probability Bateman<br />

and Davis claim Keynes adopted after abandoning his early objective theory. As they<br />

express it the theory’s coherence is dubious; I show how it might be made more plausible.<br />

Nevertheless, there is little to show that Keynes adopted it. The only time he<br />

† Unpublished.<br />

1 Davis’s views are also set out in his (1995), and Coates’s to some extent in his (1997), but I will focus<br />

on the more detailed position in their respective books.<br />

2 In the way that, for example, subjective Bayesianism was arguably invented by and eventually rejected<br />

by Ramsey. See his Ramsey (1926) and Ramsey (1929).


Keynes and Wittgenstein 640<br />

talks about conventions is in the context of speculative markets and in these contexts<br />

a conventionalist theory will give the same results as an objectivist theory.<br />

Section 4 looks at Coates’s quite different arguments for an influence from Wittgenstein<br />

to Keynes. Part of the problem with Coates’s argument is that the textual<br />

evidence he presents is capable of several readings; indeed competing interpretations<br />

of the pages he uses exist. A bigger problem is that even when he has shown a change<br />

in Keynes’s views occurred, he immediately infers the change was at the foundations<br />

of Keynes’s beliefs. Section 5 notes one rather important point of Wittgenstein’s of<br />

which Keynes seemed to take no notice, leading to an error in the General Theory.<br />

This should cast doubt on the claim that Keynes’s later philosophy, indeed later economics,<br />

was based on theories of Wittgenstein.<br />

2 Bateman’s Case for Change<br />

A brief biographical sketch of Keynes is in order to frame the following discussions,<br />

though I expect most readers are familiar with the broad outlines 3 . Keynes arrived<br />

as an undergraduate at Cambridge in 1902 and was based there for the rest of his life.<br />

For the next six years he largely studied philosophy under the influence of Moore<br />

and Russell. In 1907 he (unsuccessfully) submitted his theory of probability as a fellowship<br />

dissertation; this was successfully resubmitted the following year. His plans<br />

to make a book of this were interrupted by work on Indian finance, the war and its<br />

aftermath. It appeared as Treatise on Probability (hereafter, TP) in 1921, after substantial<br />

work on it in 1920. Modern subjectivist theories of probability, generally<br />

known as Bayesian theories, first appeared in critical reviews of this book (e.g. Borel<br />

(1924), Ramsey (1926)). After leaving philosophy for many years, Wittgenstein returned<br />

to Cambridge in 1929, and subsequently had many discussions with Keynes.<br />

In Keynes’s General Theory (hereafter, GT ) of 1936 and in some of the ensuing debate,<br />

Keynes referred to some distinctive elements of the TP, leading some interpreters to<br />

suspect that there was a theoretical link between his early philosophy and his later<br />

economics.<br />

There are two distinctive elements of Keynes’s early theory of probability for<br />

our purposes. The first is its objectivism. Keynes held the probability of p given h<br />

is the degree of reasonable belief in p on evidence h, or, as Carnap (1950) put it, the<br />

degree of confirmation of p by h. These degrees are determined by logic; Keynes held<br />

that there was a partial entailment relation between p and h, of which the ordinary<br />

entailment relation (then thought to have been given its best exposition by Russell<br />

and Whitehead) was just a limiting case. And these relations are Platonic entities,<br />

we discover what they are by perceiving them through our powers of intuition. The<br />

second element is that the degrees may be non-numerical. So if the probability of p<br />

given h is α, we may be able to say α > 0.3, and α < 0.5, but not be able to give any<br />

finer numerical limits. As a corollary, there are now two dimensions of confirmatory<br />

support. Keynes claimed that as well as determining the probability of p given h, we<br />

could determine the ‘weight’ of this probability, where weight measures how much<br />

3 For more details see Skidelsky (1983, 1992) or Moggridge (1992).


Keynes and Wittgenstein 641<br />

evidence we have. The more evidence is in h, the greater the weight. Keynes thought<br />

the distinction between saying that on evidence h, p has a low probability, and saying<br />

that the weight of that probability is low is important for understanding investment<br />

behaviour (GT : Ch. 12).<br />

Bateman and Davis both claim that Keynes gave up this theory for an intersubjective<br />

theory in the GT. I’ll focus on Bateman’s book, largely because the structure<br />

of his argument is more straightforward 4 . Bateman sets himself to offer another solution<br />

to ‘das Maynard Keynes problem’, which he describes as follows.<br />

“[Future theorists] will read Treatise on Probability’s account of the objective<br />

nature of probabilities and the way that rational people employ<br />

them, and they will wonder at how this person could have turned around<br />

15 years later and written a book [the GT ] in which irrational people<br />

who base their decisions on social conventions cause mass unemployment<br />

in the capitalist system” (7)<br />

I doubt this is the right thing to say about the GT, but that’s another story. For<br />

now we might simply note that there’s no obvious conflict here. For one thing, if<br />

the people in the TP are rational, and in the GT are irrational, as Bateman allows,<br />

it’s not too surprising they behave differently. More generally, it’s to be expected<br />

(sadly) that normative and descriptive theories are different, and by Bateman’s lights<br />

that should explain the difference between the outlook of the explicitly normative TP<br />

and the at least partially descriptive GT. If the agents in the GT are irrational, that<br />

book cannot but be a purely normative account of rationality. On the other hand, if<br />

the TP were taken to be descriptive and not just normative, if it claimed that people<br />

really conform to its epistemological exhortations, there could be a conflict. I can’t<br />

imagine, however, what the evidence or motivation for that reading could be.<br />

If there were a conflict between the GT and TP, there ought be a greater one between<br />

the ‘rational people’ of the TP and the blatantly irrational leaders in Economic<br />

Consequences of the Peace (ECP). These books were published about 15 months apart,<br />

not 15 years. And the most memorable parts of ECP are the descriptions of the<br />

mental failings of President Wilson, who Lloyd George could ‘bamboozle’ into believing<br />

it was just to crush Germany completely, but not ‘de-bamboozle’ out of this<br />

view when it became necessary. Or maybe we should say there’s a conflict because<br />

the characters in David Hume’s histories do not meet his ethical or epistemological<br />

norms.<br />

If we give up Bateman’s claim that the actors in the GT are irrational, and substitute<br />

the claim that the norms of rationality in the two books differ, then we have a<br />

real conflict. And the most charitable interpretation of Bateman is that this is the conflict<br />

he intends to discuss. At the bottom of page 12 he goes close to saying exactly<br />

this, but then proceeds to support his position with evidence that Keynes changed<br />

4 All page references in sections 2 and 3 (unless otherwise stated) to Bateman 1996. Space considerations<br />

preclude a detailed examination of Davis’s arguments, which are quite different to Bateman’s. However his<br />

conclusions are subject to the same criticisms I make of Bateman’s in section 3, and of Coates’s in section<br />

5.


Keynes and Wittgenstein 642<br />

his position on how rational people actually are. Once we are claiming the change of<br />

view is with regard to norms, evidence of opinion changes about empirical questions<br />

becomes irrelevant. This does mean much of Bateman’s case goes, though not yet all<br />

of it.<br />

The main problem with Bateman’s argument is that it rests on an equivocation<br />

over the use of the term ‘Platonism’. In TP Keynes held that probability relations<br />

are objective, non-natural and part of logic. I’ll use ‘logical’ for the last property.<br />

When Bateman says Keynes believed probability relations were Platonic entities, he is<br />

alternately referring to each of these properties. He seems to explicitly use ‘Platonic’<br />

to mean ‘objective’ on page 30, ‘non-natural’ on page 131, and ‘logical’ on page 123.<br />

But this isn’t the important equivocation.<br />

Say a theory about some entities is a ‘Strong Platonist’ theory if it concords with<br />

all Keynes’s early beliefs: those entities are objective, non-natural and logical. Bateman<br />

wants to conclude that by the time of the GT, Keynes no longer had an objectivist<br />

theory of probability. But showing he no longer held a Strong Platonist view<br />

won’t get that conclusion, because there are 3 interesting objectivist positions which<br />

are not Strong Platonist. The following names are my own, but they should be helpful.<br />

Carnapian Probability relations are objective, natural and logical. This is what Carnap<br />

held in his 1950.<br />

Gödelism Probability relations are objective, non-natural and non-logical. Gödel<br />

held this view about numbers, hence the name. I’d normally call this position<br />

Platonism, but that name’s under dispute. Indeed I suspect this is what Keynes<br />

means by Platonism in My Early Beliefs. (Keynes, 1938b)<br />

Reductionism Probability relations are objective, natural and non-logical. Such positions<br />

don’t have to reduce probability to something else, but they usually<br />

will. Russell held such a position in his 1948.<br />

These categories could apply to other entities, like numbers or moral properties or<br />

colours, but we will be focussing on probability relations here. Say a theory is ‘Weak<br />

Platonist’ if it is Strong Platonist or one of these three types. The most interesting<br />

equivocation in Bateman is using ‘Platonist’ to refer to either Strong or Weak Platonist<br />

positions. He argues that Keynes gave up his early Platonist position. These<br />

arguments are sound if he means Strong Platonist, unsound if he means Weak Platonist.<br />

But if he means Strong Platonist he can’t draw the extra conclusion that Keynes<br />

gave up objectivism about probability relations, which he does in fact draw. So I’ll<br />

examine his arguments under the assumption that he means to show Keynes gave up<br />

Weak Platonism.<br />

Whatever Bateman means by Keynes’s Platonism, he isn’t very sympathetic to it.<br />

It gets described as ‘obviously flawed’ (4) and ‘fatally flawed’ (17), and is given as the<br />

reason for his work being ignored by ‘early positivists and members of the Vienna<br />

Circle’ (61). Given that the TP is cited extensively, and often approvingly, by Carnap<br />

in his 1950, this last claim is clearly false. Most stunningly, he claims writers committed<br />

to the existence of Platonic entities cannot ‘be considered to be a part of the


Keynes and Wittgenstein 643<br />

analytic tradition’ (39), though he does concede in a footnote that some ‘early analytical<br />

philosophers’ (he gives Frege as an example) were Platonist. Bateman’s paradigm<br />

of philosophy seems to be the logical positivism of Ayer’s Language, Truth and Logic:<br />

“nowhere would one less expect to find metaphysics than in modern analytical philosophy”<br />

(Ayer, 1936, 39).<br />

There is an implicit argument in this derision. Keynes must, so the argument<br />

goes, have given up (Weak) Platonism because no sensible person could believe it.<br />

If anything like this were sound it should apply to Weak Platonism about other<br />

entities. But the history of ‘modern analytical philosophy’ shows that Weak Platonism<br />

(though not under that name) is quite widespread in metaphysical circles.<br />

Modern philosophy includes believers in possible worlds both concrete and ersatz,<br />

in universals and in numbers. All these positions would fall under Weak Platonism.<br />

Even Quine’s ontologically sparse Word and Object was Weak Platonist about<br />

classes, though he probably wouldn’t like the label. So by analogy Weak Platonism<br />

about probability relations isn’t so absurd as to assume Keynes must have seen its<br />

flaws.<br />

Bateman’s more important argument is direct quotation from Keynes. This argument<br />

is undermined largely because of Bateman’s somewhat selective quotation.<br />

There are two sources where Keynes appears to recant some of his early beliefs.<br />

Which early beliefs, and how early these beliefs were, is up for debate. The two are<br />

his 1938 memoir My Early Beliefs (hereafter, MEB), and his 1931 review of Ramsey’s<br />

posthumous Foundations of Mathematics. MEB wasn’t published until 1949, three<br />

years after Keynes’s death, but according to its introduction it is unchanged from the<br />

version Keynes gave as a talk in 1938. In it he largely discusses the influence of Moore,<br />

and particularly Principia Ethica, on his beliefs before the first world war.<br />

There are several connections between Moore’s work and Keynes. The most pertinent<br />

here is that Keynes’s metaphysics of probability in TP is borrowed almost completely<br />

from Moore’s metaphysics of goodness. Not only are probability relations<br />

objective and non-natural, they are simple and unanalysable. These are all attributes<br />

Moore assigns to goodness. The only addition Keynes makes is that his probability<br />

relations are logical. So Moore’s position on goodness is, in our language, Gödelian.<br />

As he says in MEB, Keynes became convinced of Moore’s metaethics, though he<br />

differed with Moore over the implications this had for ethics proper. In particular he<br />

disagreed with Moore’s claim that individuals are morally bound to conform to social<br />

norms. Bateman seems to assume that at any time Keynes’s metaphysics of goodness<br />

and probability will be roughly the same, and with the exception of questions about<br />

their logical status, this seems a safe enough assumption.<br />

Bateman quotes Keynes saying that his, and his friends’, belief in Moore’s metaethics<br />

was ‘a religion, some sort of relation of neo-platonism’ (Keynes, 1938b, 438).<br />

This is part of the evidence that Keynes meant what I’m calling Gödelism by ‘Platonism’.<br />

Not only does he use it to describe Moore’s position, but comparing Platonism<br />

with religion would be quite apt if he intends it to involve a commitment to objective,<br />

non-natural entities. The important point to note is that he is using ‘religion’ to<br />

include his metaethics, a point Bateman also makes, though it probably also includes<br />

some broad ethical generalisations. Bateman then describes the following paragraph


Keynes and Wittgenstein 644<br />

as removing ‘any doubt that [Keynes] had thrown over his youthful Platonism as<br />

untenable’. (40)<br />

Thus we were brought up – with Plato’s absorption in the good in itself,<br />

with a scholasticism which outdid St. Thomas, in calvinistic withdrawal<br />

from the pleasures and successes of Vanity Fair, and oppressed with all<br />

the sorrows of Werther. It did not pervert us from laughing most of the<br />

time and we enjoyed supreme self-confidence, superiority and contempt<br />

towards all the rest of the unconverted world. But it was hardly a state<br />

of mind which a grown-up person in his senses could sustain literally.<br />

(Keynes, 1938a, 442).<br />

As it stands, perhaps the last sentence signals a change in metaphysical beliefs, as opposed<br />

to say a change in the importance of pleasure-seeking. In any case the following<br />

paragraph (which Bateman neglects to quote) shows such an interpretation to be mistaken.<br />

It seems to me looking back, that this religion of ours was a very good<br />

one to grow up under. It remains nearer the truth than any other I know,<br />

with less extraneous matter and nothing to be ashamed of ... It was a<br />

purer, sweeter air than Freud cum Marx. It is still my religion under the<br />

surface. (Keynes, 1938a, 442).<br />

So was Keynes confessing to ‘a state of mind which a grown-up person in his senses<br />

couldn’t sustain literally’? No; his ‘religion’ which he held onto was a very broad,<br />

abstract doctrine. It needed supplementation with a even general ethical view, to wit<br />

an affirmative answer to one of Moore’s ‘open questions’. And then it needed some<br />

bridging principles to convert those ethics into moral conduct in the world as we<br />

find it. His early position included all these, and it seems it was in effect his early<br />

‘bridging principles’ he mocks in the above quote. These relied, the memoir makes<br />

clear throughout, on an excessively optimistic view of human nature, so he thought<br />

in effect that he could prevent wrong by simply proving to its perpetrators that they<br />

were wrong. Now giving up one’s bridging principles doesn’t entail abandonment of<br />

a general ethical view, let alone one’s metaethics. Indeed, let alone one’s metaphysics<br />

of probability! And as the last quote makes clear, Keynes was quite content with the<br />

most general, most abstract parts of his early belief. If this were all Bateman had to<br />

go on it wouldn’t even show Keynes had abandoned Strong Platonism 5 .<br />

There is more to Bateman’s case. In Keynes’s review 6 of Ramsey (1931), he recanted<br />

on some of his theory of probability. This is quite important to the debate, so<br />

I’ll quote the relevant section at some length.<br />

5 The above points are similar in all substantial respects to those made by O’Donnell (1991) in response<br />

to an earlier version of Bateman’s account.<br />

6 This is often mistakenly referred to as an obituary in the literature, e.g. (Coates, 1996, 139).


Keynes and Wittgenstein 645<br />

Ramsey argues as against the view which I had put forward, that probability<br />

is concerned not with objective relations between propositions but<br />

(in some sense) with degrees of belief, and he succeeds in showing that<br />

the calculus of probabilities simply amounts to a set of rules for ensuring<br />

that the system of degrees of belief which we hold shall be a consistent<br />

system. Thus the calculus of probability belongs to formal logic. But the<br />

basis of our degrees of belief – or the a priori probabilities, as they used to<br />

be called – is part of our human outfit, perhaps given us merely by natural<br />

selection, analogous to our perceptions and our memories rather than<br />

to formal logic. So far I yield to Ramsey – I think he is right. But in attempting<br />

to distinguish ‘rational’ degrees of belief from belief in general<br />

he was not yet, I think, quite successful. It is not getting to the bottom<br />

of the principle of induction to merely say it is a useful mental habit.<br />

(Keynes, 1931, 338-339).<br />

Tellingly, Bateman neglects to quote the final two sentences. I think there is an ambiguity<br />

here, turning on the scope of the ‘so far’ in the fourth sentence. If it covers the<br />

whole section quoted, it does amount to a wholesale recantation of Keynes’s theory,<br />

and this is Bateman’s interpretation. But if we take the first sentence, or at least the<br />

first clause, as being outside its scope it does not. And there are two reasons for doing<br />

this. First, it seems inconsistent with Keynes’s later reliance on the TP in parts of the<br />

GT, as (O’Donnell, 1989, Ch. 6) has stressed. Secondly, it is inconsistent with Keynes’s<br />

complaint that on Ramsey’s view induction is merely a ‘useful habit’. If Keynes<br />

had become a full-scale subjectivist, he ought have realised that patterns of reasoning<br />

could only possibly be valid (if deductive) or useful (otherwise). Since he still thought<br />

there must be something more, he seems to believe an objectvist theory is correct,<br />

though by now he is probably quite unsure as to its precise form. So in effect what<br />

Keynes does in this paragraph is summarise Ramsey’s view, list the details he agrees<br />

with (that probability relations aren’t logical), notes his agreement with them, and<br />

then lists the details he disagrees with (that probability relations aren’t objective).<br />

There is more evidence that all this quote represents is a recantation of the view<br />

that probability relations are logical. Earlier in that review he notes how little formal<br />

logic is now believed to achieve compared with its promise at the start of the century.<br />

The first impression conveyed by the work of Russell was that the field<br />

of formal logic was enormously extended. The gradual perfection of the<br />

formal treatment at the hands of himself, of Wittgenstein and of Ramsey<br />

had been, however, gradually to empty it of content and to reduce it<br />

more and more to mere dry bones, until finally it seemed to exclude not<br />

only all experience, but most of the principles, usually reckoned logical,<br />

of reasonable thought. (Keynes, 1931, 338).<br />

More speculatively, I suggest Keynes’s change of mind here (for this shows he had<br />

surely given up the view that probability relations are logical) might be influenced<br />

by Gödel’s incompleteness theorem. In the TP Keynes had followed Russell is saying<br />

mathematics is part of logic (Keynes, 1921, 293n). That view was often held to


Keynes and Wittgenstein 646<br />

be threatened by Gödel’s proof that there are mathematical truths which can’t be<br />

proven, and that the consistency of mathematics can’t be proven. But no one suggested<br />

this meant mathematics is merely subjective, or that mathematical Platonism<br />

was therefore untenable. If this response to Gödel is right, it shows there are objective<br />

standards of reasoning (i.e. mathematical standards) that are not part of logic. This<br />

makes it less of a leap to say there are objective principles of reasonable thought that<br />

are not ‘logical’ in the narrow sense we’ve been using.<br />

So would Keynes have known of Gödel’s theorem when he wrote this review? I<br />

think it’s possible, though some more research is needed. Keynes’s review was published<br />

in The New Statesman and Nation on October 3, 1931. This was a weekly<br />

political and literary magazine of which Keynes was chairman. So we can safely<br />

conclude the piece was drafted not long before publication. Gödel’s theorem was<br />

first announced at a conference in Vienna in September 1930 (Wang, 1987), and was<br />

published in early 1931. While Keynes would certainly have not read Gödel’s paper,<br />

its content could easily have reached him through Cambridge in that 12 month<br />

‘window’. Since the explicit aim of Gödel’s paper was to show the incompleteness<br />

of Principia Mathematica, it would have immediately had some effect in Cambridge,<br />

both in philosophy and mathematics. Given this evidence, the probability Keynes<br />

knew of Gödel’s theorem when he wrote the review of Ramsey still mightn’t be<br />

greater than one-half, but it mightn’t be less than that either.<br />

In sum, I conclude that Keynes had given up his earlier belief that all rules of<br />

reasonable belief are logical. This is what he yields to Ramsey. This concession<br />

would be supported by the ‘drying up’ of formal logic that Keynes notes, perhaps<br />

most dramatically expressed in Gödel’s theorem. But he hadn’t given up the belief<br />

that there are objective rules which are extra-logical, and given the identification of<br />

probability with degree of reasonable belief, he had no reason to reject Gödelism or<br />

Reductionism about probability. Hence Bateman’s argument that he rejected objectivist<br />

theories of probability fails.<br />

3 Conventionalism<br />

Bateman and Davis each argue that Keynes adopted a conventionalist, intersubjectivist<br />

theory of probability. In Davis this is explicity attributed to Wittgenstein’s<br />

influence, however in Bateman it is less clear what the source of this idea is. It isn’t<br />

obvious what they mean by an intersubjective theory. In particular, it isn’t clear<br />

whether they mean this to be an empirical or a normative theory; whether Keynes<br />

is claiming that we ought set our degrees of belief by convention or that we in general<br />

do. Since the empirical theory would be consistent with his objectivist norms,<br />

and they stress the change in his views, I conclude they are claiming this is a new<br />

normative view. According to this view being reasonable is analysed as conforming<br />

to conventions. This is not a very standard epistemological position, but something<br />

similar is often endorsed in ethics. Bateman marshals the evidence that Keynes moves


Keynes and Wittgenstein 647<br />

from an objectivist to a conventionalist position in ethics as evidence for this epistemological<br />

shift, but this doesn’t seem of overwhelming significance 7 .<br />

Here’s the closest Bateman gets to a definition of what he means by an intersubjective<br />

theory of probability.<br />

When probabilities are formed according to group norms, they are referred<br />

to as intersubjective probabilities ... I take it to be the case that in<br />

a world of subjective probabilities some individuals will form their own<br />

estimates and others will form them on the basis of group norms (50n).<br />

This makes it look very much like an empirical theory, as it refers to how people<br />

actually form beliefs, not how they ought. So his intersubjectivism looks perfectly<br />

consistent with Keynes’s objectivism. I am completely baffled by the ‘world of subjective<br />

probabilities’. I wonder what such a world looks like, and how it compares to<br />

our world of tables, chairs and stock markets?<br />

Fortunately there is a theory that does the work Bateman needs. Ayer (1936) rejects<br />

orthodox subjectivism about probability on the grounds that it doesn’t allow<br />

people to have mistaken probabilistic beliefs. But he can’t admit Keynesian probability<br />

relations into his sparse ontology. The solution he adopts is to define probability<br />

as degree of rational belief, but with this caveat.<br />

Here we may repeat that the rationality of a belief is defined, not by<br />

reference to any absolute standard, but by reference to part of our own<br />

actual practice (Ayer, 1936, 101).<br />

The ‘our’ is a bit ambiguous; interpreting it to refer to the community doesn’t do<br />

violence to the text, though it is just as plausible that it refers to a particular agent.<br />

The ‘part of our practice’ referred to is just our general rules for belief formation.<br />

These aren’t justified by an absolute standard; they are justified by the fact they are<br />

our rules, and presumably by their generality. Given Bateman’s views about metaphysics,<br />

it seems quite reasonable to suppose he’d follow Ayer on this point.<br />

The evidence Keynes adopted such a position is usually taken to be some passages<br />

from the GT and the 1937 QJE paper in which he replied to some attacks on that<br />

book. Here’s the key points from the two quotes Bateman uses to support his view.<br />

In practice we have agreed to fall back on what is, in truth, a convention.<br />

The essence of this convention – though it does not, of course, work<br />

out quite so simply – lies in assuming that the existing state of affairs<br />

will continue indefinitely, except in so far as we have specific reasons for<br />

expecting a change (GT : 152).<br />

7 If Keynes had adopted a framework which implied a tight connection between epistemological and<br />

ethical norms, such as a form of utilitarianism that stressed maximisation of expected utility, this would<br />

be important, since he couldn’t change ethics and keep his epistemology. But such frameworks aren’t<br />

compulsory, and given the vehemence with which Keynes denounced utilitarianism (Keynes, 1938b, 445)<br />

it seems he didn’t adopt one.


Keynes and Wittgenstein 648<br />

How do we manage in such circumstances to behave in a manner which<br />

saves out faces as rational, economic men? We have devised for the purposes<br />

a variety of techniques, of which much the most important are the<br />

three following: ...<br />

(3) Knowing that our own individual judgement is worthless, we endeavour<br />

to fall back on the judgement of the rest of the world which<br />

is perhaps better informed. That is, we endeavour to conform with the<br />

behaviour of the majority or the average. The psychology of a society<br />

of individuals each of whom is endeavouring to copy the others leads<br />

to what we may strictly term a conventional judgement (Keynes, 1937b,<br />

115).<br />

There are two problems with using this evidence the way Bateman does. The first is<br />

the old one that they seem expressly directed to empirical questions, though perhaps<br />

appearances are deceptive here. The more important one is that Keynes is attempting<br />

to answer a very specific question with these passages; in ignorance of the question<br />

we can easily misinterpret the answer.<br />

How much ought one pay for a share in company X? Well, if one intends to hold<br />

the share come what may, all that matters is the expected prospective yield of X’s<br />

shares, appropriately discounted, as compared to the potential yield of that money<br />

in other uses. But as Keynes repeatedly stresses (GT : 149; (Keynes, 1937b, 114)) we<br />

have no basis for forming such expectations. Were this the only reason for investing<br />

then purely commercial investment may never happen.<br />

There is another motivation for investment, one that avoids this problem. We<br />

might buy a share in X today on the hope that we will sell it next week (or next month<br />

or perhaps next year) for more than we paid. To judge whether such a purchase will<br />

be profitable, we need a theory about how the price next week will be determined.<br />

Presumably those buyers and sellers will be making much the same evaluations that<br />

we are. That is, they’ll be thinking about how much other people think X is worth.<br />

We have reached the third degree where we devote our intelligences to anticipating<br />

what average opinion expects the average opinion to be. And<br />

there are some, I believe, who practice the fourth, fifth and higher degrees<br />

(GT : 156).<br />

There is simply no solution to this except to fall back on convention. That is, we<br />

are forced into a conventionalist theory of value, at least of investment goods. But<br />

this doesn’t mean that we have a conventionalist epistemology. On the contrary,<br />

it means that our ordinary (objectivist) empiricism is unimpeded. For the question<br />

that Keynes has us solve by reference to convention is: What is the value of X? This<br />

is equivalent to, what will be value of X be, or again, to what are the conventional<br />

beliefs about X’s value? We need to answer a question about the state of conventions,<br />

and as good empiricists we answer it by observing conventions.<br />

An analogy may help here. Here’s something that Hempel believed: to gain<br />

rational beliefs about the colour of ravens, one has to look at some birds. Did this


Keynes and Wittgenstein 649<br />

mean he had an ornithological epistemology? No; he had an empiricist epistemology<br />

which when applied to a question about ravens issued the directive: Observe ravens!<br />

Similarly Keynes’s belief that to answer questions about value, i.e. about conventions,<br />

one has to look at conventions, does not imply a conventionalist epistemology. It just<br />

means he has an empiricist epistemology which when applied to a question about<br />

conventions issues the directive: Observe conventions!<br />

There might be another motivation for using conventions, again consistent with<br />

Keynes’s objectivist empiricism. Sometimes we may have not made enough observations,<br />

or may not have the mental power to convert these to a theory. So we’ll piggyback<br />

on someone else’s observations or mental powers. (This seems to be what’s<br />

going on in the quote from Keynes (1937b).) Or even better, we’ll piggyback on<br />

everyone’s work, the conventions. To see how this is consistent with an objectivist<br />

epistemology (if it isn’t already obvious) consider another analogy.<br />

What is the best way to work out the derivative of a certain function? Unless<br />

your memory of high-school calculus is clear, the simplest solution will be to consult<br />

an authority. Let’s assume for the sake of argument that the easiest authorities to<br />

consult are maths texts. It seems like the rational thing to do is to act as if the method<br />

advanced by the maths texts is the correct method. Does this mean that you have<br />

adopted some kind of authoritarian metaphysics of mathematics, where what it is<br />

for something to be correct is for it to be asserted by an authority? Not at all. It<br />

is assumed that what the textbook says is correct, but the authoritarian has to make<br />

the extra claim that the answer is correct because it is in the textbook. This is false;<br />

that answer is in the textbook because it is correct. In sum, the authoritarian gets the<br />

direction of fit wrong.<br />

Similarly in the ‘piggyback’ cases the intersubjectivist gets the direction of fit<br />

wrong. We are accepting that p has emerged as ‘average opinion’, then it is reasonable<br />

to believe p. But we aren’t saying with the intersubjectivist it is reasonable to believe<br />

p because p is average opinion; rather we are assuming p is average opinion because it<br />

is reasonable to believe p.<br />

The evidence so far suggests Keynes’s statements are consistent with his denying<br />

intersubjectivism. We might be able to go further and show they are inconsistent<br />

with his adopting that theory. After the quote on GT page 152 he spends the next<br />

page or so defending the use of conventions here. The defence is, in part, that decisions<br />

made in accord with conventions are reversible in the near future, so they won’t<br />

lead to great loss. If he really were an intersubjectivist, the use of conventions would<br />

either not need defending, or could be defended by general philosophical principles.<br />

Secondly, there is this quote which in context seems inconsistent with adopting a<br />

conventionalist view.<br />

For it is not sensible to pay 25 for an investment which you believe the<br />

prospective yield to justify a value of 30, if you also believe that the market<br />

will value it at 20 three months hence (GT : 155).<br />

The context is that he is discussing why reasonable professional investors base their<br />

valuations on convention rather than on long-term expectation. Hence the ‘you’


Keynes and Wittgenstein 650<br />

in the quote is assumed to be reasonable. Hence it is reasonable, Keynes thinks, to<br />

believe that an investment’s prospective yield justifies a value of 30, and that conventional<br />

wisdom is that its prospective yield is much lower. But if all reasonable beliefs<br />

were formed by accordance with conventional wisdom, this would be inconsistent.<br />

Hence Keynes cannot have adopted a conventionalist epistemology.<br />

4 Keynes and Vagueness<br />

What a terrible state Keynes interpretation has got into! From the same few pages<br />

(the opening of GT Ch. 4) Coates (1996) reads into Keynes a preference for basing<br />

theory on vague predicates, Bradford and Harcourt (1997) read Keynes as denying<br />

that predicates which are unavoidably vague can be used in theory, and O’Donnell<br />

(1997) sees Keynes as holding a position in between these.<br />

Coates’s theory is that Keynes abandoned the narrowly analytic foundations of<br />

his early philosophy because of the problems of vagueness that were pointed out to<br />

him by Wittgenstein. He has Keynes in 1936 adopting a middle way between analytic<br />

and Continental philosophy, which gives up on analysis because of unavoidable<br />

vagueness, but which doesn’t follow Derrida in saying all that’s left after analysis is<br />

‘poetry’. He also wants to argue for the philosophical importance of this theory. In<br />

this essay I’ll focus on his exegetical theories, though there are concerns to be raised<br />

about his philosophy.<br />

As in Bateman, analytic philosophy gets very narrowly defined in Coates 8 . Here<br />

it includes the claim that truth-value gaps are not allowed (xii). This excludes from<br />

the canon some of the most important papers in analytical philosophy of the last<br />

few decades (e.g. Dummett (1959), van Fraassen (1966), Fine (1975b), Kripke (1975)),<br />

and hence must be a mistake. To use one of Coates’s favourite terms, ‘analytic philosophy’<br />

is a family resemblance concept, not to be so narrowly cast. In particular,<br />

as we’ll see, analytic philosophers don’t have to follow Frege in being nihilist about<br />

vagueness.<br />

Even more bizarrely, Coates defines empiricism so it includes both psychologism<br />

in logic and utilitarianism in ethics (72-3). Since Ayer (1936) opposes each of these<br />

doctrines, does that makes Ayer an anti-empiricist? If Ayer is a paradigm empiricist<br />

(as seems plausible) Keynes’s rejection of psychologism and utilitarianism can hardly<br />

count as proof of opposition to empiricism, as Coates wants it to do. Apart from the<br />

fact that Mill believed all three, there is no interesting connection between empiricism,<br />

psychologism and utilitarianism.<br />

Coates’s story is that in the GT Keynes allowed both his units and his definitions<br />

to be quantitatively vague so as to follow natural language. This constitutes a new<br />

‘philosophy of social science’ (85) that is based on the ordinary language philosophy<br />

of the later Wittgenstein. There are several problems with this story. The first is<br />

that most of Coates’s evidence comes from obiter dicta in early drafts of the GT ; by<br />

the time the book was finished most of these suggestions are expunged. The second<br />

is that it’s quite possible to accept vagueness within a highly analytic philosophical<br />

8 All page references in this section (unless otherwise stated) to Coates (1996).


Keynes and Wittgenstein 651<br />

framework. The third is that the way Keynes uses vagueness is only consistent within<br />

such a framework.<br />

The first part of the story focuses on how Keynes derided his predecessors for<br />

using concepts that were vague as if they were precise. Coates adduces evidence to<br />

show Keynes in this context used ‘vague’ as a synonym for ‘quantitatively inexact’.<br />

The most important concept misused by Keynes’s predecessors in this way was the<br />

general price level. Of course this was hardly a new point in the GT ; Keynes (1909)<br />

says similar things. Coates claims that Keynes’s reaction to this misuse was to ‘criticise<br />

formal methods’ (83), and to conclude that ‘economic analysis can do without<br />

the “mock precision” of formal methods’ (85). This is all hard to square with Keynes’s<br />

explicit comments.<br />

The well-known, but unavoidable, element of vagueness which admittedly<br />

attends the concept of the general price-level makes this term very<br />

unsatisfactory for the purposes of a causal analysis, which ought to be<br />

exact (GT : 39).<br />

Further, Keynes then defends his choice of units of quantity (quantity of moneyvalue<br />

and quantities of employment) on the grounds that they are not quantitatively<br />

vague. Coates is surely right when he says that Keynes’s analysis of vagueness here is<br />

‘not very controversial’; although it is perhaps misleading to say it is controversial at<br />

all.<br />

The second, and central, part of the story focuses on how Keynes allowed his<br />

definitions to be vague, but defended this on the grounds of conformity to ordinary<br />

language. This ‘introduces what is distinctive about his later philosophy of the social<br />

sciences’ (85). The bulk of Coates’s evidence comes from Keynes’s commentary on<br />

his own definitions; usually this includes a claim that he has captured the ordinary<br />

usage of the term. Since he uses ‘common usage’ to explicitly mean ‘usage amongst<br />

economists’ (GT : 79) the support these dicta give to Coates’s theory might be minimal,<br />

but we’ll ignore that complication. The real problem is that this commentary<br />

extends to cases where he has changed his mind over the best definition. For example,<br />

Coates quotes Keynes writing in a draft of the GT about the definition of income.<br />

But finally I have come to the conclusion that the use of language, which<br />

is most convenient on a balance of considerations and involves the least<br />

departure from current usage, is to call the actual sale proceeds income<br />

and the present value of the expected sale proceeds effective demand (Keynes,<br />

1934, 425).<br />

Coates comments:<br />

By choosing definitions on the ground that they correspond with actual<br />

usage Keynes was formulating an ordinary language social science,<br />

one that bears a resemblance to those argued for by philosophers of<br />

hermeneutics (90).


Keynes and Wittgenstein 652<br />

He then goes on to note some comments from the GT apparently about this definition,<br />

and how it relates to common usage. The problem is that this isn’t the definition<br />

of income Keynes settles on in the GT. There he defines income of an agent as “the<br />

excess of the value of his finished output sold during the period over the prime cost”<br />

(GT : 54), and net income (which Coates fails to distinguish) as income less supplementary<br />

cost. Given that at every stage Keynes justified his current definitions by<br />

their (alleged) conformity with common usage, even when he changed definitions, it<br />

is hard to believe that these justifications are more than rhetorical flourishes. After<br />

all, who will deny that ceteris paribus technical definitions should follow ordinary<br />

usage?<br />

If Keynes’s early choice of definitions showed an adherence to a ‘philosophy of<br />

hermeneutics’, perhaps his abandonment of those definitions constitutes abandonment<br />

of that philosophy. One change doesn’t necessarily mean a change in foundations,<br />

so it is worth looking at those foundations.<br />

As I mentioned, allowing that vagueness exists doesn’t mean abandoning the Russellian<br />

program of giving a precise analysis of language. There are two reasons for<br />

this. First, contra Wittgenstein it is possible to analyse vague terms. Secondly, there<br />

are semantic programs very much in the spirit of Russell which allow vagueness. I’ll<br />

deal with these in order.<br />

In Philosophical Investigations, Wittgenstein argued that the existence of vagueness<br />

frustrated the program of analysis (ss. 60, 71). The argument presumably is that<br />

analyses are precise, and hence they cannot accurately capture vague terms. (See also<br />

his comments about the impossibility of drawing the boundaries of ‘game’ in s. 68.)<br />

This is a simple philosophical mistake. We can easily give an analysis of a vague term,<br />

we just have to make the analysans vague in exactly the same way as the analysandum.<br />

To see this in action, consider that paradigm of modern philosophy, Lewis’s analysis<br />

of subjunctive conditionals or counterfactuals. Lewis (1973b) says that the conditional<br />

‘If p were the case, it would be that q’ is true iff q is true in the most similar<br />

possible world in which p. He considers the objection that ‘most similar’ is completely<br />

vague and imprecise.<br />

Imprecise it may be; but that is all to the good. Counterfactuals are<br />

imprecise too. Two imprecise concepts may be rigidly fastened to one<br />

another, swaying together rather than separately, and we can hope to be<br />

precise about their connection Lewis (1973b).<br />

Whatever the fate of Lewis’s theory, his methodology seems uncontestable. Wittgenstein’s<br />

claim that analysis must be abandoned because of vagueness is refuted by<br />

these observations of Lewis. Hence Coates’s claim that allowing vagueness (as Keynes<br />

does) means giving up on analytic philosophy is mistaken.<br />

The second problem with Coates’s comments on vagueness is that he hasn’t allowed<br />

for what I’ll call ‘orthodox’ responses to vagueness. The aim of the early analytics<br />

drifted between giving a precise model for natural language, and replacing natural<br />

language with an artificial precise language. The latter, claims Coates, ought be


Keynes and Wittgenstein 653<br />

abandoned because of the pragmatic virtues of a vague language. Let’s agree to that;<br />

can the spirit of the early aim of giving a precise analysis of language be preserved?<br />

Two approaches which seem to meet this requirement are the supervaluational<br />

and epistemic theories of vagueness. The supervaluationist says language can’t be<br />

represented by a precise classical model, but it can be represented by a set of such<br />

models. The epistemic theorist says that there is a precise model of language, but<br />

we cannot know what it is 9 . Call a theorist who adopts one of these approaches ‘orthodox’.<br />

The name is chosen because supporters and critics of orthodoxy agree that<br />

these positions represent attempts to minimise deviations from the classical, Russellian<br />

program.<br />

Clearly Keynes did not explicitly adopt an orthodox theory of vagueness. Williamson<br />

(1994) attempts to trace the epistemic theory back to the Stoics, but general<br />

consensus is that these approaches were all but unknown until recently. What I want<br />

to argue is that Keynes’s intuitions are clearly with orthodoxy. Coates, on the other<br />

hand, wants to place Keynes in a tradition that is critical of classical analysis, and<br />

perhaps finds its best modern expression in the exponents of fuzzy logics. To see this<br />

is wrong, note that the following beliefs are all in the GT.<br />

(1) All goods are (definitely) investment goods or consumption goods.<br />

(2) For some goods it is vague whether they are an investment or consumption<br />

good. (GT : 61)<br />

(3) The yield of an investment, q, is vague.<br />

(4) The carrying cost of an investment, c, is vague.<br />

(5) The net yield of an investment, q - c, can be precisely determined. (GT : 226)<br />

Since Keynes believed (1) to (5) we can safely conclude he believed they were consistent.<br />

More importantly, since the GT has been analysed more thoroughly than any<br />

other economic text written this century, and no one has criticised the consistency<br />

of (1) to (5), it seems many people agree with him. Hence if conformity with pretheoretic<br />

intuitions of consistency is a central desideratum of a theory of vagueness,<br />

we can discard any theory that does not say they are consistent. However, of those<br />

theories on the market, only orthodox theories meet this requirement. It might also<br />

be noted that (1) and (2) are repeated in just about every introductory macro textbook,<br />

again without to my knowledge any question of their consistency.<br />

We can quickly see that these propositions are all consistent on either orthodox<br />

theory. The supervaluationist says there is a set of classical models for a language;<br />

a sentence is true iff it is true on all models, false iff it is false on all models, and<br />

truth-valueless otherwise. Vague terms have different meanings on different models.<br />

So for a particular good, say a car, about which it is vague whether it is an investment<br />

or consumption good, the supervaluationist says it is an investment good on some<br />

models and a consumption good on others. So (2) is satisfied; however on all models<br />

it, like everything else, is either a consumption or investment good, so (1) is satisfied.<br />

Similarly because it is vague whether some costs should be counted as deductions<br />

9 See Williamson (1994) for the best epistemic account, Fine (1975b) and Keefe (2000) for the best su-<br />

pervaluationist accounts.


Keynes and Wittgenstein 654<br />

from the yield of an investment or increments to its carrying cost, the values of q and<br />

c will be different on different models. Hence (3) and (4) are true, but q - c is constant<br />

across models 10 , so (5) is true.<br />

The epistemic theorist says that vagueness is just ignorance. As we can know<br />

that a car is an investment or consumption good without knowing which, (1) and (2)<br />

can be satisfied. Similarly, since we can know that a cost is incurred without knowing<br />

how to account for it in Keynes’s terms, we can know q - c precisely without knowing<br />

q or c precisely, and hence (3) to (5) can be satisfied.<br />

The heterodox theorist has a harder time. The theorist who, following Russell<br />

(1923), says that vagueness is infectious, if a part is vague so is the whole, will deny<br />

that (1) and (2) can be true together. Unless it’s definitely true that a car is an investment<br />

or definitely true it’s a consumption good it can’t be definitely true that it’s one<br />

or the other. This also seems to be the position taken by Wittgenstein (1953).<br />

The nihilist about vagueness, who follows Frege in saying vague terms can’t be<br />

used coherently, similarly can’t endorse both (1) and (2). On that view, if p and q<br />

are both vague, then their disjunction can’t be true. Arguably, on this position the<br />

disjunction of p with anything can't be true, as it is nonsense, but we don’t need<br />

anything that strong. 11<br />

The extra truth-values approach to vagueness (of which fuzzy logic is a variant)<br />

also can’t make (1) and (2) consistent. On any such approach (whether 3-valued, nvalued<br />

or continuum-valued) the degree of truth of a disjunction can’t be higher than<br />

the degree of truth of each of the disjuncts. So if neither ‘This is an investment’ nor<br />

‘This is a consumption good’ is absolutely true (true to degree 1), ‘This is an investment<br />

or consumption good’ can’t be absolutely true. Yet this is just what Keynes<br />

asserted to be possible, and what several generations of readers have found perfectly<br />

consistent. I have only remarked about the problem the consistency of (1) and (2)<br />

poses for heterodox theories. These remarks apply, mutatis mutandis, to (3), (4) and<br />

(5), but as theorists rarely discuss quantitative vagueness (as opposed to truth-value<br />

vagueness) these cases involve a bit more speculation as to what heterodoxy says.<br />

Hence Keynes did not belong to a heterodox tradition vis a vis vagueness, and<br />

heterodox theories fail to capture a crucial pre-theoretic intuition about vague terms.<br />

So Coates’s claims that Keynes followed Wittgenstein into heterodoxy here, and that<br />

he ought have, are both mistaken.<br />

Even if all of the above is mistaken, there remains serious doubt that Keynes had<br />

in mind anything like what Coates attributes to him. Coates makes the chapters on<br />

definitions in the GT into the foundations of a new philosophy, and constituting<br />

an important revolution in theory. This is crucial to Coates’s story about the influence<br />

of Wittgenstein on Keynes. But this attribution is totally at odds with Keynes’s<br />

comments on these chapters, comments that not only reveal his attitudes towards his<br />

definitions but also seem a fair commentary on them.<br />

10A particular cost will either remove an amount from q or add an equal amount to c, depending on<br />

how it is categorised.<br />

11Compare the logic in Bochvar (1939), where p ∨ q is truth-valueless if p is true and q truth-valueless.<br />

Summaries of this and many other many-valued logics are in Haack (1974).


Keynes and Wittgenstein 655<br />

I have felt that these chapters were a great drag on getting on to the real<br />

business, and would perplex the reader quite unnecessarily with a lot of<br />

points which really do not matter to my proper theme (Keynes to Roy<br />

Harrod, 9 August 1935, quoted in (Keynes, 1971-1989, XIII: 537)).<br />

But the main point I would urge is that all this is not fundamental. Being<br />

clear is fundamental, but the choice of definitions of income and investment<br />

is not (Keynes to Dennis Robertson, 29 January 1935, quoted in<br />

(Keynes, 1971-1989, XIII, 495, italics in original)).<br />

5 Keynes on Rules and Private Language<br />

Had Keynes followed Wittgenstein in the ways suggested by either Bateman or Coates<br />

he would have been led into error. Fortunately he was not tempted. There was, however,<br />

one point on which Keynes clearly did not follow Wittgenstein, and sadly so for<br />

Wittgenstein was right. If Kripke (1982) is correct and this is the crucial point in the<br />

later Wittgenstein’s thinking, Keynes’s failure to observe it provides strong evidence<br />

that Wittgenstein’s influence on him was at best slight.<br />

Keynes, as we saw above, thought we dealt with uncertainty by assuming that the<br />

future would resemble the present. Call this Keynes’s maxim. But this, points out<br />

Wittgenstein, gets us nowhere. We know that the future will resemble the present;<br />

what we don’t know is how it will do so. Wittgenstein illustrates this with examples<br />

from mathematics and semantics, but we can apply it more broadly.<br />

Say that a particle in a one-dimensional Euclidean space is now at position d,<br />

travelling at velocity v under acceleration a. Assuming things stay the same, where<br />

will the particle be in 1 unit of time? This question simply can’t be answered, until we<br />

know what in what respect things will ‘stay the same’. If it is in respect of position,<br />

the answer is d, in respect of velocity it is d + v, in respect of acceleration d + v + a/2.<br />

Perhaps our Newtonian intuitions make us prefer the second answer, perhaps not.<br />

The same story applies in economics. When we assume things will stay the same,<br />

does that mean we are assuming the unemployment rate or the rate of change of<br />

the unemployment rate to be the same; real growth or nominal growth to be constant?<br />

At the level of the firm, we can ask whether Keynes’s maxim would have us<br />

assume real or nominal profits to be constant, or perhaps the growth rate of real or<br />

nominal profits, or perhaps sales figures (real or nominal, absolute or variation), or<br />

perhaps one of the variables which play a role like acceleration (rate of change of<br />

sales growth)? In some computing firms we might even take some of the logarithmic<br />

variables (growth of logarithm of sales) to be the constant. We can’t in consistency<br />

assume more than one or two of these variables to be unchanged, yet Keynes provides<br />

us with nothing to tell between them.<br />

More importantly, it looks like Keynes hasn’t even seen the problem. The mechanical<br />

example above looks very similar to some of the paradoxes of indifference<br />

(TP: Ch. 4). For example, in von Kries’s cube factory example, we know that a factory<br />

makes cubes with side length between 0 and 2cm. If that’s all we know, what<br />

should we say is the probability that the next cube’s side length will be greater than


Keynes and Wittgenstein 656<br />

1cm? According to Laplace’s principle of indifference we should divide the probabilities<br />

equally between the possibilities, which seems to give an answer of 1/2. However<br />

we could have set out the problem by saying that the volume of cubes produced is<br />

between 0 and 8cm 3 and we want to know the probability the volume of the next<br />

cube is greater than 1cm 3 . Now the answer (to the same problem) looks to be 7/8.<br />

And if we set out the problem in terms of surface area we seem to get the answer 3/4.<br />

The conclusion is that the principle of indifference could only be saved if we have a<br />

small designated set of predicates to which we can exclusively apply it. But now it<br />

seems Keynes’s maxim can only work if we have a small designated set of predicates<br />

to which we can exclusively apply it, and if we do that we can avoid the paradoxes<br />

of indifference. Keynes explicitly adopts his maxim to avoid the paradoxes of indifference<br />

(GT : 152). He would hardly have done this if he knew structurally similar<br />

problems beset the maxim as best the principle of indifference. As further evidence<br />

he just missed this point, note that while he was not averse to wielding philosophical<br />

tools in economic writing (like the paradoxes of indifference), Wittgenstein’s point<br />

is not mentioned; not in the GT, not in any of its drafts and not in any of the correspondence<br />

after it was published.<br />

For Kripke, this point is central to Wittgenstein’s private language argument. All<br />

that we can know about the meaning of a word is how our community has used it in<br />

the past. We must assume they’ll use it the same way in the future. But what is to<br />

count as using it the same way? A priori it looks like any usage of a word could count;<br />

the only thing that could make usage of a word wrong is the user has a different way<br />

of using the word ‘the same way’ to everyone else. Hence if there is no community<br />

to set such standards there are no bars on how words can be used. And if there are no<br />

such bars, there is nothing that can properly be called a language. Hence there can’t<br />

be a private language.<br />

Given the importance of that conclusion to Wittgenstein’s later philosophy, if<br />

Kripke is even close to right in his reconstruction then it is central to the later Wittgenstein<br />

that Keynes’s maxim is contentless. As Keynes clearly didn’t think this (witness<br />

the central role it plays in summaries of the GT like Keynes 1937) he hasn’t<br />

adopted a central tenet of the later Wittgenstein’s work. This puts a rather heavy<br />

burden on those who would say he became a Wittgensteinian. The arguments presented<br />

so far do nothing to lift that burden.


Blome-Tillmann on Interest-Relative Invariantism<br />

Michael Blome-Tillmann has argued that his version of Lewisian contextualism is<br />

preferable to interest-relative invariantism (Blome-Tillmann, 2009a) 1 . There are three<br />

main arguments offered, one from modal embeddings, one from temporal embeddings<br />

and one from deviant conjunctions. I don’t think any of these work, for reasons<br />

that I’ll go through in turn.<br />

1 Modal Embeddings<br />

Jason Stanley had argued that the fact that interest-relative invariantism (hereafter,<br />

IRI) has counterintuitive consequences when it comes to knowledge ascriptions in<br />

modal contexts shouldn’t count too heavily against IRI, because contextualist approaches<br />

are similarly counterintuitive. In particular, he argues that the theory that<br />

‘knows’ is a contextually sensitive quantifier, plus the account of quantifier-domain<br />

restriction that he developed with Zoltán Gendler Szabó (Stanley and Szabó, 2000),<br />

has false implications when it is applied to knowledge ascriptions in counterfactuals.<br />

Blome-Tillmann disagrees, but I don’t think he provides very good reasons for<br />

disagreeing. Let’s start by reviewing how we got to this point.<br />

Often when we say All Fs are Gs, we really mean All C Fs are Gs, where C is a<br />

contextually specified property. So when I say Every student passed, that utterance<br />

might express the proposition that every student in my class passed. Now there’s<br />

a question about what happens when sentences like All Fs are Gs are embedded in<br />

various contexts. The question arises because quantifier embeddings tend to allow<br />

for certain kinds of ambiguity. For instance, when we have a sentence like If p were<br />

true, all Fs would be G, that could express either of the following two propositions.<br />

(We’re ignoring context sensitivity for now, but we’ll return to it in a second.)<br />

• If p were true, then everything that would be F would also be G.<br />

• If p were true, then everything that’s actually F would be G.<br />

We naturally interpret (1) the first way, and (2) the second way.<br />

(1) If I had won the last Presidential election, everyone who voted for me would<br />

regret it by now.<br />

(2) If Hilary Clinton had been the Democratic nominee in the last Presidential<br />

election, everyone who voted for Barack Obama would have voted for her.<br />

† Unpublished.<br />

1 Blome-Tillmann calls interest-relative invariantism ‘subject-sensitive invariantism’. This is an unfortunate<br />

moniker. The only subject-insensitive theory of knowledge has that for any S, T S knows that p iff T<br />

knows that p. The view Blome-Tillmann is targetting, as set out in Fantl and McGrath (2002); Hawthorne<br />

(2004b); Stanley (2005) and <strong>Weatherson</strong> (2005a) certainly isn’t defined in opposition to this generalisation.<br />

The canonical source for Lewisian contextualism is Lewis (1996b), and Blome-Tillmann defends a variant<br />

in Blome-Tillmann (2009b).


Blome-Tillmann on Interest-Relative Invariantism 658<br />

Given this, you might expect that we could get a similar ambiguity with C . That<br />

is, when you have a quantifier that’s tacitly restricted by C , you might expect that<br />

you could interpret a sentence like If p were true, all Fs would be G in either of these<br />

two ways. (In each of these interpretations, I’ve left F ambiguous; it might denote<br />

the actual F s or the things that would be F if p were true. So these are just partial<br />

disambiguations.)<br />

• If p were true, then every F that would be C would also be G.<br />

• If p were true, then every F that is actually C would be G.<br />

Surprisingly, it’s hard to get the second of these readings. Or, at least, it is hard to<br />

avoid the availability of the first reading. Typically, if we restrict our attention to the<br />

C s, then when we embed the quantifier in the consequent of a counterfactual, the<br />

restriction is to the things that would be C , not to the actual C s. 2<br />

Blome-Tillmann notes that Stanley makes these observations, and interprets him<br />

as moulding them into the following argument against Lewisian contextualism.<br />

1. An utterance of If p were true, all Fs would be Gs is interpreted as meaning If p<br />

were true, then every F that would be C would also be G.<br />

2. Lewisian contextualism needs an utterance of If p were true, then S would know<br />

that q to be interpreted as meaning If p were true, then S’s evidence would rule<br />

out all ¬q possibilities, except those that are actually being properly ignored, i.e. it<br />

needs the contextually supplied restrictor to get its extension from the nature<br />

of the actual world.<br />

3. So, Lewisian contextualism is false.<br />

And Blome-Tillmann argues that the first premise of this argument is false. He thinks<br />

that he has examples which undermine premise 1. But I don’t think his examples<br />

show any such thing. Here are the examples he gives. (I’ve altered the numbering for<br />

consistency with this paper.)<br />

(3) If there were no philosophers, then the philosophers doing research in the field<br />

of applied ethics would be missed most painfully by the public.<br />

(4) If there were no beer, everybody drinking beer on a regular basis would be<br />

much healthier.<br />

(5) If I suddenly were the only person alive, I would miss the Frege scholars most.<br />

These are all sentences of (more or less) the form If p were true, Det Fs would be<br />

G, where De t is some determiner or other, and they should all be interpreted a la<br />

our second disambiguation above. That is, they should be interpreted as quantifying<br />

over actual F s, not things that would be F if p were true. But the existence of such<br />

sentences is completely irrelevant to what’s at issue in premise 1. The question isn’t<br />

whether there is an ambiguity in the F position, it is whether there is an ambiguity<br />

in the C position. And nothing Blome-Tillmann raises suggests premise 1 is false. So<br />

this response doesn’t work.<br />

2 See Stanley and Szabó (2000) and Stanley (2005) for arguments to this effect.


Blome-Tillmann on Interest-Relative Invariantism 659<br />

Even if a Lewisian contextualist were to undermine premise 1 of this argument,<br />

they wouldn’t be out of the woods. That’s because premise 1 is much stronger than<br />

is needed for the anti-contextualist argument Stanley actually runs. Note first that<br />

the Lewisian contextualist needs a reading of If p were true, all Fs would be G where it<br />

means:<br />

• If p were true, every actual C that would be F would also be G.<br />

The reason the Lewisian contextualist needs this reading is that on their story, S<br />

knows that p means Every ¬ p possibility is ruled out by S’s evidence, where the every<br />

has a contextual domain restriction, and the Lewisian focuses on the actual context.<br />

The effect in practice is that an utterance of S knows that p is true just in case every<br />

¬ p possibility that the speaker isn’t properly ignoring, i.e., isn’t actually properly<br />

ignoring, is ruled out by S’s evidence. Lewisian contextualism is meant to explain<br />

sceptical intuitions, so let’s consider a particular sceptical intuition. Imagine a context<br />

where:<br />

• I’m engaged in sceptical doubts;<br />

• there is beer in the fridge<br />

• I’ve forgotten what’s in the fridge; and<br />

• I’ve got normal vision, so if I check the fridge I’ll see what’s in it.<br />

In that context it seems (6) is false, since it would only be true if Cartesian doubts<br />

weren’t salient.<br />

(7) If I were to look in the fridge and ignore Cartesian doubts, then I’d know there<br />

is beer in the fridge.<br />

But the only way to get that to come out false, and false for the right reasons, is to fix<br />

on which worlds we’re actually ignoring (i.e., include in the quantifier domain worlds<br />

where I’m the victim of an evil demon), but look at worlds that would be ruled out<br />

with the counterfactually available evidence. We don’t want the sentence to be false<br />

because I’ve actually forgotten what’s in the fridge. And we don’t want it to be true<br />

because I would be ignoring Cartesian possibilities. In the terminology above, we<br />

would need If p were true, all Fs would be Gs to mean If p were true, then every actual<br />

C that were F would also be G. We haven’t got any reason yet to think that’s even a<br />

possible disambiguation of (6).<br />

But let’s make things easy for the contextualist and assume that it is. Stanley’s<br />

point is that the contextualist needs even more than this. They need it to be by<br />

far the preferred disambiguation, since in the context I describe the natural reading<br />

of (6) (given sceptical intuitions) is that it is false because my looking in the fridge<br />

wouldn’t rule out Cartesian doubts. And they need it to be the preferred reading<br />

even though there are alternative readings that are (a) easier to describe, (b) of a kind<br />

more commonly found, and (c) true. Every principle of contextual disambiguation<br />

we have pushes us away from thinking this is the preferred disambiguation. This is<br />

the deeper challenge Stanley raises for contextualists, and it hasn’t yet been solved.


Blome-Tillmann on Interest-Relative Invariantism 660<br />

2 Temporal Embeddings<br />

Blome-Tillmann thinks that IRI will lead to implausible consequences with temporal<br />

embeddings. He illustrates this with a variant of the well-known bank cases. (I<br />

assume familiarity with these cases among my readers.) Let O be that the bank is<br />

open Saturday morning. If Hannah has a large debt, she is in a high-stakes situation<br />

with respect to O. She had in fact incurred a large debt, but on Friday morning the<br />

creditor waived this debt. Hannah had no way of anticipating this on Thursday. She<br />

has some evidence for O, but not enough for knowledge if she’s in a high-stakes situation.<br />

Blome-Tillmann says that this means after Hannah discovers the debt waiver,<br />

she could say (7).<br />

(7) I didn’t know O on Thursday, but on Friday I did.<br />

But I’m not sure why this case should be problematic. As Blome-Tillmann notes,<br />

it isn’t really a situation where Hannah’s stakes change. She was never actually in a<br />

high stakes situation. At most her perception of her stakes change; she thought she<br />

was in a high-stakes situation, then realised that she wasn’t. Blome-Tillmann argues<br />

that even this change in perceived stakes can be enough to make (7) true if IRI is<br />

true. I agree that this change in perception could be enough to make (7) true, but<br />

when we work through the reason that’s so, we’ll see that it isn’t because of anything<br />

distinctive, let alone controversial, about IRI.<br />

If Hannah is rational, then given her interests she won’t be ignoring ¬O possibilities<br />

on Thursday. She’ll be taking them into account in her plans. Someone who<br />

is anticipating ¬O possibilities, and making plans for them, doesn’t know O. That’s<br />

not a distinctive claim of IRI. Any theory should say that if a person is worrying<br />

about ¬O possibilities, and planning around them, they don’t know O. If Hannah<br />

is rational, that will describe her on Thursday, but not on Friday. So (7) is true not<br />

because Hannah’s practical situation changes between Thursday and Friday, but because<br />

her psychological state changes.<br />

What if Hannah is, on Thursday, irrationally ignoring ¬O possibilities, and not<br />

planning for them even though her rational self wishes she were planning for them?<br />

Then Hannah has irrational attitudes towards O, and anyone who has irrational attitudes<br />

towards O doesn’t know O, since knowledge requires a greater level of rationality<br />

than Hannah shows here. The principle from the last sentence can lead to some<br />

quirky results for reasons independent of IRI, and this could trip us up if we spend<br />

too much time on particular cases, and too little on epistemological theory.<br />

So consider Bobby. Bobby has the disposition to infer ¬B from A → B and ¬A.<br />

He currently has good inductive evidence for q, and infers q on that basis. But he<br />

also knows p → q and ¬ p. If he notices that he has these pieces of knowledge, he’ll<br />

infer ¬q. This inferential disposition defeats any claim he might have to know q;<br />

the inferential disposition is a kind of doxastic defeater. Then Bobby sits down with<br />

some truth tables and talks himself out of the disposition to infer ¬B from A → B<br />

and ¬A. He now knows q although he didn’t know it earlier, when he had irrational


Blome-Tillmann on Interest-Relative Invariantism 661<br />

attitudes towards a web of propositions including q. And that’s true even though his<br />

evidence for q didn’t change. 3<br />

Bobby’s case is parallel to the case where Hannah irrationally ignores the significance<br />

of O to her practical deliberation. In both cases, defective mental states<br />

elsewhere in their cognitive architecture defeat knowledge claims. And in that kind<br />

of case, we should expect sentences like (7) to be true, even if they appear counterintuitive<br />

before we’ve worked through the details.<br />

3 Conjunctions<br />

George and Ringo both have $6000 in their bank accounts. They both are thinking<br />

about buying a new computer, which would cost $2000. Both of them also have rent<br />

due tomorrow, and they won’t get any more money before then. George lives in<br />

New York, so his rent is $5000. Ringo lives in Syracuse, so his rent is $1000. Clearly,<br />

(8) and (9) are true.<br />

(8) Ringo has enough money to buy the computer.<br />

(9) Ringo can afford the computer.<br />

And (10) is true as well, though there’s at least a reading of (11) where it is false.<br />

(10) George has enough money to buy the computer.<br />

(11) George can afford the computer.<br />

Focus for now on (10). It is a bad idea for George to buy the computer; he won’t be<br />

able to pay his rent. But he has enough money to do so; the computer costs $2000,<br />

and he has $6000 in the bank. So (10) is true. Admittedly there are things close to<br />

(10) that aren’t true. He hasn’t got enough money to buy the computer and pay his<br />

rent. You might say that he hasn’t got enough money to buy the computer given<br />

his other financial obligations. But none of this undermines (10). The point of this<br />

little story is to respond to an argument Blome-Tillmann makes towards the end of<br />

his paper. Here is how he puts the argument. (Again I’ve changed the numbering and<br />

some terminology for consistency with this paper.)<br />

Suppose that John and Paul have exactly the same evidence, while John is<br />

in a low-stakes situation towards p and Paul in a high-stakes situation towards<br />

p. Bearing in mind that IRI is the view that whether one knows p<br />

depends on one’s practical situation, IRI entails that one can truly assert:<br />

(12) John and Paul have exactly the same evidence for p, but only John<br />

has enough evidence to know p, Paul doesn’t.<br />

3 I assume, what shouldn’t be controversial, that irrational inferential dispositions which the agent does<br />

not know he has, and which he does not apply, are not part of his evidence.


Blome-Tillmann on Interest-Relative Invariantism 662<br />

And this is meant to be a problem, because (12) is intuitively false.<br />

But IRI doesn’t entail any such thing. Paul does have enough evidence to know<br />

that p, just like George has enough money to buy the computer. Paul can’t know<br />

that p, just like George can’t buy the computer, because of their practical situations.<br />

But that doesn’t mean he doesn’t have enough evidence to know it. So, contra Blome-<br />

Tillmann, IRI doesn’t entail this problematic conjunction.<br />

In a footnote attached to this, Blome-Tillmann tries to reformulate the argument.<br />

I take it that having enough evidence to ‘know p’ in C just means having<br />

evidence such that one is in a position to ‘know p’ in C , rather than having<br />

evidence such that one ‘knows p’. Thus, another way to formulate<br />

(12) would be as follows: ‘John and Paul have exactly the same evidence<br />

for p, but only John is in a position to know p, Paul isn’t.’<br />

The ‘reformulation’ is obviously bad, since having enough evidence to know p isn’t<br />

the same as being in a position to know it, any more than having enough money to<br />

buy the computer puts George in a position to buy it. But might there be a different<br />

problem for IRI here? Might it be that IRI entails (13), which is false?<br />

(13) John and Paul have exactly the same evidence for p, but only John is in a<br />

position to know p, Paul isn’t.<br />

There isn’t a problem with (13) because almost any epistemological theory will imply<br />

that conjunctions like that are true. In particular, any epistemological theory<br />

that allows for the existence of defeaters to not supervene on the possession of evidence<br />

will imply that conjunctions like (13) are true. Any such theory, or indeed any<br />

such non-supervenience claim, will be controversial. But there are so many plausible<br />

violations of this supervenience principle, that it would be implausible if none of<br />

them work. Here are three putative examples of cases where subjects have the same<br />

evidence but different defeaters; I expect most epistemologists will find at least one<br />

of them plausible.<br />

Logic and the Oracle<br />

Graham, Crispin and Ringo have an audience with the Delphic Oracle,<br />

and they are told ¬ p ∨ q and ¬¬ p. Graham is a relevant logician, so if<br />

he inferred p ∧ q from these pronouncements, his belief in the invalidity<br />

of disjunctive syllogism would be a doxastic defeater, and the inference<br />

would not constitute knowledge. Crispin is an intuitionist logician, so<br />

if he inferred p ∧ q from these pronouncements his belief in the invalidity<br />

of double negation elimination would be a doxastic defeater, and the<br />

inference would not constitute knowledge. Ringo has no deep views on<br />

the nature of logic. Moreover, in the world of the story classical logic is<br />

correct. So if Ringo were to infer p ∧ q from these pronouncements, his<br />

belief would constitute knowledge. Now Graham’s and Crispin’s false<br />

beliefs about entailment are not p ∧ q-relevant evidence, so all three of


Blome-Tillmann on Interest-Relative Invariantism 663<br />

them have the same p ∧ q-relevant evidence, but only Ringo is in a position<br />

to know p ∧ q.<br />

Missing iPhone (after Harman (1973)).<br />

Lou and Andy both get evidence E, which is strong inductive evidence<br />

for p. If Lou were to infer p from E, his belief would constitute knowledge.<br />

Andy has just missed a phone call from a trusted friend. The friend<br />

left a voicemail saying ¬ p, but Andy hasn’t heard this yet. If Andy were<br />

to infer p from E, his friend’s phone call and voicemail would constitute<br />

defeaters, so he wouldn’t know p. But phone calls and voicemails you<br />

haven’t got aren’t evidence you have. So Lou and Andy have the same<br />

p-relevant evidence, but only Lou is in a position to know p.<br />

Fake Barns and Motorcycles (after Gendler and Hawthorne (2005))<br />

Bob and Levon are travelling through Fake Barn Country. Bob is on a<br />

motorcycle, Levon is on foot. They are in an area where the barns are,<br />

surprisingly, real for a little ways around. On his motorcycle, Bob will<br />

soon come to fake barns, but Levon won’t hit any fakes for a long time.<br />

They are both looking at the same real barn. If Bob inferred it was a real<br />

barn, not a fake, the fakes he is speeding towards would be defeaters. But<br />

Levon couldn’t walk that far, so those barns don’t defeat him. So Bob<br />

and Levon have the same evidence, but only Levon is in a position to<br />

know that the barn is real.<br />

I’m actually not sure what plausible theory would imply that what different agents<br />

are in a position to know depends on nothing except for what evidence they have.<br />

The only theory that I can imagine with that consequence is the conjunction of evidentialism<br />

about justification and a justified true belief theory of knowledge. So<br />

really there’s no reason to think that implying sentences like (13) is a mark against a<br />

theory.<br />

It’s been suggested to me 4 that there are other more problematic conjunctions in<br />

the neighbourhood. For instance, we might worry that IRI implies that (14) is true.<br />

(14) John and Paul are alike in every respect relevant to knowledge of p, but John<br />

is in a position to know p, and Paul isn’t.<br />

That would be problematic, but there’s no reason to think IRI implies it. Indeed, IRI<br />

entails that the first conjunct is false, since John and Paul are unlike in one respect<br />

that IRI loudly insists is relevant. Perhaps we can do better with (15).<br />

(15) John and Paul are alike in every respect relevant to knowledge of p except their<br />

practical interests, but John is in a position to know p, and Paul isn’t.<br />

4 Footnote deleted for blind review.


Blome-Tillmann on Interest-Relative Invariantism 664<br />

That is something IRI implies, but it seems more than a little question-begging to use<br />

its alleged counterintuitiveness against IRI. After all, it’s simply a statement of IRI<br />

itself. If someone had alleged that IRI should be accepted because it was so intuitive,<br />

I guess noting how odd (15) looks would be a response to them. But that’s not the<br />

way IRI is defended in the books and papers I cited in the introduction. It is defended<br />

by noting what a good job it does of handling difficult puzzles, especially puzzles<br />

concerning lotteries and conjunctions, and I don’t see any reason to think there is<br />

any solution to those puzzles that isn’t counterintuitive.<br />

On a more positive note, I think it will be worthwhile going forward to think<br />

about ‘afford’ in these debates, since ‘afford’ is interest-relative, at least on one disambiguation.<br />

A kind of Lewisian contextualism about ‘afford’ would be crazy. We<br />

shouldn’t say that if my rent is $5000, and it is due tomorrow, then (9) is false, because<br />

after all, in my context someone with Ringo’s money couldn’t buy the computer and<br />

meet their financial obligations. So there might be some interesting patterns that ‘afford’<br />

has, and checking whether ‘knows’ behaves the same way could be a good check<br />

on whether ‘knows’ is interest-relative.


Doing Philosophy With Words<br />

Scott Soames has written two wonderfully useful books that will be valuable introductions<br />

to twentieth century philosophy. The books arose out of his well-received<br />

classes on the history of twentieth century history at Princeton, and will be valuable<br />

to anyone teaching similar courses. I shall be relying on them as I teach such a course<br />

at Cornell.<br />

The books consist of detailed case studies of important twentieth-century works.<br />

They are best read alongside those original texts. Anyone who works through the<br />

canon in this way will have an excellent introduction to what twentieth century<br />

philosophers were trying to do. The selections are judicious, and while some are<br />

obvious classics some are rather clever choices of papers that are representative of the<br />

type of work being done at the time. And Soames doesn’t just point to the most<br />

important works to study, but the most important sections of those works.<br />

Soames’s discussion of these pieces is always built around an analysis of their<br />

strengths and weaknesses. He praises the praiseworthy, but the focus, at least in the<br />

sections I’m discussing (ordinary language philosophy from Wittgenstein to Grice),<br />

is on where these philosophers go wrong. This is particularly so when the mistakes<br />

are representative of a theme. There are three main mistakes Soames finds in philosophers<br />

of this period. First, they rely logical positivism long after it had been shown to<br />

be unviable. Second, they disregard the principle that semantics should be systematic.<br />

Third, they ignore the distinction between necessity and a priority. All three constitute<br />

major themes of Soames’s book, and indeed of twentieth century philosophy as<br />

Soames sees it.<br />

These books concentrate, almost to a fault, on discussion of philosophers’ published<br />

works, as opposed to the context in which they are written. Apart from occasionally<br />

noting that some books were released posthumously, we aren’t told whether<br />

the philosophers who wrote them are alive, and only in one case are we told when a<br />

philosopher was born. This kind of external information does not seem important<br />

to Soames. He is the kind of historian who would prefer a fourth reading of Austin’s<br />

published works to a first reading of his wartime diaries. And he’d prefer to spend<br />

the evening working on refutations, or charitable reformulations, of Austin’s arguments<br />

to either. I’m mostly sympathetic to this approach; this is history of philosophy<br />

after all. We can leave discussions of the sociology of 1950s Oxford to those better<br />

qualified. But this choice about what to write about has consequences.<br />

Most of Soames’s chapters focus almost exclusively on a particular book or paper.<br />

The exceptions are like the chapter on Sense and Sensibilia, where Soames contrasts<br />

Austin’s discussion with Ayer’s response. We learn a lot about the most important<br />

works that way, but less about their intellectual environment. So the book doesn’t<br />

have much by way of broad discussion about overall trends or movements. There’s<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philosophical<br />

Studies 135 (2007): 429-37. Thanks to David Chalmers, Michael Fara, John Fischer, Tamar Szabó<br />

Gendler, James Klagge, Michael Kremer, Ishani Maitra, Aidan McGlynn, Alva Noë, Jonathan Weinberg<br />

and Larry Wright.


Doing Philosophy With Words 666<br />

very little, for example, about who were the influencers and who the influenced.<br />

There’s nothing about how anyone not called ‘Wittgenstein’ changed their positions<br />

in response to criticism. One assumes from the chronology that Ryle’s influence on<br />

Austin was greater than Austin’s influence on Ryle, for example, but Soames is silent<br />

on whether this is true.<br />

Soames says at one point that, “[Ryle] was, along with Wittgenstein, J. L. Austin,<br />

and Paul Grice, one of the prime movers in postwar philosophy in England.” (68).<br />

But we aren’t really told why this is so, apart from the discussion of some prominent<br />

works of these four philosophers. (Perhaps Soames has taken the maxim Show it,<br />

don’t say it rather completely to heart.) Nor are why told why the list includes those<br />

four, and not, say, Strawson or Geach or Anscombe. Actually Anscombe’s absence<br />

reminds us that there is almost no discussion of women in philosophy in the book.<br />

That’s not Soames fault, it’s a reflection of a long-running systematic problem in<br />

philosophy that the discipline has a hard time recruiting and retaining women. Could<br />

some of that be traced back to what was going on in the ordinary language period?<br />

That kind of questions can’t be addressed by the kind of history book that Soames<br />

has written, where the focus is on the best philosophical writing, and not on the<br />

broader philosophical community.<br />

One of the other consequences of the format is that, by necessity, many important<br />

figures are left out, on pain of writing a fifteen-volume book. In the period under<br />

discussion here there was historically important work by (among many others) Nelson<br />

Goodman, Wilfrid Sellars and Roderick Chisholm, some of which connects up<br />

closely to the themes and interests of the ordinary language philosophers, but none of<br />

which is as much as mentioned. (Goodman is mentioned in the epilogue as someone<br />

Soames regrets not covering.)<br />

Now this can’t be a complaint about the book Soames has written, because it<br />

would have been impossible to cover any more figures than he did in the style and<br />

depth that he did. And it would have been impossible to tell in detail the story of how<br />

Ryle’s impact on the philosophical world differed from Austin’s, or of the painfully<br />

slow integration of women into the top echelons of philosophy, without making<br />

the book be even more monumental than it is. All we’re left with is a half-hearted<br />

expression of regret that he didn’t write a different kind of book, one that told us<br />

more about the forest, even as we value what he says about the tallest of the trees.<br />

1 Grice and The End of Ordinary Language<br />

There is one place where Soames stops to survey the field, namely his discussion of<br />

the impact of Grice’s work on the ordinary language tradition. Soames argues that<br />

with Grice’s William James lectures, the idea of ordinary language philosophy had<br />

“run their course”. The position seems to be that Grice overthrew a paradigm that<br />

had been vibrant for two decades, but was running out of steam by the time of Grice’s<br />

James lectures. How plausible is this?<br />

The first step is to work out just what it was that Grice refuted. When summarising<br />

the ordinary language paradigm that he takes Grice to have overthrown,<br />

Soames is uncharacteristically harsh. In Soames’s summary one of the characteristic


Doing Philosophy With Words 667<br />

activities of an ordinary language philosopher is “opportunistically assembling reminders<br />

about how philosophically significant words are used in ordinary settings”<br />

(216). That may be a fair enough description of some mid-century work, but it isn’t a<br />

fair summary of the best of the work that Soames has spent the previous two hundred<br />

odd pages discussing. It all suggests that Grice didn’t so much overthrow ordinary<br />

language philosophy as much as badly done ordinary language philosophy, and this<br />

category might not include Strawson, Ryle, Austin and so on.<br />

More importantly, it isn’t entirely clear just what it was Grice did that caused this<br />

paradigm shift. In Soames’s telling it seems the development of the speaker meaning/semantic<br />

meaning distinction was crucial, but Austin at least already recognised<br />

this distinction, indeed appealed to it twice in Sense and Sensibilia. Soames mentions<br />

the discussion on pages 89 to 91 of Sense and Sensibilia of phrases like “I see two<br />

pieces of paper”, and there is also the intriguing discussion on pages 128-9 of the relation<br />

between accurate and true where Austin goes close to stating Grice’s submaxim<br />

of concision.<br />

The other suggestion is that Grice restored the legitimacy and centrality of systematic<br />

semantic theorising. It’s true Grice did that, but this doesn’t show we have<br />

to give up ordinary language philosophy unless it was impossible to be an ordinary<br />

language philosopher and a systematic semanticist. And it isn’t clear that this really<br />

is impossible. It hardly seems inconsistent with the kind of philosophy Austin did (especially<br />

in his theory of perception) that one endorse a systematic semantic theory.<br />

(Though Austin himself rarely put forward systematic analyses.) Notably, there are<br />

plenty of very systematic formal semanticists who take Strawson’s work on descriptions<br />

seriously, and try and integrate it into formal models. So we might wonder why<br />

Grice’s work shouldn’t have led to a kind of ordinary language philosophy where we<br />

paid more careful attention to system-building.<br />

More broadly, we might wonder whether the ordinary language period really did<br />

end. The analysis of knowledge industry (strangely undiscussed in a work on analysis<br />

in the twentieth century) seemed to putter along much the same before and after<br />

the official demise of ordinary language philosophy. And there are affinities between<br />

the ordinary language philosophers and important contemporary research programs,<br />

e.g. the ‘Canberra Plan’ as described by Frank Jackson (1998). So perhaps before we<br />

asked who killed ordinary language philosophy (It was Professor Grice! In Emerson<br />

Hall!! With the semantics/pragmatics distinction!!!) we should have made sure there<br />

was a corpse. More on this point presently.<br />

2 A Whig History?<br />

One of the major themes of Soames’s discussion is that there are some systematic<br />

problems in twentieth century philosophy that are righted by the heroes at the end<br />

of the story. I already mentioned the heroic role assigned to Grice. But the real star of<br />

the show is Kripke, who comes in as a deus ex machina at the end showing how different<br />

necessity and a priority are, and thereby righting all manner of grievous wrongs.<br />

That Kripke is an important figure in twentieth century philosophy is hardly a matter<br />

of dispute, but Soames does stretch a little to find errors for our hero to correct.


Doing Philosophy With Words 668<br />

Some of the complaints about philosophers collapsing the necessary/a priori distinction<br />

do hit the target, but don’t leave deep wounds in their victims. For instance,<br />

Soames quotes Ryle arguing (in Dilemmas) that perception cannot be a physiological<br />

process because if it were we couldn’t know whether we saw a tree until we found<br />

out the result of complicated brain scans. Soames points out, perfectly correctly, that<br />

the seeing might be necessarily identical to the brain process even if we don’t know,<br />

and even can’t know without complicated measurements, whether they are identical.<br />

Soames is right that Ryle has made an epistemological argument here when a metaphysical<br />

argument was needed. But rewriting Ryle so he makes that metaphysical<br />

argument isn’t hard. If my seeing the tree is necessarily identical to the brain process,<br />

and the brain process is (as Ryle and Soames seem to agree it is) individuated by the<br />

brain components that implement it, then I couldn’t have seen the tree had one of<br />

the salient neurons in my brain been silently replaced with a functionally equivalent<br />

silicon chip. Since it is possible that I could have seen a tree even if a salient neuron<br />

was replaced with a functionally equivalent silicon chip, the seeing and the brain process<br />

are not necessarily identical. So while Ryle might have slipped here, and Kripke’s<br />

work does help us correct the slip, the consequences of this are basically verbal.<br />

A more important charge of ignoring the necessary/a priori distinction comes<br />

in Soames’s discussion of Wittgenstein’s deflationism about philosophy. Here is the<br />

salient passage.<br />

His deflationary conception of philosophy is also consistent with, and<br />

even derivative from, his new ideas about meaning plus a set of unquestioned<br />

philosophical presuppositions he brings to the enterprise. The<br />

philosophical presuppositions include the then current and widespread<br />

assumptions that (i) that philosophical theses are not empirical, and hence<br />

must be necessary and a priori, and (ii) that the necessary, the a priori and<br />

the analytic are one and the same. Because he takes these assumptions for<br />

granted, he takes it for granted that if there are any philosophical truths,<br />

they must be analytic (29).<br />

This seems to me to be mistaken twice over.<br />

First, it isn’t clear to me that there is any appeal to concepts of necessity in the<br />

passages in Wittgenstein Soames is summarising here, and metaphysical necessity simply<br />

doesn’t seem to have been a major interest of Wittgenstein’s. Wittgenstein does<br />

appear to reason that if a proposition is not empirical it is a priori, but that inference<br />

doesn’t go via claims about necessity, and isn’t shown to be fallacious by any of<br />

Kripke’s examples.<br />

Second, it simply isn’t true that philosophers in Wittgenstein’s time took for<br />

granted that the analytic and the a priori were one and the same. To be sure, many<br />

philosophers in the early twentieth century (including many argue the younger Wittgenstein)<br />

argued against Kant’s claim that they are distinct, but this isn’t quite the<br />

same as taking for granted they are identical. And there are a few places where Wittgenstein<br />

appears to accept that some propositions are synthetic a priori. For example<br />

in Remarks on the Foundations of Mathematics he says it is synthetic a priori that there<br />

is no reddish green, (Part III, para 39) and goes on to say this about primes.


Doing Philosophy With Words 669<br />

The distribution of primes would be an ideal example of what could be<br />

called synthetic a priori, for one can say that it is at any rate not discoverable<br />

by an analysis of the concept of a prime number. (Wittgenstein,<br />

1956, Part III, para 42)<br />

Now it is far from obvious what the connection is between remarks such as these<br />

and the remarks about the impossibility of philosophical theses in the Investigations.<br />

Indeed it is not obvious whether Wittgenstein really believed in the synthetic a priori<br />

at any stage of his career. But given his lack of interest in metaphysical necessity, and<br />

openness to the possibility of synthetic a priori claims, it seems unlikely that he was,<br />

tacitly or otherwise, using the argument Soames gives him to get the deflationary<br />

conclusions. 1<br />

3 Getting the Question Right<br />

As I mentioned above, Soames’s is the kind of history that focuses on the works<br />

of prominent philosophers, rather than their historical context. There’s much to<br />

be gained from this approach, in particular about what the greats can tell us about<br />

pressing philosophical questions. But one of the costs is that in focussing on what<br />

they say about our questions, we might overlook their questions. In most cases this<br />

is a trap Soames avoids, but in the cases of Austin and Ryle the trap may have been<br />

sprung.<br />

Soames sees Austin in Sense and Sensibilia as trying to offer us a new argument<br />

against radical scepticism.<br />

Austin’s ultimate goal is to undermine the coherence of skepticism. His<br />

aim is not just to show that skepticism is unjustified, or implausible, or<br />

that it is a position no one has reason to accept. Rather, his goal is to<br />

prevent skepticism from getting off the ground by denying skeptics their<br />

starting point. (173-4)<br />

But we don’t get much of an interpretative argument that this is really Austin’s goal.<br />

Indeed, Soames concedes that Austin “doesn’t always approach these questions directly”<br />

(172). I’d say he does very little to approach them at all. To be sure, many<br />

contemporary defenders of direct realism are interested in its anti-sceptical powers,<br />

but there’s little to show Austin was so moved. Scepticism is not a topic that even<br />

arises in Sense and Sensibilia until the chapter on Warnock, after Austin has finished<br />

with the criticism of Ayer that takes up a large part of the book. And Soames doesn’t<br />

address the question of how to square the somewhat dismissive tone Austin takes towards<br />

scepticism in “Other Minds” with the view here propounded that Austin put<br />

forward a fairly radical theory of perception as a way of providing a new answer to<br />

the sceptic.<br />

1 I’m grateful to many correspondants for discussions about Wittgenstein. They convinced me, inter<br />

alia, that it would be foolish of me to commit to strong views of any kind about the role of the synthetic<br />

a priori in Wittgenstein’s later thought, and that the evidence is particularly messy because Wittgenstein<br />

wasn’t as centrally concerned with these concepts as we are.


Doing Philosophy With Words 670<br />

If Austin wasn’t trying to refute the sceptic, what was he trying to do? The simplest<br />

explanation is that he thought direct realism was true, sense-data theories were<br />

false, and that “there is noting so plain boring a the constant repetition of assertions<br />

that are not true, and sometimes no even faintly sensible; if we can reduce this a bit,<br />

it will all be to the good.” (Austin, 1962, 5) I’m inclined to think that in this case<br />

the simplest explanation is the best, that Austin wrote a series of lectures on perception<br />

because he was interested in the philosophy of perception. Warnock says that<br />

“Austin was genuinely shocked by what appeared to his eye to be recklessness, hurry,<br />

unrealism, and inadequate attention to truth” (Warnock, 1989, 154) and suggests this<br />

explained not only why Austin wrote the lectures but their harsh edge.<br />

There is one larger point one might have wanted to make out of a discussion of<br />

direct realism, or that one might have learned from a discussion of direct realism,<br />

that seems relevant to what comes later in Soames’s book. If we really see objects,<br />

not sense-data, then objects are constituents of intentional states. That suggests that<br />

public objects might be constituents of other states, such as beliefs, and hence constituents<br />

of assertions. Soames doesn’t give us a discussion of these possible historical<br />

links between direct realism and direct reference, and that’s too bad because there<br />

could be some fertile ground to work over here. (I’m no expert on the history of the<br />

1960s, so I’m simply guessing as to whether there is a historical link between direct<br />

realism and direct reference to go along with the strong philosophical link between<br />

the two. But it would be nice if Soames has provided an indication as to whether<br />

those guesses were likely to be productive or futile.)<br />

Soames gives us no inkling of where theories of direct reference came from, save<br />

from the brilliant mind of Kripke. Apart from the absence of discussion of any<br />

connection between direct realism and direct reference, there’s no discussion of the<br />

possible connections between Wittgenstein’s later theories and direct reference, as<br />

Howard Wettstein (2004) has claimed exist. And there’s no discussion of the (possibly<br />

related) fact that Kripke was developing the work that went into Naming and<br />

Necessity at the same time as he was lecturing and writing on Wittgenstein, producing<br />

the material that eventually became Wittgenstein on Rules and Private Language.<br />

Kripke is presented here as the first of the moderns 2 , and in many ways he is, but<br />

the ways in which he is the last (or the latest) of the ordinary language philosophers<br />

could be a very valuable part of a history of philosophy. 3<br />

Matters are somewhat more difficult when it comes to Ryle’s The Concept of Mind.<br />

Ryle predicted that he would “be stigmatised as ‘behaviourist”’ (Ryle, 1949, 327) and<br />

Soames obliges, and calls him a verificationist to boot.<br />

If beliefs and desires were private mental states [says Ryle], then we could<br />

never observe the beliefs and desires of others. But if we couldn’t observe<br />

them, then we couldn’t know that they exist, [which we can.] . . . This<br />

2 The first of what David Armstrong (2000) has aptly called “The Age of Conferences”.<br />

3 Just in case this gets misinterpreted, what I’m suggesting here is that Kripke (and his audiences) might<br />

have been influenced in interesting ways by philosophy of the 1950s and 1960s, not that Kripke took his<br />

ideas from those philosophers. The latter claim has been occasionally made, but on that ‘debate’ (Soames,<br />

1998b,a) I’m 100% on Soames’s side.


Doing Philosophy With Words 671<br />

argument is no stronger than verificationism in general, which by 1949<br />

when The Concept of Mind was published, had been abandoned by its<br />

main proponents, the logical positivists, for the simple reason that every<br />

precise formulation of it had been decisively refuted (97-8).<br />

But Ryle’s position here isn’t verificationism at all, it’s abductophobia, or fear of inference<br />

to underlying causes. Ryle doesn’t think the claim of ghosts in the machine<br />

is meaningless, he thinks it is false. The kind of inference to underlying causes he<br />

disparages here is exactly the kind of inference to unobservables that paradigm verificationists,<br />

especially Ayer, go out of their way to allow, and in doing so buy all end<br />

of trouble. 4 And abductophobia is prevalent among many contemporary anti-verificationists,<br />

particularly direct realists such as McDowell (1996), Brewer (1999) and<br />

Smith (2003) who think that if we don’t directly observe beer mugs we can never be<br />

sure that beer mugs exist. I basically agree with Soames that Ryle’s argument here<br />

(and the same style of argument recurs repeatedly in The Concept of Mind) is very<br />

weak, but it’s wrong to call it verificationist.<br />

The issue of behaviourism is trickier. At one level Ryle surely is a behaviourist,<br />

because whatever behaviourism means in philosophy, it includes what Ryle says in<br />

The Concept of Mind. Ryle is the reference-fixer for at least one disambiguation of<br />

behaviourist. However we label Ryle’s views though, it’s hard to square what he says<br />

his aims are with the aims Soames attributes to him. In particular, consider Soames’s<br />

criticism of Ryle’s attempt to show that we don’t need to posit a ghost in the machine<br />

to account for talk of intelligence. (Soames is discussing a long quote from page 47 of<br />

The Concept of Mind.)<br />

The description Ryle gives here is judicious, and more or less accurate.<br />

But it is filled with words and phrases that seem to refer to causally efficacious<br />

internal mental states—inferring, thinking, interpreting, responding<br />

to objections, being on the lookout for this, making sure not to rely on that,<br />

and so on. Unless all of these can be shown to be nothing more than<br />

behavioral dispositions, Ryle will not have succeeded in establishing that<br />

to argue intelligently is simply to manifest a variety of purely behavioral<br />

dispositions. (106)<br />

And Soames immediately asks<br />

So what are the prospects of reducing all this talk simply to talk about<br />

what behavior would take place in various conditions? (106)<br />

4 It would be particularly poor form of me to use a paradigm case argument without discussing Soames’s<br />

very good dissection of Malcolm’s paradigm case argument in chapter 7 of his book. So let me note my<br />

gratitude as a Cornellian for all the interesting lines of inquiry Soames finds suggested in Malcolm’s paper<br />

– his is a paradigm of charitable interpretation, a masterful discovery of wheat where I’d only ever seen<br />

chaff.


Doing Philosophy With Words 672<br />

The answer, unsurprisingly, is that the prospects aren’t good. But why this should<br />

bother Ryle is never made clear. For Ryle only says that when we talk of mental<br />

properties we talk about people’s dispositions, not that we talk about their purely<br />

behavioural dispositions. The latter is Soames’s addition. It is rejected more or less<br />

explicitly by Ryle in his discussion of knowing how. “Knowing how, then, is a disposition,<br />

but not a single-track disposition like a reflex or a habit . . . its exercises can be<br />

overt or covert, deeds performed or deeds imagined, words spoken aloud or words<br />

heard in one’s head, pictures painted on canvas or pictures in the mind’s eye.” (1949,<br />

46-47). Nor should Ryle feel compelled to say that these dispositions are behavioural,<br />

given his other theoretical commitments.<br />

Ryle is opposed in general to talk of ‘reduction’ as the discussion of mechanism<br />

on pages 76ff shows. To be sure there he is talking about reduction of laws, but he repeatedly<br />

makes clear that he regards laws and dispositions as tightly connected (1949,<br />

43, 123ff) and suggests that we use mental concepts to signal that psychological rather<br />

than physical laws are applicable to the scenario we’re discussing (167). Moreover, he<br />

repeatedly talks about mental events for which it is unclear there is any kind of correlated<br />

behavioural disposition, e.g. the discussion of Johnson’s stream of consciousness<br />

on page 58 and the extended discussion of imagination in chapter 8. Ryle’s claim that<br />

“Silent soliloquy is a form of pregnant non-sayings” (269) hardly looks like the claim<br />

of someone who wanted to reduce all mental talk to behavioural dispositions, unless<br />

one leans rather hard on ‘pregnant’. But we aren’t told whether Soames leans hard<br />

on this word, for he never quite tells us why he thinks all the dispositions that Ryle<br />

considers must be behavioural dispositions, rather than (for example) dispositions to<br />

produce other dispositions.<br />

To be sure, from a modern perspective it is hard to see where the space is that<br />

Ryle aims to occupy. He wants to eliminate the ghosts, so what is left for mind to<br />

be but physical stuff, and what does physical stuff do but behave? He’s not an eliminativist,<br />

so he’s ontologically committed to minds, and he hasn’t left anything for<br />

them to be but behavioural dispositions. So we might see it (not unfairly) but that’s<br />

not how Ryle sees it. 5 Soames sees Ryle as an ancestor of a reductive materialist like<br />

David Lewis, and a not very successful one at that. But the Ryle of The Concept of<br />

Mind has as much in common with non-reductive materialists, especially when he<br />

says that “not all questions are physical questions” (1949, 77), insists that “men are<br />

not machines, not even ghost-ridden machines” (1949, 81) and describes Cartesians<br />

rather than mechanists as “the better soldiers” (1949, 330) in the war against ignorance.<br />

Perhaps a modern anti-dualist should aim for a reduction of the mental to the<br />

physical, but Ryle thought no such reduction was needed to give up the ghost, and<br />

the historian should record this.<br />

4 Conclusion<br />

As I said at the top, Soames has written two really valuable books. For anyone<br />

who wants to really understand the most important philosophical work written be-<br />

5 Of course he couldn’t have seen it that way since in 1949 he wouldn’t have had the concept of onto-<br />

logical commitment.


Doing Philosophy With Words 673<br />

tween 1900 and 1970, reading through the classics while constantly referring back<br />

to Soames’s books to have the complexities of the philosophy explained will be immensely<br />

rewarding. Those who do that might feel that the people who skip reading<br />

the classics and just read Soames’s books get an unreasonably large percentage of the<br />

benefits they’ve accrued. As noted once or twice above I have some quibbles with<br />

some points in Soames’s story, but that shouldn’t let us ignore what a great service<br />

Soames has provided by providing these surveys of great philosophical work.


Epistemicism, Parasites and Vague Names<br />

Why is it so implausible that there is a sharp boundary between the rich and the<br />

non-rich? Perhaps we find it implausible merely because we (implicitly) believe that<br />

if there were such a boundary we would be able to discover where it is. If this is so<br />

we should revise our judgements. As Timothy Williamson (1994, 2000a) has shown,<br />

if there were such a boundary we would not know where it is. Still, this is not the<br />

only reason for being sceptical about the existence of such a boundary. In “Vagueness,<br />

Epistemicism and Response-Dependence” John Burgess outlines an impressive<br />

objection to the existence of such boundaries, and in particular to epistemicist theories<br />

that posit their existence. Burgess’s objection is based not on principles about the<br />

epistemology of content, as the bad objection just stated is, but rather on principles<br />

about the metaphysics of content.<br />

If a word t has content c, this must be the case in virtue of some more primitive<br />

fact obtaining. Facts about content, such as this, are not among the fundamental<br />

constituents of reality. Roughly, facts about linguistic content must obtain in virtue<br />

of facts about use. But there are simply not enough facts about use to determine a<br />

precise meaning for paradigmatically vague terms like ‘rich’. Any theory that holds<br />

that ‘rich’ does have a precise meaning must meet this objection. As Burgess argues,<br />

Williamson’s attempts to do this have not been entirely successful. Burgess argues,<br />

persuasively, that epistemicists owe us a theory of how terms like ‘rich’ get to have<br />

the precise meaning they apparently have given that the facts about use do not seem<br />

to generate a precise meaning. He also argues, less persuasively, that Williamson’s<br />

‘parasitic’ strategy for meeting this obligation is unsuccessful. Indeed, the argument<br />

here rests at one point on a premiss that is clearly false. I will suggest a way to patch<br />

the argument and reinstate the objection to epistemicism.<br />

The obligation to provide a theory that generates content in terms of use does not<br />

just fall on the epistemicists. We indeterminists about content must also discharge it.<br />

Assume that we have done so, and we have a theory of content that divides sentences<br />

into (at least) the true, the false and the indeterminate. (Williamson, 1994, 207-208)<br />

argues that the only reason we believe that any sentences fall into this third category<br />

is that we are respecting a mythical symmetry between truth and falsity. We are<br />

falling into the trap of thinking that if a sentence is not somehow made false, it is not<br />

false. The true story is that if an assertoric sentence has content, and it is not made<br />

true, it isfalse. This provides the basis for Williamson’s ‘parasitic’ strategy: wait for<br />

the indeterminist to offer a theory of when sentences are true, accept that part of the<br />

indeterminist theory, and say all other sentences that express propositions are false. If<br />

the strategy works, then there is no way the indeterminist can meet the obligation to<br />

provide a theory of content without the epistemicist also being able to do so, so there<br />

is no argument for indeterminism here. (There are complications, to put it mildly,<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Australasian<br />

Journal of Philosophy 81 (2003): 276-9.


Epistemicism, Parasites and Vague Names 675<br />

with this strategy when the indeterminist allows the border between the true and the<br />

indeterminate to be vague. Burgess lets these potential problems slide, and so shall I.)<br />

The strategy rests on the purported asymmetry between truth and falsity. Burgess<br />

claims that positing such an asymmetry makes epistemicism inconsistent. Consider<br />

a colour patch that is around the border between red and orange. Burgess claims,<br />

correctly, that an indeterminist theory of content may say that (1) and (2) are indeterminate,<br />

and hence Williamson might be committed to the position that (1) and (2)<br />

are false, and hence so is (3).<br />

(1) That patch is red.<br />

(2) That patch is orange.<br />

(3) That patch is either red or orange.<br />

But this is hopeless because “on the epistemicist view, there is a sharp boundary in<br />

the series between red and orange; every patch is either one or the other.” (Burgess,<br />

2001, 519) This last claim is false. According to epistemicism, there is a sharp boundary<br />

between red and not red, so the patch is either red or not red. But the epistemicist<br />

need not hold that if the patch is not red, then it is orange. It is consistent with epistemicism<br />

that there are colours strictly between red and orange, just as it is consistent<br />

with epistemicism that there are colours strictly between red and yellow, and just as<br />

it is consistent with epistemicism that there are colours strictly between red and blue.<br />

Hence it is possible that the colour of this patch is strictly between red and orange,<br />

and thus is neither red nor orange. So this line of reasoning does not work. Perhaps<br />

the argument can be easily fixed. According to the indeterminist, both (1) and (4) are<br />

indeterminate. Hence according to Williamson’s ‘parasitic’ theory of content, both<br />

(1) and (4) are false, so (5) is false.<br />

(1) That patch is red.<br />

(4) That patch is not red.<br />

(5) That patch is either red or not red.<br />

This is more like a problem, because Williamson certainly is committed to the truth<br />

of (5). However, it is easy to see how Williamson should respond. The theory of<br />

content sketched above (or more precisely, the strategy for converting indeterminist<br />

theories to determinist ones) was only meant to apply to simple sentences. A simple<br />

sentence is true iff the indeterminist says it is true. The truth value of compound<br />

sentences, like (4) and (5), is given by a standard Davidsonian theory of truth. Hence<br />

(1) is false and (4) is true, as required.<br />

The best way to resurrect Burgess’s argument is to shift our attention from vague<br />

predicates to vague names. Consider any mountain, say Kilimanjaro. It is vague just<br />

where the mountain starts, so it will be vague just which atoms constitute Kilimanjaro.<br />

Kilimanjaro is some fusion of atoms or other, but it is indeterminate just which<br />

one it is. Some of these fusions have different masses, and some have different shapes,<br />

so no sentence of the form of (6) or of (7) will be true according to the indeterminist.<br />

(6) Kilimanjaro has shape s.


Epistemicism, Parasites and Vague Names 676<br />

(7) Kilimanjaro has mass m.<br />

Hence according to the Williamson’s asymmetric theory of truth, any sentence of<br />

either of these forms is false. Note that this holds even if we restrict the application<br />

of his theory to simple sentences. Now let K be a set of fusions of atoms {f 1 , f 2 , . . . ,<br />

f n } such that it is determinate that Kilimanjaro is one of these fusions. (Because of<br />

higher-order vagueness it may be impossible to find such a set that does not contain<br />

any fusion that is determinately not Kilimanjaro. That will not matter; all that we<br />

require is that Kilimanjaro is one of these fusions.) Let s i be the shape of f i and m i its<br />

mass. Then for all i, (6.i) and (7.i) are false, as we just argued.<br />

(6.i) Kilimanjaro has shape s i .<br />

(7.i) Kilimanjaro has mass m i .<br />

Hence both (8) and (9) are false.<br />

(8) Kilimanjaro has shape s 1 or Kilimanjaro has shape s 2 or . . . or Kilimanjaro has<br />

shape s n .<br />

(9) Kilimanjaro has mass m 1 or Kilimanjaro has mass m 2 or . . . or Kilimanjaro has<br />

mass m n .<br />

And the epistemicist is committed to (8) and (9) being true. We may not be able to<br />

discover which disjunct is true, but that is no reason to think that the disjunction<br />

is not true. Burgess’s argument was that if we adopt Williamson’s advice for constructing<br />

a theory of content, we will misclassify sentences that express penumbral<br />

connections. He was basically right, but we need to use a different example to prove<br />

it.<br />

I assumed above that Kilimanjaro is a fusion of atoms. Some may object to this<br />

on the grounds that Kilimanjaro has different temporal and modal properties to any<br />

fusion of atoms. I doubt such objections ultimately work, but for present purposes<br />

the important thing to note is that the argument can go through without such an<br />

assumption. Even if Kilimanjaro is not identical to any fusion in K, it is clear that<br />

Kilimanjaro (actually, now) exactly overlaps some member of K. And since Kilimanjaro<br />

has the same (actual, present) shape and mass as any fusion of atoms it exactly<br />

overlaps, it still follows that (8) and (9) are true.<br />

If we do assume that Kilimanjaro is one of the fusions, then we can generate<br />

another case where Williamson’s theory generates false predictions. Since at most<br />

one of the fusions is a mountain, it follows that (10.i) is indeterminate for all i on an<br />

indeterminist theory of content, and hence false according to Williamson.<br />

(10.i) f i is a mountain.<br />

Hence his theory mistakenly predicts that (11) is false, when it is by hypothesis true.<br />

(11) f 1 is a mountain or f 2 is a mountain or . . . or f n is a mountain.


Epistemicism, Parasites and Vague Names 677<br />

This argument does rest on a contentious bit of metaphysics, but it still seems basically<br />

sound.<br />

I did not assume at any point that Kilimanjaro is a vague object. I did assume that<br />

‘Kilimanjaro’ is a vague name, but it is consistent with the argument I have presented<br />

that there are no vague objects, and the vagueness in ‘Kilimanjaro’ consists in it being<br />

indeterminate which precise object it denotes.<br />

As Burgess demonstrates, it is fair to require that the epistemicist provide a theory<br />

of how terms get the precise content they do. Williamson attempted to show<br />

he was in just as good a position to discharge this obligation as the indeterminist<br />

by providing an algorithm for converting any indeterminist theory of content into<br />

one acceptable to the epistemicist. Burgess argued that the algorithm produced unacceptable<br />

results when we applied it to vague sentences such as (1) and (2). This<br />

particular argument is no good; the algorithm does not seem to produce implausible<br />

results in that case. We can make this form of argument work, however, especially<br />

if we focus on vague names. Applying the algorithm to any plausible indeterminist<br />

theory produces the result that every disjunct in (8) and (9) are false, and hence that<br />

these disjunctions are false. Since the epistemicist is (correctly) committed to these<br />

sentences being true, Burgess was correct to conclude that “this particular attempt to<br />

implement the parasite strategy is doomed to failure.”


Begging the Question and Bayesians<br />

Abstract<br />

In a recent article Patrick Maher shows that the ‘depragmatised’ form of<br />

Dutch Book arguments for Bayesianism tend to beg the question against<br />

their most interesting anti-Bayesian opponents. I argue that the same<br />

criticism can be levelled at Maher’s own argument for Bayesianism.<br />

The arguments for Bayesianism in the literature fall into three broad categories. There<br />

are Dutch Book arguments, both of the traditional pragmatic variety and the modern<br />

‘depragmatised’ form. And there are arguments from the so-called ‘representation<br />

theorems’. The arguments have many similarities, for example they have a common<br />

conclusion, and they all derive epistemic constraints from considerations about coherent<br />

preferences, but they have enough differences to produce hostilities between<br />

their proponents. In a recent paper, Maher (1997) has argued that the pragmatised<br />

Dutch Book arguments are unsound and the depragmatised Dutch Book arguments<br />

question begging. He urges we instead use the representation theorem argument as in<br />

Maher (1993). In this paper I argue that Maher’s own argument is question-begging,<br />

though in a more subtle and interesting way than his Dutch Book wielding opponents.<br />

1 Bayesianism<br />

What’s a Bayesian? The term these days covers so many different positions that the<br />

only safe course is to strictly define what one means by the term. The alternative, as<br />

the discussion in Walley (1996) shows, is to have one of the least interesting semantic<br />

debates ever held. I define a Bayesian to be one who is committed to two theses,<br />

which I’ll call (B1) and (B2).<br />

(B1) Belief comes by degrees.<br />

(B2) It is a requirement of consistency that these degrees of belief, or credences, be<br />

consistent with the probability calculus.<br />

I should explain (B2) a little. Historically, Bayesians held that credences were, or at<br />

least ought be, reals in [0, 1], and the function Bel which takes any proposition into<br />

the agent’s credence in that proposition should be a probability function. Modern<br />

Bayesians, following Levi (1980) and Jeffrey (1983a), allow that credences can be imprecise.<br />

In this case the consistency requirement is that there be some precisification<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Studies<br />

in History and Philosophy of Science 30 (1999): 687-697.


Begging the Question and Bayesians 679<br />

of their credences which is a probability function. (B2) is deliberately ambiguous between<br />

the traditional and modern Bayesian positions, largely because nothing in this<br />

debate turns on this question 1 .<br />

There are many other properties that could have been used to define Bayesians.<br />

For example, it could be suggested that it is requirement of being a Bayesian that one<br />

think rules like (B2) are derived by analysing credences as dispositions to bet. This is<br />

suggested by Kaplan (1993, 320) to be the “fundamental Bayesian insight”. Or it could<br />

be argued that being a Bayesian requires some an extra rule to the effect that credences<br />

are updated by conditionalisation. I haven’t included that for two reasons. First, I’m<br />

mostly interested in static constraints on credences, and secondly, some paradigm<br />

Bayesians like Levi (1980) and van Fraassen (1989) reject this rule in its fully general<br />

form. Finally it might be suggested that Bayesians aren’t those that believe certain<br />

principles like (B1) and (B2), but only those who think these principles provide the<br />

foundation for all philosophy of science. So perhaps my definition is a bit liberal.<br />

Now it is well known that not everyone’s a Bayesian. One group of non-Bayesians<br />

who have received too little attention from their Bayesian rivals are those who accept<br />

(B1) but not (B2). That is, theorists who agree there are such things as credences,<br />

and even that credences are important for philosophy of science, but not that they<br />

ought be constrained by the probability calculus. The most interesting example is<br />

the theory of evidence developed by Dempster (1967, 1968) and Shafer (1976)<br />

Dempster and Shafer, like many other theorists, think that when we have no<br />

evidence either for or against p, we should have low credences in both p and ¬p. In<br />

the limit case it is acceptable to have our credence in both p and in ¬p set at zero.<br />

Now this is in conflict with (B2), for it is a theorem of the probability calculus that<br />

Pr(p) + Pr(¬p) = 1, and so by (B2) it is a requirement of rationality that Bel(p) +<br />

Bel(¬p) = 0.<br />

This intuition about cases where the evidence is low is formalised in a rather<br />

neat theory. For simplicity I’ll say how the theory works when we are interested<br />

in finitely many propositions; Shafer shows the infinite case can be dealt with but<br />

it doesn’t raise any philosophically interesting differences. We are interested in n<br />

propositions, so the possibility space contains 2 n ‘worlds’. A proposition A can be<br />

identified in the usual ways with the set of worlds at which it is true. The Bayesian has<br />

us place a normalised measure on this possibility space, with our credence in A being<br />

the measure of the set A. In Dempster and Shafer’s theory we place a normalised<br />

measure, which they call a ‘mass function’ on the power set of the worlds, excluding<br />

the null set. Our credence in A is calculated as the measure of the set of sets which are<br />

subsets of A. So in the simplest case, where we are just interested in one proposition<br />

p, the mass function is defined on {{p}, {¬p}, {p, ¬p}}. Complete ignorance is represented<br />

by giving {p, ¬p} mass one, and the other sets mass zero. Hence both Bel(p)<br />

and Bel(¬p) are zero. On the other hand, Bel(p ∨ ¬p) will be one, as is Bel(C) for any<br />

classical tautology C. As a consequence of this we will not have the addition rule.<br />

1 Historically the impetus for allowing imprecise credences was the economic distinction between<br />

insurable and uninsurable risks, as discussed in Knight (1921), Keynes (1937b), Tintner (1941) and Hart<br />

(1942).


Begging the Question and Bayesians 680<br />

Addition: For disjoint A, B, Bel(A ∨ B) = Bel(A) + Bel(B)<br />

Since Bayesians believe in Addition and some opponents do not, arguments for Bayesianism<br />

should be inter alia arguments for Addition. I don’t want to argue that Dempster<br />

and Shafer’s theory is right. It has some internal problems, particularly with<br />

updating, which make it look not too promising as a general theory of evidence. The<br />

recent collection edited by Yager et al. (1994) has papers dealing with many of these<br />

issues for the interested reader. My interest in this theory is merely to show the kind<br />

of theorist the Bayesian must argue against. This is particularly important when the<br />

alleged problem with arguments for Bayesianism is that they are question-begging.<br />

As a last point about the Dempster-Shafer theory it might be noted that not only<br />

does Addition fail for credences, the equivalent rule for valuing bets also fails. Put<br />

formally, let an A-bet be a bet which pays £1 if A and nothing otherwise. It is consistent<br />

with the Dempster-Shafer theory to say the value of an A-bet is always £Bel(A).<br />

So on this theory it is not the case that for disjoint A, B it is true that the value of an<br />

(A ∨ B)-bet always equals the value of an A-bet plus the value of a B-bet.<br />

Since it is sometimes thought there is an argument showing this to be incoherent,<br />

it is worthwhile giving a partial defence of its consistency. An argument like the<br />

following appears, as we’ll see, to be endorsed by Maher. There exists a voucher<br />

which is an (A ∨ B)-bet, a ticket which is an A-bet and a coupon which is a B-bet.<br />

Now anyone holding the ticket and the coupon will receive exactly the same payout<br />

in all circumstances as anyone holding the voucher, hence they must have the same<br />

value. Hence the value of the voucher is the value of the ticket plus the value of<br />

the coupon. The problem with this argument is that it assumes the ticket and the<br />

coupon are not what economists call complementary goods. Some goods, like say<br />

compact discs, have more value to a consumer if they hold certain other goods, like<br />

compact disc players. On the Dempster-Shafer theory, the ticket and the coupon<br />

may well be complementary goods. To anyone holding the ticket, the value of the<br />

coupon is the difference between value of the voucher and the value of the ticket,<br />

that is, Bel(A ∨ B) - Bel(A). This will in general be greater than its ‘intrinsic’ value<br />

Bel(B). But this goes no way to showing that its value to someone without the ticket<br />

must be greater than Bel(B). This, in rough outline, is the objection Schick (1986)<br />

makes to Dutch Book arguments. So arguments for Addition which assume that<br />

A-bets and B-bets are not complements will beg the question against proponents of<br />

the Dempster-Shafer theory, who have a principled reason to reject that assumption.<br />

2 Dutch Book Arguments<br />

As I mentioned above, there are three broad categories of arguments for Bayesianism,<br />

two breeds of Dutch Book arguments and ‘representation theorem’ style arguments.<br />

In this section I’ll briefly deal with the Dutch Book arguments before looking at<br />

Maher’s version of the representation theorem argument in section 3.<br />

The classic, pragmatic, Dutch Book argument, assumes that appropriate circumstances<br />

exist whereby the amount an agent would be prepared to pay for any A-bet is<br />

£Bel(A). Indeed, they assume this not only is true, but that it would remain true while


Begging the Question and Bayesians 681<br />

the agent starts trading in bets. Given these assumptions, if the agent’s credences are<br />

not a probability function, a clever bookie who knows just their credences can sell<br />

them a ‘Dutch Book’ which is guaranteed to lose in all circumstances.<br />

Everyone’s got their favourite objection to this argument, so I won’t spend much<br />

time on it here. Maher (1993, 98) argues that the declining marginal utility of money<br />

means that this won’t work for pounds, and since there is no currency with a constant<br />

marginal utility this flaw can’t be resolved. This is a rather odd objection since<br />

Savage (1954) argued long ago that we could get around this problem by using bets<br />

denominated in lottery tickets. As mentioned above, Schick (1986) takes the possibility<br />

of complementary bets to be a crushing blow to the argument. My favourite<br />

objection turns on the distinction that Adam Smith famously drew attention to between<br />

the usefulness of a good and its market value. Bel(A) can determine, at most,<br />

the usefulness of an A-bet, so at disequilibrium a coherent agent should probably not<br />

trade A-bets for £Bel(A). And since there’s Dutch Books to be sold we must be at<br />

disequilibrium, so the initial assumption about ‘appropriate circumstances’ must be<br />

false. In any case, there are enough different objections to this argument that we can<br />

safely move on.<br />

The depragmatised Dutch Book arguments, as in Howson and Urbach (1989),<br />

Christensen (1996) and Hellman (1997), do away with the prospect of a bookie actually<br />

milking the poor incoherent agent. Rather they use similar reasoning to show<br />

that there is something wrong with an agent whose credences are not probability<br />

functions. I will concentrate on Christensen’s argument, but similar comments apply<br />

to the other two arguments.<br />

Christensen does not believe that an agent who’s credence in A is x should be<br />

prepared to buy a A-bet for £A. However he does say that the agent should “evaluate<br />

such [trades] as fair” (Christensen, 1996, 456). So credences may ‘sanction’ (his term)<br />

certain odds even if the agent does not desire to accept these sanctioned bets. This<br />

may come about because of the declining marginal utility of the currency in which<br />

the bets are denominated, or because of a dislike of gambling, or possibly because of<br />

the discrepancy I mentioned between use-value and exchange-value. Now making the<br />

safe enough assumption that credences that sanction trades which lead to sure loss are<br />

defective he concludes that credences which do not satisfy the probability calculus are<br />

defective.<br />

However, Maher (1997, 301-3) points out, the argument so far doesn’t get the<br />

conclusion Christensen wants. Indeed for some simple Shafer functions which are<br />

not probability functions no sure-loss trades will be sanctioned. To get Christensen’s<br />

conclusion, we need the extra premise that if two trades are sanctioned their sum<br />

is sanctioned. Equivalently, we need the premise that what bets are sanctioned is<br />

independent of what bets are already held. But given the definition of ‘sanction’ this<br />

just is the premise that credences must satisfy Addition. So the argument is questionbegging<br />

against the writer who denies Addition. Maher shows that similar problems<br />

beset the arguments in Howson and Urbach (1989) and Hellman (1997).


Begging the Question and Bayesians 682<br />

3 Representation Theorems<br />

The alternative Maher supports is based around ‘representation theorems’. A similar<br />

approach is taken by Kaplan (1996), but I’ll focus on Maher. In any case, the issues<br />

that arise are exactly the same. The basic idea is that it is a requirement on the coherence<br />

of an agent’s preferences that there exist a probability function and a set of<br />

utility functions equivalent up to affine transformation such that the agent prefers<br />

gamble f to g iff the expected utility of f given the probability function and any of<br />

the utility functions is greater than that of g. The probability function will give us the<br />

agent’s credences in all propositions. Preferences are decreed to be coherent if they<br />

satisfy a number of axioms that Maher defends. For example, it is required that preferences<br />

be transitive, that an agent not prefer f to g and prefer g to f, and so on. The<br />

claim here is that the defence of these axioms begs the question against a supported<br />

of the Dempster-Shafer approach.<br />

Strictly Maher does not quite believe that all coherent credence functions are<br />

probability functions. The argument to that conclusion requires that preferences be<br />

complete, and Maher does not think this is plausible. If we drop that assumption we<br />

get the conclusion that the agent’s credences should be represented by sets of probability<br />

functions as in Levi and Jeffrey, not a single probability function. However<br />

for convenience he assumes first that completeness holds, and I’ll follow this lead.<br />

Nothing pertaining to the success or otherwise of the argument turns on this point.<br />

There are a few immediate problems with this approach. Maher needs to assume<br />

that if an agent has a higher credence in p than in q they will prefer a p-bet to a q-bet.<br />

The problem is that when it is unlikely that we will ever see p or ¬p confirmed we<br />

may well prefer a q-bet to a p-bet even if we have a higher degree of belief in p. I would<br />

prefer a bet on the Yankees winning the next World Series to a bet on Oswald being<br />

Kennedy’s assassin, even though I have a higher degree of belief in Oswald’s guilt<br />

than the Yankees’s success, because betting on the Yankees gives me some chance of<br />

getting a payout.<br />

Maher is aware of this point, but his attempt to dispose of it is disastrous. His<br />

example is comparing a bet on the truth of the theory of evolution, construed as the<br />

claim that all life on earth is descended from a few species, with betting on its negation<br />

Maher (1993, 89). Taking scientists as his expert function he asks some biologists<br />

which of these bets they would prefer, on the assumption that there are extraterrestrials<br />

who have been observing earth from its formation and will adjudicate on the bet.<br />

He is rather happy that they all plump for betting on Darwin. But this is a perfectly<br />

useless result. The objection was that we can have degrees of belief on unverifiable<br />

propositions, but our attitudes to bets on these propositions will be quite different<br />

to our attitude towards bets on verifiable propositions. He has attempted to counter<br />

this by simply making the problematic proposition verifiable. When we drop the<br />

assumption that there are extraterrestrials, so the theory of evolution would become<br />

unverifiable, presumably most people would (strictly) prefer a bet on a fair coin landing<br />

heads to either a bet on the theory of evolution or its negation. Preferences in a


Begging the Question and Bayesians 683<br />

situation when the theory is verifiable are completely irrelevant to the problem unverifiable<br />

theories pose for Bayesian philosophies of science. So we have a problem,<br />

although I’d be prepared to accept for the sake of the argument it can be finessed.<br />

The major problem for Maher is that his argument is just as question-begging<br />

as the Dutch Book arguments he criticises, though in a more subtle and interesting<br />

way. For Maher’s argument to work we have to accept some constraints on preferences,<br />

such as transitivity. His argument is only as strong as the argument for these<br />

constraints. He has nine axioms which must be justified in some way 2 . The most<br />

interesting is Independence, which he construes as follows. D is a set of gambles and<br />

X a set of propositions, f ≤ g means the agent either prefers g to f or is indifferent<br />

between them, f ≡ g means that f and g are exactly the same gamble, they have the<br />

same payouts in all possible worlds, and f ≡ g on A means that on all possible worlds<br />

in which A, f and g have the same payouts.<br />

Independence For all f, f ′ , g, g ′ ∈ D and A ∈ X, if f ≡ f ′ on A, g ≡ g ′ on A, f ≡ g<br />

on ¬A, f ′ ≡ g ′ on ¬A and f ≤ g then f ′ ≤ g ′ (Maher, 1993, 190)<br />

The idea is that if f and g have the same payouts given some proposition, say A,<br />

our preference between f and g should be independent of its value. All that matters,<br />

according to this idea, is the comparative fact that if A turns out true f and g have<br />

identical payouts, so whichever of the bets is preferred given ¬A should be preferred<br />

overall. The most famous examples where intuition says this may be violated are the<br />

Allais and Ellsburg ‘paradoxes’. Since uncertainty plays a larger role in it, I’ll briefly<br />

sketch the Ellsburg paradox. An urn contains 90 balls. Thirty of these are yellow,<br />

and remainder are either black or red in unknown proportion. The payouts for the<br />

four gambles in question are given in this table.<br />

Red Black Yellow<br />

f £1 0 0<br />

g 0 0 £1<br />

f ′ £1 £1 0<br />

g ′ 0 £1 £1<br />

Many subjects prefer g to f, since they know the chance of a yellow ball being<br />

drawn but not that of a red ball, but prefer f ′ to g ′ since they know the chance of a<br />

red or black ball being drawn but not that of a black or yellow ball. This is easily<br />

justifiable under the Dempster-Shafer approach. Let B, R and Y be the propositions<br />

that a black, red and yellow ball respectively is drawn. Given the evidence about the<br />

composition of the urn, it seems plausible to set Bel(B) = Bel(R)=0, Bel(Y )=Bel(Y ∨B)<br />

= Bel(Y ∨R) = 1<br />

2<br />

and Bel(B∨R) = . Bayesians say this is coherent, but it is perfectly<br />

3 3<br />

acceptable under a Dempster-Shafer theory. Given these credences, the value of f is<br />

£0, and the value of g is £ 1<br />

3 . However the value of f ′ is £ 2<br />

3 while the value of g′ is just<br />

2 There is a further axiom designed to guarantee countable additivity, but that raises independent prob-<br />

lems and won’t be discussed here.


Begging the Question and Bayesians 684<br />

£ 1<br />

. Hence it is not only acceptable, but arguably a requirement of rationality that an<br />

3<br />

agent prefer g to f but f ′ to g ′ .<br />

With these preferences, and setting A to ‘A black ball is not drawn’ we can see<br />

this violates Maher’s independence axiom. No objection yet, many people are just<br />

irrational. The real problem arises with Maher’s argument that people who choose<br />

in this way are irrational. The following two choice trees set out two ‘tree form’<br />

versions of the choices facing these subjects.<br />

The left-hand tree represents the choice between f and g. The subject is told<br />

that if a black ball is drawn they will receive nothing, but if it is not drawn they<br />

will have a choice between betting on red and betting on yellow. So far we have a<br />

standard enough dynamic choice problem. Maher proposes to make it synchronic<br />

by requiring that subjects specify in advance what they would do if they reached the<br />

square, that is if a black ball is not drawn. This, he claims, makes the situation exactly<br />

as if the agent was choosing between f and g. Now the right-hand tree is the same as<br />

the left-hand tree in all respects but one. If a black ball is drawn the agent receives<br />

£1, not nothing. But the only choice the agent has to make is exactly the same as in<br />

the left-hand tree, so they ought make the same choice. We can concede to Maher<br />

here that it would be irrational to specify, in advance, a preference for g over f in the<br />

left-hand tree and for f ′ over g in the right-hand tree. This is, however, insufficient<br />

for his conclusion.<br />

The problem lies in his assumption that “it seems uncontroversial that the consequences<br />

a person values are not changed by representing the options in a tabular or<br />

tree form” Maher (1993, 71). As Seidenfeld (1994) makes clear, this is exactly what<br />

is controversial in these circumstances. Indeed this premise, call it Reduction, is expressly<br />

denied by a number of heterodox decision theorists, and by writers who deny<br />

Addition on the occasions they talk about decision theory. There is a good reason for<br />

this. As noted above, on the Dempster-Shafer theory, Bel(B∨R)may be greater than<br />

Bel(B)+Bel(R). When evaluating the worth of choosing f ′ in the original, tabular, it<br />

seems plausible that it is Bel(B∨R) that matters, not Bel(B)+Bel(R). However in the<br />

tree form problem all that matters to f ′ is Bel(B), for the possibility that we won’t<br />

need to choose, and Bel(R), for the possibility that we do.


Begging the Question and Bayesians 685<br />

The point is that Maher has to either assume agents only consider Bel(B) and<br />

Bel(R) when assessing f ′ , not Bel(B∨R), or that Bel(B∨R) is some function of Bel(B)<br />

and Bel(R) so that we can ignore that complication, in his ‘uncontroversial’ assumption.<br />

The first option is implausible, surely when comparing f ′ and g ′ we just compare<br />

Bel(B∨R) with Bel(B∨Y ). More interestingly, I claim that the second is questionbegging.<br />

Given that virtually everyone agrees that in some cases, for example lotteries,<br />

degrees of belief should be probability functions, in some cases the function<br />

which gives us Bel(B∨R) from Bel(B) and Bel(R) must be addition. Hence he must<br />

assume that Bel(B∨R) = Bel(B)+Bel(R) for the move from tabular to tree form to be<br />

plausible. But this is just what he was trying to prove, so the argument is questionbegging.<br />

4 Conclusion<br />

Maher rightly objects to depragmatised Dutch Book arguments on the ground that<br />

they are question-begging. That is, they use their conclusion as an implicit premise. It<br />

is argued here that the same objection applies to Maher’s argument for Bayesianism.<br />

He relies on the reducibility of tree form decisions to table form decisions, but the<br />

only justification for this could be a reliance on Addition. But Addition was what he<br />

was trying to prove all along, so he isn’t allowed to take Reduction as a premise.<br />

There are three moves that Maher could make here. First, he could say that<br />

Reduction is so obvious that it should be acceptable as a premise without justification.<br />

The resulting argument may be effective at convincing some agnostics about<br />

Bayesianism that their implicit assumptions all along were Bayesian, but it would be<br />

completely ineffective against the sceptics about Bayesianism I have been discussing.<br />

Secondly, he could come up with a new argument for Reduction that I haven’t considered<br />

here and isn’t vulnerable to this objection. Given the conclusions of the last<br />

section I doubt this is possible, but the ingenuity of philosophers shouldn’t be underestimated.<br />

Thirdly, and most interestingly, he could look for justifications of Bayesianism<br />

that do not rely on construing credences as dispositions to bet. Since the<br />

arguments from considerations about preferences to constraints on credences have so<br />

far all failed, the time might be right to look at the problem from a different direction.


Prankster’s Ethics<br />

Andy Egan, <strong>Brian</strong> <strong>Weatherson</strong><br />

1 A Quick Argument for Boorishness<br />

Diversity is a good thing. Some of its value is instrumental. Having people around<br />

with diverse beliefs, or customs, or tastes, can expand our horizons and potentially<br />

raise to salience some potential true beliefs, useful customs or apt tastes. Even diversity<br />

of error can be useful. Seeing other people fall away from the true and the useful<br />

in distinctive ways can immunise us against similar errors. And there are a variety of<br />

pleasant interactions, not least philosophical exchange, that wouldn’t be possible unless<br />

some kinds of diversity existed. Diversity may also have intrinsic value. It may<br />

be that a society with diverse views, customs and tastes is simply thereby a better<br />

society. But we will mostly focus on diversity’s instrumental value here.<br />

We think that what is true of these common types of diversity is also true of moral<br />

diversity. By moral diversity we mean not only diversity of moral views, though that<br />

is no doubt valuable, but diversity of moral behaviour. In a morally diverse society,<br />

at least some people will not conform as tightly to moral norms as others. In short,<br />

there will be some wrongdoers. To be sure, moral diversity has some costs, and<br />

too much of it is undoubtedly a bad thing. Having rapists and murderers adds to<br />

moral diversity (assuming, as we do, that most people are basically moral) but not in<br />

a way that is particularly valuable. Still, smaller amounts of moral diversity may be<br />

valuable, all things considered. It seems particularly clear that moral diversity within<br />

a subgroup has value, but sometimes society as a whole is better off for being morally<br />

diverse. Let us consider some examples.<br />

Many violations of etiquette are not moral transgressions. Eating asparagus spears<br />

with one’s fork is not sinful, just poor form. But more extreme violations may be<br />

sinful. Hurtful use of racial epithets, for example, is clearly immoral as well as a<br />

breach of etiquette. Even use of language that causes not hurt, but strong discomfort,<br />

may be morally wrong. Someone who uses an offensive term in polite company,<br />

say at a dinner party or in a professional philosophical forum, may be doing the<br />

wrong thing. But having the wrongdoer around may have valuable consequences.<br />

For example, they generate stories that can be told, to great amusement, at subsequent<br />

dinner parties. They also prompt us to reconsider the basis for the standards we<br />

ourselves adopt in such matters. The reconsideration may cause us to abandon useless<br />

practices, and it may reinforce useful practices. These benefits seem to outweigh the<br />

disutility of the discomfort felt by those in attendance when the fateful word drops<br />

from the speaker’s lips. These side benefits do not make the original action morally<br />

permissible. Indeed, it is precisely because the action is not morally permissible that<br />

the benefits accrue.<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philo-<br />

sophical Perspectives 18 (2004): 45-52.


Prankster’s Ethics 687<br />

While we think that case is one of valuable moral diversity, some may question<br />

the immorality of the act in question. So let us try a more clearly immoral case: the<br />

mostly harmless prankster. Sam is a pie-thrower. Sam doesn’t just throw pies at the<br />

rich and infamous. No, Sam’s pies land on common folk like you and I, often for<br />

no reason beyond Sam’s amusement. Causing gratuitous harm for one’s own amusement<br />

is immoral. And a pie in the face, while better than a poke in the eye with a<br />

burnt stick, is harmful. But it may, in some circumstances, have side benefits. There<br />

will be the (guilty) pleasure occasioned in the unharmed bystanders, though it would<br />

be wrong to put too much weight on that. Other more significant benefits may accrue<br />

if Sam’s society is otherwise saintly. Sam’s existence will prompt people to take<br />

some simple, and worthwhile, precautions against perpetrators of such attacks. Even<br />

if society currently contains no malfeasants, such precautions will be useful against<br />

future wrongdoers. This benefit will increase if Sam graduates from pie-throwing to<br />

more varied pranks. (As may the entertainment value of Sam’s pranks.) Many computer<br />

hackers perform just this function in the present world. Malicious hackers on<br />

the whole cause more harm than good. But other hackers, who hack without gratuitously<br />

harming, provide a protective benefit by informing us of our weaknesses.<br />

These are the pie-throwers of the virtual world. Sam’s actions have other benefits. If<br />

Sam’s pranks are harmless enough, some will mistakenly think that they are morally<br />

acceptable, and we can have enjoyable, valuable, philosophical discussions with them.<br />

(Note that this benefit also increases if Sam varies the pranks.) The upshot is that<br />

Sam’s pranks can make the world a better place, all things considered, despite being<br />

immoral. Indeed, in some ways they make the world a better place because they are<br />

immoral.<br />

The philosophical point, or points, here may be familiar. One point certainly is<br />

familiar: we have here an example of a Moorean organic unity. The goodness of the<br />

whole is no simple function of the goodness of the parts. It might be thought that<br />

this follows simply from the familiar counterexamples to utilitarianism, and that our<br />

examples have no more philosophical interest than those old counterexamples. Both<br />

of these thoughts would be mistaken.<br />

The familiar counterexamples we have in mind include, for example, the case of<br />

the doctor who kills a healthy patient to harvest her organs, or the judge who executes<br />

an innocent man to prevent a riot. Importantly, those examples do not refute<br />

consequentialism in general, but only a version of consequentialism that adopts a<br />

particular kind of reductive analysis of the good. The details of the analysis won’t<br />

matter here, but it may be an analysis of goodness in terms on happiness, or preference<br />

satisfaction. If we give up the reductive analysis of goodness, we can say that the<br />

doctor and the judge do not make for a better society. A familiar heuristic supports<br />

that claim. (We take no stand here on whether this heuristic can be turned into an<br />

analysis.) Behind the Rawlsian veil of ignorance, we would prefer that there not be<br />

such doctors or judges in society. We think that most of us would agree, even in full<br />

appreciation of the possibility that we will be saved by the doctor, or possibly the<br />

judge. On the other hand, we think we’d prefer a society with the occasional boorish<br />

dinner guest, or a rare pie-thrower, to a society of moral saints. We say this in<br />

full appreciation of the possibility that we may get a pie in the face for our troubles.


Prankster’s Ethics 688<br />

Possibly if we knew we would be the pie-throwee we would change our minds, but<br />

fortunately pies cannot penetrate the veil of ignorance.<br />

Although it isn’t much discussed in the literature, we think this form of consequentialism<br />

is interesting for several reasons beyond its capacity to avoid counterexamples.<br />

For one thing, it is not easy to say whether this counts as an agent-neutral<br />

ethical theory. On the one hand, we can say what everyone should do in neutral<br />

terms: for each person it is better if they do things that create a better world from<br />

the perspective of those behind the veil of ignorance. On the other hand this rule<br />

leads to obligations on agents that do not seem at all neutral. From behind the veil of<br />

ignorance we’d prefer that parents love their children and hence privilege their interests,<br />

and that they love them because they are their children not because this creates<br />

a better world, so parents end up with a special obligation to their children. Having<br />

this much (or more importantly this little) neutrality in a moral theory sounds quite<br />

plausible to us, and although we won’t develop the point here there is possibly an<br />

attractive answer to the ‘nearest and dearest’ objection to consequentialism (Jackson,<br />

1991). More generally, because we have preferences from behind the veil of ignorance<br />

about why people act and not just about how they act – we prefer for instance that<br />

people visit sick friends in hospital because they are friends not because of an abstract<br />

sense of duty – this form of consequentialism is not particularly vulnerable to<br />

objections that claim consequentialists pay too little attention to motives.<br />

So we think a consequentialist can avoid the standard objections to utilitarianism<br />

by being less ambitious and not trying to provide a reductive analysis of goodness.<br />

The most natural retreat is to behind the veil of ignorance, but our examples can<br />

reach even there. This is far from the only interesting consequence of the examples.<br />

2 The Good, the Right, and the Saintly<br />

We think that the cases of the curser and the pie-thrower are examples of situations<br />

in which (a) an agent ought not to ϕ, and (b) it’s best that the agent does ϕ. Our<br />

judgements about the cases are not based on any theoretical analysis of the right and<br />

the good. They’re simply intuitions about cases—it just seems to us that the right<br />

thing to say about the pie thrower is that she ought not to do what she does, but<br />

that it’s still best if she does it. To the extent that these intuitions are puzzling or<br />

theoretically problematic (and we think that they are at least a little bit puzzling, and<br />

at least potentially problematic), it’s open to us to reject one or the other intuition<br />

about the cases, and either deny that the curser and the pie thrower ought not to<br />

curse or throw pies, or deny that it’s best that they do curse and throw pies. This is<br />

an option, but we think it’s not a very attractive one. Suppose that instead we take<br />

the intuitions at face value, and accept our judgements about the cases. What follows?<br />

Our analysis of the examples is incompatible with two attractive views about the<br />

connection between goodness (that is, the property of things—in particular worlds—in<br />

virtue of which some of them stand in the better than relation to others) and rightness,<br />

and between goodness and good character:<br />

(1) It’s better if everyone does what’s right.


Prankster’s Ethics 689<br />

(2) It’s better if everyone has good character. 1<br />

Now, neither of these will do as a philosophical thesis. But it’s probably not worth<br />

spending the time and effort on patching them up, since even the patched-up versions<br />

will be false.<br />

If the pie-thrower ought not to throw her pies, but it’s nonetheless best that she<br />

does, no patched-up version of (1) that captures the intuition behind it can be right.<br />

Any patched-up version of (1) will still be claiming that there’s a very tight connection<br />

between what it would be right for us to do (what we ought to do) and what it<br />

would be best for us to do. Any plausible elaboration on (1) will include a commitment<br />

to the thesis that, if we ought not to do something, then it’s best if we don’t do<br />

it. But if our analyses of the cases of the curser and the pie-thrower are right, then<br />

these are counterexamples.<br />

What about (2)? Well, it’s not better if the cursing dinner guest has good character.<br />

What happens if we suppose that the curser does have good character? One of<br />

two things: (i) He’ll no longer curse at dinner parties, and we’ll lose the benefits that<br />

come from his cursing. This would be bad. (ii) He’ll still curse at dinner parties, but<br />

he’ll be cursing in a studied way. He’ll be cursing because he’s seen that things will be<br />

better if somebody uses foul language in inappropriate circumstances, and he’s taken it<br />

upon himself to fill the unfilled functional role. This would also be bad. This sort of<br />

studied bad conduct doesn’t have the same value as bad conduct that springs from bad<br />

character. Here is some evidence for this: We value the curser’s breaching of societal<br />

norms, even though he ought not to do it. Were we to find out that every expletive<br />

had been studied, produced either to produce these important social goods, or to create<br />

a familiar bad-boy image, we would stop valuing his breachings of the moral order.<br />

They would, instead, become merely tiresome and annoying. Since we value spontaneous<br />

cursings which are products of less-than-optimal character, but we do not<br />

value studied cursings which are products of exemplary character, it’s very plausible<br />

to conclude (though admittedly not quite mandatory) that the spontaneous curses are<br />

much more valuable than the studied ones. We’re inclined to say, in fact, that while<br />

having a few spontaneous cursers around makes things better, having studied cursers<br />

around makes things worse. Since you have to have less-than-perfect character in order<br />

to be a spontaneous curser, it follows that you can’t get the benefits of having<br />

cursers around without having some people with less-than-perfect character around.<br />

And since it’s better to have the cursers than not, it’s better to have some people with<br />

less-than-perfect character around than not. This will be incompatible with almost<br />

any plausible way of cashing out (2). 2<br />

1 (2) is quite a natural position to hold if one is trying to capture the insights of virtue ethics in a<br />

consequentialist framework, as in Driver (2001) or Hurka (2001). But if we take ‘better’ in a more neutral<br />

way, so (2) does not mean that there are better consequences if everyone has good character, but simply<br />

that the world is a better place if this is so, even if this has few consequences, or even negative consequences,<br />

then it will be a position common to most virtue ethicists.<br />

2 Specifically, it will be incompatible with any maximizing version of (2). There are ‘threshold’ versions<br />

of (2) that don’t fall afoul of this kind of problem because they don’t claim it would be best for everyone<br />

to have perfect character, but only that it would be best for everyone to have pretty good character, or at<br />

least for nobody to have really bad character.


Prankster’s Ethics 690<br />

3 A Problem about Quantifier Scope?<br />

But isn’t there a sense in which (for example) the pie-thrower ought to throw his<br />

pies? After all, if nobody was throwing pies, we might think to ourselves, “gosh,<br />

it would be better if there were a few—not many, but a few—pie throwers around”.<br />

Then it would be natural to conclude, “somebody ought to start throwing pies at<br />

strangers”. And then it would be natural to infer that at least the first person to start<br />

throwing pies at strangers would be doing what they ought. It would be natural, but<br />

it would be wrong. The plausible reading of “someone ought to start throwing pies<br />

at strangers” is, “it ought to be that somebody starts throwing pies at strangers”, not,<br />

“there’s somebody out there such that they ought to start throwing pies at strangers”.<br />

So we haven’t gotten anybody a moral license to throw pies yet. And in fact it’s very<br />

plausible that we ought to understand assertions that it ought to be that P as claiming<br />

that it would be better if it were the case that P; that is, as making claims about what<br />

would be good, not about what would be right.<br />

There’s a puzzle about what to make of cases where we’re inclined to say that<br />

it ought to be that somebody ϕs—that is, that somebody ought to ϕ; but also that<br />

there’s nobody such that they ought to ϕ—in fact, that everybody is such that they<br />

ought not to ϕ. 3 Maybe the fact that our intuitions about the examples give rise<br />

to these kinds of puzzling cases is evidence that one or the other of our intuitions<br />

ought to be rejected. The move we suggested above is that the reason this seems so<br />

puzzling is that we’ve been punning on “ought”. The “ought” in “somebody ought to<br />

start throwing pies” doesn’t have anything much to do with what moral obligations<br />

anybody has—doesn’t have anything much to do with what’s right—but has a great<br />

deal to do with what’s good. And if that’s the case, then all we have is more evidence<br />

against the tight connection between the right and the good: it would be better if<br />

somebody started throwing pies, but everybody has a moral obligation not to. So it<br />

would be better if somebody did what they oughtn’t.<br />

4 Value, Desire and Advice<br />

Although the “ought” in “somebody ought to throw pies” has little to do with what’s<br />

right, it might have a lot to do with what we find desirable. And this will cause problems<br />

for some familiar meta-ethical theories. Quite naturally, Jack does not desire to<br />

throw pies at strangers for amusement in the actual world. Jack’s a very civic minded<br />

fellow in that respect. In fact, his concern for others goes deeper than that. He’d<br />

be quite prepared to risk his body for the sake of his fellow citizens. As it turns<br />

out, he’s been a volunteer fire fighter for years now. And Jack likes to think that if<br />

need be, he would be prepared, to use an old fashioned phrase, to risk his soul for<br />

the community. He hopes he would be morally depraved if what the society needed<br />

was depravity. Jack agrees with the discussion of character in section 2, so he hopes<br />

3 It’s actually the second part that makes it puzzling. Compare the familiar and unproblematic situation<br />

in which we ought to give you a horse, but there’s no horse such that we ought to give you that one, and<br />

the more troubling situation in which we ought to give you a horse, but every horse is such that we ought<br />

not to give you that one.


Prankster’s Ethics 691<br />

that when society needs a pie-thrower, he will step up with the plate, and do so directly<br />

because he wants to throw pies at innocent bystanders. Letting C stand for the<br />

circumstances described above, where it would be good for there to be more wrongdoing,<br />

Jack’s position can be summarised by saying that he desires that in C he desires<br />

that he throws pies at innocents.<br />

Does this all mean Jack values his throwing pies at innocents in C? Not necessarily.<br />

Does it mean that if we were all like Jack, and we are subjectivists about what is<br />

right, it would be right to throw pies at innocents in C? Definitely not. David Lewis<br />

(1989b) equates what we value with what we desire to desire. 4 And he equates what<br />

is valuable with what we value. The text is not transparent, but it seems Lewis wants<br />

valuable to subsume both what we call the ‘right’ and the ‘good’. And this he cannot<br />

have. Assume that everyone in Jack’s community desires to (de se) desire that (s)he<br />

throw pies at innocents in C. That does not make it right that pies are thrown at<br />

innocents. We take no stand here on whether the flaw is in the equation of personal<br />

value with second-order desire, or in the reduction of both rightness and goodness to<br />

personal value, but there is a problem for Lewis’s dispositional theory of value. 5<br />

This point generalises to cause difficulties for several dispositional theories of<br />

value. For example, Michael Smith (1994) holds that right actions are what our perfectly<br />

rational selves would advise us to do. This assumes that when the good and<br />

the right come apart, our perfectly rational selves would choose the right over the<br />

good. And it’s far from clear that Smith has the resources to argue for this assumption.<br />

Smith’s argument that our perfectly rational selves will advise us to do what is<br />

right relies on his earlier argument that anyone who does not do what she judges to<br />

be right is practically irrational, unlike presumably our perfectly rational selves. And<br />

the main argument for that principle is that it is the best explanation of why actually<br />

good people are motivated to do what they judge to be right, even when they change<br />

their judgements about what is right. But now we should be able to see that there’s<br />

an alternative explanation available. Actually good people might be motivated to do<br />

what they judge to be good rather than right. We have seen no reason to believe that<br />

the right and the good actually come radically apart, so this is just as good an explanation<br />

of the behaviour actual moral agents as Smith’s explanation. So for all Smith<br />

has argued, one might judge ϕing to be right, also judge it not to be good, hence be<br />

not motivated to ϕ, and not be practically irrational. Indeed, our perfectly rational<br />

4 More precisely, with what we desire to desire in circumstances of appropriate imaginative acquaintance.<br />

We can suppose that Jack, and everyone else under discussion in this paragraph, is suitably imaginatively<br />

acquainted with the salient situations. Jack knows full well what it is like to get a pie in the<br />

face.<br />

5Someone might think it obvious that Lewisian value can’t be used in an analysis of both rightness and<br />

goodness, since it is one concept and we are analysing two concepts. But Lewisian value bifurcates in a<br />

way that one might think it is suitable for analysing both rightness and goodness. Since there are both<br />

de dicto and de se desires, one can easily draw out both de dicto and de se values. And it is prima facie<br />

plausible that the de dicto values correspond to what is good, and the de se values to what is right. Indeed,<br />

given a weak version of consequentialism where these two can be guaranteed to not directly conflict, this<br />

correspondence may well hold. But we think the pie-thrower threatens even those consequentialists. The<br />

net philosophical conclusion is that the pie-thrower is a problem for Lewis’s meta-ethics, but only because<br />

(a) she is a problem for Lewis’s consequentialism, and, surprisingly, (b) Lewis’s meta-ethics depends on his<br />

consequentialism being at least roughly right.


Prankster’s Ethics 692<br />

self might be just like this. 6 Hence we cannot rely on our perfectly rational self to be<br />

a barometer of what is right, as opposed to what is good.<br />

6 We have glossed over a technical point here that is irrelevant to the current discussion. What matters<br />

is not whether our perfectly rational selves are motivated to ϕ, it matters whether they desire that we ϕ,<br />

and hence whether they are motivated to advise us to ϕ. Keeping this point clear matters for all sorts of<br />

purposes, but not we think the present one.


Are You a Sim?<br />

Abstract<br />

Nick Bostrom (2003) argues that if we accept some plausible assumptions<br />

about how the future will unfold, we should believe we are probably<br />

not humans. The argument appeals crucially to an indifference principle<br />

whose precise content is a little unclear. I set out five possible interpretations<br />

of the principle, none of which can be used to support Bostrom’s<br />

argument. On the first two interpretations the principle is false, on the<br />

next two it does not entail the conclusion, and on the fifth it only entails<br />

the conclusion given an auxiliary hypothesis that we have no reason to<br />

believe.<br />

In Will Wright’s delightful game The Sims, the player controls a neighbourhood full<br />

of people, affectionately called sims. The game has no scoring system, or winning<br />

conditions. It just allows players to create, and to some extent participate in, an<br />

interesting mini-world. Right now the sims have fairly primitive psychologies, but<br />

we can imagine this will be improved as the game evolves. The game is very popular<br />

now, and it seems plausible that it, and the inevitable imitators, will become even<br />

more popular as its psychological engine becomes more realistic. Since each human<br />

player creates a neighbourhood with many, many sims in it, in time the number of<br />

sims in the world will vastly outstrip the number of humans.<br />

Let’s assume that as the sims become more and more complex, they will eventually<br />

acquire conscious states much like yours or mine. I do not want to argue for or<br />

against this assumption, but it seems plausible enough for discussion purposes. I’ll reserve<br />

the term Sim, with a capital S, for a sim that is conscious. By similar reasoning<br />

to the above, it seems in time the number of Sims in the world will far outstrip the<br />

number of humans, unless humanity either (a) stops existing, or (b) runs into unexpected<br />

barriers to computing power or (c) loses interest in these kinds of simulators.<br />

I think none of these is likely, so I think that over time the ratio of Sims to humans<br />

will far exceed 1:1.<br />

Nick Bostrom (2003) argues that given all that, we should believe that we are<br />

probably Sims. Roughly, the argument is that we know that most agents with conscious<br />

states somewhat like ours are Sims. And we don’t have any specific evidence<br />

that tells on whether we are a Sim or a human. So the credence we each assign to I’m<br />

a Sim should equal our best guess as to the percentage of human-like agents that are<br />

Sims, which is far above 1/2. As Glenn Reynolds put it, “Is it live, or is it Memorex?<br />

Statistically, it’s probably Memorex. Er, and so are you, actually.” 1 (Is it worrying<br />

that we used the assumption that we are human to generate this statistical argument?<br />

Not necessarily; if we are Sims then the Sims:humans ratio is probably even higher,<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Philosophical<br />

Quarterly 53 (2003): 425-31.<br />

1Link. Reynolds’s comment wasn’t directly about Bostrom, but it bore the ancestral of the relation<br />

refers to Bostrom’s paper.


Are You a Sim? 694<br />

so what we know is a lower bound on the proportion of human-like agents that are<br />

Sims.) Less roughly, the argument appeals crucially to the following principle:<br />

(#) Cr(Sim | f Sim = x) = x<br />

Here Cr is a rational credence function. I will adopt David Lewis’s theory of de<br />

se belief, and assume that the credence function is defined over properties, rather<br />

than propositions Lewis (1979a). Whenever I use a term that normally stands for<br />

a proposition inside the scope of Cr, it stands for the property of being in a world<br />

where that proposition is true. So f Sim = x stands for the property of being in a world<br />

where 100x% of the human-like agents are Sims.<br />

As Bostrom notes, the main reason for believing (#) is that it is an instance of a<br />

plausible general principle, which I’ll call (##).<br />

(##) ∀Φ: Cr(Φ | f Φ = x) = x<br />

Bostrom does not formulate this more general principle, but it is clear that he intends<br />

something like it to be behind his argument, for many of the defences of (#) involve<br />

substituting some other property in place of Sim in statements like (#). So I will<br />

focus here on whether anything like (##) is plausibly true, and whether it supports<br />

(#). There are many ways we could interpret (##), depending on whether we take<br />

Cr to be a rational agent’s current credences, or in some sense the prior credences<br />

before they are affected by some particular evidence, and on whether we take the<br />

quantifier to be restricted or unrestricted. Five particular interpretations stand out as<br />

being worth considering. None of these, however, provides much reason to believe<br />

(#), at least on the reading Bostrom wants to give it. In that reading (#) the credence<br />

function represents the current credences of an agent much like you or me. If (#) isn’t<br />

interpreted that way, it can’t play the dialectical role Bostrom wants it to play. On<br />

two of the interpretations, (##) is false, on two others it may be true but clearly does<br />

not entail (#), and on the fifth it only entails (#) if we make an auxiliary assumption<br />

which is far from obviously true.<br />

For ease of exposition, I will assume that Cr describes in some way the credences<br />

at some time of a particular rational human-like agent, Rat, who is much like you or<br />

me, except that she is perfectly rational.<br />

1 First Interpretation<br />

Cr in (##) measures Rat’s current credences, and the quantifier in (##) is unrestricted.<br />

On this interpretation, (##) is clearly false, as Bostrom notes. Rat may well know<br />

that the proportion of human-like agents that are like spaghetti westerns is rather<br />

low, while rationally being quite confident that she likes spaghetti westerns. For any<br />

property Φ where Rat has some particular information about whether he is one of<br />

the Φs or not, that information, and not general facts about the proportion of humanlike<br />

agents that are Φ, can (indeed should) guide Rat’s credences. So those substitution<br />

instances of (##) are false.


Are You a Sim? 695<br />

2 Second Interpretation<br />

Just like the first interpretation, except that we restrict the quantifier range so that<br />

it only ranges over properties such that Rat does not know whether she possesses<br />

them. This interpretation seems to be hinted at by Bostrom when he says, “the<br />

bland indifference principle expressed by (#) prescribes indifference only between<br />

hypotheses about which observer you are, when you have no information about<br />

which of these observers you are.” Even given this restriction, (##) is still false, as the<br />

following example shows.<br />

Assume that Rat knows that f Sim > 0.9, which Bostrom clearly takes to be consistent<br />

with rationality. And assume also that Rat, being a normal human-like agent,<br />

knows some fairly specific, and fairly distinctive facts about her conscious life. If<br />

Rat is anything like you or me, she will have experiences that he can be fairly sure<br />

are unique to her. Last night, for instance, while Rat was listening to Go-Betweens<br />

bootlegs, watching baseball, drinking beer, rocking in his rocking chair and thinking<br />

about Bostrom’s simulation argument, she stubbed her toe in a moderately, but not<br />

excessively, painful way. Few people will have done all these things at once, and none<br />

in quite that way. Let C be the property of ever having had an experience almost just<br />

like that. Rat knows he is a C. She is very confident, though not certain, that she is<br />

the only human-like C. Let a suman be the property of being C and human, or not-C<br />

and a Sim. For much of the paper we’re going to be concerned with the following<br />

two properties.<br />

x is a suman = df x is a human C or a Sim who is not a C .<br />

x is a him = df x is a Sim C or a human who is not a C .<br />

We are following Bostrom in assuming that Rat does not know whether she is a Sim<br />

so she does not know whether she is a suman. But given that almost no one is C,<br />

it follows that f suman ≈ f Sim . Hence f suman > 0.85, for if it is less than f Sim , it is not<br />

much less. But if Cr(a suman) > 0.85, and Cr(Sim) > 0.9, and Rat is coherent, it<br />

follows that Cr(C) < 0.25. But we assumed that Rat knew that she was a C, and<br />

however knowledge and credence are to be connected, it is inconceivable that one<br />

could know something while one’s credence in it is less than 1/4. Hence it must be<br />

false that Cr(C) < 1/4, but we inferred that from given facts about the story and (##),<br />

as interpreted here. Hence (##), as interpreted here, is false.<br />

3 Third Interpretation<br />

One natural response ot the previous objection is that there shoul dbe some way of<br />

restricting (##) so that it does not apply to properties like being a suman. Intuitively,<br />

the response is that even though Rat doesn’t know whether she is a suman, she knows<br />

something that is relevant to whether she is a suman, namely that she is a C . The<br />

problem with this response is that any formal restriction on (##) that implements<br />

this intuition ends up giving us a version so weak that it doesn’t entail (#).<br />

The idea is that what went wrong in the previous case is that even though Rat<br />

does not know whether she is a suman, she knows something relevant to this. In


Are You a Sim? 696<br />

particular, she knows that if she is a suman, she is one of the sumans that is human,<br />

rather than one of the ones that is a Sim. Our third interpretation avoids the difficulties<br />

this raises by restricting the quantifier in (##) even further. Say that a property Φ<br />

is in the domain of the quantifier iff (a) Rat does not know whether she is Φ, and (b)<br />

there is no more specific property Φ ′ such that Rat knows that if she is Φ, then she is<br />

Φ ′ . 2 This will rule out the applicability of (##) to properties like a suman. Unfortunately,<br />

it will also rule out the applicability of (##) to properties like being a Sim. For<br />

Rat knows that if she is a Sim, then she is a Sim that is also a C. So now (##) doesn’t<br />

entail (#).<br />

This kind of problem will arise for any attempt to put a purely formal restriction<br />

on (##). The problem is that, as Goodman noted in a quite different context (Goodman,<br />

1955), there is no formal distinction between the ‘normal’ properties, being<br />

a human and being a sim, and the ‘deviant’ properties, being a suman and being a<br />

him. The following four biconditionals are all conceptual truths, and hence must all<br />

receive credence 1.<br />

(1) (a) x is a suman iff x is a human C or a Sim who is not a C .<br />

(b) x is a him iff x is a Sim C or a human who is not a C .<br />

(2) (a) x is a human iff x is a suman C or a him who is not a C .<br />

(b) x is a Sim iff x is a him C or a suman who is not a C .<br />

If the obvious truth of (1a) implies that Rat cannot apply (##) to the property o being<br />

a suman once she knows that she is a C , for (1a) makes that evidence look clrarly<br />

relevant to the issue of whether she is suman, then similar reasoning suggests that the<br />

obvious truth of (2a) implies that Rat cannot apply (##) to the properties of being<br />

a human once she knows that she is a C , for (2a) makes that evidence look clearly<br />

relevant to the issue of whether she is human. The point is that a restriction on<br />

(##) that is to deliver (#) must fine some epistemologically salient distinction between<br />

the property of being human and the property of being suman if it is to rule out<br />

one application of (##) without ruling out the other, and if we only consider formal<br />

constraints, we won’t find such a restriction. Our final attempt to justify (#) from<br />

something like (##) attempts to avoid this problem by appealing directly to the nature<br />

of Rat’s evidence.<br />

4 Fourth Interpretation<br />

The problems with the three interpretations of (##) so far have been that they applied<br />

after Rat found out something distinctive about herself, that she was a C. Perhaps (##)<br />

is really a constraint on prior credence functions. A priori, Rat’s credences should be<br />

governed by an unrestricted version of (##). We then have the following argument<br />

for (#). (As noted above, (#) is a constraint on current credences, so it is not immediately<br />

entailed by a constraint on prior credences such as (##) under its current<br />

interpretation.)<br />

2 I think it is this interpretation of (##) that Adam Elga implicitly appeals to in his solution to the<br />

Sleeping Beauty problem Elga (2000a).


Are You a Sim? 697<br />

P1 A priori, Rat’s conditional credence in her being a Sim given that f Sim is x is x.<br />

P2 All of Rat’s evidence is probabilistically independent of the property of being a<br />

Sim.<br />

C Rat’s current conditional credence in her being a Sim given that f Sim is x is x.<br />

This interpretation may be reasonably faithful to what Bostrom had in mind. The<br />

argument just sketched looks similar enough to what he hints at in the following<br />

quote: “More generally, if we knew that a fraction x of all observers with humantype<br />

experiences live in simulations, and we don’t have any information that indicate<br />

that our own particular experiences are any more or less likely than other humantype<br />

experiences to have been implemented in vivo rather than in machina, then our<br />

credence that we are in a simulation should equal x.” So it’s not unreasonable to<br />

conclude that he is committed to P2, and intends it to be used in the argument that<br />

you should give high credence to being a Sim. 3 Further, this version of (##), where<br />

it is restricted to prior credences, does not look unreasonable. So if P2 is true, an<br />

argument for (#) might just succeed. So the issue now is just whether P2 is true.<br />

Why might we reject P2? Any of the following three reasons might do. First,<br />

Rat’s evidence might be constituted by more than her conscious phenomenal states.<br />

This reply has an externalist and an internalist version. On the externalist version,<br />

Rat’s perceptual evidence is constituted in part by the objects she is perceiving. Just<br />

as seeing a dagger and hallucinating a dagger provide different evidence, so does seeing<br />

a dagger and sim-seeing a sim-dagger. For reasns Williamson notes, a Sim may<br />

not know that she has different evidence to someone seeing a dagger when she simsees<br />

a sim-dagger, but that does not imply that she does not have different evidence<br />

unless one also assumes, implausibly, that agents know exactly what their evidence is<br />

Williamson (2000b). On the internalist version, our evidence is constituted by our<br />

sensory irritations, just as Quine said it is (Quine, 1973). If Rat’s evidence includes<br />

the fact that her eyes are being irritated thus-and-so, his credence conditional on that<br />

that she is human should be 1, for if she were a Sim she could not have this evidence<br />

because she would not have eyes. She may, depending on the kind of Sim she is, have<br />

sim-eyes, but sim-eyes are not eyes. So Bostrom needs an argument that evidence supervenes<br />

on conscious experiences, and he doesn’t clearly have one. This is not to say<br />

that no such argument could exist. For example, Laurence BonJour provides some<br />

intriguing grounds for thinking that our fundamental evidence does consist in certain<br />

kinds of conscious states, namely occurrent beliefs (BonJour, 1999), but we’re a long<br />

3 Jamie Dreier pointed out to me that what Bostrom says here is slightly more complicated than what<br />

I, hopefully charitably, attribute to him. A literal reading of Bostrom’s passage suggests he intends the<br />

following principle.<br />

∀e: Cr(e * | Human) - Cr(e * | Sim) = Cr(e | Human) - Cr(e | Sim) (B)<br />

The quantifier here ranges over possible experiences e, e * is the actual experience Rat has, and Cr is the<br />

credence function at the ‘time’ when Rat merely knows that he is human-like and f Sim is greater than 0.9.<br />

I suggested a simpler assumption:<br />

Cr(Human | e * ) = Cr(Sim | e * ) (I)<br />

Bostrom needs something a little stronger than (I) to get his desired conclusion, for he needs this to hold<br />

not just for Rat’s experience e * , but for your experience and mine as well. But we will not press that point.<br />

Given that point, though, (I) is all he needs. And presumably the reason he adopts (B) is because it looks<br />

like it entails (I). And indeed it does entail (I) given some fairly innocuous background assumptions.


Are You a Sim? 698<br />

way from knowing that the supervenience claims holds. And if the supervenience<br />

claim does not hold, then even if Sims and humans have the same kind of experiences,<br />

they may not have the same kind of evidence. And if that is true, it is open to us to<br />

hold that Rat’s non-experiential evidence entails that she is not a Sim (as both Williamson<br />

and Quine suggest), so her evidence will not be independent of the question<br />

of whether she is a Sim.<br />

Secondly, even if every one of Rat’s experiences is probabilistically independent<br />

of the hypothesis that she is a Sim, that doesn’t give us a sufficient reason to believe<br />

that her total evidence is so independent. Just because e 1 and e 2 are both probabilistically<br />

independent of H, the conjunction e 1 ∧ e 2 might not be independent of H. So<br />

possibly our reasons for accepting P2 involve a tacit scope confusion. 4<br />

Finally, we might wonder just why we’d even think that Rat’s evidence is probabilistically<br />

independent of the hypothesis that she is human. To be sure, her evidence<br />

does not entail that she is human. But that cannot be enough to show that it is probabilistically<br />

independent. For the evidence also does not entail that she is suman. And<br />

if P2 is true, then the evidence must have quite a bit of bearing on whether she is<br />

suman. For Rat’s prior credence in being suman is above 0.9 but apparently her posterior<br />

credence in it should be below 0.15. So the mere fact that the evidence does not<br />

entail that she is human cannot show that it is probabilistically independent of her<br />

being human, for the same reasoning would show it is probabilistically independent<br />

of his being suman.<br />

More generally, we still need a distinction here between the property of being<br />

human and the property of being suman that shows why ordinary evidence should<br />

be independent of the first property but not the second. One might think the distinction<br />

can reside in the fact that being human is a natural property, while being<br />

suman is gruesome. The lesson of Goodman’s riddle of induction is that we have to<br />

give a privileged position in our epistemic framework to natural properties like being<br />

human, and this explains the distinction. This response gets the status of privileged<br />

and gruesome properties back-to-front. The real lesson of Goodman’s riddle is that<br />

credences in hypotheses involving natural properties should be distinctively sensitive<br />

to new evidence. Our evidence should make us quite confident that all emeralds are<br />

green, while giving us little reason to think that all emeralds are grue. What P2 says<br />

is that a rather natural hypothesis, that Rat is human, is insensitive to all the evidence<br />

Rat has, while a rather gruesome hypothesis, that Rat is suman, is sensitive to this<br />

evidence. The riddle of induction gives us no reason to believe that should happen.<br />

It seems, though this is a little speculative, that the only reason for accepting<br />

P2 involves a simple fallacy. It is true that we have no reason to think that some<br />

evidence, say C, is more or less likely given that Rat is human rather than a Sim. But<br />

from this we should not conclude that we have a reason to think it is not more or less<br />

likely given that Rat is human rather than a Sim, which is what P2 requires. Indeed,<br />

drawing this kind of conclusion will quickly lead to a contradiction, for we can use<br />

the same ‘reasoning’ to conclude that we have a reason to think her evidence is not<br />

more or less likely given that Rat is a suman rather than a him.<br />

4 Thanks to Jamie Dreier for reminding me of this point.


Are You a Sim? 699<br />

5 Conclusion<br />

Nothing I have said here implies that Rat should have a high credence in her being<br />

human. But it does make one argument that she should not have a high credence in<br />

this look rather tenuous. Further, it is quite plausible that if there is no good reason<br />

not to give high credence to a hypothesis, then it is rationally permissible to give<br />

it such a high credence. It may not be rationally mandatory to give it such a high<br />

credence, but it is permissible. If Rat is very confident that she is human, even while<br />

knowing that most human-like beings are Sims, she has not violated any norms of<br />

reasoning, and hence is not thereby irrational. In that respect she is a bit like you and<br />

me.


Humeans Aren’t Out of Their Minds<br />

Humeanism is “the thesis that the whole truth about a world like ours supervenes<br />

on the spatiotemporal distribution of local qualities.” (Lewis, 1994a, 473) Since the<br />

whole truth about our world contains truths about causation, causation must be<br />

located in the mosaic of local qualities that the Humean says constitute the whole<br />

truth about the world. The most natural ways to do this involve causation being<br />

in some sense extrinsic. To take the simplest possible Humean analysis, we might<br />

say that c causes e iff throughout the mosaic events of the same type as c are usually<br />

followed by events of type e. For short, the causal relation is the constant conjunction<br />

relation. Whether this obtains is determined by the mosaic, so this is a Humean<br />

theory, but it isn’t determined just by c and e themselves, so whether c causes e is<br />

extrinsic to the pair. Now this is obviously a bad theory of causation, but the fact<br />

that causation is extrinsic is retained even by good Humean theories of causation.<br />

John Hawthorne (2004a) objects to this feature of Humeanism. I’m going to argue<br />

that his arguments don’t work, but first we need to clear up three preliminaries about<br />

causation and intrinsicness.<br />

First, my wording so far has been cagey because I haven’t wanted to say that<br />

Humeans typically take causation to be an extrinsic relation. That’s because the greatest<br />

Humean of them all, David Lewis, denies that causation is a relation at all, and<br />

hence that it is an extrinsic relation (Lewis, 2004c). We can go some way to avoiding<br />

this complication by talking, as Hawthorne does, about properties of regions,<br />

and asking the property of containing a duplicate of c that causes a duplicate of e is<br />

intrinsic or extrinsic. 1 Humeans typically take causation to be extrinsic in this sense.<br />

Second, nothing in Humeanism requires that causation is extrinsic in that sense.<br />

If one analysed causation as that intrinsic relation that actually most tightly correlates<br />

with the constant conjunction relation, then one would have guaranteed that causation<br />

was an intrinsic relation. Moreover, one would have a perfectly Humean theory of<br />

causation. (A perfectly awful theory, to be sure, but still a Humean one.) Peter<br />

Menzies (1996, 1999) develops a more sophisticated version of such a theory, and<br />

though Menzies describes his view as anti-Humean, one can locate the relation we’ve<br />

defined here in the Humean mosaic, so such an approach might be consistent with<br />

Humeanism in the intended sense.<br />

Third, there is good reason, independent of Humeanism, to accept that causation<br />

is extrinsic. As Ned Hall (2004) argues, it is very hard to square the intrinsicness of<br />

causation with the possibility of causation by omission. Given the choice between<br />

these two, I’m going to accept causation by omission without much hesitation. There<br />

is one powerful objection to the possibility of causation by omission, namely that if<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Noûs 41<br />

(2007): 529-535.<br />

1This move requires the not wholly uncontroversial assumption that regions are the kinds of things<br />

that can have properties. But I’ll happily make that assumption here. Note that the formulation here<br />

allows that the property denoted might be intrinsic for some c and e and extrinsic for others. I’ll say<br />

causation is extrinsic if the property denoted is extrinsic for some choice of c and e, even if it is intrinsic<br />

for others, as it might be if, for example, no region could possess the property because c is a part of e.


Humeans Aren’t Out of Their Minds 701<br />

there is any causation by omission then there is a lot more than is intuitively plausible.<br />

But since Sarah McGrath (2005) has a good response to that objection, I feel<br />

happy accepting there is causation by omission. So I accept that causation is extrinsic,<br />

for reasons totally independent of Humeanism. Since Hawthorne appeals to no<br />

feature of Humeanism beyond the Humean’s acceptance of the extrinsicness of causation,<br />

we can take his argument to be an argument against the causal extrinsicalist,<br />

i.e. the theorist who accepts causation is extrinsic in the above sense. To see that the<br />

argument doesn’t go through, we need to consider what exactly the causal extrinsicalist<br />

is committed to. I’ll explore this by looking at some other examples of properties<br />

of regions.<br />

Some regions contain uncles and some do not. This seems to be an extrinsic<br />

property of regions. My house does not contain any uncles right now, but there are<br />

duplicates of it, in worlds where my brothers have children, where it does contain<br />

an uncle, namely my counterpart. Consider the smallest region containing the earth<br />

from the stratosphere in from the earth’s formation to its destruction. Call this region<br />

e. Any duplicate of e also contains uncles, including several uncles of mine. You<br />

can’t duplicate the earth without producing a duplicate of me who is, in the duplicate<br />

world, the nephew of the duplicates of my uncles. So it is intrinsic to e that it contain<br />

an uncle, even though this is an extrinsic property of regions. (There is much<br />

more discussion of extrinsic properties that are possessed intrinsically in Humberstone<br />

(1996).)<br />

This possibility, that a region might intrinsically possess an extrinsic property,<br />

poses a problem for Hawthorne’s argument. Here is his presentation of it.<br />

(1) An intrinsic duplicate of any region wholly containing me will contain<br />

a being with my conscious life.<br />

(2) There are causal requirements on my conscious life.<br />

Therefore, Humeanism is false. (Hawthorne, 2004a, 351-352)<br />

The problem is that this argument isn’t valid. What follows from (1) and (2) is that<br />

any region containing Hawthorne must possess some causal properties intrinsically.<br />

(As Hawthorne argues on page 356.) And what Humeanism entails is that causal<br />

properties are extrinsic properties of regions. But there is no incompatibility here,<br />

for it is possible that extrinsic properties are possessed intrinsically, as we saw in the<br />

discussion of uncles.<br />

Hawthorne’s argument would go through if Humeans, and causal extrinsicalists<br />

more generally, were committed to the stronger claim that regions never possess<br />

causal properties intrinsically. But it doesn’t seem that Humeans should be committed<br />

to this claim. Consider again e and all its duplicates. Any such duplicate will<br />

contain a duplication of the event of Booth’s shooting Lincoln, and Lincoln dying. 2<br />

2 There is a potential complication here in that arguably in some such worlds, e.g. worlds where there<br />

is another planet on the opposite side of the sun to duplicate-earth where people are immediately ‘resurrected’<br />

when the ‘die’ on duplicate-earth. In such a world you might say that duplicate-Lincoln doesn’t<br />

really die on duplicate-Earth, but merely has the duplicate-earth part of his life ended. We’ll understand<br />

‘dying’ in such a way that this counts as dying.


Humeans Aren’t Out of Their Minds 702<br />

Will it also be the case that duplicate-Booth’s shooting in this world causes duplicate-<br />

Lincoln’s dying? If so, and this seems true, then it is intrinsic to e that it contains<br />

an event of a shooting causing a dying, even though the property of containing a<br />

shooting causing a dying is extrinsic.<br />

It would be a bad mistake to offer the following epistemological argument that in<br />

all duplicates of e, duplicate-Booth’s shooting causes duplicate-Lincoln’s dying.<br />

1. If there was a duplicate of e where duplicate-Booth’s shooting does not cause<br />

duplicate-Lincoln’s dying, then we would not know whether Booth’s shooting<br />

causes Lincoln’s dying without investigating what happens outside e.<br />

2. We can know that Booth’s shooting caused Lincoln’s dying without investigating<br />

outside e.<br />

3. So, there is no duplicate of e where duplicate-Booth’s shooting does not cause<br />

duplicate-Lincoln’s dying.<br />

The problem with this argument is that even there are worlds containing such duplicates,<br />

we might know a priori that we do not live in such a world, just as we know a<br />

priori that we do not live in a world where certain kinds of sceptical scenarios unfold<br />

(Hawthorne, 2002; <strong>Weatherson</strong>, 2005b).<br />

A better argument against the existence of such a world is that if it is possible, it<br />

should be conceivable. But it is basically impossible to conceive such a world. Even<br />

if throughout the universe shootings like Booth’s are usually followed by something<br />

other than dying, say that shooting in most parts of the universe causes diseases to be<br />

cured, the large-scale regularity within e (or its duplicate) of shootings being followed<br />

by dying suffices to ground the claim that shootings cause dyings in a good Humean<br />

theory. The crucial assumption here is that local regularities count for more than<br />

global regularities. If the local regularities deviate too far from the global regularities,<br />

then Humeans can and should say that different nomic claims (and hence causal<br />

claims) are true in this part of the world to the rest of the universe. If they say this,<br />

they can say that regions can have causal features (such as containing a shooting causing<br />

a dying) intrinsically even though causal features are extrinsic properties.<br />

To illustrate the kind of Humean theory that would have such a consequence,<br />

consider the following variant on the constant conjunction theory of causation. The<br />

theory I’m imagining says that c causes e iff whenever an event of the same type as c<br />

occurs within a 50 mile radius of where c occurred, it was followed by an event of type<br />

e. Call this the 50 mile constant conjunction theory of causation. 3 On the 50 mile<br />

constant conjunction theory of causation, it won’t be intrinsic to Ford’s Theatre that<br />

it contained a causal event involving Booth shooting Lincoln, but it will be intrinsic<br />

to any sphere of radius 50 miles or more centred on the theatre that it contains such a<br />

causal event. So on this theory causal properties can be intrinsic to a region, though<br />

they are still extrinsic properties of such a region.<br />

3 I assume here that events can be properly said to have locations. Spelling out this assumption in detail<br />

will require some serious metaphysics, particularly when it comes to omissions. Where exactly does my<br />

omission to stop the Iraq War take place? Here at my kitchen table where I am? In Iraq, where the war is?<br />

In Washington, if that’s where I’d be were I doing something to stop the war? These questions are hard,<br />

though not so hard that we should give up on the very natural idea that events have locations.


Humeans Aren’t Out of Their Minds 703<br />

That’s a very implausible Humean theory, but when we look at the details of<br />

David Lewis’s Humean picture, we can see the outlines of a more plausible theory<br />

with the same consequences. Lewis of course doesn’t offer a simple regularity theory<br />

of causation. Rather, he first argues that laws are the extremely simple, extremely<br />

informative true propositions (Lewis, 1973b, 73). That is, he offers a sophisticated<br />

regularity theory of laws. Then he analyses counterfactual dependence in terms of<br />

lawhood (Lewis, 1973b, 1979b). Finally he analyses causation in terms of counterfactual<br />

dependence (Lewis, 2004a). The philosophical theory meets the Humean mosaic<br />

most closely on the issue of what a law is. If we can offer a theory of laws that allows<br />

extra sensitivity to local facts, while remaining Humean, we can plug this back into<br />

Lewis’s theories concerning counterfactual dependence and hence causation without<br />

upsetting its Humean credentials.<br />

Now there is a good reason to think that a Humean theory of laws should be<br />

locally sensitive. (I’m indebted here to long ago conversations with James Chase.)<br />

Humeans typically believe in fairly unrestricted principles of recombination. And<br />

they believe that laws are not necessarily true. So there could be worlds with very<br />

different laws. So there is a world which ‘patches’ together part of the world with<br />

laws L 1 with a world with laws L 2 . If the parts are large and isolated enough, it would<br />

be foolish to say that within those parts nothing is law-governed, or that within those<br />

regions there is no counterfactual dependence, or no causation. Much better to say<br />

that regularities obtaining within such a region are sufficiently simple and informative<br />

to count as laws. In our patchwork world, the laws might simply say In r 1 , L 1<br />

and in r 2 , L 2 . Provided the terms denoting the regions are not too gruesome, these<br />

will plausibly be Humean, even Lewisian, laws.<br />

Let’s bring all this back to Hawthorne’s example. Hawthorne argues that certain<br />

causal facts are intrinsic to the region containing his body. The challenge for the<br />

Humean is to say how this could be so when Hawthorne could be embedded in<br />

a world where very different regularities obtain. The simple answer is to say that<br />

in such worlds, laws like In r, L, where r picks out the region Hawthorne’s body<br />

occupies, and L picks out a real-world law, will be true, simple and informative. It is<br />

informative because any duplicate of Hawthorne’s body is a very complicated entity,<br />

containing billions of billions of particles interacting in systematic ways, ways that<br />

are nicely summarised by real-world laws. Simplicity is a little harder to make out,<br />

but note that there is a reasonably sharp boundary between Hawthorne’s body and<br />

the rest of the world (Lewis 1993), so there should be a natural enough way to pick it<br />

out. In other words, even if we embed a Hawthorne duplicate in a world with very<br />

different regularities, Humeans will still have good reason to say that the laws, and<br />

hence the facts about counterfactual dependence and causation, inside that duplicate<br />

are not changed. So not only is it logically possible that Hawthorne’s premises are<br />

true and his conclusion false, we can motivate a Humean position that endorses the<br />

truth of Hawthorne’s premises and the falsity of the conclusion.<br />

Since Hawthorne’s argument is invalid then, we can accept the premises without<br />

giving up Humeanism. But I think it is worthwhile to note that his (1) also can be<br />

questioned. Hawthorne notes that it is rejected by those such as Dretske and Lewis<br />

who say that phenomenal character is determined in part by kind membership. (See


Humeans Aren’t Out of Their Minds 704<br />

Lycan (2001) for a longer defence of this kind of rejection of (1).) Hawthorne thinks<br />

that the intuitive plausibility of (1) constitutes a serious objection to those views. But<br />

by reflecting a little on the phenomenology of what I’ll call totality qualia, we can<br />

undermine the intuitive case for (1).<br />

Tweedledee is facing a perfectly symmetrical scene. His visual field is symmetric,<br />

with two gentle mountains rising to his left and his right and a symmetric plain in<br />

between them. All he can hear are two birds singing in perfect harmony, one behind<br />

his left ear and one behind his right ear. The smells of the field seem to envelope him<br />

rather than coming from any particular direction. There is a cool breeze blowing<br />

directly on his face. It’s a rather pleasant scene, and the overwhelming feeling is one<br />

of symmetry.<br />

Tweedledum is very much like Tweedledee. Indeed, Tweedledum contains a duplicate<br />

of Tweedledee as a proper part. But Tweedledum also has some sensors in<br />

his skin, and brain cells in what corresponds to a suspiciously empty part of Tweedledee’s<br />

brain, that allow him to detect, and feel, where the magnetic fields are in<br />

the vicinity. And sadly, though Tweedledum is facing a duplicate of the scene facing<br />

Tweedledee, there is a major disturbance in the magnetic field just to Tweedledum’s<br />

left. This produces a jarring sensation in Tweedledum’s left side. As a consequence,<br />

Tweedledum does not share Tweedledee’s feeling of symmetry.<br />

Whether a picture is symmetric is a property of its internal features, but it is also<br />

a feature that can be destroyed without changing the internal features by just adding<br />

more material to one side. It is a totality property of pictures, a property the picture<br />

has because it stops just where it does. 4 Similarly, totality qualia are qualia that we<br />

have in part because we don’t have any more feelings than we actually do. Feelings<br />

of symmetry are totality qualia in this sense, as are many of the feelings of calm and<br />

peacefulness associated with Tweedledee’s state. It is not intuitive that totality qualia<br />

should be intrinsic to a region. Indeed, it seems intuitive that a duplicate of me that<br />

was extended to produce more sensory features would lack these feelings. Hence a<br />

duplicate of me would not share my conscious life in all respects, so Hawthorne’s<br />

premise (1) is also false. To be sure, these totality qualia are a somewhat speculative<br />

suggestion, but the Humean does not need them since Hawthorne’s anti-Humean<br />

argument is invalid.<br />

4 Ted Sider (2001b, 2003) stresses the importance to a theory of intrinsicness of properties that are<br />

instantiated in virtue of the object not bearing relations to other objects. My example here is closely<br />

modeled on examples from his papers.


Nine Objections to Steiner and Wolff on Land<br />

Disputes<br />

In the July 2003 Analysis, Hillel Steiner and Jonathan Wolff (2003) propose a framework<br />

for “resolving disputed land claims between competing nations or ethnic groups.”<br />

The idea is that we should auction off the land, with the loser of the auction getting<br />

the money. While this might mean that the richer party will normally end up with<br />

the land, and this is normally not thought to be a good thing, if the auction is conducted<br />

as they specify “it will turn out that the other party ends up with something<br />

which, in the circumstances, it prefers to the land: lots of money.”<br />

Actually, it isn’t so clear that this is what will result. Let’s say we have a particular<br />

parcel of land that groups A and B want. They each want it quite strongly, but B<br />

has deeper pockets than A, so while A would be prepared to pay 8 for the land, B<br />

would be prepared to pay 12. For the auction process to function, there must be a<br />

minimum bid increment, I’ll say it is 1<br />

. Assume that B has just bid 4, A must now<br />

2<br />

choose whether to bid 4 1<br />

or accept B’s bid. And assume for now that A is not bidding<br />

2<br />

tactically, it only makes a bid if it would prefer to win the auction with that bid than<br />

accept B’s bid. This assumption will be relaxed below.<br />

So for now, A must decide whether it prefers to be given 4, or to get the land for<br />

4 1<br />

1<br />

1<br />

. Since it values the land at 8, and since it will give up 8 to buy the land (the 4 2 2 2<br />

it will pay, plus the 4 it would have received from B) it may well decide to just accept<br />

the bid. But now it has ended up with something it definitely does not prefer to the<br />

land, since it just accepted a bid for 4. There are two assumptions at play here. One<br />

is that A doesn’t bid tactically, which I shall return to a bit. The other is that how<br />

much A will pay for the land is not affected by receiving B’s 4. That is, I assume that<br />

the marginal utility of money is relatively constant for A over the ranges of money at<br />

play in the auction. This assumption might be false if we’re dealing with a very large<br />

or valuable body of land, but it’s not unreasonable in most circumstances. (Space<br />

prevents a complete study of what happens if we take the declining marginal utility<br />

of money completely into account. Roughly, the effect is that some of my criticisms<br />

are slightly vitiated.) Now while these assumptions might be false, Steiner and Wolff<br />

give us no reason to be certain they are false. So for all they’ve said we could have a<br />

situation just like this one, where the poorer party ends up with something it wants<br />

much less than the land. Hence<br />

Objection 1. There is no guarantee that the losing party will end up<br />

with something they prefer to the land.<br />

While this contradicts an alleged benefit of Steiner and Wolff’s plan, it might not be<br />

thought to be a deep problem. After all, A gets half as much as they wanted, and if<br />

they are only one of two equal claimants to the land, then this is a fair result. This<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Analysis<br />

63 (2003): 321-8.


Nine Objections to Steiner and Wolff on Land Disputes 706<br />

may be true, but note that the assumption that each party has an equal claim to the<br />

land is doing a lot of work here. If A’s claim is stronger, then only getting half of the<br />

value of the land is quite unfair. If the two claims are incommensurable, there may<br />

be no fact of the matter whether it is fair that A receives 4. If we cannot tell which of<br />

the moral claims is stronger, which is very often the case in land disputes, it may be<br />

impossible to tell whether A’s receiving 4 is fair or not. Hence<br />

Objection 2. The proposal is only appropriate where each party has a<br />

genuinely equal moral claim to the land. This doesn’t happen often, and<br />

it is quite rare that we know it happens.<br />

While Steiner and Wolff note that they are leaving questions about enforcement and<br />

compliance to another place, so it isn’t fair to press them too strongly on these topics,<br />

it is worth noting how this feature of their proposal makes compliance harder to<br />

enforce. If by participating in the auction both parties are tacitly agreeing that the<br />

other party has an equal claim to the land, and I think the above suggests they are<br />

doing just this, that will reduce the legitimacy of the auction process in the eyes of<br />

members of the losing group. And that will lead to enforcement difficulties down the<br />

line.<br />

There is an administrative problem lurking around here. Since each party will<br />

end up with something from this process once the auction begins, we must have a<br />

way of determining whether the competing claims warrant an auction, or whether<br />

one party should receive the land, or whether some kind of negotiation is possible.<br />

And once we set up a process to do that, it could easily encourage relatively spurious<br />

land claims. Unless there is a serious cost to suggesting that one should be party to<br />

an auction of some block of land, there is a large incentive to get into these auctions<br />

wherever and whenever possible. Perhaps some method could be designed to offset<br />

this incentive, and perhaps even the desire groups have to be approved by the court of<br />

public opinion will offset it at times, but it seems to be a problem with the proposal<br />

as formulated.<br />

To be sure, if A accepts B’s bid, then both parties do end up with something from<br />

the auction. A gets 4, and B gets some land that it values at 12 for 4, a gain of 8. Note<br />

that B does much better out of the auction than A. If the auction stops when the<br />

richer party makes a bid at or above half the price the poorer party would pay, then<br />

the richer party will always end up with a higher ‘utility surplus’. Hence<br />

Objection 3. If there’s no tactical bidding the utility surplus is given<br />

entirely to the richer party.<br />

Let’s relax the assumption that A does not bid tactically. Indeed, let’s make things<br />

as good as could be realistically expected for A. It knows that B values the land at<br />

12 and does not bid tactically, so B will make bids up to 6, and accept any bid over<br />

6. Hence the auction proceeds as follows: A bids 4 1<br />

1<br />

, B bids 5, A bids 5 , B bids 6,<br />

2 2<br />

A accepts. Now things could go better for A, but it would require some luck and<br />

courage. A could bid 6 1<br />

and B could reply with a bid of 7, but since this requires B<br />

2


Nine Objections to Steiner and Wolff on Land Disputes 707<br />

acting against its own interests (it is better off accepting the bid of 6 1<br />

after all), and<br />

2<br />

hence also requires A making a risky move that will only yield dividends only if B<br />

acts against its own interests in just this way, such an outcome seems unlikely. So in<br />

practice the best case scenario for A is that B pays 6 for the land. In this case A ends<br />

up with 6, and B ends up paying 6 for land it values at 12, a gain of 6. Hence<br />

Objection 4. Among the realistic outcomes, the best case scenario for<br />

the poorer party is that it ends up with as large a utility surplus as the<br />

richer party.<br />

Best cases don’t often happen, so in practice we should normally expect a result somewhere<br />

between the ‘no tactical bargaining’ option, where B receives a larger share of<br />

the surplus, and this ‘best case scenario’ where the two parties get an equal share<br />

of the surplus. Hence in almost all cases, the richer party will get a larger surplus<br />

than the poorer party. This seems like a flaw in the proposal, but worse is to come.<br />

Most of the ways in which B can realistically increase its share of the surplus involve<br />

behaviour that we should not want to encourage.<br />

Consider again A’s decision to reject the bid of 5 and bid 5 1<br />

. Assume, for sim-<br />

2<br />

plicity, that A plans to accept a bid of 6, but drop the assumption that A knows that<br />

B will reject a bid of 5 1<br />

, if it is made. So before A makes its decision, there are three<br />

2<br />

possible outcomes it faces:<br />

Accept the bid In this case it receives 5.<br />

Bid 5 1<br />

2<br />

Bid 5 1<br />

2<br />

and have it accepted In this case it gets the land (value 8) for 5 1<br />

2<br />

and have it rejected In this case B bids 6, and A accepts, so it gets 6.<br />

, net gain 2 1<br />

2 .<br />

A’s expected utility is higher if it bids 5 1<br />

rather than accepts B’s bid iff its degree of<br />

2<br />

belief that B will bid 6 is over 5<br />

. If it is less confident than that that B will bid 6, it<br />

7<br />

should accept the bid of 5. As it happens, B is going to reject a bid of 5 1<br />

and bid 6,<br />

2<br />

so it is better off if A accepts the bid of 5. If A knows B’s plans, this will not happen.<br />

But if A is ignorant of B’s intentions, it is possible it will accept the bid of 5. Indeed,<br />

since A’s confidence that B will decline must be as high as 5<br />

before it makes the bid<br />

7<br />

of 5 1<br />

, it might be quite likely in this case that A will just accept the bid.<br />

2<br />

Not surprisingly, we get the result that B is better off if its bargaining plans are<br />

kept secret than if they are revealed to A. That in itself may not be objectionable. But<br />

remember that the agents here are not individuals, they are states. And the decisions<br />

about how to bid involve policy questions that will often be the most important issue<br />

the state in question faces for many a year. Ideally, decisions about how to approach<br />

the auction should be decided as democratically as possible. But democratic decision<br />

making requires openness, and it is impossible that all the stakeholders in B, including<br />

one imagines the citizens, can participate in the decision about how to approach the<br />

auction without B tipping its hand. In the modern world it’s impossible to involve<br />

everyone in B without opening the debate to agents of A. And this, as we’ve seen,<br />

probably has costs. Since B is better off if it does not make decisions about how to<br />

approach the auction in the open, we have


Nine Objections to Steiner and Wolff on Land Disputes 708<br />

Objection 5. The proposal favours secretive governments over open<br />

democratic governments.<br />

Assume that B has been somewhat secretive, but A is still fairly confident that B will<br />

not accept a bid of 5 1<br />

3<br />

. Its degree of belief that such a bid will be rejected is , let’s<br />

2 4<br />

say, so it is disposed to gamble and make that bid. But now B starts making some<br />

noises about what it will do with any money it gets from A. The primary beneficiary<br />

of this windfall will be B’s military. And the primary use of this military is to engage<br />

in military conflicts with A. While some of these engagements will be defensive, if A<br />

gets the land under dispute many will be offensive. (I don’t think these assumptions<br />

are particularly fanciful in many of the land disputes we see in the modern world.)<br />

A must take this into account when making its decisions. It seems reasonable to say<br />

that every 1 that A gives B has a disutility of 1.2 for A, 1 for the cost of the money it<br />

gives up, and 0.2 for the extra damage it may suffer when that money is turned into<br />

weaponry turned back against A. Now the utility calculations are quite different. If<br />

B accepts A’s bid of 5 1<br />

, A’s balance sheet will look like this:<br />

2<br />

Gain The land, value 8.<br />

Cost 5 1<br />

1<br />

paid to B, value 5 2 2<br />

Cost B’s extra military capability, value (a little over) 1.<br />

Net gain Roughly 1 1<br />

2 .<br />

So now the expected utility of bidding 5 1<br />

2 is:<br />

Prob(Bid Accepted) × Utility(Bid Accepted) + Prob(Bid Rejected) ×<br />

Utility(Bid Rejected)<br />

≈ 1 1 3<br />

× 1 + × 6<br />

4 2 4<br />

= 4 7<br />

8<br />

Hence A’s expected utility for accepting B’s bid of 5, i.e. 5, is higher than its expected<br />

utility of bidding 5 1<br />

, so it will accept the bid, just as B wanted it to do. So if B<br />

2<br />

indicates that it will use any payments from A to attack A, it may well be able to get<br />

the land for less. Hence<br />

Objection 6. The proposal favours belligerent governments over peaceful<br />

governments.<br />

One qualification to this objection is that what matters here is what A thinks B will<br />

do, not what B actually does. So the objection is not that the proposal rewards offensive<br />

behaviour, but that it rewards belligerence, or indications of offensive behaviour.<br />

This isn’t as bad as rewarding military action, but it is still objectionable.<br />

Throughout I have used a particular example to make the points clearer, none of<br />

the arguments turns on the details of this example. What matters is that in any case<br />

where one party is able to spend more for the land in question simply because they<br />

are richer, the richer party will almost inevitably have a higher utility surplus, and


Nine Objections to Steiner and Wolff on Land Disputes 709<br />

this party can increase their expected utility surplus by being more secretive about<br />

their plans, and by being adopting a more belligerent tone towards their rivals before<br />

and during the auction. So it seems the proposal systematically rewards behaviour<br />

we should be discouraging.<br />

The remaining objections concern the implementation of Steiner and Wolff’s<br />

proposal. While I don’t have a demonstrative proof that any of these concerns present<br />

insurmountable difficulties, they all suggest ways in which the proposal must be qualified<br />

if it is to be just.<br />

The proposal seems to assume that the parties to the dispute agree over whether<br />

the land in question can be divided. As Steiner and Wolff put it, “The auction can<br />

thus be viewed as a device for achieving a fair settlement for the disposition of a<br />

good when neither division nor joint ownership is acceptable to the parties.” In<br />

some conflicts at least part of what is at issue is whether the land can be divided.<br />

For instance, if we were applying this proposal as a way of settling the war between<br />

Britain and Ireland in 1921, would we say that all of Ireland should be auctioned<br />

off, or just that the six counties that became Northern Ireland should be auctioned?<br />

Assuming the British had decided that governing southern Ireland had become too<br />

much trouble and were only interested in retaining the north, they may not have<br />

wanted to pay for the whole country just to protect their interests in the north. But<br />

at least some of the Irish would have been unwilling to accept a process that may<br />

have led to the division of the country, as would have obtained had the south been<br />

granted Home Rule, but the north left subject to an auction. (The historical facts<br />

are, obviously, somewhat more complicated than I’ve sketched here, but even when<br />

those complications are considered the difficulties that must be overcome before we<br />

know how to apply the proposal to a real situation are formidable.) Hence<br />

Objection 7. The proposal assumes a mechanism for determining which<br />

land is indivisible, and in some cases developing such a process is no easier<br />

than settling the dispute.<br />

Steiner and Wolff assume that the groups, A and B, are easily identifiable. In practice,<br />

this may not be so easy. For example, at least some people in Scotland would prefer<br />

that Scotland was independent. For now most people prefer devolution to independence<br />

(and some would prefer rule from Westminster) but we can easily imagine<br />

circumstances in which the nationalist support would rise to a level where it became<br />

almost a majority. If a majority in Scotland wants to secede, and the British government<br />

is willing to do this, then presumably they will just secede. But what are we<br />

to do if a narrow majority in Scotland wants to secede, and the British government<br />

(or people) do not want them to go? Presumably Steiner and Wolff’s proposal is<br />

that some sort of auction should be held to determine who should be in charge of<br />

the land. But who exactly are meant to be the parties? On the Westminster side, is<br />

the party Britain as a whole, or Britain except for Scotland? On the Scottish side,<br />

is it the Scottish people? The Scottish government, which for now is a creature that<br />

exists at the pleasure of the British Parliament? Those people who support Scottish<br />

independence? If the last, how shall we determine just who these people are? Perhaps


Nine Objections to Steiner and Wolff on Land Disputes 710<br />

some one or other of these answers can be defended, but the proposal is seriously<br />

incomplete, hence<br />

Objection 8. There is no mechanism for determining who shall count<br />

as a member of the groups in question.<br />

Finally, the proposal simply assumes that we can agree upon the currency in which<br />

the auction shall be conducted, but it is not ever so clear that this can be done. Usually,<br />

the two parties to a dispute will use different currencies, so to avoid conflicts it<br />

would be best if the auction were conducted in a neutral currency. But finding such<br />

a currency may be non-trivial. There are only a handful of currencies in the world<br />

whose supply is sufficiently abundant to conduct an auction of this size, and most of<br />

the time those currencies will be backed by governments who favour one side in the<br />

dispute. If they use this favouritism to provide access to credit denominated in their<br />

currency at a discounted rate, that threatens the fairness of the auction. Hence<br />

Objection 9. The proposal assumes a given currency in which to conduct<br />

the auction, but in practice any choice of currency may favour one<br />

side.<br />

The last three objections are, as mentioned, somewhat administrative. It is possible<br />

that in a particular situation they could be overcome, though I think that it is more<br />

likely that they would pose serious difficulties to a would-be auction-wielding pacifier.<br />

But that’s not the serious problem with the proposal. The real problem, as the<br />

first six objections show, is that it favours rich, secretive, belligerent states that are<br />

disposed to make spurious land claims over poor, democratic, pacifist states that only<br />

make genuine land claims.


Misleading Indexicals<br />

In ‘Now the French are invading England’ Komarine Romdenh-Romluc (2002) offers<br />

a new theory of the relationship between recorded indexicals and their content.<br />

Romdenh-Romluc’s proposes that Kaplan’s basic idea, that reference is determined<br />

by applying a rule to a context, is correct, but we have to be careful about what the<br />

context is, since it is not always the context of utterance. A few well known examples<br />

illustrate this. The ‘here’ and ‘now’ in ‘I am not here now’ on an answering machine<br />

do not refer to the time and place of the original utterance, but to the time the message<br />

is played back, and the place its attached telephone is located. Any occurrence of<br />

‘today’ in a newspaper or magazine refers not to the day the story in which it appears<br />

was written, nor to the day the newspaper or magazine was printed, but to the cover<br />

date of that publication.<br />

Still, it is plausible that for each (token of an) indexical there is a salient context,<br />

and that ‘today’ refers to the day of its context, ‘here’ to the place of its context,<br />

and soon. Romdenh-Romluc takes this to be true, and then makes a proposal about<br />

what the salient context is. It is ‘the context that Ac would identify on the basis of<br />

cues that she would reasonably take U to be exploiting’. 2002, 39 Ac is the relevant<br />

audience, ‘the individual who it is reasonable to take the speaker to be addressing’,<br />

and who is assumed to be linguistically competent and attentive. (So Ac might not<br />

be the person U intends to address. This will not matter for what follows.) The<br />

proposal seems to suggest that it is impossible to trick a reasonably attentive hearer<br />

about what the referent of a particular indexical is. Since such trickery does seem<br />

possible, Romdenh-Romluc’s theory needs (at least) supplementation. Here are two<br />

examples of such tricks.<br />

Example One<br />

Imagine that at my university, the email servers are down, so all communication<br />

from the office staff is by written notes left in our mailboxes. I<br />

notice that one of my colleagues, Bruce, has a rather full mailbox, and<br />

hence must not have been checking his messages for the last day or two.<br />

I also know that Bruce is a forgetful type, and if someone told him that<br />

he’d forgotten about a faculty meeting yesterday, he’d probably believe<br />

them. In fact he hasn’t forgotten; the meeting is for later today. So I decide<br />

to play a little trick on him. I write an official looking note saying<br />

‘There is a faculty meeting today’, leave it undated, and put it in Bruce’s<br />

mailbox underneath several other messages, so it looks like it has been<br />

there for a day or two. When Bruce sees it he is appropriately tricked,<br />

and for an instant panics about the meeting that he has missed.<br />

It seems to me that what I wrote on the note was true. It was horribly misleading,<br />

to be sure, but still true. And as a few people have pointed out over the years, most<br />

† Penultimate draft only. Please cite published version if possible. Final version published in Analysis<br />

62 (2002): 308-10. Thanks to Europa Malynicz, Adam Sennet and Ted Sider for helpful comments.


Misleading Indexicals 712<br />

prominently Bill Clinton I guess, it is possible to mislead people with the truth. But<br />

on Romdemh-Romluc’s proposal, what I said was false, since my audience (Bruce)<br />

reasonably took the context to be a day earlier in the week.<br />

Example Two<br />

This example is closely based on a recent TV commercial. Jack leaves the<br />

following message on Jill’s answering machine late one Saturday night.<br />

‘Hi Jill, it’s Jack. I’m at Rick’s. This place is wild. There’s lots of cute<br />

girls here, but I’m just thinking about you.’ In the background loud<br />

music is playing, as if Jack were at a nightclub, indeed as if Jack were<br />

at Rick’s, so Jill reasonably concludes that Jack was at Rick’s when he<br />

sent the message, and hence that ‘here’ refers to Rick’s. In fact Jack was<br />

home alone, but wanted to hide this fact, so he turned the stereo up to<br />

full volume while leaving the message. Despite the fact that a reasonable<br />

and attentive member of the target audience inferred on the basis<br />

of contextual clues left by Jack that the context was Rick’s, it was not.<br />

The context was Jack’s house, and ‘here’ in Jack’s message referred to his<br />

house. Jack’s trick may be less morally reprehensible than mine, but at<br />

least I managed to avoid lying, something Jack failed to do.<br />

In Example One I said something true even though what the hearer took me to say<br />

was false. In Example Two Jack says something false, though what the hearer takes<br />

him to say may well be true, assuming that there are a lot of cute girls at Rick’s.<br />

Romdenh-Romluc’s theory predicts that neither of these things is possible, so it does<br />

not work as it stands. This, of course, is not to say that anyone else (myself included)<br />

has a better theory readily available, so it is unclear whether the right lesson to draw<br />

from these examples is that Romdenh-Romluc’s theory needs to have some epicycles<br />

added, or that we need to try a rather different approach. One simple epicycle makes<br />

the theory extensionally adequate, but philosophically uninteresting. Consider modifying<br />

the theory to require Ac to be not just reasonable and attentive, but informed<br />

of U’s circumstances. Then the context identified by Ac will be the salient context<br />

for determining the referent of U’s indexicals. But saying this is not to offer a theory<br />

of content for recorded indexicals, it is merely to say that ideally placed observers<br />

have access to all the relevant semantic facts. Even this might be wrong if epistemicism<br />

about vagueness is correct, but if that is true then Romdenh-Romluc’s theory<br />

is probably radically mistaken, for then there are facts about content that cannot be<br />

reasonably believed, even by an attentive and informed observer. We still seem to be<br />

a fair distance from having an acceptable theory.


Bibliography<br />

Adams, Douglas. 1980. The Restaurant at the End of the Universe. London: Pan<br />

Macmillan. Reprinted in Adams (2002). References to Reprint.<br />

—. 2002. The Ultimate Hitchhiker’s Guide to the Galaxy. New York: Ballantyne.<br />

Adams, Ernest. 1998. A Primer on Probability Logic. Palo Alto: CSLI.<br />

Adams, Robert. 1974. “Theories of Actuality.” Noûs 8:211–231.<br />

Adams, Robert Merrihew. 1985. “Involuntary Sins.” Philosophical Review 94:3–31.<br />

Alston, William. 1988. “The Deontological Conception of Epistemic Justification.”<br />

Philosophical Perspectives 2:115–152.<br />

Armstrong, D. M. 1978. Universals and Scientific Realism. Cambridge: Cambridge<br />

University Press.<br />

—. 2000. “Black Swans: The Formative Influences in Australian Philosophy.” In Berit<br />

Brogaard and Barry Smith (eds.), Rationality and Irrationality, 11–17. Kirchberg:<br />

Austrian Ludwig Wittgenstein Society.<br />

Armstrong, David. 1980. “Identity Through Time.” In Peter van Inwagen (ed.), Time<br />

and Cause: Essays Presented to Richard Taylor, 67–78. Dordrecht: Reidel.<br />

Arntzenius, Frank. 2008. “No Regrets; or, Edith Piaf Revamps Decision Theory.”<br />

Erkenntnis 68:277–297.<br />

Austin, J. L. 1962. Sense and Sensibilia. Oxford: Oxford University Press.<br />

Ayer, Alfred. 1936. Language, Truth and Logic. London: Gollantz.<br />

Bach, Kent. 1985. “A Rationale for Reliabilism.” The Monist 68:246–263.<br />

—. 1994. “Conversational Impliciture.” Mind and Language 9:124–62.<br />

—. 2010. “Knowledge In and Out of Context.” In Joseph Keim Campbell, Michael<br />

O’Rourke, and Harry S. Silverstein (eds.), Knowledge and Skepticism, unknown.<br />

Cambridge, MA: MIT Press.


BIBLIOGRAPHY 714<br />

Barker, Stephen. 1997. “Material Implication and General Indicative Conditionals.”<br />

Philosophical Quarterly 47:195–211.<br />

Bartha, Paul and Hitchcock, Christopher. 1999. “The Shooting-Room Paradox and<br />

Conditionalizing on Measurably Challenged Sets.” Synthese 118:403–37.<br />

Barwise, Jon and Cooper, Robin. 1981. “Generalized Quantifiers and Natural Language.”<br />

Linguistics and Philosophy 4:159–220.<br />

Barwise, Jon and Perry, John. 1983. Situations and Attitudes. Cambridge: MIT Press.<br />

Bateman, Bradley. 1996. Keynes’s Uncertain Revolution. Ann Arbor: University of<br />

Michigan Press.<br />

Bealer, George. 1998. “Intuition and the Autonomy of Philosophy.” In DePaul and<br />

Ramsey (1998), 201–240.<br />

Bennett, Jonathan. 1984. “Counterfactuals and Temporal Direction.” Philosophical<br />

Review 93:57–91.<br />

—. 1988. “Farewell to the Phlogiston Theory of Conditionals.” Mind 97:509–527.<br />

—. 1995. “Classifying Conditionals: The Traditional Way is Correct.” Mind 104:331–<br />

354.<br />

—. 2003. A Philosophical Guide to Conditionals. Oxford: Oxford University Press.<br />

Bernadete, Jose. 1964. Infinity: An Essay in Metaphysics. Oxford: Clarendon Press.<br />

Bertrand, Joseph Louis François. 1888. Calcul des probabilités. Paris: Gauthier-Villars<br />

et fils.<br />

Black, Max. 1952. “Saying and Disbelieving.” Analysis 13:25–33.<br />

Blome-Tillmann, Michael. 2009a. “Contextualism, Subject-Sensitive Invariantism,<br />

and the Interaction of ‘Knowledge’-Ascriptions with Modal and Temporal Operators.”<br />

Philosophy and Phenomenological Research 79:315–331, doi:10.1111/j.1933-<br />

1592.2009.00280.x.<br />

—. 2009b. “Knowledge and Presuppositions.” Mind 118:241–294,<br />

doi:10.1093/mind/fzp032.<br />

Bochvar, D. A. 1939. “On a Three Valued Calculus and Its Application to the Analysis<br />

of Contradictories.” Matematicheskii Sbornik 4:287–308.<br />

Bonini, Nicolao, Osherson, Daniel, Viale, Riccardo, and Williamson, Timothy. 1999.<br />

“On the Psychology of Vague Predicates.” Mind and Language 14:377–393.<br />

BonJour, Laurence. 1997. In Defense of Pure Reason. Cambridge: Cambridge University<br />

Press.


BIBLIOGRAPHY 715<br />

—. 1999. “Foundationalism and the External World.” Philosophical Perspectives<br />

13:229–249.<br />

Borel, Emile. 1924. “A propos d’un Traité de Probabilités.” Revue Philosophique<br />

98:321–336.<br />

Bostrom, Nick. 2003. “Are You Living in a Computer Simulation.” Philosophical<br />

Quarterly 53:243–255.<br />

Bovens, Luc and Hawthorne, James. 1999. “The Preface, the Lottery, and the Logic<br />

of Belief.” Mind 108:241–264.<br />

Boyd, Richard. 1988. “How to Be a Moral Realist.” In Geoffrey Sayre-McCord (ed.),<br />

Essays in Moral Realism, 181–228. Ithaca: Cornell University Press.<br />

Braddon-Mitchell, David and Jackson, Frank. 2007. The Philosophy of Mind and Cognition,<br />

second edition. Malden, MA: Blackwell.<br />

Braddon-Mitchell, David and Nola, Robert. 1997. “Ramsification and Glymour’s<br />

Counterexample.” Analysis 57:167–169.<br />

Bradford, Wylie and Harcourt, Geoff. 1997. “Definitions and Units.” In Harcourt<br />

and Riach (1997), 107–131.<br />

Bradley, Richard. 2000. “A Preservation Condition for Conditionals.” Analysis<br />

60:219–222.<br />

Braun, David and Sider, Theodore. 2007. “Vague, So Untrue.” Noûs 41:133–156.<br />

Brewer, Bill. 1999. Perception and Reason. Oxford: Oxford University Press.<br />

Briggs, Rachael. 2009. “Distorted Reflection.” Philosophical Review 118:59–85.<br />

Brown, Jessica. 2008. “Knowledge and Practical Reason.” Philosophy Compass 3:1135–<br />

1152, doi:10.1111/j.1747-9991.2008.00176.x.<br />

Burge, Tyler. 1979. “Individualism and the Mental.” Midwest Studies in Philosophy<br />

4:73–121.<br />

Burgess, J. A. and Humberstone, I. L. 1987. “Natural Deduction Rules for a Logic of<br />

Vagueness.” Erkenntnis 27:197–229.<br />

Burgess, John. 2001. “Vagueness, Epistemicism and Response-Dependence.” Australasian<br />

Journal of Philosophy 79:507–24.<br />

Butterfield, Jeremy. 2006. “Against Pointillisme about Mechanics.” British Journal for<br />

the Philosophy of Science 57:709–753.<br />

Byrne, Alex. 1993. “Truth in Fiction - The Story Continued.” Australasian Journal<br />

of Philosophy 71:24–35.


BIBLIOGRAPHY 716<br />

Campbell, John. 2002. Reference and Consciousness. Oxford: Oxford University<br />

Press.<br />

Capellen, Herman and Lepore, Ernest. 1997. “On an Alleged Connection between<br />

Indirect Quotation and Semantic Theory.” Mind and Language 12:278–96.<br />

Cappelen, Herman. 2008. “Content Relativism.” In Manuel Garcia-Carpintero and<br />

Max Kölbel (eds.), Relativising Utterance Truth, 265–286. Oxford: Oxford University<br />

Press.<br />

Cappelen, Herman and Hawthorne, John. 2009. Relativism and Monadic Truth. Oxford:<br />

Oxford University Press.<br />

Cappelen, Herman and Lepore, Ernest. 2005. Insensitive Semantics: A Defence of<br />

Semantic Minimalism and Speech Act Pluralism. Oxford: Blackwell.<br />

Carnap, Rudolf. 1950. Logical Foundations of Probability. Chicago: University of<br />

Chicago Press.<br />

Chalmers, David. 2006. “Foundations of Two-Dimensional Semantics.” In Manuel<br />

Garcia-Carpintero and Josep Macià (eds.), Two-Dimensional Semantics, 55–140.<br />

Oxford: Oxford University Press.<br />

Chambers, Robert and Quiggin, John. 2000. Uncertainty, Production, Choice, And<br />

Agency: The State-Contingent Approach. Cambridge: Cambridge University Press.<br />

Christensen, David. 1996. “Dutch-Book Arguments De-Pragmatized.” Journal of<br />

Philosophy 93:450 – 479.<br />

—. 2005. Putting Logic in Its Place. Oxford: Oxford University Press.<br />

—. 2007. “Epistemology of disagreement: The good news.” Philosophical Review<br />

116:187–217.<br />

—. 2010. “Disagreement, Question-Begging and Epistemic Self-Criticism.” Philosophers’<br />

Imprint unknown:unknown.<br />

Coates, John. 1996. The Claims of Common Sense. Cambridge: Cambridge University<br />

Press.<br />

—. 1997. “Keynes, Vague Concepts and Fuzzy Logic.” In Harcourt and Riach (1997),<br />

244–260.<br />

Cohen, L. Jonathan. 1977. The Probable and the Provable. Oxford: Clarendon Press.<br />

Cohen, Stewart. 1984. “Justification and Truth.” Philosophical Studies 46:279–295.<br />

—. 1986. “Knowledge and Context.” The Journal of Philosophy 83:574–583.<br />

—. 1988. “How to be a Fallibilist.” Philosophical Perspectives 2:91–123.


BIBLIOGRAPHY 717<br />

—. 2005. “Why Basic Knowledge is Easy Knowledge.” Philosophy and Phenomenological<br />

Research 70:417–430.<br />

Cottingham, John. 2002. “Descartes and the Voluntariness of Belief.” The Monist<br />

85:343–360.<br />

Cummins, Robert. 1998. “Reflection on Reflective Equilibrium.” In DePaul and<br />

Ramsey (1998), 113–128.<br />

Currie, Gregory. 1990. The Nature of Fiction. Cambridge: Cambridge University<br />

Press.<br />

—. 2002. “Desire in Imagination.” In Gendler and Hawthorne (2002), 201–221.<br />

David, Marian. 2002. “Content Essentialism.” Acta Analytica 17:103–114.<br />

Davidson, Donald. 1963. “Actions, Reasons and Causes.” Journal of Philosophy<br />

60:685–700.<br />

—. 1970. “Mental Events.” In Lawrence Foster and J. W. Swanson (eds.), Experience<br />

and Theory, 79–101. London: Duckworth.<br />

Davidson, Paul. 1988. “A Technical Definition of Uncertainty and the Long-Run<br />

Non-neutrality of Money.” Cambridge Journal of Economics 12:329–338.<br />

—. 1991. “Is Probability Theory Relevant for Uncertainty? A Post Keynesian Perspective.”<br />

Journal of Economic Perspectives 5:129–144.<br />

Davies, Martin. 1981. Meaning, Quantification, Necessity: Themes in Philosophical<br />

Logic. London: Routledge.<br />

Davies, Martin and Humberstone, I. L. 1980. “Two Notions of Necessity.” Philosophical<br />

Studies 38:1–31.<br />

Davis, John. 1994. Keynes’s Philosophical Development. Cambridge: Cambridge University<br />

Press.<br />

—. 1995. “Keynes’ Later Philosophy.” History of Political Economy 27:237–260.<br />

Davis, Wayne A. 1979. “Indicative and Subjunctive Conditionals.” Philosophical<br />

Review 88:544–64.<br />

de Finetti, Bruno. 1974. Theory of Probability. New York: Wiley.<br />

Dempster, Arthur. 1967. “Upper and Lower Probabilities induced by a Multi- valued<br />

Mapping.” Annals of Mathematical Statistics 38:325–339.<br />

—. 1968. “A Generalisation of Bayesian Inference.” Journal of the Royal Statistical<br />

Society Series B 30:205–247.<br />

Denby, David. 2001. “Determinable Nominalism.” Philosophical Studies 102:297–327.


BIBLIOGRAPHY 718<br />

DePaul, Michael and Ramsey, William (eds.). 1998. Rethinking Intuition. Lanham:<br />

Rowman & Littlefield.<br />

DeRose, Keith. 1991. “Epistemic Possibilities.” Philosophical Review 100:581–605.<br />

—. 1995. “Solving the Skeptical Problem.” Philosophical Review 104:1–52.<br />

—. 1996. “Knowledge, Assertion and Lotteries.” Philosophical Review 74:568–79.<br />

—. 1998. “Simple Might’s, Indicative Possibilities, and the Open Future.” The Philosophical<br />

Quarterly 48:67–82.<br />

—. 2002. “Assertion, Knowledge and Context.” Philosophical Review 111.<br />

—. 2004. “Single Scoreboard Semantics.” Philosophical Studies 119:1–21.<br />

Descartes, René. 1641/1996. Meditations on First Philosophy, tr. John Cottingham.<br />

Cambridge: Cambridge University Press.<br />

—. 1644/2003. The Principles of Philosophy, tr. John Veitch. Champaign, IL: Project<br />

Gutenberg.<br />

Deutsch, Max. 2009. “Experimental Philosophy and the Theory of Reference.” Mind<br />

and Language 24:445–466.<br />

Devitt, Michael. 2010. “Experimental Semantics.” Philosophy and Phenomenological<br />

Research Forthcoming.<br />

Dietz, Richard. 2008. “Epistemic Modals and Correct Disagreement.” In Manuel<br />

Garcia-Carpintero and Max Kölbel (eds.), Relative Truth, 239–264. Oxford: Oxford<br />

University Press.<br />

Dogramaci, Sinan. 2010. “Knowledge of Validity.” Noûs Forthcoming.<br />

Dorr, Cian. 2003. “Vagueness without Ignorance.” Philosophical Perspectives 17:83–<br />

113.<br />

Douven, Igor. 2006. “Assertion, Knowledge and Rational Credibility.” Philosophical<br />

Review 115.<br />

Dreier, James. 2001. “Boundless Good.” ms .<br />

Dretske, Fred. 1971. “Conclusive Reasons.” Australasian Journal of Philosophy 49:1–<br />

22.<br />

Driver, Julia. 2001. Uneasy Virtues. Cambridge: Cambridge University Press.<br />

Dudman, V. H. 1994. “Against the Indicative.” Australasian Journal of Philosophy<br />

72:17–26.<br />

Dummett, Michael. 1959. “Truth.” Proceedings of the Aristotelian Society New Series<br />

59:141–62.


BIBLIOGRAPHY 719<br />

—. 1975. “Wang’s Paradox.” Synthese 30:301–24.<br />

—. 1991. The Logical Basis of Metaphysics. .Cambridge, MA: Harvard.<br />

Dunn, J. Michael. 1990. “Relevant Predication 2: Intrinsic Properties and Internal<br />

Relations.” Philosophical Studies 60:177–206.<br />

Edgington, Dorothy. 1995. “On Conditionals.” Mind 104:235–327.<br />

—. 1996. “Lowe on Conditional Probability.” Mind 105:617–630.<br />

Egan, Andy. 2004. “Second-Order Predication and the Metaphysics of Properties.”<br />

Australasian Journal of Philosophy 82:48 – 66.<br />

—. 2007a. “Epistemic Modals, Relativism and Assertion.” Philosophical Studies 133:1–<br />

22.<br />

—. 2007b. “Some Counterexamples to Causal Decision Theory.” Philosophical Review<br />

116.<br />

—. 2009. “Billboards, Bombs and Shotgun Weddings.” Synthese 166.<br />

Egan, Andy and Elga, Adam. 2005. “I Can’t Believe I’m Stupid.” Philosophical Perspectives<br />

19:77–93.<br />

Egan, Andy, Hawthorne, John, and <strong>Weatherson</strong>, & <strong>Brian</strong>. 2005. “Epistemic Modals<br />

in Context.” In Gerhard Preyer and Georg Peter (eds.), Contextualism in Philosophy:<br />

Knowledge, Meaning, and Truth, 131–170. Oxford University Press.<br />

Einheuser, Iris. 2008. “Three Forms of Truth-Relativism.” In Manuel Garcia-<br />

Carpintero and Max Kölbel (eds.), Relativising Utterance Truth, 187–203. Oxford:<br />

Oxford University Press.<br />

Eklund, Matti. 2002. “Inconsistent Languages.” Philosophy and Phenomenological<br />

Research 64:251–75.<br />

—. 2005. “What Vagueness Consists in.” Philosophical Studies 125:27–60.<br />

Elga, Adam. 2000a. “Self-Locating Belief and the Sleeping Beauty Problem.” Analysis<br />

60:143–147.<br />

—. 2000b. “Self-Locating Belief and the Sleeping Beauty Problem.” Analysis 60:143–7.<br />

—. 2004a. “Defeating Dr. Evil with Self-Locating Belief.” Philosophy and Phenomenological<br />

Research 69:383–396.<br />

—. 2004b. “Defeating Dr. Evil with Self-Locating Belief.” Philosophy and Phenomenological<br />

Research 69:383–396.<br />

—. 2007. “Reflection and disagreement.” Noûs 41:478–502.<br />

—. 2010. “How to disagree about how to disagree.” In Warfield and Feldman (2010).


BIBLIOGRAPHY 720<br />

Ellis, <strong>Brian</strong>. 1991. “Scientific Essentialism.” Paper presented to the 1991 conference<br />

of the Australasian Association for the History and Philosophy of Science.<br />

—. 2001. Scientific Essentialism. Cambridge: Cambridge University Press.<br />

Engel, Mylan. 1992. “Personal and Doxastic Justification in Epistemology.” Philosophical<br />

Studies 67:133–150.<br />

Euclid. 1956. The Thirteen Books of the Elements, tr. Thomas L. Heath. New York:<br />

Dover.<br />

Evans, Gareth. 1978. “Can There be Vague Objects?” Analysis 38:208.<br />

—. 1979. “Reference and Contingency.” The Monist 62:161–89.<br />

Evnine, Simon. 1999. “Believing Conjunctions.” Synthese 118:201–227,<br />

doi:10.1023/A:1005114419965.<br />

Fantl, Jeremy and McGrath, Matthew. 2002. “Evidence, Pragmatics, and Justification.”<br />

Philosophical Review 111:67–94.<br />

—. 2009. Knowledge in an Uncertain World. Oxford: Oxford University Press.<br />

Fara, Delia Graff. 2000. “Shifting Sands: An Interest-Relative Theory of Vagueness.”<br />

Philosophical Topics 28:45–81. This paper was published under the name ‘Delia<br />

Graff’.<br />

—. 2001. “Phenomenal Continua and the Sorites.” Mind 110:905–35. This paper was<br />

first published under the name ‘Delia Graff’.<br />

—. 2002. “An Anti-Epistemicist Consequence of Margin for Error Semantics for<br />

Knowledge.” Philosophy and Phenomenological Research 64:127–142. This paper<br />

was first published under the name ‘Delia Graff’.<br />

Feldman, Fred. 1998. “Hyperventilating about Intrinsic Value.” Journal of Ethics<br />

2:339–354.<br />

Feldman, Richard. 2005. “Respecting the evidence.” Philosophical Perspectives 19:95–<br />

119.<br />

—. 2006. “Epistemological puzzles about disagreement.” In Epistemology Futures,<br />

216–226. Oxford University Press.<br />

Feltz, Adam and Zarpentine, Chris. forthcoming. “Do You Know More When<br />

It Matters Less?” Philosophical Psychology Retrieved from http://faculty.<br />

schreiner.edu/adfeltz/<strong>Papers</strong>/Know%20more.pdf.<br />

Field, Hartry. 1973. “Theory Change and the Indeterminacy of Reference.” Journal<br />

of Philosophy 70:462–81.


BIBLIOGRAPHY 721<br />

—. 1986. “The Deflationary Conception of Truth.” In Graham Macdonald (ed.), Fact,<br />

Science and Morality, 55–117. Oxford: Blackwell.<br />

—. 2000. “Indeterminacy, Degree of Belief, and Excluded Middle.” Noûs 34:1–30.<br />

Fine, Kit. 1975a. “Critical Notice of Counterfactuals.” Mind 84:451–458.<br />

—. 1975b. “Vagueness, Truth and Logic.” Synthese 30:265–300.<br />

—. 1994. “Compounds and Aggregates.” Noûs 28:137–58.<br />

von Fintel, Kai and Gillies, Anthony S. 2008. “CIA Leaks.” Philosophical Review<br />

117:77–98.<br />

Fodor, Jerry A. 1983. The Modularity of Mind. Cambridge, MA: MIT Press.<br />

—. 1987. Psychosemantics. Cambridge, MA: MIT Press.<br />

—. 1998. Concepts: Where Cognitive Science Went Wrong. Oxford: Oxford University<br />

Press.<br />

Fodor, Jerry A. and Lepore, Ernest. 1992. Holism: A Shopper’s Guide. Cambridge:<br />

Blackwell.<br />

—. 1996. “What Cannot be Valuated Cannot be Valuated, and it Cannot be Supervaluated<br />

Either.” Journal of Philosophy 93:516–535.<br />

Foley, Richard. 1993. Working Without a Net. Oxford: Oxford University Press.<br />

Forrest, Peter. 1982. “Occam’s Razor and Possible Worlds.” Monist 65:456–464.<br />

Forrest, Peter and Armstrong, D. M. 1984. “An Argument Against David Lewis’<br />

Theory of Possible Worlds.” Australasian Journal of Philosophy 62:164–168.<br />

van Fraassen, Bas. 1989. Laws and Symmetry. Oxford: Clarendon Press.<br />

—. 1990. “Figures in a Probability Landscape.” In J. M. Dunn and A. Gupta (eds.),<br />

Truth or Consequences, 345–356. Amsterdam: Kluwer.<br />

Francescotti, Robert. 1999. “How to Define Intrinsic Properties.” Noûs 33:590–609.<br />

Frankfurt, Harry. 1971. “Freedom of the Will and the Concept of a Person.” Journal<br />

of Philosophy 68:5–20.<br />

Freidman, M. and Savage, L. 1952. “The Expected Utility Hypothesis and the Measurability<br />

of Utility.” Journal of Political Economy 60:463–474.<br />

Friedman, Milton. 1953. “The Methodology of Positive Economics.” In Essays in<br />

Positive Economics, 3–43. Chicago: University of Chicago Press.<br />

Gärdenfors, Peter and Sahlin, Nils-Eric. 1982. “Unreliable probabilities, risk taking<br />

and decision making.” Synthese 53:361–386.


BIBLIOGRAPHY 722<br />

Geach, P. T. 1969. God and the Soul. London: Routledge.<br />

Gendler, Tamar Szab´ o. 2000. “The Puzzle of Imaginative Resistance.” Journal of<br />

Philosophy 97:55–81.<br />

Gendler, Tamar Szabó and Hawthorne, John (eds.). 2002. Conceivability and Possibility.<br />

Oxford: Oxford University Press.<br />

Gendler, Tamar Szabó and Hawthorne, John. 2005. “The Real Guide to Fake Barns:<br />

A Catalogue of Gifts for Your Epistemic Enemies.” Philosophical Studies 124:331–<br />

352, doi:10.1007/s11098-005-7779-8.<br />

Gettier, Edmund L. 1963. “Is Justified True Belief Knowledge?” Analysis 23:121–123,<br />

doi:10.2307/3326922.<br />

Gibbard, Allan. 1975. “Contingent Identity.” Journal of Philosophical Logic 4:187–<br />

222.<br />

—. 1981. “Two Recent Theories of Conditionals.” In William Harper, Robert C.<br />

Stalnaker, and Glenn Pearce (eds.), Ifs, 211–247. Dordrecht: Reidel.<br />

—. 1990. Wise Choices, Apt Feelings: A Theory of Normative Judgment. Cambridge,<br />

MA: Harvard University Press.<br />

Gibbard, Allan and Harper, William. 1978. “Counterfactuals and Two Kinds of Expected<br />

Utility.” In C. A. Hooker, J. J. Leach, and E. F. McClennen (eds.), Foundations<br />

and Applications of Decision Theory, 125–162. Dordrecht: Reidel.<br />

Gibbons, John. 1993. “Identity without Supervenience.” Philosophical Studies 70:59–<br />

79.<br />

Gilbert, Daniel T. 1991. “How Mental Systems Believe.” American Psychologist<br />

46:107–119.<br />

Gilbert, Daniel T., Krull, Douglas S., and Malone, Patrick S. 1990. “Unbelieving<br />

the Unbelievable: Some problems in the rejection of false information.” Journal of<br />

Personality and Social Psychology 59:601–613.<br />

Gilbert, Daniel T., Tafarodi, Romin W., and Malone, Patrick S. 1993. “You Can’t<br />

Not Believe Everything You Read.” Journal of Personality and Social Psychology<br />

65:221–233.<br />

Gillies, Anthony S. 2009. “On Truth-Conditions for If (but Not Quite Only If ).”<br />

Philosophical Review 118:325–349.<br />

Ginet, Carl. 1985. “Contra Reliabilism.” Philosophical Studies 68:175–187.<br />

—. 2001. “Deciding to Believe.” In Matthias Steup (ed.), Knowledge, Truth and Duty,<br />

63–76. Oxford: Oxford University Press.


BIBLIOGRAPHY 723<br />

Glanzberg, Michael. 2007. “Context, Content and Relativism.” Philosophical Studies<br />

136:1–29.<br />

Goldblatt, Robert. 1992. Logics of Time and Computation. Palo Alto: CSLI.<br />

Goldman, Alvin. 2002. What is Social Epistemology? A Smorgasbord of Projects, 182–<br />

204. Oxford: Oxford University Press.<br />

Goodman, Nelson. 1951. The Structure of Appearance. Cambridge, MA: Harvard<br />

University Press.<br />

—. 1955. Fact, Fiction and Forecast,. Cambridge: Harvard University Press.<br />

Greenough, Patrick. 2003. “Vagueness: A Minimal Theory.” Mind 112:235–81.<br />

Grice, H. Paul. 1989. Studies in the Way of Words. Cambridge, MA.: Harvard University<br />

Press.<br />

Haack, Susan. 1974. Deviant Logic. Chicago: University of Chicago Press.<br />

Hacking, Ian. 1967. “Possibility.” Philosophical Review 76:343–368.<br />

Hájek, Alan. 2000. “Objecting Vaguely to Pascal’s Wager.” Philosophical Studies 98.<br />

—. 2003. “What Conditional Probability Could Not Be.” Synthese 137:273–323.<br />

—. 2005. “Scotching Dutch Books.” Philosophical Perspectives 19:139–151.<br />

—. 2010. “David Lewis.” In The New Dictionary of Scientific Biography. New York:<br />

Scribners.<br />

Hall, Ned. 1994. “Correcting the Guide to Objective Chance.” Mind 103:505–518.<br />

—. 2004. “Causation and the Price of Transitivity.” In John Collins, Ned Hall, and<br />

L. A. Paul (eds.), Causation and Counterfactuals, 181–203. Cambridge: MIT Press.<br />

Halpern, Joseph. 2004. “Sleeping Beauty Reconsidered: Conditioning and Reflection<br />

in Asynchronous Systems.” In Oxford Studies in Epistemology, volume 1, 111–142.<br />

Oxford: Oxford University Press.<br />

Ham, Sandra A., Martin, Sarah, and Kohl III, Harold W. 2008. “Changes in the<br />

percentage of students who walk or bike to school-United States, 1969 and 2001.”<br />

Journal of Physical Activity and Health 5:205–215.<br />

Hammond, Peter J. 1988. “Consequentialist Foundations for Expected Utility.” Theory<br />

and Decision 25:25–78, doi:10.1007/BF00129168.<br />

Harcourt, G. C. and Riach, P. A. (eds.). 1997. A ‘Second Edition’ of the General Theory.<br />

London: Routledge.<br />

Hare, R. M. 1951. The Language of Morals. Oxford: Oxford University Press.


BIBLIOGRAPHY 724<br />

Harman, Gilbert. 1973. Thought. Princeton: Princeton University Press.<br />

—. 1983. “Problems with Probabilistic Semantics.” In Alex Orenstein and Rafael<br />

Stern (eds.), Developments in Semantics, 243–237. New York: Haven.<br />

—. 1986. Change in View. Cambridge, MA: Bradford.<br />

Hart, A. G. 1942. “Risk, Uncertainty and the Unprofitability of Compounding Probabilities.”<br />

In F. McIntyre O. Lange and T. O. Yntema. (eds.), Studies in Mathematical<br />

Economics and Econometrics. Chicago: University of Chicago Press.<br />

Hart, H. L. A. 1961. The Concept of Law. Oxford: Clarendon Press.<br />

Haslanger, Sally. 1989. “Endurance and Temporary Intrinsics.” Analysis 49:119–125.<br />

Hasson, Uri, Simmons, Joseph P., and Todorov, Alexander. 2005. “Believe It or Not:<br />

On the possibility of suspending belief.” Psychological Science 16:566–571.<br />

Hausman, Daniel. 1992. “Why Look Under the Hood?” In Essays in Philosophy and<br />

Economic Methodology, 70–73. Cambridge: Cambridge University Press.<br />

Hawthorne, John. 1990. “A Note on Languages and Language.” Australasian Journal<br />

of Philosophy 68:116–118.<br />

—. 2001. “Intrinsic Properties and Natural Relations.” Philosophy and Phenomenological<br />

Research 63:399–403.<br />

—. 2002. “Deeply Contingent A Priori Knowledge.” Philosophy and Phenomenological<br />

Research 65:247–269.<br />

—. 2004a. “Humeans Are Out of Their Minds.” Noûs 38:351–8.<br />

—. 2004b. Knowledge and Lotteries. Oxford: Oxford University Press.<br />

—. 2006. “Quantity in Lewisian Metaphysics.” In Metaphysical Essays, 229–237. Oxford:<br />

Oxford University Press.<br />

Hawthorne, John and Stanley, Jason. 2008. “Knowledge and Action.” Journal of<br />

Philosophy 105:571–90.<br />

Heller, Mark. 1996. “Against Metaphysical Vagueness.” Philosophical Perspectives<br />

10:177–85.<br />

—. 2000. “Hobartian Voluntarism: Grounding a Deontological Conception of Epistemological<br />

Justification.” Pacific Philosophical Quarterly 81:130–141.<br />

Hellman, Geoffery. 1997. “Bayes and Beyond.” Philosophy of Science 64:191–221.<br />

Hetherington, Stephen. 2001. Good Knowledge, Bad Knowledge: on two dogmas of<br />

epistemology. Oxford: Oxford University Press.<br />

Hicks, John. 1967. Critical Essays in Monetary Theory. Oxford: Clarendon Press.


BIBLIOGRAPHY 725<br />

Hieronymi, Pamela. 2008. “Responsibility for Believing.” Synthese 161:357–373.<br />

Higginbotham, James and May, Robert. 1981. “Questions, Quantifiers and Crossing.”<br />

Linguistic Review 1:41–80.<br />

Hindriks, Frank. 2007. “The Status of the Knowledge Account of Assertion.” Linguistics<br />

and Philosophy 30:393–406.<br />

Holton, Richard. 1997. “Some Telling Examples:Reply to Tsohatzidis.” Journal of<br />

Pragmatics 28:625–8.<br />

—. 1999. “Intention and Weakness of Will.” The Journal of Philosophy 96:241–262.<br />

—. 2003. “How is Strength of Will Possible?” In Stroud and Tappolet (2003), 39–67.<br />

—. 2004. “Rational Resolve.” Philosophical Review 113:507–535.<br />

Holton, Richard and Shute, Stephen. 2007. “Self-Control in the Modern Provocation<br />

Defence.” Oxford Journal of Legal Studies 27:49–73.<br />

Horgan, Terrence. 1995. “Transvaluationism: a Dionysian Approach to Vagueness.”<br />

Southern Journal of Philosophy 33: Spindel Conference Supplement:. 97–125.<br />

Horn, Laurence. 1989. A Natural History of Negation. Chicago: University of<br />

Chicago Press.<br />

Horowitz, Tamara. 1998. “Philosophical Intuitions and Psychological Theory.”<br />

Ethics 108:367–85.<br />

Horwich, Paul. 1999. Meaning. Oxford: Oxford University Press.<br />

Howson, Colin and Urbach, Peter. 1989. Scientific Reasoning. La Salle: Open Court.<br />

Humberstone, I. L. 1996. “Intrinsic/Extrinsic.” Synthese 108:205–67.<br />

Hume, David. 1757. “On the Standard of Taste.” In Essays: Moral, Political and Legal,<br />

227–249. Indianapolis: Liberty Press.<br />

Humphreys, Paul and Fetzer, James (eds.). 1998. The New Theory of Reference. Dordrecht:<br />

Kluwer.<br />

Hunter, Daniel. 1996. “On the Relation Between Categorical and Probabilistic Belief.”<br />

Noûs 30:75–98, doi:10.2307/2216304.<br />

Hurka, Thomas. 2001. “Vices as Higher-Level Evils.” Utilitas 13:195–212.<br />

Hyde, Dominic. 1997. “From Heaps and Gaps to Heaps of Gluts.” Mind 106:641–<br />

660.<br />

Jackson, Frank. 1987. Conditionals. Blackwell: Oxford.<br />

—. 1990. “Classifying Conditionals.” Analysis 50:134–147.


BIBLIOGRAPHY 726<br />

—. 1991. “Decision Theoretic Consequentialism and the Nearest and Dearest Objection.”<br />

Ethics 101:461–82.<br />

—. 1998. From Metaphysics to Ethics: A Defence of Conceptual Analysis. Clarendon<br />

Press: Oxford.<br />

—. 2001. “Responses.” Philosophy and Phenomenological Research 62:653–64.<br />

Jacobson, Pauline. 1999. “Towards a Variable Free Semantics.” Linguistics and Philosophy<br />

22:117–184.<br />

Jaffray, J. Y. 1994. “Decision Making with Belief Functions.” In Yager et al. (1994),<br />

331–352.<br />

Jeffrey, Richard. 1983a. “Bayesianism with a Human Face.” In J. Earman (ed.) (ed.),<br />

Testing Scientific Theories. Minneapolis: University of Minnesota Press.<br />

Jeffrey, Richard C. 1983b. The Logic of Decision. Chicago: University of Chicago<br />

Press, 2nd edition.<br />

Jehle, David. 2009. Some Results in Bayesian Confirmation Theory with Applications.<br />

Ph.D. thesis, Cornell University.<br />

Jehle, David and Fitelson, Branden. 2009. “What is the “Equal Weight View”?” Episteme<br />

6:280–293.<br />

Jenkins, C. S. 2005. “Sleeping Beauty: A Wake-Up Call.” Philosophica Mathematica<br />

(III) 13:194–201.<br />

Jeshion, Robin. 2002. “Acquiantanceless De Re Belief’.” In Joseph Keim Campbell,<br />

Michael O’Rourke, and David Shier (eds.), Meaning and Truth: Investigations in<br />

Philosophical Semantics, 53–74. New York: Seven Bridges Press.<br />

Johnston, Mark. 1987. “Is There a Problem About Persistence?” Proceedings of the<br />

Aristotelian Society 61:107–135.<br />

Jones, Matthew L. 2006. The Good Life in the Scientific Revolution: Descartes, Pascal,<br />

Leibniz and the Cultivation of Virtue. Chicago: University of Chicago Press.<br />

Joyce, James. 1914/2000. Dubliners. Oxford: Oxford University Press.<br />

—. 1922/1993. Ulysses. Oxford: Oxford University Press.<br />

—. 1944/1963. Stephen Hero. New Directions: Norfolk, CT.<br />

Joyce, James M. 1998. “A Non-Pragmatic Vindication of Probabilism.” Philosophy of<br />

Science 65:575–603.<br />

Kagan, Shelly. 1998. “Rethinking Intrinsic Value.” Journal of Ethics 2:277–297.<br />

Kaplan, David. 1989a. “Demonstratives.” In Joseph Almog, John Perry, and Howard<br />

Wettstein (eds.), Themes from Kaplan, 481–563. Oxford: Oxford University Press.


BIBLIOGRAPHY 727<br />

—. 1989b. “Demonstratives.” In Joseph Almog, John Perry, and Howard Wettstein<br />

(eds.), Themes From Kaplan, 481–563. Oxford: Oxford University Press.<br />

Kaplan, Mark. 1993. “Confessions of a Modest Bayesian.” Canadian Journal of Philosophy<br />

s19:315–337.<br />

—. 1996. Decision Theory as Philosophy. Cambridge: Cambridge University Press.<br />

Keefe, Rosanna. 2000. Theories of Vagueness. Cambridge: Cambridge University<br />

Press.<br />

Keller, Simon. 2005. “Patriotism as Bad Faith.” Ethics 115:563–592.<br />

Kelly, Thomas. 2005. “The epistemic significance of disagreement.” In Oxford Studies<br />

in Epistemology, Volume 1, 167–196. Oup.<br />

—. 2010. “Peer disagreement and higher order evidence.” In Warfield and Feldman<br />

(2010).<br />

Kennett, Jeanette and Smith, Michael. 1996a. “Frog and Toad Lose Control.” Analysis<br />

56:63–73, doi:10.1111/j.0003-2638.1996.00063.x.<br />

—. 1996b. “Philosophy and Commonsense: The Case of Weakness of<br />

Will.” In Michaelis Michael and John O’Leary-Hawthorne (eds.), The<br />

Place of Philosophy in the Study of Mind, 141–157. Norwell, MA: Kluwer,<br />

doi:10.1017/CBO9780511606977.005.<br />

Keynes, John Maynard. 1909. “The Method of Index Numbers with Special Reference<br />

to the Measurement of General Exchange Value.” In Keynes (1971-1989),<br />

50–156.<br />

—. 1921. Treatise on Probability. London: Macmillan.<br />

—. 1931. “Review of Foundations of Mathematics by Frank Ramsey.” The New Statesman<br />

and Nation 2:407;. Reprinted in (Keynes, 1971-1989, X 336-339).<br />

—. 1934. Draft of the General Theory, 423–449. Volume XIII of Keynes (1971-1989).<br />

—. 1936. The General Theory of Employment, Interest and Money. London: Macmillan.<br />

—. 1937a. “The Ex Ante Theory of the Rate of Interest.” Economic Journal 47:663–<br />

668;. Reprinted in (Keynes, 1971-1989, XIV 215-223), references to reprint.<br />

—. 1937b. “The General Theory of Employment.” Quarterly Journal of Economics<br />

51:209–223. Reprinted in (Keynes, 1971-1989, XIV 109-123), references to reprint.<br />

—. 1938a. Letter to Hugh Townshend dated 7 December, 293–294. Volume 14 of Keynes<br />

(1971-1989).<br />

—. 1938b. “My Early Beliefs.” In Keynes (1971-1989), 433–451.


BIBLIOGRAPHY 728<br />

—. 1971-1989. The Collected Writings of John Maynard Keynes. London: Macmillan.<br />

Khamara, E. J. 1988. “Indiscernables and the Absolute Theory of Space and Time.”<br />

Studia Leibnitiana 20:140–159.<br />

Kidd, John. 1988. “The Scandal of ‘Ulysses’.” The New York Review of Books 35:32–<br />

39.<br />

Kieran, Matthew and Lopes, Dominic McIver (eds.). 2003. Imagination, Philosophy<br />

and the Arts. London. Routledge.<br />

Kim, Jaegwon. 1973. “Causes and Counterfactuals.” Journal of Philosophy 70:570–<br />

572.<br />

—. 1982. “Psychophysical Supervenience.” Philosophical Studies 41:51–70.<br />

King, Jeff and Stanley, Jason. 2005. “Semantics, Pragmatics and the Role of Semantic<br />

Content.” In Zoltan Szabó (ed.), Semantics vs Pragmatics, 111–164. Oxford:<br />

Oxford University Press.<br />

King, Jeffrey. 1994. “Can Propositions be Naturalistically Acceptable?” Midwest<br />

Studies in Philosophy 19:53–75.<br />

—. 1995. “Structured Propositions and Complex Predicates.” Noûs 29:516–35.<br />

—. 1998. “What is a Philosophical Analysis?” Philosophical Studies 90:155–179.<br />

Knight, Frank. 1921. Risk, Uncertainty and Profit. Chicago: University of Chicago<br />

Press.<br />

Kölbel, Max. 2004. “Indexical Relativism vs Genuine Relativism.” International<br />

Journal of Philosophical Studies 12:297–313.<br />

Kölbel, Max. 2009. “The Evidence for Relativism.” Synthese 166:375–395.<br />

Krebs, Angelika. 1999. Ethics of Nature: A Map. Hawthorne: de Gruyter.<br />

Kripke, Saul. 1965. “Semantical Analysis of Intuitionistic Logic.” In Michael Dummett<br />

and John Crossley (eds.), Formal Systems and Recursive Functions. Amsterdam:<br />

North-Holland.<br />

—. 1975. “Outline of a Theory of Truth.” Journal of Philosophy 72:690–716.<br />

—. 1980. Naming and Necessity. Cambridge: Harvard University Press.<br />

—. 1982. Wittgenstein on Rules and Private Language. Oxford: Basil Blackwell.<br />

Kyburg, Henry. 1974. The Logical Foundations of Statistical Inference. Dordrecht:<br />

Reidel.


BIBLIOGRAPHY 729<br />

Langton, Rae and Lewis, David. 1998. “Defining ‘Intrinsic’.” Philosophy and Phenomenological<br />

Research 58:333–345. Reprinted in <strong>Papers</strong> in Metaphysics and Epistemology,<br />

pp. 116-132.<br />

—. 2001. “Marshall and Parsons on ‘Intrinsic’.” Philosophy and Phenomenological<br />

Research 63:353–355.<br />

Lasersohn, Peter. 2005. “Context Dependence, Disagreement and Predicates of Personal<br />

Taste.” Linguistics and Philosophy 28:643–686.<br />

Laurence, Stephen and Margolis, Eric. 2001. “The Poverty of the Stimulus Argument.”<br />

British Journal for the Philosophy of Science 52:217–76.<br />

Leeds, Stephen. 2000. “A Disquotationalist Looks at Vagueness.” Philosophical Topics<br />

28:107–28.<br />

Leibniz, Gottfried Wilhelm. 1998. Philosophical Texts. Oxford: Oxford University<br />

Press.<br />

Levi, Isaac. 1974. “On Indeterminate Probabilities.” Journal of Philosophy 71:391–418.<br />

—. 1980. The Enterprise of Knowledge. Cambridge, MA.: MIT Press.<br />

—. 1982. “Ignorance, Probability and Rational Choice.” Synthese 53:387–417.<br />

—. 1996. For the Sake of the Argument:Ramsey Test Conditionals, Inductive Inference<br />

and Nonmonotonic Reasoning. Cambridge: Cambridge University Press.<br />

Levinson, Stephen. 2000. Presumptive Meanings. Cambridge, MA: MIT Press.<br />

Lewis, David. 1966. “An Argument for the Identity Theory.” Journal of Philosophy<br />

63:17–25. Reprinted with additions as Lewis (1971a).<br />

—. 1968. “Counterpart Theory and Quantified Modal Logic.” Journal of Philosophy<br />

65:113–126. Reprinted in Philosophical <strong>Papers</strong>, Volume I, pp. 26-39.<br />

—. 1969a. Convention: A Philosophical Study. Cambridge: Harvard University Press.<br />

—. 1969b. “Lucas against Mechanism.” Philosophy 44:231–3. Reprinted in <strong>Papers</strong> in<br />

Philosophical Logic, pp. 166-169.<br />

—. 1970a. “Anselm and Actuality.” Noûs 4:175–188. Reprinted in Philosophical <strong>Papers</strong>,<br />

Volume I, pp. 10-20.<br />

—. 1970b. “How to Define Theoretical Terms.” Journal of Philosophy 67:427–446.<br />

Reprinted in Philosophical <strong>Papers</strong>, Volume I, pp. 78-95.<br />

—. 1971a. “An Argument for the Identity Theory.” In David M. Rosenthal (ed.),<br />

Materialism and the Mind-Body Problem, 162–171. Englewood Cliffs, NJ: Prentice-<br />

Hall. Reprinted in Philosophical <strong>Papers</strong>, Volume I, pp. 99-107.


BIBLIOGRAPHY 730<br />

—. 1971b. “Counterparts of Persons and Their Bodies.” Journal of Philosophy 68:203–<br />

211. Reprinted in Philosophical <strong>Papers</strong>, Volume I, pp. 47-54.<br />

—. 1972. “Psychophysical and Theoretical Identifications.” Australasian Journal of<br />

Philosophy 50:249–58. Reprinted in <strong>Papers</strong> in Metaphysics and Epistemology, pp.<br />

248-261.<br />

—. 1973a. “Causation.” Journal of Philosophy 70:556–567. Reprinted in Philosophical<br />

<strong>Papers</strong>, Volume II, pp. 159-172.<br />

—. 1973b. Counterfactuals. Oxford: Blackwell Publishers.<br />

—. 1974a. “Radical Interpretation.” Synthese 27:331–344. Reprinted in Philosophical<br />

<strong>Papers</strong>, Volume I, pp. 108-118.<br />

—. 1974b. “’Tensions.” In Milton K. Munitz and Peter K. Unger (eds.), Semantics and<br />

Philosophy, 49–61. New York: New York University Press. Reprinted in Philosophical<br />

<strong>Papers</strong>, Volume I, pp. 250-260.<br />

—. 1975a. “Adverbs of Quantification.” In Formal Semantics of Natural Language, 3–<br />

15. Cambridge: Cambridge University Press. Reprinted in <strong>Papers</strong> in Philosophical<br />

Logic, pp. 5-20.<br />

—. 1975b. “Languages and Language.” In Minnesota Studies in the Philosophy of Science,<br />

volume 7, 3–35. Minneapolis: University of Minnesota Press. Reprinted in<br />

Philosophical <strong>Papers</strong>, Volume I, pp. 163-188.<br />

—. 1976a. “The Paradoxes of Time Travel.” American Philosophical Quarterly 13:145–<br />

152. Reprinted in Philosophical <strong>Papers</strong>, Volume II, pp. 67-80.<br />

—. 1976b. “Probabilities of Conditionals and Conditional Probabilities.” Philosophical<br />

Review 85:297–315. Reprinted in Philosophical <strong>Papers</strong>, Volume II, pp. 133-152.<br />

—. 1978a. “Reply to McMichael.” Analysis 38:85–86. Reprinted in <strong>Papers</strong> in Ethics<br />

and Social Philosophy, pp. 34-36.<br />

—. 1978b. “Truth in Fiction.” American Philosophical Quarterly 15:37–46. Reprinted<br />

in Philosophical <strong>Papers</strong>, Volume I, pp. 261-275.<br />

—. 1979a. “Attitudes De Dicto and De Se.” Philosophical Review 88:513–543. Reprinted<br />

in Philosophical <strong>Papers</strong>, Volume I, pp. 133-156.<br />

—. 1979b. “Counterfactual Dependence and Time’s Arrow.” Noûs 13:455–476.<br />

Reprinted in Philosophical <strong>Papers</strong>, Volume II, pp. 32-52.<br />

—. 1979c. “Lucas against Mechanism II.” Canadian Journal of Philosophy 9:373–6.<br />

Reprinted in <strong>Papers</strong> in Philosophical Logic, pp. 170-173.<br />

—. 1979d. “Prisoners’ Dilemma is a Newcomb Problem.” Philosophy and Public<br />

Affairs 8:235–240. Reprinted in Philosophical <strong>Papers</strong>, Volume II, pp. 299-304.


BIBLIOGRAPHY 731<br />

—. 1979e. “A Problem about Permission.” In Esa Saarinen, Risto Hilpinen, Illka<br />

Niiniluoto, and Merrill Provence (eds.), Essays in Honour of Jaakko Hintikka on<br />

the occasion of his fiftieth birthday on January 12, 1979, 163–175. Dordrecht: Reidel.<br />

Reprinted in <strong>Papers</strong> in Ethics and Social Philosophy, pp. 20-33.<br />

—. 1979f. “Scorekeeping in a Language Game.” Journal of Philosophical Logic 8:339–<br />

359. Reprinted in Philosophical <strong>Papers</strong>, Volume I, pp. 233-249.<br />

—. 1980a. “Index, Context, and Content.” In Stig Kanger and Sven Öhman (eds.),<br />

Philosophy and Grammar, 79–100. Dordrecht: Reidel. Reprinted in <strong>Papers</strong> in Philosophical<br />

Logic, pp. 21-44.<br />

—. 1980b. “Mad Pain and Martian Pain.” In Ned Block (ed.), Readings in the Philosophy<br />

of Psychology, volume I, 216–232. Cambridge: Harvard University Press.<br />

Reprinted in Philosophical <strong>Papers</strong>, Volume I, pp. 122-130.<br />

—. 1980c. “A Subjectivist’s Guide to Objective Chance.” In Studies in Inductive<br />

Logic and Probability, volume 2, 83–132. Berkeley: University of California Press.<br />

Reprinted in Philosophical <strong>Papers</strong>, Volume II, pp. 83-113.<br />

—. 1980d. “Veridical Hallucination and Prosthetic Vision.” Australasian Journal of<br />

Philosophy 58:239–249. Reprinted in Philosophical <strong>Papers</strong>, Volume II, pp. 273-286.<br />

—. 1981a. “Are we Free to Break the Laws?” Theoria 47:113–121. Reprinted in<br />

Philosophical <strong>Papers</strong>, Volume II, pp. 291-298.<br />

—. 1981b. “Causal Decision Theory.” Australasian Journal of Philosophy 59:5–30.<br />

Reprinted in Philosophical <strong>Papers</strong>, Volume II, pp. 305-337.<br />

—. 1982. “Logic for Equivocators.” Noûs 16:431–441. Reprinted in <strong>Papers</strong> in Philosophical<br />

Logic, pp. 97-110.<br />

—. 1983a. “Extrinsic Properties.” Philosophical Studies 44:197–200. Reprinted in<br />

<strong>Papers</strong> in Metaphysics and Epistemology, pp. 111-115.<br />

—. 1983b. “Individuation by Acquaintance and by Stipulation.” Philosophical Review<br />

92:3–32. Reprinted in <strong>Papers</strong> in Metaphysics and Epistemology, pp. 373-402.<br />

—. 1983c. “New Work for a Theory of Universals.” Australasian Journal of Philosophy<br />

61:343–377. Reprinted in <strong>Papers</strong> in Metaphysics and Epistemology, pp. 8-55.<br />

—. 1983d. Philosophical <strong>Papers</strong>, volume I. Oxford: Oxford University Press.<br />

—. 1984a. “Devil’s Bargains and the Real World.” In Douglas Maclean (ed.), The<br />

Security Gamble: Deterrence in the Nuclear Age, 141–154. Totowa, NJ: Rowman<br />

and Allenheld. Reprinted in <strong>Papers</strong> in Ethics and Social Philosophy, pp. 201-218.<br />

—. 1984b. “Putnam’s Paradox.” Australasian Journal of Philosophy 62:221–236.<br />

Reprinted in <strong>Papers</strong> in Metaphysics and Epistemology, pp. 56-77.


BIBLIOGRAPHY 732<br />

—. 1986a. “Events.” In Philosophical <strong>Papers</strong>, volume II, 241–269. Oxford: OUP.<br />

—. 1986b. On the Plurality of Worlds. Oxford: Blackwell Publishers.<br />

—. 1986c. Philosophical <strong>Papers</strong>, volume II. Oxford: Oxford University Press.<br />

—. 1986d. “Probabilities of Conditionals and Conditional Probabilities II.” Philosophical<br />

Review 95:581–589. Reprinted in <strong>Papers</strong> in Philosophical Logic, pp. 57-65.<br />

—. 1988a. “Ayer’s First Empiricist Criterion of Meaning: Why Does it Fail?” Analysis<br />

48:1–3. Reprinted in <strong>Papers</strong> in Philosophical Logic, pp. 156-158.<br />

—. 1988b. “Desire as Belief.” Mind 97:323–32. Reprinted in <strong>Papers</strong> in Ethics and Social<br />

Philosophy, pp. 42-54.<br />

—. 1988c. “Rearrangement of Particles: Reply to Lowe.” Analysis 48:65–72.<br />

Reprinted in <strong>Papers</strong> in Metaphysics and Epistemology, pp. 187-195.<br />

—. 1988d. “Vague Identity: Evans Misunderstood.” Analysis 48:128–130.<br />

—. 1988e. “What Experience Teaches.” Proceedings of the Russellian Society 13:29–57.<br />

Reprinted in <strong>Papers</strong> in Metaphysics and Epistemology, pp. 262-290.<br />

—. 1989a. “Academic Appointments: Why Ignore the Advantage of Being Right?”<br />

Ormond <strong>Papers</strong> 6. Reprinted in <strong>Papers</strong> in Ethics and Social Philosophy, pp. 187-200.<br />

—. 1989b. “Dispositional Theories of Value.” Proceedings of the Aristotelian Society<br />

Supplementary Volume 63:113–137. Reprinted in <strong>Papers</strong> in Ethics and Social<br />

Philosophy, pp. 68-94.<br />

—. 1989c. “Mill and Milquetoast.” Australasian Journal of Philosophy 67:152–171.<br />

Reprinted in <strong>Papers</strong> in Ethics and Social Philosophy, pp. 159-186.<br />

—. 1990. “Noneism or Allism?” Mind 99:23–31. Reprinted in <strong>Papers</strong> in Metaphysics<br />

and Epistemology, pp. 152-163.<br />

—. 1991. Parts of Classes. Oxford: Blackwell.<br />

—. 1992. “Meaning without Use: Reply to Hawthorne.” Australasian Journal of<br />

Philosophy 70:106–110. Reprinted in <strong>Papers</strong> in Ethics and Social Philosophy, pp.<br />

145-151.<br />

—. 1993a. “Evil for Freedom’s Sake?” Philosophical <strong>Papers</strong> 22:149–172. Reprinted in<br />

<strong>Papers</strong> in Ethics and Social Philosophy, pp. 101-127.<br />

—. 1993b. “Many, But Almost One.” In Keith Campbell, John Bacon, and<br />

Lloyd Reinhardt (eds.), Ontology, Causality, and Mind: Essays on the Philosophy<br />

of D. M. Armstrong, 23–38. Cambridge: Cambridge University Press,<br />

doi:10.1017/CBO9780511625343.010. Reprinted in <strong>Papers</strong> in Metaphysics and Epistemology,<br />

pp. 164-182.


BIBLIOGRAPHY 733<br />

—. 1993c. “Mathematics is Megethology.” Philosophia Mathematica 3:3–23. Reprinted<br />

in <strong>Papers</strong> in Philosophical Logic, pp. 203-230.<br />

—. 1994a. “Humean Supervenience Debugged.” Mind 103:473–490. Reprinted in<br />

<strong>Papers</strong> in Metaphysics and Epistemology, pp. 224-247.<br />

—. 1994b. “Reduction of Mind.” In Samuel Guttenplan (ed.), A<br />

Companion to the Philosophy of Mind, 412–431. Oxford: Blackwell,<br />

doi:10.1017/CBO9780511625343.019. Reprinted in <strong>Papers</strong> in Metaphysics<br />

and Epistemology, pp. 291-324.<br />

—. 1995. “Should a Materialist Believe in Qualia?” Australasian Journal of Philosophy<br />

73:140–44. Reprinted in <strong>Papers</strong> in Metaphysics and Epistemology, pp. 325-331.<br />

—. 1996a. “Desire as Belief II.” Mind 105:303–13. Reprinted in <strong>Papers</strong> in Ethics and<br />

Social Philosophy, pp. 55-67.<br />

—. 1996b. “Elusive Knowledge.” Australasian Journal of Philosophy 74:549–567,<br />

doi:10.1080/00048409612347521. Reprinted in <strong>Papers</strong> in Metaphysics and Epistemology,<br />

pp. 418-446.<br />

—. 1997a. “Do We Believe in Penal Substitution?” Philosophical <strong>Papers</strong> 26:203–209.<br />

Reprinted in <strong>Papers</strong> in Ethics and Social Philosophy, pp. 128-135.<br />

—. 1997b. “Finkish Dispositions.” Philosophical Quarterly 47:143–158. Reprinted in<br />

<strong>Papers</strong> in Metaphysics and Epistemology, pp. 133-151.<br />

—. 1997c. “Naming the Colours.” Australasian Journal of Philosophy 75:325–42.<br />

Reprinted in <strong>Papers</strong> in Metaphysics and Epistemology, pp. 332-358.<br />

—. 1998. <strong>Papers</strong> in Philosophical Logic. Cambridge: Cambridge University Press.<br />

—. 1999a. <strong>Papers</strong> in Metaphysics and Epistemology. Cambridge: Cambridge University<br />

Press.<br />

—. 1999b. “Why Conditionalize?” In <strong>Papers</strong> in Metaphysics and Epistemology, 403–<br />

407. Cambridge University Press. Originally written as a course handout in 1972.<br />

—. 2000. <strong>Papers</strong> in Ethics and Social Philosophy. Cambridge: Cambridge University<br />

Press.<br />

—. 2001a. “Redefining ’Intrinsic’.” Philosophy and Phenomenological Research 63:381–<br />

398.<br />

—. 2001b. “Sleeping Beauty: Reply to Elga.” Analysis 61:171–176.<br />

—. 2001c. “Truthmaking and Difference-Making.” Noûs 35:602–615.<br />

—. 2002. “Tensing the Copula.” Mind 111:1–14.


BIBLIOGRAPHY 734<br />

—. 2003. “Things qua Truthmakers.” In Hallvard Lillehammer and Gonzalo<br />

Rodriguez-Pereyra (eds.), Real Metaphysics: Essays in Honour of D. H. Mellor, 25–38.<br />

London: Routledge.<br />

—. 2004a. “Causation as Influence.” In John Collins, Ned Hall, and L. A. Paul (eds.),<br />

Causation and Counterfactuals, 75–106. Cambridge: MIT Press.<br />

—. 2004b. “How Many Lives has Schrödinger’s Cat?” Australasian Journal of Philosophy<br />

82:3–22.<br />

—. 2004c. “Void and Object.” In John Collins, Ned Hall, and L. A. Paul (eds.),<br />

Causation and Counterfactuals, 277–290. Cambridge: MIT Press.<br />

—. 2007. “Divine Evil.” In Louise Anthony (ed.), Philosophers Without Gods, 231–242.<br />

Oxford: Oxford University Press.<br />

Lewis, David and Lewis, Stephanie. 1970. “Holes.” Australasian Journal of Philosophy<br />

48:206–212. Reprinted in Philosophical <strong>Papers</strong>, Volume I, pp. 3-9.<br />

Lipsey, R. G. and Lancaster, Kelvin. 1956-1957. “The General Theory of Second<br />

Best.” Review of Economic Studies 24:11–32, doi:10.2307/2296233.<br />

Littlejohn, Clayton. 2009. “The Externalist’s Demon.” Canadian Journal of Philosophy<br />

39:399–434.<br />

López de Sa, Dan. 2007a. “(Indexical) Relativism about Values: A Presuppositional<br />

Defense.”<br />

—. 2007b. “The Many Relativisms and the Question of Disagreement.” International<br />

Journal of Philosophical Studies 15:339–348.<br />

—. 2008. “Presuppositions of Commonality.” In Manuel Garcia-Carpintero and Max<br />

Kölbel (eds.), Relativising Utterance Truth, 297–310. Oxford University Press.<br />

Lowe, E. J. 1988. “The Problems of Intrinsic Change: Rejoinder to Lewis.” Analysis<br />

48:72–77.<br />

Ludwig, Kirk. 2007. “The Epistemology of Thought Experiments.” Midwest Studies<br />

in Philosophy 31:128–159.<br />

Lycan, William. 1993. “MPP, RIP.” Philosophical Perspectives 7:411–428.<br />

—. 2001. “The Case for Phenomenal Externalism.” Philosophical Perspectives 15:17–<br />

35.<br />

MacFarlane, John. 2003a. “Future Contingents and Relative Truth.” Philosophical<br />

Quarterly 53:321–336.<br />

—. 2003b. “Future Contingents and Relative Truth.” Philosophical Quarterly 53:321–<br />

336.


BIBLIOGRAPHY 735<br />

—. 2005a. “The Assessment Sensitivity of Knowledge Attributions.” Oxford Studies<br />

in Epistemology 1:197–233.<br />

—. 2005b. “Making Sense of Relative Truth.” Proceedings of the Aristotelian Society<br />

105:321–339.<br />

—. 2007. “Relativism and Disagreement.” Philosophical Studies 132:17–31.<br />

Macfarlane, John. 2007. “Semantic Minimalism and Nonindexical Contextualism.”<br />

In Gerhard Preyer and Georg Peter (eds.), Context-Sensitivity and Semantic Minimalism:<br />

New Essays on Semantics and Pragmatics, 240–250. Oxford University<br />

Press.<br />

MacFarlane, John. 2009. “Nonindexical Contextualism.” Synthese 166:231–250.<br />

Machina, Kenton. 1976. “Truth, Belief and Vagueness.” Journal of Philosophical Logic<br />

5:47–78, doi:10.1007/BF00263657.<br />

Mackie, John. 1977. Ethics: Inventing Right and Wrong. London: Penguin.<br />

Maher, Patrick. 1993. Betting on Theories. Cambridge: Cambridge University Press.<br />

—. 1997. “Depragmatised Dutch Book Arguments.” Philosophy of Science 64:291–305.<br />

Maitra, Ishani. 2007. “How and Why to Be a Moderate Contextualist.” In Gerhard<br />

Preyer and Georg Peter (eds.), Context Sensitivity and Semantic Minimalism: New<br />

Essays on Semantics and Pragmatics, 111–132. Oxford: Oxford University Press.<br />

—. 2010. “Assertion, Norms and Games.”<br />

Mallon, Ron, Machery, Eduoard, Nichols, Shaun, and Stich, Stephen. 2009. “Against<br />

Arguments from Reference.” Philosophy and Phenomenological Research 79:332–<br />

356.<br />

Marshall, Dan and Parsons, Josh. 2001. “Langton and Lewis on ‘Intrinsic’.” Philosophy<br />

and Phenomenological Research 63:347–351.<br />

Martí, Genoveva. 2009. “Against Semantic Multi-Culturalism.” Analysis 69:42–48.<br />

Matravers, Derek. 2003. “Fictional Assent and the (So-Called) “Puzzle of Imaginative<br />

Resistance”.” In Kieran and Lopes (2003), 91–108.<br />

Maudlin, Tim. 2007. The Metaphysics Within Physics. Oxford: Oxford University<br />

Press.<br />

May, Joshua, Sinnott-Armstrong, Walter, Hull, Jay G., and Zimmerman, Aaron.<br />

forthcoming. “Practical Interests, Relevant Alternatives, and Knowledge Attributions:<br />

an Empirical Study.” Review of Philosophy and Psychology ,<br />

doi:10.1007/s13164-009-0014-3.


BIBLIOGRAPHY 736<br />

McCawley, James. 1996. “Conversational Scorekeeping and the Interpretation of<br />

Conditional Sentences.” In Masayoshi Shibatani and Sandra Thompson (eds.),<br />

Grammatical Constructions, 77–101. Oxford: Clarendon Press.<br />

McDowell, John. 1996. Mind and World. Cambridge, MA: Harvard University Press.<br />

McGee, Vann. 1985. “A Counterexample to Modus Ponens.” Journal of Philosophy<br />

82:462–471.<br />

—. 1991. Truth, Vagueness and Paradox. Indianapolis: Hackett.<br />

—. 1999. “An Airtight Dutch Book.” Analysis 59:257–265.<br />

McGee, Vann and McLaughlin, <strong>Brian</strong>. 1995. “Distinctions Without a Difference.”<br />

Southern Journal of Philosophy 33 (Supp):203–51.<br />

—. 1998. “Review of Timothy Williamson’s Vagueness.” Linguistics and Philosophy<br />

21:221–231.<br />

—. 2000. “The Lessons of the Many.” Philosophical Topics 28:129–51.<br />

McGrath, Sarah. 2005. “Causation by Omission: A Dilemma.” Philosophical Studies<br />

123:125–48.<br />

McKay, Thomas. 2006. Plural Predication. Oxford: Oxford University Press.<br />

McKinnon, Neil. 2002. “Supervaluations and the Problem of the Many.” Philosophical<br />

Quarterly 52:320–339.<br />

Melia, Joseph. 1992. “A Note on Lewis’s Ontology.” Analysis 52:191–192.<br />

Melia, Joseph and Saatsi, Juha. 2006. “Ramseyfication and Theoretical Content.”<br />

British Journal for the Philosophy of Science 57:561–585.<br />

Menzies, Peter. 1996. “Probabilistic Causation and the Pre-emption Problem.” Mind<br />

105:85–117.<br />

—. 1999. “Intrinsic versus Extrinsic Conceptions of Causation.” In Howard Sankey<br />

(ed.), Causation and Laws of Nature, 313–329. Dordrecht: Kluwer.<br />

Merricks, Trenton. 2001. “Varieties of Vagueness.” Philosophy and Phenomenological<br />

Research 62:145–57.<br />

Milne, Peter. 1991. “Scotching the Dutch Book Argument.” Erkenntnis 32:105–26.<br />

Moggridge, Donald. 1992. Maynard Keynes: An Economist’s Biography. London:<br />

Routledge.<br />

Montague, Richard. 1970. “Universal grammar.” Theoria 36:373–398.


BIBLIOGRAPHY 737<br />

—. 1973. “The Proper Treatment of Quantification in Ordinary English.” In K. J. J.<br />

Hintikka, J. M. E. Moravcsik, and P. Suppes (eds.), Approaches to Ordinary Language,<br />

221–242. Dordrecht: Reidel.<br />

Monton, Bradley. 2002. “Reflections on Sleeping Beauty.” Analysis 62:47–53.<br />

Moore, G. E. 1903. Principia Ethica. Cambridge: Cambridge University Press.<br />

Moran, Richard. 1995. “The Expression of Feeling in Imagination.” Philosophical<br />

Review 103:75–106.<br />

Morgan, Charles and LeBlanc, Hughes. 1983a. “Probabilistic Semantics for Formal<br />

Logic.” Notre Dame Journal of Formal Logic 24:161–180.<br />

—. 1983b. “Probability Theory, Intuitionism, Semantics and the Dutch Book Argument.”<br />

Notre Dame Journal of Formal Logic 24:289–304.<br />

Morgan, Charles and Mares, Edward. 1995. “Conditionals, Probability and Non-<br />

Triviality.” Journal of Philosophical Logic 24:455–467.<br />

Neale, Stephen. 1990. Descriptions. Cambridge, MA: MIT Press.<br />

Nelkin, Dana. 2000. “The Lottery Paradox, Knowledge, and Rationality.” Philosophical<br />

Review 109:373–409.<br />

Nerlich, Graham. 1979. “Is Curvature Intrinsic to Physical Space?” Philosophy of<br />

Science 46:439–458.<br />

Neta, Ram. 2007. “Anti-intellectualism and the Knowledge-Action Principle.”<br />

Philosophy and Phenomenological Research 75:180–187, doi:10.1111/j.1933-<br />

1592.2007.00069.x.<br />

Newton, Isaac. 1952. Opticks. New York: Dover Press.<br />

Nolan, Daniel. 1998. “Impossible Worlds: A Modest Approach.” Notre Dame Journal<br />

of Formal Logic 38:535–573.<br />

—. 2003. “Defending a Possible-Worlds Account of Indicative Conditionals.” Philosophical<br />

Studies 116.<br />

—. 2005. David Lewis. Chesham: Acumen Publishing.<br />

—. 2006. “Selfless Desires.” Philosophy and Phenomenological Research 73:665–679.<br />

—. 2007. “Selfless Desires.” Philosophy and Phenomenological Research 73:665–679.<br />

O’Donnell, Rod. 1989. Keynes: Philosophy, Economics and Politics. London: Macmillan.<br />

—. 1991. “Reply.” In Rod O’Donnell (ed.), Keynes as Philosopher-Economist, 78–102.<br />

London: Macmillan.


BIBLIOGRAPHY 738<br />

—. 1997. “Keynes and Formalism.” In Harcourt and Riach (1997), 131–165.<br />

Owens, David. 2000. Reason Without Freedom: The Problem of Epistemic Responsibility.<br />

New York: Routledge.<br />

Pagin, Peter. 2005. “Compositionality and Context.” In Gerhard Preyer and Georg<br />

Peter (eds.), Contextualism in Philosophy: Knowledge, Meaning, and Truth, 303–348.<br />

Oxford: Oxford University Press.<br />

Paris, J. B. 1994. The Uncertain Reasoner’s Companion: A Mathematical Perspective.<br />

Cambridge: Cambridge University Press.<br />

Parsons, Josh. 2007. “Is Everything a World?” Philosophical Studies 134:165–181.<br />

Parsons, Terence. 2000. Indeterminate Identity. Oxford: Oxford.<br />

Partee, Barbara. 1989. “Binding Implicit Variables in Quantified Contexts.” In Caroline<br />

Wiltshire, Randolph Graczyk, and Bradley Music (eds.), <strong>Papers</strong> from the<br />

Twenty-fifth Regional Meeting of the Chicago Linguistic Society, chapter 342-356.<br />

Chicago: Chicago Linguistic Society. Reprinted in Partee (2004).<br />

—. 2004. Compositionality in Formal Semantics: Selected <strong>Papers</strong> by Barbara H. Partee.<br />

Oxford: Blackwell.<br />

Perry, John. 1979. “The Problem of the Essential Indexical.” Noûs 13:3–21.<br />

Pettit, Philip and Sugden, Robert. 1989. “The backward induction paradox.” Journal<br />

of Philosophy 86:169–182.<br />

Plantinga, Alvin. 1974. The Nature of Necessity. Oxford: Oxford University Press.<br />

Ponsonnet, Jean-Marc. 1996. “The Best and the Worst in G. L. S. Shackle’s Decision<br />

Theory.” In Christian Schmidt (ed.), Uncertainty in Economic Thought, 169–196.<br />

Cheltham: Edward Elgar.<br />

Poole, David, Mackworth, Alan, and Goebel, Randy. 1998. Computational Intelligence:<br />

A Logical Approach. Oxford: Oxford University Press.<br />

Priest, Graham. 1998. “What Is So Bad About Contradictions?” Journal of Philosophy<br />

95:410–426.<br />

—. 1999. “Sylvan’s Box: A Short Story and Ten Morals.” Notre Dame Journal of<br />

Formal Logic. 38:573–582.<br />

—. 2001. An Introduction to Non-Classical Logic. Cambridge: Cambridge University<br />

Press.<br />

Pryor, James. 2000a. “The Sceptic and the Dogmatist.” Noûs 34:517–549,<br />

doi:10.1111/0029-4624.00277.


BIBLIOGRAPHY 739<br />

—. 2000b. “The Skeptic and the Dogmatist.” Noûs 34:517–549, doi:10.1111/0029-<br />

4624.00277.<br />

—. 2001. “Highlights of Recent Epistemology.” British Journal for the Philosophy of<br />

Science 52:95–124.<br />

—. 2004a. “Is Moore’s Argument an Example of Transmission Failure?” Philosophical<br />

Issues 14:349–378.<br />

—. 2004b. “What’s Wrong with Moore’s Argument?” Philosophical Issues 14:349–378.<br />

Putnam, Hilary. 1973. “Meaning and Reference.” Journal of Philosophy 70:699–711.<br />

—. 1980. “Models and Reality.” Journal of Symbolic Logic 45:464–82.<br />

Putnam, Hillary. 1981. Reason, Truth and History. Cambridge: Cambridge University<br />

Press.<br />

Quine, W. V. O. 1960. Word and Object. Cambridge, MA.: MIT Press.<br />

—. 1969. “Propositional Objects.” In Ontological Relativity and Other Essays, 139–<br />

160. New York: Columbia University Press.<br />

—. 1973. The Roots of Reference. La Salle: Open Court.<br />

Rabinowicz, Wlodzimierz. 1979. Universalizability. Dordrecht: Reidel.<br />

Raffman, Diana. 1994. “Vagueness Without Paradox.” Philosophical Review 103:41–<br />

74.<br />

Ramsey, Frank. 1926. “Truth and Probability.” In Ramsey (1990), 52–94.<br />

—. 1929. “Probability and Partial Belief.” In Ramsey (1990), 95–96.<br />

—. 1931. The Foundations of Mathematics and other Logical Essays. London: Routledge.<br />

—. 1990. Philosophical <strong>Papers</strong>. Cambridge: Cambridge University Press.<br />

Ramsey, William. 1998. “Prototypes and Conceptual Analysis.” In DePaul and Ramsey<br />

(1998), 161–177.<br />

Reichenbach, Hans. 1956. The Direction of Time. Berkeley: University of California<br />

Press.<br />

Richard, Mark. 2004. “Contextualism and Relativism.” Philosophical Studies 119:215–<br />

242.<br />

Richter, Reed. 1984. “Rationality Revisited.” Australasian Journal of Philosophy 62.<br />

Robertson, Teresa. 2000. “On Soames’s Solution to the Sorites Paradox.” Analysis<br />

60:328–34.


BIBLIOGRAPHY 740<br />

Romdenh-Romluc, Komarine. 2002. “Now the French are invading England.” Analysis<br />

62:34–41.<br />

Rosch, Eleanor and Mervis, Carolyn. 1975. “Family Resemblances: Studies in the<br />

Internal Structure of Categories.” Cognitive Science 8:382–439.<br />

Ross, Jacob. forthcoming. “Countable Additivty, Dutch Books and the Sleeping<br />

Beauty Problem.” Possibly forthcoming in Philosophical Review.<br />

Runde, Jochen. 1990. “Keynesian Uncertainty and the Weight of Arguments.” Economics<br />

and Philosophy 6:275–93.<br />

—. 1994a. “Keynes After Ramsey: In Defence of ‘A Treatise on Probability’.” Studies<br />

in the History and Philosophy of Science 25:97–124.<br />

—. 1994b. “Keynesian Uncertainty and Liquidity Preference.” Cambridge Journal of<br />

Economics 18:129–144.<br />

Russell, Bertrand. 1923. “Vagueness.” Australasian Journal of Philosophy and Psychology<br />

1:84–92.<br />

—. 1948. Human Knowledge: Its Scope and Limits. London: Allen and Unwin.<br />

Russell, Gillian and Doris, John M. 2009. “Knowledge by Indifference.” Australasian<br />

Journal of Philosophy 86:429–437, doi:10.1080/00048400802001996.<br />

Ryan, Sharon. 2003. “Doxastic Compatibilism and the Ethics of Belief.” Philosophical<br />

Studies 114:47–79.<br />

Ryle, Gilbert. 1949. The Concept of Mind. New York: Barnes and Noble.<br />

—. 1954. Dilemmas. Cambridge: Cambridge University Press.<br />

Sainsbury, Mark. 1991. “Is There Higher-Order Vagueness?” Philosophical Quarterly<br />

41:167–82.<br />

—. 1996. “Vagueness, Ignorance and Margin for Error.” British Journal for the Philosophy<br />

of Science 46:589–601.<br />

Salmon, Nathan. 1981. Reference and Essence. Princeton: Princeton University Press.<br />

Sanford, David. 1976. “Competing Semantics of Vagueness: Many Values Versus<br />

Super Truth.” Synthese 33:195–210.<br />

Savage, Leonard. 1954. The Foundations of Statistics. New York: John Wiley.<br />

Sayre-McCord, Geoffrey. 1991. “Being a Realist about Relativism (in Ethics).” Philosophical<br />

Studies 61:155–176.<br />

Schaffer, Jonathan. 2000. “Trumping Preemption.” Journal of Philosophy 97:165–.


BIBLIOGRAPHY 741<br />

Schelling, Thomas. 1960. The Strategy of Conflict. Cambridge: Harvard University<br />

Press.<br />

Schick, Frederick. 1986. “Dutch Bookies and Money Pumps.” Journal of Philosophy<br />

83:112 – 119.<br />

Schiffer, Stephen. 1987. Remnants of Meaning. Cambridge, MA.: MIT Press.<br />

—. 1998. “Two Issues of Vagueness.” Monist 81:193–214.<br />

—. 2000a. “Replies.” Philosophical Issues 10:320–43.<br />

—. 2000b. “Vagueness and Partial Belief.” Philosophical Issues 10:220–57.<br />

Schlenker, Philippe. 2003. “Indexicality, Logophoricity, and Plural Pronouns.” In<br />

Jacqueline Lecarme (ed.), Afroasiatic Grammar II: Selected <strong>Papers</strong> from the Fifth<br />

Conference on Afroasiatic Languages, Paris, 2000, 409–428. Amsterdam: John Benjamins.<br />

Schlesinger, George. 1990. “Qualitative Identity and Uniformity.” Noûs 24:529–541.<br />

Schmeidler, David. 1989. “Subjective Probability and Expected Utility without Additivity.”<br />

Econometrica 57:571–589.<br />

Schwarz, Wolfgang. 2009. David Lewis: Metaphysik und Analyse. Paderborn: Mentis-<br />

Verlag.<br />

Sedivy, Julie, Tanenhaus, Michael., Chambers, Craig, and Carlson, Gregory. 1999.<br />

“Achieving incremental semantic interpretation through contextual representation.”<br />

Cognition 71:109–47.<br />

Seidenfeld, Teddy. 1994. “When Normal and Extensive Form Decisions Differ.” In<br />

<strong>Brian</strong> Skyrms Dag Prawitz and Dag Westerståhl (eds.), Logic, Methodology and Philosophy<br />

of Science, 451–463. Amsterdam: Elsevier.<br />

Shackle, George. 1949. Expectation in Economics. Cambridge: Cambridge University<br />

Press.<br />

Shafer, Glenn. 1976. A Mathematical Theory of Evidence. Princeton: Princeton University<br />

Press.<br />

—. 1981. “Constructive Probability.” Synthese 48:1–60.<br />

Shah, Nishi. 2002. “Clearing Space for Doxastic Voluntarism.” The Monist 85:436–<br />

445.<br />

Shapiro, Nina. 1997. “Imperfect Competition and Keynes.” In Harcourt and Riach<br />

(1997), 83–92.<br />

Shin, Hyun Song. 1989. “Non-partitional Information on Dynamic State Spaces<br />

and the Possibility of Speculation.” Working Paper 90-11, Center for Research on<br />

Economic and Social Theory, Univesity of Michigan.


BIBLIOGRAPHY 742<br />

Shoemaker, Sydney. 1984. Cause and Mind. Cambridge: Cambridge University Press.<br />

Shope, Robert. 1983. The Analysis of Knowledge. Princeton: Princeton University<br />

Press.<br />

Sider, Theodore. 1993. “Asymmetry and Self-Sacrifice.” Philosophical Studies 70.<br />

—. 1996. “All the World’s a Stage.” Australasian Journal of Philosophy 74:433 – 453.<br />

—. 2001a. “Maximality and Intrinsic Properties.” Philosophy and Phenomenological<br />

Research 63:357–364.<br />

—. 2001b. “Maximality and Intrinsic Properties.” Philosophy and Phenomenological<br />

Research 63:357–364.<br />

—. 2002. “The Ersatz Pluriverse.” Journal of Philosophy 99:279–315.<br />

—. 2003. “Maximality and Microphysical Supervenience.” Philosophy and Phenomenological<br />

Research 66:139–149.<br />

Silins, Nicholas. 2005. “Deception and Evidence.” Philosophical Perspectives 19:375–<br />

404.<br />

Skidelsky, Robert. 1983. John Maynard Keynes. Vol. I: Hopes Betrayed, 1883-1920.<br />

London: Macmillan.<br />

—. 1992. John Maynard Keynes. Vol. II: The Economist as Saviour, 1920-1937. London:<br />

Macmillan.<br />

Skinner, B. F. 1948. Walden Two. New York: Macmillan.<br />

Smart, J. J. C. 1959. “Sensations and Brain Processes.” Philosophical Review 68:141–<br />

156.<br />

Smith, Angela M. 2005. “Responsibility for Attitudes: Activity and Passivity in Mental<br />

Life.” Ethics 115:236–271.<br />

Smith, Michael. 1994. The Moral Problem. Oxford: Blackwell.<br />

—. 1997. “A Theory of Freedom and Responsibility.” In Garrett Cullity and Berys<br />

Gaut (eds.), Ethics and Practical Reason, 293–317. Oxford: Oxford University Press.<br />

—. 2003. “Rational Capacities.” In Stroud and Tappolet (2003), 17–38.<br />

Soames, Scott. 1987. “Direct Reference, Propositional Attitudes and Semantic Content.”<br />

Philosophial Topics 15:47–87.<br />

—. 1998a. “More Revisionism about Reference.” In Humphreys and Fetzer (1998),<br />

65–87.<br />

—. 1998b. “Revisionism about Reference: A Reply to Smith.” In Humphreys and<br />

Fetzer (1998), 13–35.


BIBLIOGRAPHY 743<br />

—. 1999. Understanding Truth. New York: Oxford University Press.<br />

—. 2002. Beyond Rigidity. Oxford: Oxford University Press.<br />

—. 2003. Philosophical Analysis in the Twentieth Century. Princeton: Princeton University<br />

Press.<br />

Sorensen, Roy. 2000. “Direct Reference and Vague Identity.” Philosophical Topics<br />

28:175–94.<br />

—. 2001. Vagueness and Contradiction. Oxford: Oxford University Press.<br />

Sosa, Ernest. 1991. Knowledge in Perspective. New York: Cambridge University<br />

Press.<br />

—. 1997. “Reflective Knowledge in the Best Circles.” Journal of Philosophy 94:410–<br />

430.<br />

—. 1998. “Minimal Intuition.” In DePaul and Ramsey (1998), 257–269.<br />

—. 1999. “How to Defeat Opposition to Moore.” Philosophical Perspectives 13:137–49.<br />

Stalnaker, Robert. 1973. “Presuppositions.” Journal of Philosophical Logic 2.<br />

—. 1975. “Indicative Conditionals.” Philosophia 5:269–286.<br />

—. 1984. Inquiry. Cambridge, MA: MIT Press.<br />

—. 1998. “Belief revision in games: forward and backward induction.” Mathematical<br />

Social Sciences 36:31 – 56. ISSN 0165-4896, doi:DOI: 10.1016/S0165-4896(98)00007-<br />

9.<br />

—. 1999. “Extensive and strategic forms: Games and models for games.” Research in<br />

Economics 53:293 – 319. ISSN 1090-9443, doi:DOI: 10.1006/reec.1999.0200.<br />

—. 2008a. Our Knowledge of the Internal World. Oxford: Oxford University Press.<br />

—. 2008b. Our Knowledge of the Internal World. Oxford: Oxford University Press.<br />

Stalnaker, Robert C. 1968. “A Theory of Conditionals.” In Nicholas Rescher (ed.),<br />

Studies in Logical Theory, 98–112. Oxford: Blackwell.<br />

—. 1976. “Possible Worlds.” Noûs 10:65–75.<br />

—. 1978. “Assertion.” Syntax and Semantics 9:315–332.<br />

—. 1981. “A Defence of Conditional Excluded Middle.” In William Harper,<br />

Robert C. Stalnaker, and Glenn Pearce (eds.), Ifs, 87–104. Dordrecht: Reidel.<br />

—. 1996. “Knowledge, Belief and Counterfactual Reasoning in Games.” Economics<br />

and Philosophy 12:133–163, doi:10.1017/S0266267100004132.


BIBLIOGRAPHY 744<br />

Stanley, Jason. 2000. “Context and Logical Form.” Linguistics and Philosophy 23:391–<br />

434.<br />

—. 2002. “Nominal Restriction.” In Georg Peter and Gerhard Preyer (eds.), Logical<br />

Form and Language, 365–388. Oxford: Oxford University Press.<br />

—. 2005. Knowledge and Practical Interests. Oxford University Press.<br />

—. 2007. Language in Context: Selected Essays. Oxford University Press.<br />

—. 2008. “Knowledge and Certainty.” Philosophical Issues 18:35–57.<br />

Stanley, Jason and Szabó, Zoltán Gendler. 2000. “On Quantifier Domain Restriction.”<br />

Mind and Language 15:219–61, doi:10.1111/1468-0017.00130.<br />

Steiner, Hillel and Wolff, Jonathan. 2003. “A general framework for resolving disputed<br />

land claims.” Analysis 63:188–189.<br />

Stephenson, Tamina. 2007a. “Judge Dependence, Epistemic Modals, and Predicates<br />

of Personal Taste.” Linguistics and Philosophy 30:487–525.<br />

—. 2007b. “Judge Dependence, Epistemic Modals, and Predicates of Personal Taste.”<br />

Linguistics and Philosophy 30:484–525.<br />

Steup, Matthias. 2000. “Doxastic Voluntarism and Epistemic Deontology.” Acta<br />

Analytica 15:25–56.<br />

—. 2008. “Doxastic Freedom.” Synthese 161:375–392.<br />

Stich, Stephen. 1978. “Beliefs and Subdoxastic States.” Philosophy of Science 45:499–<br />

518.<br />

—. 1988. “Reflective Equilibrium, Analytic Epistemology and the Problem of Cognitive<br />

Diversity.” Synthese 74:391–413.<br />

—. 1992. “What is a Theory of Mental Representation?” Mind 101:243–63.<br />

Stock, Kathleen. 2003. “The Tower of Goldbach and Other Impossible Tales.” In<br />

Kieran and Lopes (2003), 107–124.<br />

Strat, Thomas. 1990. “Decision analysis using Belief Functions.” International Journal<br />

of Approximative Reasoning 4:391–417.<br />

Stroud, Sarah and Tappolet, Christine (eds.). 2003. Weakness of Will and Varities of<br />

Practical Irrationality. Oxford: Oxford University Press.<br />

Szabó, Zoltan Gendler. 2000. “Descriptions and Uniqueness.” Philosophical Studies<br />

101:29–57.<br />

Tappenden, Jamie. 1993. “The Liar and Sorites Paradoxes; Toward a Unified Treatment.”<br />

Journal of Philosophy 90:551–77.


BIBLIOGRAPHY 745<br />

Taylor, Barry. 1993. “On Natural Properties in Metaphysics.” Mind 102:81–100.<br />

Teller, Paul. 1972. “Epistemic Possibility.” Philosophia 2:303–320.<br />

—. 1973. “Conditionalization and Observation.” Synthese 26:218–258.<br />

Tennant, Neil. 1992. Autologic. Edinburgh: Edinburgh University Press.<br />

Thau, Michael. 1994. “Undermining and Admissibility.” Mind 103:491–504.<br />

Tintner, Gerhard. 1941. “The Theory of Choice Under Subjective Risk and Uncertainty.”<br />

Econometrica 9:298–304.<br />

Titlebaum, Michael. 2008. “The Relevance of Self-Locating Beliefs.” Philosophical<br />

Review 117:555–605.<br />

Turri, John. 2010. “On the Relationship between Propositional and Doxastic Justification.”<br />

Philosophy and Phenomenological Research 80:312–326, doi:10.1111/j.1933-<br />

1592.2010.00331.x.<br />

Tye, Michael. 1990. “Vague Objects.” Mind 99:535–57.<br />

—. 1994. “Sorites Paradoxes and the Semantics of Vagueness.” Philosophical Perspectives<br />

8:189–206.<br />

Unger, Peter. 1996. Living High and Letting Die. Oxford: Oxford University Press.<br />

Vallentyne, Peter. 1997. “Intrinsic Properties Defined.” Philosophical Studies 88:209–<br />

219.<br />

van Fraassen, Bas. 1966. “Singular Terms, Truth–Value Gaps and Free Logic.” Journal<br />

of Philosophy 66:481–95.<br />

—. 1995. “Belief and the Problem of Ulysses and the Sirens.” Philosophical Studies<br />

77:7–37.<br />

von Fintel, Kai and Iatridou, Sabine. 2003. “Epistemic Containment.” Linguistic<br />

Inquiry 34:173–98.<br />

Walley, Peter. 1991. Statisical Reasoning with Imprecise Probabilities. London: Chapman<br />

& Hall.<br />

—. 1996. “Inferences from Multinomal Data: Learning about a bag of marbles (with<br />

discussion).” Journal of the Royal Statistical Society Series B 58:3–57.<br />

Walton, Kendall. 1990. Mimesis as Make Believe. Cambridge, MA: Harvard University<br />

Press.<br />

—. 1994. “Morals in Fiction and Fictional Morality.” Aristotelian Society 68(Supp):27–<br />

50.<br />

Wang, Hao. 1987. Reflections on Gôdel. Cambridge, MA: MIT Press.


BIBLIOGRAPHY 746<br />

Warfield, Ted and Feldman, Richard. 2010. Disagreement. Oxford: Oxford University<br />

Press.<br />

Warnock, G. J. 1989. J. L. Austin. London: Routledge.<br />

Watson, Gary. 1977. “Skepticism about Weakness of Will.” Philosophical Review<br />

86:316–339, doi:10.2307/2183785.<br />

<strong>Weatherson</strong>, <strong>Brian</strong>. 2001a. “Indicative and Subjunctive Conditionals.” Philosophical<br />

Quarterly 51:200–216.<br />

—. 2001b. “Intrinsic Properties and Combinatorial Principles.” Philosophy and Phenomenological<br />

Research 63:365–380.<br />

—. 2002. “Misleading Indexicals.” Analysis 62:308–310.<br />

—. 2003a. “Epistemicism, Parasites, and Vague Names.” Australasian Journal of Philosophy<br />

81:276 – 279.<br />

—. 2003b. “Review of Rosanna Keefe, Theories of Vagueness.” Philosophy and Phenomenological<br />

Research 67:491–494.<br />

—. 2003c. “What Good Are Counterexamples?” Philosophical Studies 115:1–31,<br />

doi:10.1023/A:1024961917413.<br />

—. 2004a. “From Classical to Intuitionistic Probability.” Notre Dame Journal of<br />

Formal Logic 111–123.<br />

—. 2004b. “Luminous Margins.” Australasian Journal of Philosophy 82:373 – 383.<br />

—. 2005a. “Can We Do Without Pragmatic Encroachment?” Philosophical Perspectives<br />

19:417–443, doi:10.1111/j.1520-8583.2005.00068.x.<br />

—. 2005b. “Scepticism, Rationalism and Externalism.” Oxford Studies in Epistemology<br />

1:311–331.<br />

—. 2005c. “True, Truer, Truest.” Philosophical Studies 123:47–70, doi:10.1007/s11098-<br />

004-5218-x.<br />

—. 2006a. “Intrinsic Vs. Extrinsic Properties.” In Edward N. Zalta (ed.), The Stanford<br />

Encyclopedia of Philosophy (Fall 2006 Edition).<br />

—. 2006b. “Questioning Contextualism.” In Stephen Cade Hetherington (ed.), Epistemology<br />

Futures, 133–147. Oxford: Oxford University Press.<br />

—. 2007. “The Bayesian and the Dogmatist.” Proceedings of the Aristotelian Society<br />

107:169–185, doi:10.1111/j.1467-9264.2007.00217.x.<br />

—. 2009. “Conditionals and Indexical Relativism.” Synthese 166:333–357.<br />

—. forthcoming. “Stalnaker on Sleeping Beauty.” Philosophical Studies .


BIBLIOGRAPHY 747<br />

Wedgwood, Ralph. 2007. The Nature of Normativity. Oxford: Oxford University<br />

Press.<br />

Weinberg, Jonathan, Stich, Stephen, and Nichols, Shaun. 2001. “Normativity and<br />

Epistemic Intuitions.” Philosophical Topics 29:429–460.<br />

Weiner, Matthew. 2005. “Must We Know What We Say.” Philosophical Review<br />

114:227–251.<br />

Weirich, Paul. 2008. “Causal Decision Theory.” The Stanford Encyclopaedia<br />

of Philosophy Winter 2008. Edward N. Zalta (ed.),<br />

url=http://plato.stanford.edu/archives/win2008/entries/decision-causal/.<br />

Wettstein, Howard. 2004. The Magic Prism. Oxford: Oxford University Press.<br />

White, Roger. 2006. “Problems for Dogmatism.” Philosophical Studies 131:525–557.<br />

Williams, Bernard. 1976. “Deciding to Believe.” In Problems of the Self, 136–151.<br />

Cambridge: Cambridge University Press.<br />

Williams, J. R. G. 2010. “Gradational accuracy and non-classical semantics.”<br />

Manuscript.<br />

Williams, J. Robert G. 2007. “Eligibility and Inscrutability.” Philosophical Review<br />

116:361–399.<br />

Williamson, Timothy. 1994. Vagueness. Routledge.<br />

—. 1995. “Definiteness and Knowability.” Southern Journal of Philosophy 33<br />

(Supp):171–191.<br />

—. 1996. “Knowing and Asserting.” Philosophical Review 105:489–523.<br />

—. 1998. “Conditionalizing on Knowledge.” British Journal for the Philosophy of<br />

Science 49:89–121, doi:10.1093/bjps/49.1.89.<br />

—. 2000a. Knowledge and its Limits. Oxford University Press.<br />

—. 2000b. “Scepticism and Evidence.” Philosophy and Phenomenological Research<br />

60:613–628.<br />

—. 2004. “Reply to McGee and McLaughlin.” Linguistics and Philosophy 27.<br />

—. 2007. The Philosophy of Philosophy. Blackwell Pub. Ltd.<br />

Witner, Gene D., Butchard, William, and Trogdon, Kelly. 2005. “Intrinsicality without<br />

Naturalness.” Philosophy and Phenomenological Research 70:326–350.<br />

Wittgenstein, Ludwig. 1953. Philosophical Investigations. London: Macmillan.<br />

—. 1956. Remarks on the Foundations of Mathematics. New York: Macmillan.


BIBLIOGRAPHY 748<br />

Wolfe, Tom. 2000. Hooking Up. Farrar, Strauss and Giroux: New York.<br />

Wright, Crispin. 1975. “On the Coherence of Vague Predicates.” Synthese 30:325–65.<br />

—. 2004. “Warrant for Nothing (And Foundations for Free)?” Proceedings of the<br />

Aristotelian Society, Supplementary Volume 78:167–212.<br />

Yablo, Stephen. 1993. “Intrinsicness.” Philosophical Topics 26:479–505.<br />

—. 1999. “Intrinsicness.” Philosophical Topics 26:479–505.<br />

—. 2002. “Coulda, Woulda, Shoulda.” In Gendler and Hawthorne (2002), 441–492.<br />

Yager, R., Fedrizzi, M., and Kacprzyk, J. (eds.). 1994. Advances in the Dempster- Shafer<br />

Theory of Evidence. New York: John Wiley.<br />

Yaglom, I. M. 1962. Geometric Transformations I. Random House: New York.<br />

Zadeh, Lofti A. 1965. “Fuzzy Sets.” Information and Control 8:338–53.<br />

—. 1978. “Fuzzy Sets as a Basis for a Theory of Probability.” Fuzzy Sets and Systems<br />

1:3–28.<br />

Zimmerman, Dean. 1996. “Could Extended Objects Be Made Out of Simple Parts:<br />

An Argument for Atomless Gunk.” Philosophy and Phenomenological Research<br />

56:1–29.<br />

—. 1997. “Immanent Causation.” Philosophical Perspectives 11:433–471.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!