Negative evidence and the raw frequency fallacy* - CiteSeerX
Negative evidence and the raw frequency fallacy* - CiteSeerX
Negative evidence and the raw frequency fallacy* - CiteSeerX
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
62 A. Stefanowitsch<br />
In this note, I would like to take issue with (a large part of) <strong>the</strong>ir<br />
argument. I will argue that <strong>the</strong> idea that corpora do not contain negative<br />
<strong>evidence</strong> is simply a special case of what I have termed <strong>the</strong> observed<strong>frequency</strong><br />
(or <strong>raw</strong>-<strong>frequency</strong>) fallacy, i. e., <strong>the</strong> belief that “[o]bserved frequencies<br />
of occurrence represent relevant facts for scientific analysis”<br />
(Stefanowitsch 2005: 296). When approached with <strong>the</strong> right methodological<br />
tools, corpora do provide negative <strong>evidence</strong>, i. e., <strong>evidence</strong> that<br />
allows us, in principle, to distinguish between constructions that did not<br />
occur but could have (<strong>the</strong>se could be referred to as ‘accidentally absent’,<br />
<strong>and</strong> constructions that did not occur <strong>and</strong> could not have (<strong>the</strong>se can be<br />
referred to as ‘significantly absent’ structures). Thus, while I do agree<br />
that linguists cannot (<strong>and</strong> should not) ‘eschew introspection entirely’, I<br />
will argue that <strong>the</strong>y can (<strong>and</strong> largely should) eschew introspective judgments<br />
of acceptability.<br />
Collostructional analysis <strong>and</strong> <strong>the</strong> significance of absence<br />
In this section, I will address <strong>the</strong> general issue of how significant absences<br />
of a particular configuration of linguistic elements can be distinguished<br />
from accidental ones, using as an example <strong>the</strong> ‘ability’ or ‘inability’ of<br />
English verbs to occur with ditransitive complementation. The choice of<br />
this example is motivated primarily by practical considerations: as will<br />
presently become clear, <strong>the</strong> method I will use requires <strong>the</strong> researcher to<br />
extract exhaustively from a corpus all occurrences of <strong>the</strong> grammatical<br />
phenomenon in question. Ditransitive complementation happens to be<br />
one of <strong>the</strong> features that is relatively uncontroversially tagged in <strong>the</strong><br />
largest grammatically annotated balanced corpus currently available, <strong>the</strong><br />
British component of <strong>the</strong> International Corpus of English (ICE-GB, cf.<br />
Nelson et al. 2002). However, it is a welcome coincidence that this is<br />
precisely <strong>the</strong> complementation pattern that McEnery <strong>and</strong> Wilson chose<br />
to demonstrate <strong>the</strong> need for grammaticality judgments. 1<br />
The relevant method is one of several that Gries <strong>and</strong> I have developed<br />
in a series of publications specifically for <strong>the</strong> purpose of investigating<br />
<strong>the</strong> relationship between grammatical constructions <strong>and</strong> <strong>the</strong> words occurring<br />
in <strong>the</strong>m, <strong>and</strong> that we refer to collectively as collostructional<br />
analysis (cf. e. g., Stefanowitsch <strong>and</strong> Gries 2003, 2005, to appear a; Gries<br />
<strong>and</strong> Stefanowitsch 2004a, b, to appear). 2 The most basic of <strong>the</strong>se methods,<br />
simple collexeme analysis, allows <strong>the</strong> researcher to identify words<br />
that occur significantly more or less frequently than expected in a given<br />
slot of a construction. This is done on <strong>the</strong> basis of a st<strong>and</strong>ard 2-by-2<br />
contingency table containing four observed frequencies: (a) <strong>the</strong> <strong>frequency</strong><br />
of a given word in a particular slot of a given construction, (b)<br />
<strong>the</strong> <strong>frequency</strong> of <strong>the</strong> same word in <strong>the</strong> corresponding slots of all o<strong>the</strong>r