Negative evidence and the raw frequency fallacy* - CiteSeerX

More documents

Recommendations

Info

72 A. Stefanowitsch whether zero deviates significantly from this expected frequency (the sufficient condition for upholding the hypothesis). This information will likely be more difficult to obtain or estimate than information about complementation patterns, but to do so is by no means impossible. Any hypothesis about possible and impossible structures in language is ultimately a hypothesis about the incompatibility of two (or more) linguistic categories. As long as these categories can be operationalized in such a way that they can be exhaustively annotated (or identified spontaneously) in a corpus of naturally occurring language, and as long as the corpus is large enough, this corpus can provide both positive and negative evidence. The first condition should always be met: if a category cannot be operationalized for objective identification, it has no place in a linguistic theory. The second condition is not currently met. There are several syntactically annotated corpora (for example, the Penn Treebank, Sampson’s Suzanne and Christine corpora, and the ICE-GB used in this note), but they are either too small for many research questions, or their annotation scheme is too coarse or too unreliable, or both. However, this cannot seriously be used as a defense of the introspective method. Instead, it must be used as an argument for the funding and the human resources necessary for the construction of large grammatically annotated corpora. A discipline can only get so far by thought experiments (if that is what acceptability judgments are). It begins to make substantial headway only when it faces up to the problem of data scarcity and solves it. Astronomers have built radio telescopes, physicists have built particle colliders, and geneticists have sequenced the human genome; linguists should be able to construct large, balanced, syntactically annotated corpus of at least the world’s major languages. But even until this goal is reached or, more likely, in case it is never reached corpora can yield both positive and negative evidence for the construction of linguistic theories. Final remarks: the occurring and the non-occurring The main point of this note was to show that corpora contain negative evidence and that this negative corpus evidence can, and should, replace introspective acceptability judgments. It seems appropriate, however, to discuss the most important theoretical implications of such a step. First, from the perspective advocated here, the non-occurrence of a particular linguistic structure is merely the limiting case; it is not qualitatively different from very rare occurrences. This may seem to be a problem for an approach that argues for an absolute distinction between possible and impossible configurations of linguistic categories (for example, between grammatical and ungrammatical structures). This problem
Negative evidence and the raw frequency fallacy 73 may be more apparent than real, however. The continuum between significantly rare and significantly absent structures is not fundamentally different from the continuum between various degrees of unacceptability that is regularly found for acceptability ratings. In both cases, the data must be viewed in light of one’s theory of language in order to make sense of this continuum. Also, it may well be possible to identify a degree of improbability that is close enough to impossibility to be indistinguishable from it. Second, while the statistically significant absence (or rareness) of a particular configuration of grammatical categories can be taken as evidence that this configuration is impossible (i. e., very improbable), it does not, in itself, provide any clues as to why this should be the case. Again, the same is true of introspective judgments. Chomsky pointed this out early on: “The notion ‘acceptable’ is not to be confused with ‘grammatical’. Acceptability belongs to the study of performance whereas grammaticalness belongs to the study of competence” (Chomsky 1965: 11). A linguistic structure may give rise to introspective judgments of unacceptability for a number of reasons, of which ungrammaticality (or, more, generally, failure to conform to general linguistic rules) is just one. What that reason is must be determined independently of the acceptability judgment. The same is true of significantly absent (or rare) structures: determining significant absence/rareness is just the first step of a linguistic analysis. The second step is to determine the reasons for the significant absence/rareness. This step can be much closer to traditional linguistic argumentation. First, it may involve the search for authentic counterexamples (as in the case of whisper above) in order to test the extent of this absence. This may uncover variation in the data (panchronic, regional, social, etc.) or particular contexts in which seemingly impossible structures become possible. Second, it may involve constructing examples in order to determine whether the significant absence is semantically determined. If the constructed examples are not interpretable, the absence may simply be due to semantic incompatibility. For example, no interpretation can be assigned to He knew her the answer or She saw him the light. If the constructed examples are interpretable, their absence cannot be due to semantic incompatibility but may instead have purely formal reasons. For example, He said her the answer or She put him the book are straightforwardly interpretable (of course, there may be more fine-grained semantic restrictions as the huge literature on ditransitives shows). In other words, while I argue against the use of acceptability judgments as a linguistic method, I do not argue against the use of interpretation. There is good reason for this distinction, which I am not the first to point out: interpreting utterances is a natural human activity, judging their acceptability is not.
Page 1 and 2: NOTE Negative evidence and the raw
Page 3 and 4: Negative evidence and the raw frequ
Page 11: Negative evidence and the raw frequ
Page 17: Negative evidence and the raw frequ

Negative evidence and the raw frequency fallacy* - CiteSeerX

Create successful ePaper yourself

Delete template?

Save as template?