Negative evidence and the raw frequency fallacy* - CiteSeerX
Negative evidence and the raw frequency fallacy* - CiteSeerX
Negative evidence and the raw frequency fallacy* - CiteSeerX
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
68 A. Stefanowitsch<br />
Table 5. (continued)<br />
Collexeme F(Corpus) F O (Ditr) F E (Ditr) FYE p-value<br />
provide 380 0 5.08 5.99E003<br />
live 378 0 5.05 6.16E003<br />
remember 373 0 4.98 6.59E003<br />
produce 328 0 4.38 1.21E002<br />
speak 323 0 4.31 1.29E002<br />
hope 316 0 4.22 1.42E002<br />
run 309 0 4.13 1.56E002<br />
change 306 0 4.09 1.63E002<br />
meet 303 0 4.05 1.69E002<br />
help 301 0 4.02 1.74E002<br />
start 294 0 3.93 1.91E002<br />
move 291 0 3.89 1.99E002<br />
seem 285 0 3.81 2.16E002<br />
agree 279 0 3.73 2.34E002<br />
lead 271 0 3.62 2.60E002<br />
expect 265 0 3.54 2.82E002<br />
consider 264 0 3.53 2.86E002<br />
suggest 259 0 3.46 3.06E002<br />
describe 259 0 3.46 3.06E002<br />
decide 259 0 3.46 3.06E002<br />
underst<strong>and</strong> 250 0 3.34 3.46E002<br />
hold 249 0 3.33 3.50E002<br />
require 244 0 3.26 3.75E002<br />
involve 242 0 3.23 3.85E002<br />
suppose 241 0 3.22 3.90E002<br />
include 236 0 3.15 4.17E002<br />
occur 233 0 3.11 4.35E002<br />
develop 233 0 3.11 4.35E002<br />
go on 231 0 3.09 4.46E002<br />
follow 227 0 3.03 4.71E002<br />
Two things about this table require discussion. First, it demonstrates<br />
that even a one-million-word corpus is too small to allow us to identify<br />
significant absences for more than a h<strong>and</strong>ful of cases (at least for a<br />
relatively rare pattern such as ditransitive complementation). I will discuss<br />
this problem in <strong>the</strong> remainder of this section <strong>and</strong> in <strong>the</strong> next section.<br />
Second, <strong>the</strong> results only tell us that a particular structure is significantly<br />
absent, <strong>the</strong>y do not, as pointed out in <strong>the</strong> introduction, tell us why it is<br />
significantly absent. I will return to this problem in <strong>the</strong> final section.<br />
The problem of insufficient corpus size can ultimately only be solved<br />
by <strong>the</strong> creation of larger grammatically annotated corpora. However, in<br />
many individual cases it is possible to arrive at a fairly safe conclusion<br />
using currently available non-annotated corpora. Take <strong>the</strong> case of ex-