10.01.2014 Views

Negative evidence and the raw frequency fallacy* - CiteSeerX

Negative evidence and the raw frequency fallacy* - CiteSeerX

Negative evidence and the raw frequency fallacy* - CiteSeerX

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

68 A. Stefanowitsch<br />

Table 5. (continued)<br />

Collexeme F(Corpus) F O (Ditr) F E (Ditr) FYE p-value<br />

provide 380 0 5.08 5.99E003<br />

live 378 0 5.05 6.16E003<br />

remember 373 0 4.98 6.59E003<br />

produce 328 0 4.38 1.21E002<br />

speak 323 0 4.31 1.29E002<br />

hope 316 0 4.22 1.42E002<br />

run 309 0 4.13 1.56E002<br />

change 306 0 4.09 1.63E002<br />

meet 303 0 4.05 1.69E002<br />

help 301 0 4.02 1.74E002<br />

start 294 0 3.93 1.91E002<br />

move 291 0 3.89 1.99E002<br />

seem 285 0 3.81 2.16E002<br />

agree 279 0 3.73 2.34E002<br />

lead 271 0 3.62 2.60E002<br />

expect 265 0 3.54 2.82E002<br />

consider 264 0 3.53 2.86E002<br />

suggest 259 0 3.46 3.06E002<br />

describe 259 0 3.46 3.06E002<br />

decide 259 0 3.46 3.06E002<br />

underst<strong>and</strong> 250 0 3.34 3.46E002<br />

hold 249 0 3.33 3.50E002<br />

require 244 0 3.26 3.75E002<br />

involve 242 0 3.23 3.85E002<br />

suppose 241 0 3.22 3.90E002<br />

include 236 0 3.15 4.17E002<br />

occur 233 0 3.11 4.35E002<br />

develop 233 0 3.11 4.35E002<br />

go on 231 0 3.09 4.46E002<br />

follow 227 0 3.03 4.71E002<br />

Two things about this table require discussion. First, it demonstrates<br />

that even a one-million-word corpus is too small to allow us to identify<br />

significant absences for more than a h<strong>and</strong>ful of cases (at least for a<br />

relatively rare pattern such as ditransitive complementation). I will discuss<br />

this problem in <strong>the</strong> remainder of this section <strong>and</strong> in <strong>the</strong> next section.<br />

Second, <strong>the</strong> results only tell us that a particular structure is significantly<br />

absent, <strong>the</strong>y do not, as pointed out in <strong>the</strong> introduction, tell us why it is<br />

significantly absent. I will return to this problem in <strong>the</strong> final section.<br />

The problem of insufficient corpus size can ultimately only be solved<br />

by <strong>the</strong> creation of larger grammatically annotated corpora. However, in<br />

many individual cases it is possible to arrive at a fairly safe conclusion<br />

using currently available non-annotated corpora. Take <strong>the</strong> case of ex-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!