25.10.2012 Views

Laurie Bauer - WordPress.com — Get a Free Blog Here

Laurie Bauer - WordPress.com — Get a Free Blog Here

Laurie Bauer - WordPress.com — Get a Free Blog Here

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

167 STATISTICS<br />

numbers do not add up to 100 per cent because there are other possibilities).<br />

Thus 17 per cent is our best guess at this proportion among the whole population<br />

of written restrictive relative clauses. However, since we have only taken a<br />

sample of the population our result must be expressed tentatively, acknowledging<br />

the uncertainty introduced by examining only a subset of the population<br />

rather than taking a census (in this particular case, a census would not even<br />

be possible). So rather than reporting a simple value, statisticians often prefer<br />

to quote a confidence interval (i.e. a range of values within which we are reasonably<br />

confident the true population value lies).<br />

Statistical inference<br />

When you have collected your data, and made some estimates of various quantities<br />

of interest, you may simply wish to report those estimates, with their<br />

uncertainties. Most <strong>com</strong>monly, however, you want to go a step further and<br />

answer a specific research question: e.g. do the speakers in two different<br />

suburbs really behave in a different way? In particular do the mean values of a<br />

particular vowel differ? You will have collected samples from both suburbs and<br />

calculated means in the two samples. Those sample means will undoubtedly<br />

differ, at least slightly. But does that mean that the population means are<br />

different? You will probably have individuals in each sample whose behaviour<br />

is atypical of the population they came from, for reasons such as those suggested<br />

above, but you want to know whether the differences between the two<br />

samples are due entirely to chance (because of the individuals you recorded<br />

when trying to find out the answer to your question) or whether the two populations<br />

they came from are really behaving differently. Statisticians prefer us<br />

to ask this question in a very particular way. Let us assume, they say, that there<br />

is no difference between our two populations (i.e. the people from the different<br />

suburbs, the very tall and not-so-tall people, are in principle behaving identically).<br />

This is the null hypothesis. It is, in a sense, the hypothesis of the<br />

person who is sceptical about your expectations and believes that the two<br />

groups do not differ in terms of the feature we are trying to measure. Statistical<br />

tests are then framed so as to answer the question: what evidence do we have<br />

that the null hypothesis is wrong? In a statistical test we start from the null<br />

hypothesis, and look to see if we have evidence to persuade us out of that<br />

position.<br />

p-values<br />

At this stage we have stopped asking about the nature of the sample, and we are<br />

trying to use the sample to draw inferences about the population as a whole.<br />

The null hypothesis is that the two samples we are dealing with really belong

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!