27.10.2014 Views

Russel-Research-Method-in-Anthropology

Russel-Research-Method-in-Anthropology

Russel-Research-Method-in-Anthropology

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Univariate Analysis 581<br />

Both groups have an average age of 35, but one of them obviously has a lot<br />

more variation than the other. Consider MFRATIO <strong>in</strong> table 19.8. The mean<br />

ratio of females to males <strong>in</strong> the world was 100.2 <strong>in</strong> 2000. One measure of<br />

variation <strong>in</strong> this mean across the 50 countries <strong>in</strong> our sample is the range.<br />

Inspect<strong>in</strong>g the column for MFRATIO <strong>in</strong> table 19.8, we see that Latvia had the<br />

lowest ratio, with 83 males for every 100 females, and the United Arab Emirates<br />

(UAE) had the highest ratio, with 172. The range, then, is 172 83 <br />

91. The range is a useful statistic, but it is affected strongly by extreme scores.<br />

Without the UAR, the range is 27, a drop of nearly 70%.<br />

The <strong>in</strong>terquartile range avoids extreme scores, either high or low. The<br />

75th percentile for MFRATIO is 101 and the 25th percentile is 96, so the <strong>in</strong>terquartile<br />

range is 101 96 5. This tightens the range of scores, but sometimes<br />

it is the extreme scores that are of <strong>in</strong>terest. The <strong>in</strong>terquartile range of<br />

freshmen SAT scores at major universities tells you about the middle 50% of<br />

the <strong>in</strong>com<strong>in</strong>g class. It doesn’t tell you if the university is recruit<strong>in</strong>g athletes<br />

whose SAT scores are <strong>in</strong> the bottom 25%, the middle 50%, or the top 25% of<br />

scores at those universities (see Kle<strong>in</strong> 1999).<br />

Measures of Dispersion II: Variance and the Standard Deviation<br />

The best-known, and most-useful measure of dispersion for a sample of<br />

<strong>in</strong>terval data is the standard deviation, usually written just s or sd. The sd is<br />

a measure of how much, on average, the scores <strong>in</strong> a distribution deviate from<br />

the mean score. It is gives you a feel for how homogeneous or heterogeneous<br />

a population is. (We use s or sd for the standard deviation of a sample; we use<br />

the lower-case Greek sigma, , for the standard deviation of a population.)<br />

The sd is calculated from the variance, written s 2 , which is the average<br />

squared deviation from the mean of the measures <strong>in</strong> a set of data. To f<strong>in</strong>d the<br />

variance <strong>in</strong> a distribution: (1) Subtract each observation from the mean of the<br />

set of observations; (2) Square the difference, thus gett<strong>in</strong>g rid of negative<br />

numbers; (3) Sum the differences; and (4) Divide that sum by the sample size.<br />

Here is the formula for calculat<strong>in</strong>g the variance:<br />

2<br />

xx<br />

s 2 <br />

Formula 19.4<br />

n1<br />

where s 2 is the variance, x represents the raw scores <strong>in</strong> a distribution of <strong>in</strong>terval-level<br />

observations, x is the mean of the distribution of raw scores, and n is<br />

the total number of observations.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!