27.10.2014 Views

Russel-Research-Method-in-Anthropology

Russel-Research-Method-in-Anthropology

Russel-Research-Method-in-Anthropology

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

690 Chapter 21<br />

I’m go<strong>in</strong>g to expla<strong>in</strong> cluster analysis <strong>in</strong> some detail. It’s a very important<br />

descriptive tool, but as we go through the next few paragraphs, keep <strong>in</strong> m<strong>in</strong>d<br />

two th<strong>in</strong>gs: First, cluster<strong>in</strong>g is just a technique for f<strong>in</strong>d<strong>in</strong>g the similarity<br />

chunks. It doesn’t label those chunks. That part is also a Rorschach test: You<br />

stare at the output and decide what the mean<strong>in</strong>g is. Second, as with so many<br />

methods, different treatments of your data produce different outcomes. The<br />

next few pages will make it very clear just how much you and only you are<br />

responsible for every choice you make <strong>in</strong> data analysis.<br />

How Cluster Analysis Works<br />

Consider the follow<strong>in</strong>g example from de Ghett (1978:121):<br />

1 3 7 9 14 20 21 25<br />

This set of numbers has no mean<strong>in</strong>g at all, so we can concentrate on the<br />

method of cluster<strong>in</strong>g them, without any <strong>in</strong>terpretation gett<strong>in</strong>g <strong>in</strong> the way.<br />

When we get through with this example, we’ll move on to a set of numbers<br />

that does have mean<strong>in</strong>g. The distance between 1 and 3 is 2. The distance<br />

between 21 and 25 is 4. So, <strong>in</strong> a numerical sense, 1 and 3 are twice as similar<br />

to one another as 21 and 25 are to one another. Table 21.29 shows the dissimilarity<br />

matrix for these numbers.<br />

TABLE 21.29<br />

Dissimilarity Matrix for Cluster<strong>in</strong>g<br />

1 3 7 9 14 20 21 25<br />

1 0<br />

3 2 0<br />

7 6 4 0<br />

9 8 6 2 0<br />

14 13 11 7 5 0<br />

20 19 17 13 11 6 0<br />

21 20 18 14 12 7 1 0<br />

25 24 22 18 16 11 5 4 0<br />

SOURCE: ‘‘Hierarchical Cluster Analysis’’ by V. J. de Ghett, 1978, Quantitative Ethology, ed. by P. W. Colgan.<br />

Repr<strong>in</strong>ted by permission of John Wiley & Sons, Inc.<br />

There are several ways to f<strong>in</strong>d clusters <strong>in</strong> this matrix. Two of them are called<br />

s<strong>in</strong>gle-l<strong>in</strong>k or closest-neighbor analysis and complete-l<strong>in</strong>k or farthestneighbor<br />

analysis (there are others, but I won’t go <strong>in</strong>to them here). In s<strong>in</strong>glel<strong>in</strong>k<br />

cluster<strong>in</strong>g, we use only the numbers adjacent to the diagonal: 2, 4, 2, 5,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!