01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

22 Describ<strong>in</strong>g data<br />

cannot be less than zero, and there is presumably some lower limit above zero<br />

and some upper limit, but it would be difficult to say exactly what these limits<br />

are. The dist<strong>in</strong>ction between discrete and cont<strong>in</strong>uous variables is not always<br />

clear, because all cont<strong>in</strong>uous measurements are <strong>in</strong> practice rounded off; for<br />

<strong>in</strong>stance, a series of heights might be recorded to the nearest centimetre and so<br />

appear discrete. Any ambiguity rarely matters, s<strong>in</strong>ce the same statistical methods<br />

can often be safely applied to both cont<strong>in</strong>uous and discrete variables, particularly<br />

if the scale used for the latter is fairly f<strong>in</strong>ely subdivided. On the other hand,<br />

there are some special methods applicable to counts, which as we have seen must<br />

be positive whole numbers. The problems of summariz<strong>in</strong>g quantitative data are<br />

much more complex than those for qualitative data, and the rema<strong>in</strong>der of this<br />

chapter will be devoted almost entirely to them.<br />

Sometimes a cont<strong>in</strong>uous or a discrete quantitative variable may be summarized<br />

by divid<strong>in</strong>g the range of values <strong>in</strong>to a number of categories, or group<strong>in</strong>g<br />

<strong>in</strong>tervals, and produc<strong>in</strong>g a table of frequencies. For example, for age a number of<br />

age groups could be created and each <strong>in</strong>dividual put <strong>in</strong>to one of the groups. The<br />

variable, age, has then been transformed <strong>in</strong>to a new variable, age group, which<br />

has all the characteristics of an ordered categorical variable. Such a variable may<br />

be called an <strong>in</strong>terval variable.<br />

A useful first step <strong>in</strong> summariz<strong>in</strong>g a fairly large collection of quantitative data<br />

is the formation of a frequency distribution. This is a table show<strong>in</strong>g the number of<br />

observations, or frequency, at different values or with<strong>in</strong> certa<strong>in</strong> ranges of values<br />

of the variable. For a discrete variable with a few categories the frequency may<br />

be tabulated at each value, but, if there is a wide range of possible values, it will be<br />

convenient to subdivide the range <strong>in</strong>to categories. An example is shown <strong>in</strong> Table<br />

2.3. (In this example the reader should note the dist<strong>in</strong>ction between two types of<br />

countÐthe variable, which is the number of lesions on an <strong>in</strong>dividual chorioallantoic<br />

membrane, and the frequency, which is the number of membranes on which<br />

the variable falls with<strong>in</strong> a specified range.) With cont<strong>in</strong>uous measurements one<br />

must form group<strong>in</strong>g <strong>in</strong>tervals (Table 2.4). In Table 2.4 the cumulative relative<br />

Table 2.4 Frequency distribution of age for 1357 male patients with lung cancer.<br />

Age (years)<br />

Frequency<br />

(number of patients)<br />

Relative<br />

frequency (%)<br />

Cumulative relative<br />

frequency (%)<br />

25± 17 1 3 1 3<br />

35± 116 8 5 9 8<br />

45± 493 36 3 46 1<br />

55± 545 40 2 86 3<br />

65±74 186 13 7 100 0<br />

Total 1357 100 0

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!