06.09.2021 Views

Learning Statistics with R - A tutorial for psychology students and other beginners, 2018a

Learning Statistics with R - A tutorial for psychology students and other beginners, 2018a

Learning Statistics with R - A tutorial for psychology students and other beginners, 2018a

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

x class(x)<br />

[1] "logical"<br />

> x class(x)<br />

[1] "numeric"<br />

Exciting, no?<br />

4.7<br />

Factors<br />

Okay, it’s time to start introducing some of the data types that are somewhat more specific to statistics. If<br />

you remember back to Chapter 2, when we assign numbers to possible outcomes, these numbers can mean<br />

quite different things depending on what kind of variable we are attempting to measure. In particular,<br />

we commonly make the distinction between nominal, ordinal, interval <strong>and</strong> ratio scale data. How do<br />

we capture this distinction in R? Currently, we only seem to have a single numeric data type. That’s<br />

probably not going to be enough, is it?<br />

A little thought suggests that the numeric variable class in R is perfectly suited <strong>for</strong> capturing ratio<br />

scale data. For instance, if I were to measure response time (RT) <strong>for</strong> five different events, I could store<br />

the data in R like this:<br />

> RT 2 * RT<br />

[1] 684 802 1180 782 1108<br />

> RT + 1000<br />

[1] 1342 1401 1590 1391 1554<br />

And to a lesser extent, the “numeric” class is okay <strong>for</strong> interval scale data, as long as we remember that<br />

multiplication <strong>and</strong> division aren’t terribly interesting <strong>for</strong> these sorts of variables. That is, if my IQ score<br />

is 110 <strong>and</strong> yours is 120, it’s perfectly okay to say that you’re 10 IQ points smarter than me 18 , but it’s<br />

not okay to say that I’m only 92% as smart as you are, because intelligence doesn’t have a natural<br />

zero. 19 We might even be willing to tolerate the use of numeric variables to represent ordinal scale<br />

variables, such as those that you typically get when you ask people to rank order items (e.g., like we do<br />

in Australian elections), though as we will see R actually has a built in tool <strong>for</strong> representing ordinal data<br />

(see Section 7.11.2) However, when it comes to nominal scale data, it becomes completely unacceptable,<br />

because almost all of the “usual” rules <strong>for</strong> what you’re allowed to do <strong>with</strong> numbers don’t apply to nominal<br />

scale data. It is <strong>for</strong> this reason that R has factors.<br />

18 Taking all the usual caveats that attach to IQ measurement as a given, of course.<br />

19 Or, more precisely, we don’t know how to measure it. Arguably, a rock has zero intelligence. But it doesn’t make sense<br />

to say that the IQ of a rock is 0 in the same way that we can say that the average human has an IQ of 100. And <strong>with</strong>out<br />

knowing what the IQ value is that corresponds to a literal absence of any capacity to think, reason or learn, then we really<br />

can’t multiply or divide IQ scores <strong>and</strong> expect a meaningful answer.<br />

- 97 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!