06.09.2021 Views

Learning Statistics with R - A tutorial for psychology students and other beginners, 2018a

Learning Statistics with R - A tutorial for psychology students and other beginners, 2018a

Learning Statistics with R - A tutorial for psychology students and other beginners, 2018a

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Table 7.5: An illustration of two different ways of indexing a 2 ˆ 3 matrix. On the left we see the row<br />

<strong>and</strong> column version, which is identical to the corresponding indexing scheme <strong>for</strong> a data frame of the same<br />

size. On the right we see the single-index version, which is quite different to what we would get <strong>with</strong> a<br />

data frame. The reason <strong>for</strong> this is that, <strong>for</strong> both data frames <strong>and</strong> matrices, the “row <strong>and</strong> column” version<br />

exists to allow the human user to interact <strong>with</strong> the object in the psychologically meaningful way: since<br />

both data frames <strong>and</strong> matrices are basically just tables of data, it’s the same in each case. However,<br />

the single-index version is really a method <strong>for</strong> you to interact <strong>with</strong> the object in terms of its internal<br />

structure, <strong>and</strong> the internals <strong>for</strong> data frames <strong>and</strong> matrices are quite different.<br />

column<br />

row 1 2 3<br />

1 [1,1] [1,2] [1,3]<br />

2 [2,1] [2,2] [2,3]<br />

column<br />

row 1 2 3<br />

1 [1] [3] [5]<br />

2 [2] [4] [6]<br />

.......................................................................................................<br />

(which can be qualitatively different to each <strong>other</strong>) <strong>and</strong> rows represent cases (which cannot). Matrices<br />

are intended to be thought of in a different way. At a fundamental level, a matrix really is just one<br />

variable: it just happens that this one variable is <strong>for</strong>matted into rows <strong>and</strong> columns. If you want a matrix<br />

of numeric data, every single element in the matrix must be a number. If you want a matrix of character<br />

strings, every single element in the matrix must be a character string. If you try to mix data of different<br />

types together, then R will either spit out an error, or quietly coerce the underlying data into a list. If<br />

you want to find out what class R secretly thinks the data <strong>with</strong>in the matrix is, you need to do something<br />

like this:<br />

> class( M[1] )<br />

[1] "numeric"<br />

You can’t type class(M), because all that will happen is R will tell you that M is a matrix: we’re not<br />

interested in the class of the matrix itself, we want to know what class the underlying data is assumed<br />

to be. Anyway, to give you a sense of how R en<strong>for</strong>ces this, let’s try to change one of the elements of our<br />

numeric matrix into a character string:<br />

> M[1,2] M<br />

col.1 col.2 col.3<br />

row.1 "2" "text" "1"<br />

row.2 "5" "6" "7"<br />

It looks as if R has coerced all of the data in our matrix into character strings. And in fact, if we now<br />

typed in class(M[1]) we’d see that this is exactly what has happened. If you alter the contents of one<br />

element in a matrix, R will change the underlying data type as necessary.<br />

There’s only one more thing I want to talk about regarding matrices. The concept behind a matrix<br />

is very much a mathematical one, <strong>and</strong> in mathematics a matrix is a most definitely a two-dimensional<br />

object. However, when doing data analysis, we often have reasons to want to use higher dimensional<br />

tables (e.g., sometimes you need to cross-tabulate three variables against each <strong>other</strong>). You can’t do this<br />

<strong>with</strong> matrices, but you can do it <strong>with</strong> arrays. An array is just like a matrix, except it can have more<br />

than two dimensions if you need it to. In fact, as far as R is concerned a matrix is just a special kind of<br />

array, in much the same way that a data frame is a special kind of list. I don’t want to talk about arrays<br />

too much, but I will very briefly show you an example of what a 3D array looks like. To that end, let’s<br />

cross tabulate the speaker <strong>and</strong> utterance variables from the nightgarden.Rdata data file, but we’ll add<br />

- 244 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!