Lecture Notes on Compositional Data Analysis - Sedimentology ...
Lecture Notes on Compositional Data Analysis - Sedimentology ...
Lecture Notes on Compositional Data Analysis - Sedimentology ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
36 Chapter 5. Exploratory data analysis<br />
Figure 5.1: Simulated data set before (left) and after (right) centring.<br />
with the same relative c<strong>on</strong>tributi<strong>on</strong> of each log-ratio in the variati<strong>on</strong> array. This is a<br />
significant difference with c<strong>on</strong>venti<strong>on</strong>al standardisati<strong>on</strong>: with real vectors, the relative<br />
c<strong>on</strong>tributi<strong>on</strong>s variable is an artifact of the units of each variable, and most usually should<br />
be ignored; in c<strong>on</strong>trast, in compositi<strong>on</strong>al vectors, all parts share the same “units”, and<br />
their relative c<strong>on</strong>tributi<strong>on</strong> to total variati<strong>on</strong> is a rich informati<strong>on</strong>.<br />
5.4 The biplot: a graphical display<br />
Gabriel (1971) introduced the biplot to represent simultaneously the rows and columns<br />
of any matrix by means of a rank-2 approximati<strong>on</strong>. Aitchis<strong>on</strong> (1997) adapted it for<br />
compositi<strong>on</strong>al data and proved it to be a useful exploratory and expository tool. Here<br />
we briefly describe first the philosophy and mathematics of this technique, and then its<br />
interpretati<strong>on</strong> in depth.<br />
5.4.1 C<strong>on</strong>structi<strong>on</strong> of a biplot<br />
C<strong>on</strong>sider the data matrix X with n rows and D columns. Thus, D measurements have<br />
been obtained from each <strong>on</strong>e of n samples. Centre the data set as described in Secti<strong>on</strong><br />
5.3, and find the coefficients Z in clr coordinates (Eq. 4.3). Note that Z is of the same<br />
order as X, i.e. it has n rows and D columns and recall that clr coordinates preserve<br />
distances. Thus, we can apply to Z standard results, and in particular the fact that the<br />
best rank-2 approximati<strong>on</strong> Y to Z in the least squares sense is provided by the singular<br />
value decompositi<strong>on</strong> of Z (Krzanowski, 1988, p. 126-128).<br />
The singular value decompositi<strong>on</strong> of a matrix of coefficients is obtained from the<br />
matrix of eigenvectors L of ZZ ′ , the matrix of eigenvectors M of Z ′ Z and the square<br />
roots of the s positive eigenvalues λ 1 , λ 2 , . . .,λ s of either ZZ ′ or Z ′ Z, which are the