Lecture Notes on Compositional Data Analysis - Sedimentology ...

More documents

Recommendations

Info

36 Chapter 5. Exploratory data analysis Figure 5.1: Simulated data set before (left) and after (right) centring. with the same relative contribution of each log-ratio in the variation array. This is a significant difference with conventional standardisation: with real vectors, the relative contributions variable is an artifact of the units of each variable, and most usually should be ignored; in contrast, in compositional vectors, all parts share the same “units”, and their relative contribution to total variation is a rich information. 5.4 The biplot: a graphical display Gabriel (1971) introduced the biplot to represent simultaneously the rows and columns of any matrix by means of a rank-2 approximation. Aitchison (1997) adapted it for compositional data and proved it to be a useful exploratory and expository tool. Here we briefly describe first the philosophy and mathematics of this technique, and then its interpretation in depth. 5.4.1 Construction of a biplot Consider the data matrix X with n rows and D columns. Thus, D measurements have been obtained from each one of n samples. Centre the data set as described in Section 5.3, and find the coefficients Z in clr coordinates (Eq. 4.3). Note that Z is of the same order as X, i.e. it has n rows and D columns and recall that clr coordinates preserve distances. Thus, we can apply to Z standard results, and in particular the fact that the best rank-2 approximation Y to Z in the least squares sense is provided by the singular value decomposition of Z (Krzanowski, 1988, p. 126-128). The singular value decomposition of a matrix of coefficients is obtained from the matrix of eigenvectors L of ZZ ′ , the matrix of eigenvectors M of Z ′ Z and the square roots of the s positive eigenvalues λ 1 , λ 2 , . . .,λ s of either ZZ ′ or Z ′ Z, which are the
5.4. The biplot 37 same. As a result, taking k i = √ λ i , we can write ⎛ ⎞ k 1 0 · · · 0 0 k 2 · · · 0 Z = L ⎜ ⎝ . . . .. ⎟ . ⎠ M′ , 0 0 · · · k s where s is the rank of Z and the singular values k 1 , k 2 , . . .,k s are in descending order of magnitude. Usually s = D − 1. Both matrices L and M are orthonormal. The rank-2 approximation is then obtained by simply substituting all singular values with index larger then 2 by zero, thus keeping Y = ( ) ( ) ( ) l ′ 1 l ′ k 1 0 m1 2 0 k 2 m 2 ⎛ ⎞ l 11 l 21 l 12 l 22 ( ) ( ) k1 0 m11 m = 12 · · · m 1D ⎜ ⎟ . ⎝ . . ⎠ 0 k 2 m 21 m 22 · · · m 2D l 2n l 1n The proportion of variability retained by this approximation is λ Ps 1+λ 2 . i=1 λ i To obtain a biplot, it is first necessary to write Y as the product of two matrices GH ′ , where G is an (n × 2) matrix and H is an (D × 2) matrix. There are different possibilities to obtain such a factorisation, one of which is ⎛√ √n n − 1l11 − 1l12 Y = ⎜ ⎝ √ . n − 1l1n √ ⎞ √ n − 1l21 n − 1l22 ⎟ √ . ⎠ n − 1l2n ( k1 m 11 √n−1 k 1 m 12 √n−1 · · · √n−1 · · · k 2 m 21 √n−1 k 2 m 22 ⎛ ⎞ g 1 g 2 ( ) = ⎜ ⎟ h1 h 2 · · · h D . ⎝ . ⎠ g n ) k √n−1 1 m 1D k √n−1 2 m 2D The biplot consists simply in representing the n + D vectors g i , i = 1, 2, ..., n, and h j , j = 1, 2, ..., D, in a plane. The vectors g 1 , g 2 , ..., g n are termed the row markers of Y and correspond to the projections of the n samples on the plane defined by the first two eigenvectors of ZZ ′ . The vectors h 1 , h 2 , ..., h D are the column markers, which correspond to the projections of the D clr-parts on the plane defined by the first two eigenvectors of Z ′ Z. Both planes can be superposed for a visualisation of the relationship between samples and parts.
Page 1 and 2: Lecture No
Page 3 and 4: Preface These notes have been prepa
Page 5 and 6: Contents Preface i 1 Introduction 1
Page 7 and 8: Chapter 1 Introduction The awarenes
Page 9 and 10: 3 centred logratio transformation (
Page 11 and 12: Chapter 2 Compositional data and th
Page 13 and 14: 2.2 Principles of compositional ana
Page 15 and 16: 2.3 Exercises 9 Definition 2.6. a f
Page 17 and 18: Chapter 3 The Aitchison geometry 3.
Page 19 and 20: 3.3 Inner product, norm and distanc
Page 21 and 22: 3.4. GEOMETRIC FIGURES 15 3.4 Geome
Page 23 and 24: Chapter 4 Coordinate representation
Page 25 and 26: 4.3. Generating systems 19 where (
Page 27 and 28: 4.4 Orthonormal coordinates 21 been
Page 29 and 30: 4.4 Orthonormal coordinates 23 Exam
Page 31 and 32: 4.5. WORKING IN COORDINATES 25 4.5
Page 33 and 34: 4.6. Additive log-ratio coordinates
Page 35 and 36: 4.7. Matrix notation 29 In coordina
Page 37 and 38: 4.8. EXERCISES 31 4.8 Exercises Exe
Page 39 and 40: Chapter 5 Exploratory data analysis
Page 41: 5.3 Centring and scaling 35 Definit
Page 45 and 46: 5.5. EXPLORATORY ANALYSIS OF COORDI
Page 47 and 48: 5.6. Illustration 41 0.00 0.02 0.04
Page 49 and 50: 5.6. Illustration 43 Table 5.2: Nor
Page 51 and 52: 5.6. Illustration 45 Figure 5.4: Pl
Page 53 and 54: 5.7. Exercises 47 Table 5.4: Covari
Page 55 and 56: Chapter 6 Distributions on the simp
Page 57 and 58: 6.3. Tests of normality 51 which is
Page 59 and 60: 6.3. Tests of normality 53 Tolosana
Page 61 and 62: 6.4. Exercises 55 Table 6.3: Critic
Page 63 and 64: Chapter 7 Statistical inference 7.1
Page 65 and 66: 7.1. Hypothesis about two groups 59
Page 67 and 68: 7.2. Probability and confidence reg
Page 69 and 70: Chapter 8 Compositional processes C
Page 71 and 72: 8.1. Linear processes 65 x1 2 t=0 1
Page 73 and 74: 8.2. Complementary processes 67 Exa
Page 75 and 76: 8.2. Complementary processes 69 0.5
Page 77 and 78: 8.3. Mixture processes 71 Example 8
Page 79 and 80: 8.4. Linear regression with composi
Page 81 and 82: 8.5. PRINCIPAL COMPONENT ANALYSIS 7
Page 83 and 84: 8.5. Principal component analysis 7
Page 85 and 86: 8.5. Principal component analysis 7
Page 87 and 88: Bibliography Aitchison, J. (1981).
Page 89 and 90: BIBLIOGRAPHY 83 Buccianti, A., V. P
Page 91 and 92: BIBLIOGRAPHY 85 study of surface wa
Page 93 and 94:
A. Plotting a ternary diagram Denot
Page 95 and 96:
B. Parametrisation of an elliptic r
show all

Lecture Notes on Compositional Data Analysis - Sedimentology ...

Create successful ePaper yourself

Delete template?

Save as template?