14.03.2014 Views

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 17 Correlations <strong>and</strong> <strong>Multivariate</strong> Techniques 455<br />

Computations <strong>and</strong> Statistical Details<br />

where K is a constant equal to the 0.75 quantile of a chi-square distribution with the degrees of freedom<br />

equal to the number of columns in the data table, <strong>and</strong><br />

Q = ( y i<br />

– μ) T ( S 2 ) – 1<br />

( y i<br />

– μ)<br />

where y i = the response for the i th observation, μ = the current estimate of the mean vector, S 2 = current<br />

estimate of the covariance matrix, <strong>and</strong> T = the transpose matrix operation. The final step is a bias<br />

reduction of the variance matrix.<br />

The tradeoff of this method is that you can have higher variance estimates when the data do not have many<br />

outliers, but can have a much more precise estimate of the variances when the data do have outliers.<br />

Pearson Product-Moment Correlation<br />

The Pearson product-moment correlation coefficient measures the strength of the linear relationship<br />

between two variables. For response variables X <strong>and</strong> Y, it is denoted as r <strong>and</strong> computed as<br />

r =<br />

<br />

( x–<br />

x) ( y – y)<br />

---------------------------------------------------------<br />

( x–<br />

x) 2<br />

( y–<br />

y )2<br />

If there is an exact linear relationship between two variables, the correlation is 1 or –1, depending on<br />

whether the variables are positively or negatively related. If there is no linear relationship, the correlation<br />

tends toward zero.<br />

Nonparametric Measures of Association<br />

For the Spearman, Kendall, or Hoeffding correlations, the data are first ranked. Computations are then<br />

performed on the ranks of the data values. Average ranks are used in case of ties.<br />

Spearman’s ρ (rho) Coefficients<br />

Spearman’s ρ correlation coefficient is computed on the ranks of the data using the formula for the Pearson’s<br />

correlation previously described.<br />

Kendall’s τ b Coefficients<br />

Kendall’s τ b coefficients are based on the number of concordant <strong>and</strong> discordant pairs. A pair of rows for two<br />

variables is concordant if they agree in which variable is greater. Otherwise they are discordant, or tied.<br />

The formula<br />

<br />

sgn( x i<br />

– x j<br />

) sgn( y i<br />

– y j<br />

)<br />

i<<br />

j<br />

τ b<br />

= ---------------------------------------------------------------<br />

( T 0<br />

– T 1<br />

)( T 0<br />

– T 2<br />

)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!