Lies, Damned Lies, or Statistics- How to Tell the Truth with Statistics, 2017a
Lies, Damned Lies, or Statistics- How to Tell the Truth with Statistics, 2017a
Lies, Damned Lies, or Statistics- How to Tell the Truth with Statistics, 2017a
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
36 2. BI-VARIATE STATISTICS: BASICS<br />
2.3. C<strong>or</strong>relation<br />
As bef<strong>or</strong>e (in §§1.4 and 1.5), when we moved from describing his<strong>to</strong>grams <strong>with</strong> w<strong>or</strong>ds<br />
(like symmetric) <strong>to</strong> describing <strong>the</strong>m <strong>with</strong> numbers (like <strong>the</strong> mean), we now will build a<br />
numeric measure of <strong>the</strong> strength and direction of a linear association in a scatterplot.<br />
DEFINITION 2.3.1. Given bivariate quantitative data {(x 1 ,y 1 ),...,(x n ,y n )} <strong>the</strong> [Pearson]<br />
c<strong>or</strong>relation coefficient of this dataset is<br />
r = 1 ∑ (xi − x) (y i − y)<br />
n − 1 s x s y<br />
where s x and s y are <strong>the</strong> standard deviations of <strong>the</strong> x and y, respectively, datasets by <strong>the</strong>mselves.<br />
We collect some basic inf<strong>or</strong>mation about <strong>the</strong> c<strong>or</strong>relation coefficient in <strong>the</strong> following<br />
FACT 2.3.2. F<strong>or</strong> any bivariate quantitative dataset {(x 1 ,y 1 ),...,(x n ,y n )} <strong>with</strong> c<strong>or</strong>relation<br />
coefficient r,wehave<br />
(1) −1 ≤ r ≤ 1 is always true;<br />
(2) if |r| is near 1 – meaning that r is near ±1 – <strong>the</strong>n <strong>the</strong> linear association between x<br />
and y is strong<br />
(3) if r is near 0 –meaningthatr is positive <strong>or</strong> negative, but near 0 – <strong>the</strong>n <strong>the</strong> linear<br />
association between x and y is weak<br />
(4) if r>0 <strong>the</strong>n <strong>the</strong> linear association between x and y is positive, while if r