Lies, Damned Lies, or Statistics- How to Tell the Truth with Statistics, 2017a

More documents

Recommendations

Info

40 3. LINEAR REGRESSION DEFINITION 3.1.1. Given a bivariate quantitative dataset {(x 1 ,y 1 ),...,(x n ,y n )} and a candidate line ŷ = mx + b passing through this dataset, a residual is the difference in y-coordinates of an actual data point (x i ,y i ) and the line’s y value at the same x-coordinate. That is, if the y-coordinate of the line when x = x i is ŷ i = mx i + b, then the residual is the measure of error given by error i = y i − ŷ i . Note we use the convention here and elsewhere of writing ŷ for the y-coordinate on an approximating line, while the plain y variable is left for actual data values, like y i . Here is an example of what residuals look like Now we are in the position to state the DEFINITION 3.1.2. Given a bivariate quantitative dataset the least square regression line, almost always abbreviated to LSRL, is the line for which the sum of the squares of the residuals is the smallest possible. FACT 3.1.3. If a bivariate quantitative dataset {(x 1 ,y 1 ),...,(x n ,y n )} has LSRL given by ŷ = mx + b, then (1) The slope of the LSRL is given by m = r sy s x ,wherer is the correlation coefficient of the dataset. (2) The LSRL passes through the point (x, y). (3) It follows that the y-intercept of the LSRL is given by b = y − xm= y − xr sy s x . It is possible to find the (coefficients of the) LSRL using the above information, but it is often more convenient to use a calculator or other electronic tool. Such tools also make it very easy to graph the LSRL right on top of the scatterplot – although it is often fairly easy to sketch what the LSRL will likely look like by just making a good guess, using
3.1. THE LEAST SQUARES REGRESSION LINE 41 visual intuition, if the linear association is strong (as will be indicated by the correlation coefficient). EXAMPLE 3.1.4. Here is some data where the individuals are 23 students in a statistics class, the independent variable is the students’ total score on their homeworks, while the dependent variable is their final total course points, both out of 100. x : 65 65 50 53 59 92 86 84 29 y : 74 71 65 60 83 90 84 88 48 x : 29 9 64 31 69 10 57 81 81 y : 54 25 79 58 81 29 81 94 86 x : 80 70 60 62 59 y : 95 68 69 83 70 Here is the resulting scatterplot, made with LibreOffice Calc(a free equivalent of Microsoft Excel) It seems pretty clear that there is quite a strong linear association between these two variables, as is born out by the correlation coefficient, r = .935 (computed with LibreOffice Calc’s CORREL). Using then STDEV.S and AVERAGE, we find that the coefficients of the LSRL for this data, ŷ = mx + b are m = r s y = .935 18.701 = .754 and b = y − xm=71− 58 · .754 = 26.976 s x 23.207
Page 1: Lies, Damned Lies, or Statistics: H
Page 5 and 6: Contents Release Notes Preface iii
Page 7: CONTENTS vii 5.3.2. Informed Consen
Page 10 and 11: x PREFACE of this book to help you
Page 12 and 13: The first instinct of the scientist
Page 15 and 16: CHAPTER 1 One-Variable Statistics:
Page 17 and 18: 1.2. VISUAL REPRESENTATION OF DATA,
Page 27 and 28: 1.4. NUMERICAL DESCRIPTIONS OF DATA
Page 41 and 42: EXERCISES 31 EXERCISE 1.5. Twenty s
Page 43 and 44: CHAPTER 2 Bi-variate Statistics: Ba
Page 45 and 46: 2.2. SCATTERPLOTS 35 2.2. Scatterpl
Page 47 and 48: 2.3. CORRELATION 37 Many electronic
Page 49: CHAPTER 3 Linear Regression Quick r
Page 53 and 54: 3.2. APPLICATIONS AND INTERPRETATIO
Page 55 and 56: 3.3. CAUTIONS 45 3.3. Cautions 3.3.
Page 57 and 58: 3.3. CAUTIONS 47 3.3.3. Extrapolati
Page 59: EXERCISES 49 Exercises EXERCISE 3.1
Page 62 and 63: It is something of an aphorism amon
Page 64 and 65: 54 4. PROBABILITY THEORY the flippi
Page 66 and 67: 56 4. PROBABILITY THEORY There is o
Page 68 and 69: 58 4. PROBABILITY THEORY Sometimes,
Page 70 and 71: 60 4. PROBABILITY THEORY Venn diagr
Page 72 and 73: 62 4. PROBABILITY THEORY A ∩ B, w
Page 74 and 75: 64 4. PROBABILITY THEORY The final
Page 76 and 77: 66 4. PROBABILITY THEORY 4.2. Condi
Page 78 and 79: 68 4. PROBABILITY THEORY nor are {1
Page 80 and 81: 70 4. PROBABILITY THEORY In fact, l
Page 82 and 83: 72 4. PROBABILITY THEORY FACT 4.3.8
Page 84 and 85: 74 4. PROBABILITY THEORY addition,
Page 86 and 87: 76 4. PROBABILITY THEORY FACT 4.3.1
Page 88 and 89: 78 4. PROBABILITY THEORY If X is a
Page 90 and 91: 80 4. PROBABILITY THEORY Let’s ma
Page 92 and 93: 82 4. PROBABILITY THEORY One nice t
Page 94 and 95: 84 4. PROBABILITY THEORY In order t
Page 96 and 97: 86 4. PROBABILITY THEORY What we kn
Page 98 and 99: 88 4. PROBABILITY THEORY EXERCISE 4
Page 100 and 101:
90 4. PROBABILITY THEORY EXERCISE 4
Page 102 and 103:
92 5. BRINGING HOME THE DATA guidel
Page 104 and 105:
94 5. BRINGING HOME THE DATA Simila
Page 106 and 107:
96 5. BRINGING HOME THE DATA The ha
Page 108 and 109:
98 5. BRINGING HOME THE DATA Note a
Page 110 and 111:
100 5. BRINGING HOME THE DATA and s
Page 112 and 113:
102 5. BRINGING HOME THE DATA 5.2.5
Page 114 and 115:
104 5. BRINGING HOME THE DATA 5.3.
Page 116 and 117:
106 5. BRINGING HOME THE DATA In th
Page 119 and 120:
Part 3 Inferential Statistics
Page 121 and 122:
CHAPTER 6 Basic Inferences The purp
Page 123 and 124:
6.1. THE CENTRAL LIMIT THEOREM 113
Page 125 and 126:
6.2. BASIC CONFIDENCE INTERVALS 115
Page 127 and 128:
6.3. BASIC HYPOTHESIS TESTING 117 6
Page 129 and 130:
6.3. BASIC HYPOTHESIS TESTING 119 N
Page 131 and 132:
6.3. BASIC HYPOTHESIS TESTING 121 M
Page 133 and 134:
The two hypotheses then are 6.3. BA
Page 135 and 136:
EXERCISES 125 Exercises EXERCISE 6.
Page 137:
Bibliography [Gal17] Gallup, Presid
Page 140 and 141:
130 INDEX causation, 46 center of a
Page 142 and 143:
132 INDEX population standard devia
show all

Lies, Damned Lies, or Statistics- How to Tell the Truth with Statistics, 2017a

Create successful ePaper yourself

Delete template?

Save as template?