STAT170 Workshop Notes prepared by Nan Carter for Numeracy ...

More documents

Recommendations

Info

Week 11 STAT170 workshop on CATEGORICAL variables. Prepared for Numeracy Centre Macquarie University, by Nan Carter. Examples taken from various sources and adapted for STAT170. CATEGORICAL VARIABLES: WE CONSIDER ONLY ONE RANDOM SAMPLE IN STAT170 SUMMARY: 1. ONE CATEGORICAL VARIABLE OBSERVED ON EACH ITEM • TWO CATEGORIES: use EITHER 34 z-TEST OF PROPORTIONS FOR H0: π = π0 where π0 has a particular value, and provided that nπ > 5 and n(1-π) > 5 z = (p - π0 )/SE(p) where p is the sample estimate of π and SE(p)= √[(π)(1-π)/n] The 95% confidence interval for π uses the estimated SE(p) = √[(p)(1-p)/n] and is given by (p – 1.96 SE(p), p + 1.96 SE(p)) OR χ 2 GOODNESS OF FIT TEST: The same null hypothesis as for the z-test, and it is used to find expected values, E, for both categories. The observed COUNTS are denoted by ‘O’. The χ 2 -test is valid if all expected values, E, are >5; χ 2 = Σ[(O-E) 2 /E] with 1 degree of freedom • MORE THAN TWO CATEGORIES: use only the χ 2 goodness of fit test with degrees of freedom = (no. of categories–1) Use info from the question to form the null hypothesis, and thence the expected values. The observed counts are given in the data. WARNING: SMALL EXPECTED FREQUENCIES: The chi-square test is not valid when any expected frequency is less than 5. To counter this problem, we group classes together to create larger observed counts and larger expected counts until the problem goes away! There is a good example of the need to do this in the 1998 Nan Carter: workshop notes prepared for Numeracy Centre Macquarie University.
mid-year exam paper (See below: the question on no. of children and level of education of working women.) 2. TWO CATEGORICAL VARIABLES OBSERVED ON EACH ITEM and we are INTERESTED IN WHETHER THE VARIABLES ARE INDEPENDENT. THE DATA ARE A TABLE OF OBSERVED COUNTS, O. • χ 2 TEST OF INDEPENDENCE: H0: the variables are independent For ‘expected values’, E, use the totals of observed counts in the table: E = (row total)(column total) / (grand total) WE STILL REQUIRE THAT ALL Es BE >5. Then, χ 2 = Σ[(O-E) 2 /E] where the sum is over all cells; and degrees of freedom are (r-1)(c-1) where there are r rows and c columns. If H0 is rejected then the variables are not independent, they are associated (or related). 35 Nan Carter: workshop notes prepared for Numeracy Centre Macquarie University.
Page 1 and 2: STAT170 Workshop Notes prepared by
Page 3 and 4: Week 4 H/Y. Exercises for STAT170 w
Page 5 and 6: Week 5 H/Y. Questions on SAMPLING D
Page 7 and 8: EXAMPLES ON WORKING WITH PROPORTION
Page 9 and 10: EXERCISE 4. Suppose we wanted to pl
Page 11 and 12: We now have a single sample of diff
Page 13 and 14: Week 8 H/Y. STAT170 workshop on the
Page 15 and 16: Week 9 H/Y. Stat170 workshop on Hyp
Page 17 and 18: EXERCISE 11: In Q5 use a hypothesis
Page 19 and 20: PAIRED t-TEST: QUESTION 1. (from St
Page 21 and 22: QUESTION 4. “You can taste the lu
Page 23 and 24: sp =√[{(n1-1)s1 2 + (n2-1)s2 2 }
Page 25 and 26: a) Draw the scatterplot of the data
Page 27 and 28: Residual 20 15 10 5 0 -5 -10 -15 -2
Page 29 and 30: Residual Residual 1 0.8 0.6 0.4 0.2
Page 31 and 32: Relation: Monthly sales volume outc
Page 33: Unstandardized Residual 33 40 20 0
Page 37 and 38: if the hypothesis is true. FOR EXAM
Page 39 and 40: sandwich, only 4 became ill. We wou
Page 41 and 42: Notice that a) 3.89 = 1/0.26, so if

STAT170 Workshop Notes prepared by Nan Carter for Numeracy ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?