12.07.2015 Views

STATISTICS – UNDERSTANDING HYPOTHESIS TESTS

STATISTICS – UNDERSTANDING HYPOTHESIS TESTS

STATISTICS – UNDERSTANDING HYPOTHESIS TESTS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Statistics – Understanding Hypothesis Tests:Proportions2014WORKSHOP: <strong>STATISTICS</strong> –<strong>UNDERSTANDING</strong> <strong>HYPOTHESIS</strong> <strong>TESTS</strong>:SINGLE PROPORTIONS AND DIFFERENCESBETWEEN TWO PROPORTIONSfor Semester One, 2014Statistics ProgrammeStudent Learning offers a workshop and one-to-oneassistance programme for undergraduate studentsrequiring help with basic skills and concepts in statistics.Students MUST REGISTER for each workshop withStudent Learning© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 1 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Statistical help available at Student LearningEach semester, Student Learning offers statistical help by offering a number ofworkshops and one-to-one advisory sessions by appointment. Please contactStudent Learning reception (09-373 7599 ext. 88850; sls@auckland.ac.nz;Room 320, third floor, Kate Edgar Information Commons Building) to book aone-to-one advisory session.Student Learning Statistics WorkshopsAny questions regarding statistics workshops should be forwarded to:Leila BoyleStudent Learning Undergraduate Statistics Assistancel.boyle@auckland.ac.nz; (09) 923-9045; 021 447-018Workshops are run in a relaxed environment, and allow plenty of time forquestions. In fact, this is encouraged Preparation at the beginning of the semester:Introductory workshopo Basic maths and calculator skills for StatisticsFirst half of the semesterStatistics Theory WorkshopsPlease make sure you bring your calculator with you to these workshops.Either a scientific calculator or a graphics calculator is fine - it should beable to calculate the mean and standard deviation of a set of data.o Polls, surveys, experiments and observational studieso Basic data analysiso Proportions and probabilitieso Understanding confidence intervals for:• Proportions (single proportion and difference between two proportions)• Means (single mean and difference between two means)Computer Workshopo Introducing SPSS in one hour© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 3 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Understanding Hypothesis Tests: ProportionsThis material builds on a number of workshops already held in the first half ofthis semester, which you may or may not have attended.If you want to learn more about: How we collect our sample data, see the Polls, Surveys,Experiments and Observational Studies workshop material.How we explore, display and describe our sample data, see theBasic Data Analysis workshop material.How we extract a proportion/probability from a two-way table ofcounts, see the Proportions and Probabilities workshop material.How we quantify the size of a single proportion or differencebetween two proportions, see the Understanding ConfidenceIntervals for Proportions workshop material.Types of VariablesQuantitativeor Scale (SPSS)(measurements and counts)Qualitative(define groups)Continuous(measurements – fewrepeated values)Discrete(usually counts – manyrepeated values)Categoricalor Nominal (SPSS)(no idea of order)Ordinal (SPSS)(fall in naturalorder)No gaps betweenpossible values,if measuredsufficiently accurately.Gaps between eachof the valuesit can take.ExamplesWeight; Height;Temperature;Age [large range]ExamplesNumber of siblings;Age [small range]ExamplesGender(male, female);EthnicityExamplesSize (smallmedium, large);Age group (


Statistics – Understanding Hypothesis Tests:Proportions2014Exploring Relationships between Two VariablesTwo quantitativeQuantitative vs qualitativeTwo qualitativeExploratory tools to use to explore relationship/sScatter plotSide-by-side plots on the same scale Two-way table of countsand/orBar graphs of proportions(on rows &/or columns)• Trend• Scatter• Strength ofrelationship• Association• Outliers• GroupingsFeatures to look for:• Any group differences:o averages/centreso variability/spreado shapes● symmetric/skewed● modes• Details of individualgroups:o outliers, gaps, clusters,groupingsThink about reasons why thesedifferences, similarities andfeatures are seen• Most commonand leastcommoncombinations• Differences indistributions(e.g. row orcolumn bargraphs)© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 6 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Recall that:A probability (or proportion) is a number between 0 and 1 that quantifiesuncertainty.There are two main sources of probabilities:1. Probabilities using a model – some models that may involve equallylikely outcomes are tossing a coin and rolling a die2. Probabilities from datat-tests by Hand – One and Two Proportion/sWe use statistics to find out about the real world and aspects of it specific toour area of interest. Statistical tools allow us to deal with the uncertaintypresent in all samples due to sampling variation which occurs because weare unable to survey the entire population of interest.We are usually unable to survey the entire population (take a census) as it istoo large and/or there are: budget constraints time limits logistical barriersThis means we are unable to establish the parameters of interest within ourpopulation, such as:2. Population proportion, p or4. Difference in population proportions, p 1 – p 2This means that the parameter ofinterest is an unknown numericalcharacteristic for that particularpopulation.Populationp; p 1 – p 2Samplepˆ ; pˆˆ1 p2© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 7 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014To estimate an unknown numerical characteristic (parameter) for ourpopulation of interest, we take a sample and find a sample estimate from it(that is, we make a statistical inference). The sample estimates of theabove population parameters are:2. Sample proportion, pˆ4. Difference in sample proportions, pˆˆ1 p2Usually ^HATS or BARS are used to distinguish between sample estimates andpopulation parameters.Population of Interest:NEW ZEALANDERS(N = 4.5 million people)Parameter of Interest:p = proportion of left-handedNEW ZEALANDERSSampleNEW ZEALANDERS(n = 500 people)Estimatepˆ = proportion of left-handedNEW ZEALANDERS in our sampleWe use sample data to make inferences (draw conclusions) about populationparameters by carrying out and hypothesis tests and constructing confidenceintervals.A significance test, tests one possible value for the parameter, calledthe hypothesised value. We determine the strength of evidenceprovided by the data against the null hypothesis, H o .A confidence interval gives a range of plausible values for theparameter of interest that is consistent with the data (at the specifiedlevel of confidence).A significance test determines the strength of the evidence against thehypothesised value, while a confidence interval determines the size of theeffect or difference.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 8 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Significance testing and confidence intervals are methods used to deal with theuncertainty about the true value of a parameter caused by thesampling variation in estimates.Step-by-Step Guide to Performing at-test and confidence interval by Hand1. State the parameter to be estimated (symbol and words).Is it µ, p, µ 1– µ 2, or p 1– p 2?2. State the null hypothesis, H 0 . i.e. H 0 : parameter = hyp. val.3. State the alternative hypothesis, H 1 . i.e. H 1 : parameter ≠ hyp. val.OR H 1 : parameter < hyp. val.4. State the estimate, and its value.5. Calculate the t-test statistic:OR H 1 : parameter > hyp. val.Use:t0estimate hypothesised valuestd errorsee back pagefor Formulae SheetUse the estimate from Step 4 and the hypothesised value from Steps 1&2.Find the appropriate standard error and the df (using Formulae Sheet).6. Calculate the P-value. (using Excel or the Student’s t-table).7. Interpret the P-value. (see page 15).8. Calculate the confidence interval.Use: estimate ± t × se(estimate)see back pagefor Formulae SheetUse the estimate from Step 4 and the standard error from Step 5.Find the appropriate t-multiplier (using Excel or the Student’s t-table).9. Interpret the confidence interval using plain English.There are four different types of problem:1. Single mean 2. Single proportion. 3. Difference between two means4. Difference between two proportions:Situation (a) Proportions from two independent samplesSituation (b) One sample of size n, several response categoriesSituation (c) One sample of size n, many yes/no items© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 9 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions20143 sampling situations for the difference between two proportionsSituation (a) Situation Proportions (a): Two independent from two samples independent samplesA occurs?A occurs?Sample 1 Sample 2Yes No Yes NoCompare proportionsFrom Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 1999.Situation (b): Single sample, several response categoriesSituation (b) One sample of size n, several response categoriesSinglesam pleCat. 1 Cat. 2 Cat. 3 Cat. 4 Cat. 5 Cat. 6CompareproportionsFrom Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.Situation Situation (c) (c): One Single sample sample, of two size or more n, many Yes/No yes/no items itemsA occurs? B occurs? C occurs?SinglesampleYesNoYesNoYesNoFrom Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 10 of 44The University of Auckland


Step 1Statistics – Understanding Hypothesis Tests:Proportions2014The parameter of interest we are investigating depends on the problem type:Parameter1. Single mean µ:2. Single proportion p:3. Difference between two means µ 1 – µ 2:(independent samples)4. Difference between two proportions p 1 – p 2 :Steps 2 & 3The null hypothesis, H 0 It is our best guess as to what we think the parameter of interest is – asingle plausible value. General form: H 0 : parameter = hypothesised value (some number)2. H 0 : p = 4. H 0 : p 1 – p 2 = It’s the boring thing – there is no effect or difference. The hypothesised value is not the parameter of interest. Rememberthat the parameter of interest is an unknown quantity.The alternative hypothesis, H 1 Specifies the type of departure from H 0 that we expect to detect. Corresponds to the research hypothesis. There are three different types:o H 1 : parameter ≠ hypothesised value (some number)o H 1 : parameter < hypothesised value (some number)o H 1 : parameter > hypothesised value (some number)2. H 1 : p 4. H 1 : p 1 – p 2 ONLY use a one-sided alternative when you have prior information or atheory. It’s the interesting thing – there is an effect or difference.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 11 of 44The University of Auckland


Step 4 (and Step 8)Statistics – Understanding Hypothesis Tests:Proportions2014 The estimate is based on the parameter of interest we are investigating:ParameterEstimate1. Single mean µ: estimate x2. Single proportion p: estimate pˆ3. Difference between two means µ 1 – µ 2:(independent samples)estimatex 1 x 24. Difference between two proportions p 1 – p 2 : estimate pˆˆ1 p2Step 5 (and Step 8) The standard error is based on the estimate, the number of samples andsample size(s):Estimatese(estimate)1. estimate xse(x)sn2. estimate pˆse(ˆ)p pˆ(1npˆ)3. estimate x 1 x2se(x1 x2)sn© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 12 of 44The University of Auckland211sn4. estimate pˆˆ1 p Situation (a) Proportions from two2independent samplesse(ˆp1Situation (b)se(ˆpSituation (c)se(ˆp12222 pˆ) 1pˆ ˆ1(1 p1)n1pˆˆ2(1 pnOne sample of size n, severalresponse categories pˆ) 2pˆ1 pˆ2 (ˆ pn12 pˆ)One sample of size n, manyyes / no itemsmin imum(ˆp1 pˆ2,qˆ1 qˆ2) (ˆ p1 pˆ2)nwhere qˆˆ1 1 p1and qˆˆ2 1 p2222) pˆ)22


Statistics – Understanding Hypothesis Tests:Proportions2014The degrees of freedom are based on the problem type:EstimateDegrees of Freedomi.e. forproportions,1. estimate x df n 1assume thedegrees of2. estimate pˆdf = ∞freedom is infinity,hence replace t3. estimate x 1 x2df = minimum(n 1 – 1, n 2 – 1) distribution with zdistribution (i.e.4. estimate pˆˆ1 p2df = ∞the standardNormal). The t-test statistic Is the number of standard errors our estimate is from the hypothesisedvalue. We calculate it using:Step 6 The P-value:t0estimate hypothesised valuestd error Is the probability that sampling variation would produce an estimate thatis further away from the hypothesised value than the estimate weobtained from our data, assuming that the null hypothesis is true. It is a measure of evidence against H 0 .see backpage forFormulaeSheet We calculate the P-value using the t-test statistic and the appropriateStudent’s t distribution.Alternative hypothesis P-value = area of shaded regionT ~ Student(df )H 1 : parameter hypothesised value(2-sided)H 1 : parameter > hypothesised value(1-sided)2-tailed test1-tailed testH 1 : parameter < hypothesised value(1-sided)1-tailed test© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 13 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Example:Find the P-value whenthe t-test statistic is:t 0 , = –1.25 and thedegrees of freedom,df, = ∞, for a two-tailedtest using Excel 2010.Excel:For a 1-sided test, the P-value is ____________.For a 2-sided test, the P-value is ____________.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 14 of 44The University of Auckland


1.Step 7Statistics – Understanding Hypothesis Tests:Proportions2014The P-value measures the strength of evidence against the null hypothesis,H 0 . The smaller the P-value, the stronger the evidence against H 0 .The are two ways of interpreting the P-value:1. As a description of the strength of evidence against H 0 :P-valueEvidenceagainst H 0> 0.12 None 0.10Weak 0.05Some 0.01Strong< 0.001 Very Strong2. As a description of the test result as (statistically) significant ornonsignificant.A test result is significant when the P-value is “small enough”; usually weopt for any P-values less than 0.05 (5%):Testing at a 5% level of significance:P-value Test result Action< 0.05 Significant Reject H 0 in favour of H 1> 0.05 Nonsignificant Do not reject H 0Testing can be done at any level of significance; 1% is common but 5%what most researchers use.The level of significance is an error rate; and can be thought of as thefalse alarm rate: i.e. it is the proportion of the time that a true nullhypothesis will be rejected (and the proportion of the time that a falsenull hypothesis will not be rejected).© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 15 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Step 8estimate ± t ×se(estimate)The t-multiplier (means) / z-multiplier (proportions) is based on: Whether we are investigating means or proportions The desired level of confidence The degrees of freedomEstimateDegrees of Freedom1. estimate x df n 12. estimate pˆdf = ∞3. estimate x 1 x2df = minimum(n 1 – 1, n 2 – 1)4. estimate pˆˆ1 p2df = ∞i.e. forproportions,assume thedegrees offreedom is infinity,hence replace tdistribution with zdistribution (i.e.the standardNormal).Example:Find the t-multiplier for a 95% confidence interval for a single proportion usingExcel 2010. It is a proportion scenario so degrees of freedom,df = __________.Answer:Excel:The t-multiplier is ____________.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 16 of 44The University of Auckland


Step 9Statistics – Understanding Hypothesis Tests:Proportions2014A confidence interval gives a range of plausible values for the parameterof interest that is consistent with the data (at the specified level ofconfidence). It determines the size of the effect or difference. You can do all kind of CI’s, 90%, 95%, 99%...Increasing the confidence level will increase the width of the interval.80% CI, x ± 1.282 se(x)90% CI, x ± 1.645 se(x)95% CI, x ± 1.960 se(x)99% CI, x ± 2.576 se(x)Figure 8.1.3The greater the confidence level, the wider the intervalFrom Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.Increasing the sample size will make the confidence interval more precise.To double the accuracy of the confidence interval we need 4 times asmany observations.To triple the accuracy of the confidence interval we need 9 times as manyobservations.95% confidence interval Range of plausible values for the parameter of interest that contains thetrue value of our parameter of interest for 95% of samples taken. 5% of samples taken will not have the parameter within the calculatedconfidence interval. We do not know if the sample we have taken is one of the 95% thatcontains the true unknown parameter. All we can say is that 95% of thetime it will.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 17 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014 If you take 1000 samples, based on the same sampling protocol, thenyou can expect approximately 950 of these samples will contain the truevalue (e.g. true mean, true difference between means) of the population.Sample1st2nd3rd4th5th6th7th8th9th100th..........500th24.8310th90.0%.......... ..........501st502nd..........991st992nd993rd994th995th996th997th998th999th1000thTrue meanoCoverageto date100%100%100%100%100%100%100%100%88.9%94.0%..........96.0%96.0%96.0%..........95.2%95.2%95.2%95.2%95.2%95.2%95.2%95.2%95.2%95.2%24.8224.8324.84Figure 8.1.2True meanSamples of size 10 from a Normal(µ=24.83, s=.005)distribution and their 95% confidence intervals for µ..From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 1999.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 18 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Interpreting the CI limits Step 9 for story type 4:CIs for the difference between two proportions: If the CI contains 0 (i.e. one negative and one positive number),there may be no difference between the two proportions. If CI is positive, then p 1 is higher/larger than p 2 . If CI is negative, then p 1 is lower/smaller than p 2 .Examples:(-.15, .09)(.09, .15)(-.15, -.09)Practical significance versus Statistical significanceStatistical significance Relates to having evidence of the existence of an effect or difference. Determined by examining the P-value of your significance test.To be statistically significant at the 5% level, the P-value must begreater than / less than 0.05 (5%).Practical significance Depends on the size of the effect or difference. Determined by examining the confidence interval in relation to theresearch in question.The link between the p-value and the confidence intervalLet’s say we’re interested in statistical significance at the 5% level. Then the“magic number” for the p-value is 0.05 and any p-value less than 0.05 issignificant at the 5% level. Then we also know, before calculating the95% confidence interval that the hypothesised value will beinside/outside it. If our p-value is more than 0.05, it is not significant atthe 5% level. Then we also know, before calculating the 95% CI that thehypothesised value will be inside/outside it.Let’s say we’re interested in statistical significance at the 1% level. Then the“magic number” for the p-value is 0.01 and any p-value less than 0.01 issignificant at the 1% level. Then we also know, before calculating the99% confidence interval that the hypothesised value will beinside/outside it. If our p-value is more than 0.01, it is not significant atthe 1% level. Then we also know, before calculating the 99% CI that thehypothesised value will be inside/outside it.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 19 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Student’s t-distribution Parameter: Degrees of Freedom (df). Smooth symmetric, bell-shaped curve centred at 0 like the StandardNormal distribution [Z ~ Normal (µ = 0, σ = 1)] but it’s more variable(it’s more spread out). As df becomes larger, the Student (df) distribution becomes more andmore like the Standard Normal distribution. Student’s t-distribution (df = ∞) and Normal (0,1) are the samedistribution. Methods based on this distribution work very well even for small samplesthat are from very non-Normal distributions such as those samples whichare not symmetric but skewed in some way, possibly quite severely.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 20 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Check you understand! – Practice Questions1. Which one of the following statements about hypothesis testing is false?(1) The larger the P-value, the stronger the evidence against the nullhypothesis.(2) The P-value is the probability that, if the null hypothesis were true,sampling variation would produce an estimate that is further awayfrom the hypothesised value than our data estimate.(3) We cannot establish a hypothesised value for a parameter, we canonly determine whether there is evidence to reject a hypothesisedvalue.(4) H 0 is typically a sceptical reaction to a research hypothesis.(5) The P-value measures the strength of evidence against the nullhypothesis.2. Which one of the following statements about significance tests is false?(1) Formal tests can help determine whether effects we see in our datamay just be due to sampling error.(2) The P-value associated with a two-sided alternative hypothesis isobtained by doubling the P-value associated with a one-sidedalternative hypothesis.(3) The P-value says nothing about the size of an effect.(4) The data should be carefully examined in order to determinewhether the alternative hypothesis needs to be one-sided or twosided.(5) A large P-value says the null hypothesis is believable based on theevidence (the data) presented.3. Which one of the following statements about hypothesis testing is false?(1) We make hypotheses about sample estimates.(2) In the t-test, the null hypothesis, H 0 , always involves an “=” sign.(3) We investigate whether a hypothesised value is plausible in light ofour sample data.(4) If we get a t-test statistic with a value of 3, we know the sampleestimate is 3 standard errors above the hypothesised value.(5) A large P-value does not imply that H 0 is true.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 21 of 44The University of Auckland


Questions 4 to 6 refer to the following information.Statistics – Understanding Hypothesis Tests:Proportions2014Death Penalty Survey Results“Should convicted murderersbe put to death?”AustraliaN.Z.Yes 46% 42%No 39% 41%Can’t Say 15% 17%[Polls of 1307 Australians & 1010 New Zealanders]4. Based on previous studies, a researcher believes that the proportion of NewZealanders who agree that convicted murderers should be put to deathwould be more than forty percent. The hypotheses for this test would be:(1) H 0 : p = 0.40; H 1 : p 0.40(2) H 0 : p < 0.40; H 1 : p = 0.40(3) H 0 : p = 0.40; H 1 : p > 0.40(4) H 0 : p < 0.40; H 1 : p 0.40(5) H 0 : p < 0.40; H 1 : p > 0.405. Let’s assume instead that the researcher tested H 0 : p = 0.40 versusH 1 : p 0.40 where p = the proportion of New Zealanders who agree thatconvicted murderers should be put to death. The value of the t-teststatistic, t 0 , and the degrees of freedom df, to be used with 95% confidenceare given by:(1) t 0 = 1.288, df = ∞(4) t 0 = 1.288, df = 1.96(2) t 0 = 1.288, df = 1009(5) t 0 = -1.288, df = 1009(3) t 0 = -1.288, df = ∞Another difference considered by the researcher was between the proportion ofAustralians supporting the death penalty for convicted murderers and the proportionof New Zealanders supporting the death penalty for convicted murderers.6. To test for a difference in the two proportions given above the hypotheseswould be:(1) H0: μ1 μ2 0 vs H1: μ1 μ2 0(2) H : ˆ ˆ0p1 p2 0 vs H : ˆ ˆ1p1 p2 0(3) H0: p1 p2 0 vs H1: p1 p2 0(4) H : ˆ ˆ0p1 p2 0 vs H : ˆ ˆ1p1 p2 0(5) H p p 0 vs H p p 00:1 21:1 2© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 22 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Question 7 refers to the following information.Results from two independent polls of voting intentions, taken in July andAugust 1993, were reported in TIME (20 September, 1993). There were 950voters sampled in each poll. The results were:Preferred Party (%)Alliance Labour National NZ FirstJuly 18 36 38 -August 12.5 38 36.5 11.5There was some interest in whether the NZ First party had a bigger impact onthe Labour or the National vote. To investigate this, a test was conducted ofthe null hypothesis that there was no change in p L – p N , the Labour-Nationaldifference, between the two polls. The resulting P-value was 0.76.7. Which of the following statements is true?(1) There is no evidence at the 5% level that NZ First has had animpact on support for either Labour or National.(2) The probability that NZ First has had no impact on the differencebetween National and Labour is 0.76.(3) A 95% confidence interval for the difference between National andLabour would not contain zero.(4) There is no evidence at the 5% significance level that thedifference between National and Labour has changed.(5) The probability that NZ First has had an impact on the differencebetween National and Labour is 0.76.8. Which one of the following statements is false?(1) In hypothesis testing, statistical significance does not implypractical significance.(2) In a hypothesis test for no difference between two proportions, avery small P-value indicates a very large difference in theproportions.(3) In hypothesis testing, a non-significant test result does not implythat H 0 is true.(4) In hypothesis testing, large samples can lead to small P-valueswithout the results having any practical significance.(5) In a hypothesis test for no difference between two proportions, atwo-sided test should be used when the idea of doing the test hasbeen triggered as a result of looking at the data.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 23 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Questions 9 to 12 refer to the following information.Four single-sex and two co-educational schools in Melbourne, Australia, wereasked to participate in a recent study designed to examine adolescents’attitudes towards confidentiality in the school counselling situation. All sixschools were private schools. Three of the single-sex schools agreed to takepart; one of the single-sex schools and both of the co-educational schoolsdeclined to take part in the study.The students were advised that participation was voluntary and anonymous,and that they were free to withdraw from the study at any time.Questionnaires were completed in school. Some results from the study aregiven in Table 4 below. It shows the percentage of students (aged 14–18years) agreeing, disagreeing, or unsure as to whether the school counsellorshould tell parents in situations of contraceptive use, and/or pregnancy.There were 221 male respondents and 174 female respondents.ResponseSampleSituation Agree % Disagree % Unsure % sizeContraceptionmales 33 52 15 221females 13 79 8 174Pregnancymales 41 43 16 221females 15 74 11 174Table: Adolescents’ Attitudes Towards ConfidentialityLet p agree be the proportion of all Australian male secondary school students(aged 14–18 years) who agree that a counsellor should tell parents insituations of pregnancy and p disagree be the corresponding proportion whodisagree.The results from the study are used to conduct a 2-tailed test for no differencebetween p agree and p disagree .9. An estimate of the difference between p agree and p disagree is:(1) −1.9(2) −0.02(3) −0.2(4) −0.59(5) −0.19© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 24 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions201410. For the purpose of calculating se( pˆagree- pˆdisagree), the sampling situationcan be described as:(1) one sample of size 395, several response categories.(2) one sample of size 395, many yes/no items.(3) two independent samples of sizes 221 and 174.(4) one sample of size 221, several response categories.(5) one sample of size 221, many yes/no items.11. The expression for evaluating the test statistic for the null hypothesis,H 0 : p agree − p disagree = 0, is:(1)pˆse(pˆagreeagree- pˆ- pˆdisagreedisagree)(4)pse( pˆagreeagree- p- pˆdisagreedisagree)(2)se( pˆpˆagreeagree)2- pˆdisagree- se( pˆdisagree)2(5)se( pˆpˆagreeagree- pˆdisagree) + se( pˆdisagree)(3)se( pˆpagreeagree- pdisagree) + se(pˆdisagree)12. Let p contra and p preg be the proportions of Australian female students(aged 14–18 years) who disagree that a counsellor should tell parentsin situations of contraceptive use, and pregnancy, respectively.Information from Table 4 is used to construct a 95% confidence intervalfor the difference p contra − p preg .The formula for the standard error of the estimate, se pˆcontra- pˆ) , is:(preg(1)(2)(3)(4)(5)pˆ( 1 - pˆ ) pˆ ( 1 pˆ+)contra preg preg174174contra-22pˆcontrapˆpreg174-174( 1 - pˆcontra) + ( 1 - pˆpreg) - ( pˆcontra- pˆpreg)174(contra preg contra-pregpˆ+ pˆ22pˆcontrapˆpreg174+174) - (pˆ174pˆ)22© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 25 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions201413. The general formula for a confidence interval for the difference betweentwo proportions is:pˆ1- pˆ2± z × se(pˆ1- pˆ2)Which one of the following statements about confidence intervals for thedifference between two proportions is false?(1) The value of the z-multiplier depends on the confidence level.(2) The confidence interval method for proportions works only if thesample size is sufficiently large.(3) The confidence interval is centred on pˆ1- pˆ2.(4) The value of the z-multiplier depends on the sample size.(5) The size of the standard error depends on the sampling situation.Questions 14 to 18 refer to the following information.A survey of 2171 men and 2412 women in Auckland in the early 1990s foundthat 10% of men abstained from drinking alcohol compared with 16% ofwomen.We wish to compare the proportion of female abstainers, p female , with theproportion of male abstainers, p male .14. The sampling situation is best described as:(1) two independent samples.(2) one sample, several response categories.(3) one sample, many yes/no items.(4) two samples, several response categories.(5) two samples, many yes/no items.15. Based on the data, a 95% confidence interval for p female − p male is(0.041, 0.079). Which one of the following statements is false?(1) Based on the data, a 99% confidence interval would be wider than0.038.(2) The point estimate of p female − p male is 0.06.(3) We are confident that the proportion of female abstainers is largerthan the proportion of male abstainers.(4) Zero is a plausible value for p female − p male .(5) Based on the data, a 95% confidence interval for p male − p female is(−0.079, −0.041).© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 26 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions201416. Consider the P-value associated with a two-tailed test for no differencebetween p female and p male . Based on the confidence interval in Question15, which one of the following statements is true?(1) The P-value is much less than 5%.(2) The P-value is around 10%.(3) We do not have enough information to determine the approximateP-value.(4) The P-value is greater than 5%.(5) The P-value is just below 5%.Questions 17 and 18 refer to the following additional information.Overall, 13% of the 4583 people surveyed abstained from alcohol. We areinterested in p abstain , the proportion of people who abstain from alcohol.A t-test of the hypotheses:H 0 : p abstain = 0.1H 1 : p abstain ≠ 0.1gives a test statistic of 6.04 and a P-value of 0.000.17. Which one of the following statements is false?(1) The test is significant at the 1% level of significance.(2) If the null hypothesis is true, it is extremely unlikely that samplingvariability would give values further away from the hypothesisedvalue, 0.1, than our sample estimate.(3) The sample estimate, pˆ abstain, is approximately 6 standard errorsabove the hypothesised value, 0.1.(4) If the null hypothesis is true, sampling variability could never givevalues further away from the hypothesised value, 0.1, than oursample estimate.(5) The hypothesised value, 0.1, would be outside a 99% confidenceinterval for p abstain .18. Which one of the following statements gives the best interpretation ofthe hypothesis test result?(1) There is some evidence that the true population proportion, p abstain ,is not 0.1.(2) There is very strong evidence that the true population proportion,p abstain , is not 0.1.(3) There is no evidence that the true population proportion, p abstain , isnot 0.1.(4) There is very strong evidence that the true sample proportion,pˆabstain, is not 0.1.(5) There is very strong evidence that the true population proportion,p abstain , is 0.1.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 27 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Questions 19 to 21 refer to the following information.In an April 1995 survey 655 New Zealanders were asked to choose theirfavourite advertisement. Listed below are the five most popularadvertisements for that month together with the percentage of the sample thatchose that advertisement.Air NZ - the birds 9%Telecom-Spot 7%Bluebird-chips/penguins 6%Telecom-animals 5%Anchor-family series 5%Some other advertisement 31%No favourite advertisement 37%A two-sided t-test is conducted to see if there is any evidence of a differencebetween p birds and p animals . Let p birds be the proportion who have “Air NZ - thebirds” as their favourite and p animals be the proportion who have “Telecom -animals” as their favourite. (This is the t-test referred to in the questionsbelow.)19. Which one of the following gives the correct null and alternativehypotheses?(1) H 0 : p animals = .05H 1 : p birds > .05(2) H 0 : p animals p birdsH 1 : p animals > p birds(3) H 0 : p birds – p animals = 0H 1 : p birds – p animals 0(4) H 0 : p birds – p animals = .04H 1 : p birds – p animals .04(5) H : ˆ ˆ0p birds p animals 0H : pˆbirds pˆ01 animals20. The expression used to calculate the t-test statistic, t 0 , is:(1) - 4 se (pˆ pˆ )/birds-animals(2) - .04 / se (ˆ p birds- pˆ)0animals(3) .04 t se (ˆ p birds- pˆ)0animals(4) .04 / se (ˆ p birds- pˆ)0animals(5) / se (ˆ p birds- pˆ)4animals© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 28 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions201421. The P-value of the t-test is 0.006. Which one of the following statementsis true?(1) There is no evidence of a difference between p birds and p animals .(2) The test is significant at the 5% level.(3) Pr( p birds > p animals ) = 0.006(4) Since the P-value is so small the null hypothesis must be false.(5) Since the P-value is so small the null hypothesis must be true.22. Which one of the following statements about a confidence interval for aparameter p is false?(1) A two-standard-error interval will always capture the true value of p.(2) Large samples tend to yield narrower 95% confidence intervalsthan small samples.(3) In the long run, if we repeatedly take samples and calculate a 95%confidence interval from each sample, we expect that 95% of theintervals will contain the true value of p.(4) If I plan to do a study in the future in which I will take a randomsample and calculate a 90% confidence interval, there is a 90%chance that I will catch the true value of p in my interval.(5) If a large number of researchers independently perform studies toestimate p, about 95% of them will catch the true value of p intheir 95% confidence intervals.23. Which one of the following statements is true?(1) A point estimate is preferred to a confidence interval because theinterval summarises the uncertainty due to sampling variation.(2) The standard error used to construct the interval will be identicalfor all samples of the same size.(3) If I plan to do a study in the future in which I will take a randomsample and calculate a 95% confidence interval, there is a 95%chance that I will catch the true value of p in my interval.(4) The size of the t-multiplier depends on only the sample size andnot the desired confidence level.(5) The process of using a population parameter to construct aninterval for the data estimate is an example of statistical inference.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 29 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Questions 24 to 28 refer to the following information.The Washington Post, The Henry J Kaiser Family Foundation and HarvardUniversity conducted a poll (8 March – 22 April, 2001) ‘to gauge the racialattitudes of American adults’. The telephone poll surveyed 1709 adultsincluding 779 whites, 323 African Americans, 315 Hispanics and 254 AsianAmericans. Assume this sample of 1709 adults is a random sample ofAmerican adults. Two of the questions in the survey were:Question 1:Do you feel that African Americans have more, less or about the sameopportunities in life as whites have?andQuestion 2:Do you feel that Asian Americans have more, less or about the sameopportunities in life as whites have?The percentage results for these two questions are shown in Table 7 below.ResponseSampleMore % Less % Same % Unsure % sizeQuestion 1White 13 27 58 2 779African American 1 74 23 2 323Hispanic 8 46 44 2 315Asian American 10 44 39 7 254Total Sample 11 35 51 2 1709Question 2White 13 14 70 4 779African American 15 38 39 8 323Hispanic 18 24 55 3 315Asian American 7 34 53 5 254Total Sample 14 18 63 4 1709Table 7: Americans’ responses to racial attitudes surveyLet p more be the proportion of whites who feel that African Americans havemore opportunities in life than whites have and p less be the proportion ofwhites who feel that African Americans have less opportunities in life thanwhites have. (Ie, whites’ responses to Question 1.)24. An estimate of the difference between p more and p less is:(1) 0.018(4) 0.01(2) -0.01(5) -0.10(3) -0.14© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 30 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions201425. Information from Table 7 above, is used to construct a 95% confidenceinterval for the difference p more - p less . For the purpose of calculatingse( pˆmore- pˆless), the sampling situation can be described as:(1) two independent samples of sizes 779 and 254.(2) one sample of size 779, several response categories.(3) one sample of size 1709, many yes/no items.(4) one sample of size 1709, several response categories.(5) one sample of size 779, many yes/no items.26. A 95% confidence interval for the difference p more - p less is(-0.1833, -0.09668). The best interpretation of this interval is:With 95% confidence, the percentage of whites who feel that AfricanAmericans have more opportunities in life than whites have issomewhere between:(1) 10% higher than and 18% lower than the percentage who feel thatAfrican Americans have less opportunities in life than whites have.(2) 10% and 18%.(3) 10% and 18% higher than the percentage who feel that AfricanAmericans have less opportunities in life than whites have.(4) 10% lower than and 18% higher than the percentage who feel thatAfrican Americans have less opportunities in life than whites have.(5) 10% and 18% lower than the percentage who feel that AfricanAmericans have less opportunities in life than whites have.Questions 27 and 28 refer to the following additional information.Let:p question1 be the proportion of Asian Americans who feel that AfricanAmericans have more opportunities in life than whites haveandp question2 be the proportion of Asian Americans who feel that AsianAmericans have more opportunities in life than whites have.Information from Table 7, is used to conduct a 2-tailed test for no differencebetween p question1 and p question2 .27. The expression for evaluating the test statistic for the null hypothesis,H 0 : p question1 - p question2 = 0, is:(1)se(ˆppˆ- pˆquestion1question1question2) + se(ˆpquestion2)(4)pˆse(ˆpquestion1question1- pˆquestion2- pˆquestion2)(2)se(ˆppˆquestion1question1)2- pˆquestion2- se(ˆpquestion2)2(5)sep- pquestion1question2( pˆˆquestion1- pquestion2)(3)se pˆp - pquestion1question2(question1question2) + se(ˆp)© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 31 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions201428. The formula for the standard error of the estimate, se( pˆquestionˆ1- pquestion2),is:2(1 - pˆ) + (1 ˆ ) (ˆ ˆquestion1 - pquestion2- pquestion1- pquestion2)(1)254(2)(3)(4)(5)(ˆ p + pˆ) - (ˆ p+ pˆquestion1 question2question1question2254pˆ ˆ ˆ (1 ˆquestion1(1- pquestion1)pquestion2- pquestion2)+25425422pˆquestionpˆ-254 2541 question2(ˆ p + pˆ) - (ˆ p - pˆquestion1 question2question1question2254))22Questions 29 to 31 refer to the following information.During 1999, students enrolled in stage one statistics at the University ofAuckland were surveyed regarding their access to, and experience with,computers. The survey was included as a question in an assignment, andstudents were given marks for completing it (irrespective of the answers theygave). Staff administering the courses wished to use the results of this surveyto draw conclusions about future stage one statistics students.One question asked: ‘At the start of the course, how would you describe yourExcel experience?’. A total of 918 students answered this question. Each of the918 answers were classified according to the response given by the student,and the stream the student attended. The results are given in the table below,where 107, 108 and 101 refer to the various streams.StreamResponse 107 108 101 TotalNone 15 36 102 153Very Little 44 89 119 252Some 74 150 200 424Lots 9 29 51 89Total 142 304 472 91829. A statistical test is performed on the data for no difference between theproportion of 107 students who responded None and the proportion of101 students who responded None. Which one of the followingstatements about this test is true?(1) The degrees of freedom used depends on the number of 107 and101 students in the sample.(2) The Wilcoxon rank-sum test could be used.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 32 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014(3) The test should be two–tailed.(4) The test could only be used to show a difference existed in thesample proportions.(5) An appropriate null hypothesis is that the difference between theproportion of 107 students who responded None and the proportionof 101 students who responded None, is not zero.30. The standard error for the difference in the proportions tested in question29 is:(1) 0.033(2) 0.022(3) 0.023(4) 0.032(5) 0.18431. The P-value for the statistical test mentioned in question 29 is 0.004.Which one of the following statements gives the best interpretation ofthis P–value?(1) There is some evidence that the underlying proportions of 107students and 101 students that would respond None are different.(2) There is strong evidence that the sample proportions of 107students and 101 students that responded None are different.(3) There is strong evidence that the underlying proportions of 107students and 101 students that would respond None are different.(4) There is no evidence that the underlying proportions of 107students and 101 students that would respond None are different.(5) There is weak evidence that the underlying proportions of 107students and 101 students that would respond None are different.32. Which one of the following statements is false?(1) In a t-test for no difference between two proportions, being able todemonstrate that the difference was of practical significance(importance) would almost always imply statistical significance.(2) In hypothesis testing, large samples can lead to small P-valueswithout the results having any practical significance (importance).(3) In hypothesis testing, statistical significance does not implypractical significance (importance).(4) In a hypothesis test for no difference between two proportions, avery small P-value always indicates a very large difference in theproportions.(5) In hypothesis testing, a nonsignificant test result does not implythat the null hypothesis is true.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 33 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Questions 33 to 38 refer to the following information.New Zealand drug survey results obtained in 1998 and 2001 were comparedby the Alcohol and Public Health Research Unit (NZ). In 1998, the nationwidestudy contained 5,475 randomly selected respondents and in 2001 the studycontained 5,504 randomly selected respondents.In both surveys, drug users were asked if they had wanted rehabilitation fortheir drug use but had not received it. Of those respondents who felt they hadnot received the rehabilitation they needed, they were then asked the reasonswhy. Each person could give more than one reason and their responses arelisted in Table 12.1998 2001Drug rehabilitation barriers n = 146 n = 166Didn’t know where to go 33% 32%Social pressure 28% 8%Fear of what would happen on contacting service 23% 23%No time/too busy 20% 19%Services too expensive 17% 14%Fear of losing friends 15% 9%Fear of law/Police 14% 29%No local services available 11% 10%Transport problems 7% 6%Services weren’t ongoing 4% 1%Others 18% 24%Table 12: Barriers said to have prevented drug users obtaining rehabilitation33. Let ppressure be the proportion of people who felt social pressure preventedthem receiving drug rehabilitation in 1998, and pfriends the proportion ofpeople who felt the fear of losing friends prevented them receiving drugrehabilitation in 1998. An estimate of the difference between ppressure andpfriends is:(1) 0.19(2) 0.07(3) 0.01(4) 0.13(5) 0.0034. In order to calculate the standard error of the difference between ˆp pressureand ˆpfriends, the sampling situation can be described as:(1) Two independent samples of size 146 and 166.(2) One sample of size 146, many yes/no items.(3) One sample of size 146, several response categories.(4) One sample of size 166, several response categories.(5) Two independent samples, each of size 146.Questions 35 to 38 refer to the following additional information.Let p98 be the proportion of people who said social pressure prevented themreceiving drug rehabilitation in 1998 and p01 be the proportion of people whosaid social pressure prevented them receiving drug rehabilitation in 2001.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 34 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions201435. The correct standard error of the difference between ˆp98and ˆp01, is:(1)2pˆ (1 ˆ ) ˆ (1 ˆ98- p98p01- p01)(ˆ p + ˆ + ˆ98p01)- (ˆ p98p01)-(4)146 166166(2)pˆ (1 ˆ ) ˆ (1 ˆ98- p98p01- p01)(1 - pˆˆ ˆ98) + (1 - p01)- (ˆ p98- p01)+(5)146 166146(3)2(ˆ p + ˆ ) (ˆ ˆ98p01- p98- p01)14636. In a test for no difference between p98 and p01 (i.e. H 0 : p98 - p01 = 0), theestimate of the difference is 0.2, the test statistic is 4.68 and the P-valueis 0.000. Which one of the following statements is false?(1) It is impossible for sampling variability alone to result in anestimate of a difference larger than 0.2.(2) There is very strong evidence of a difference between p98 and p01.(3) Social pressure was stated significantly less often as a reason fornot receiving drug rehabilitation in 2001 compared to 1998.(4) The P-value indicates that the t-test is significant at the 1% level ofsignificance.(5) The estimate of the difference is approximately 4.7 standard errorsabove the hypothesised difference.37. The 95% confidence interval for the difference p98 - p01 is(0.0972, 0.3028). The best interpretation of this interval is:With 95% confidence, the proportion of drug users who felt socialpressure prevented them receiving drug rehabilitation:(1) fell by somewhere between 10 and 30 percentage points between1998 and 2001.(2) was somewhere between 10 percentage points higher and 30percentage points lower in 2001 compared with 1998.(3) increased by somewhere between 10 and 30 percentage pointsbetween 1998 and 2001.(4) was 10 percentage points in 1998 and 30 percentage points in2001.(5) was somewhere between 10 percentage points lower and 30percentage points higher in 2001 compared with 1998.238. The formula for the t-test statistic for H 0 : p98 - p01 = 0, is:pˆ98- pˆ01pˆˆ98- p01(1)(3)22se(ˆp98)- se(ˆp01)se(ˆp ˆ98- p01)p98- ppˆˆ0198- p01(2)se( p )(4)98- pse(ˆp ) + (ˆ0198se pp - p(5)98 01se( pˆ98) + se(ˆp01)© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 35 of 44The University of Auckland01)


Statistics – Understanding Hypothesis Tests:Proportions2014Questions 39 to 43 refer to the following information.In 1999 the Marketing Department of the University of Auckland releasedresults of a survey of New Zealand bank customers. A random sample of 761customers was selected and these customers were asked a wide range ofquestions about banks and the services the banks provide. From theresponses, measurements were made on many variables. Some of thesevariables were:Bank:Closeness:Performance:The main bank used by the customer— ANZ, BNZ, Westpac (for WestpacTrust), Other (all otherbanks)The customer’s opinion of the closeness of their relationshipwith their main bank— Not Close, Quite Close, Very CloseThe customer’s opinion of the overall performance of theirmain bank— Poor/Fair, Good, ExcellentTwo of the questions in the bank survey, each together with a table showingsome of the percentage results, are given below.Main BankCloseness:How close is the relationship you have with your main bank?Response (Closeness)Not Close % Quite Close % Very Close %SamplesizeANZ 57.3% 33.1% 9.6% 157BNZ 48.3% 40.8% 10.8% 120Westpac 48.7% 39.1% 12.2% 230Other 39.4% 44.9% 15.7% 254Total Sample 47.3% 40.0% 12.6% 761Table 10: Responses to closeness of relationship with main bank© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 36 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Performance:How would you describe the overall level of performance of your mainbank to date?Response (Performance)Poor/Fair % Good % Excellent %SamplesizeMain BankANZ 29.9% 52.9% 17.2% 157BNZ 32.5% 53.3% 14.2% 120Westpac 24.3% 59.6% 16.1% 230Other 13.8% 57.1% 29.1% 254Total Sample 23.3% 56.4% 20.4% 761Table 11: Responses to main bank’s performanceQuestions 39 to 41 refer to the following additional information.Let:andp ANZ be the proportion of bank customers, with ANZ as their main bank,who would describe their relationship with ANZ as ‘Not Close’p Westpac be the proportion of bank customers, with WestpacTrust(Westpac) as their main bank, who would describe their relationship withWestpac as ‘Not Close’.39. From the information in Table 10, page 37, an estimate of the differencep ANZ − p Westpac is:(1) 0.86(2) 0.086(3) 0.54(4) 0.054(5) 0.004© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 37 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions201440. A 95% confidence interval is constructed for the difference between p ANZand p Westpac . For the purpose of calculating se pˆANZ- pˆ), the(Westpacsampling situation can be described as:(1) one sample of size 761, several response categories.(2) one sample of size 387, several response categories.(3) one sample of size 761, many yes/no items.(4) two independent samples of sizes 157 and 230.(5) one sample of size 387, many yes/no items.41. A 95% confidence interval for the difference p ANZ − p Westpac is(−0.014, 0.187).The best interpretation of this interval is:With 95% confidence, the percentage of bank customers, withANZ as their main bank, who would describe their relationshipwith ANZ as ‘Not Close’ is somewhere between:(1) 1.4% and 18.7% higher than the percentage of bank customers,with Westpac as their main bank, who would describe theirrelationship with Westpac as ‘Not Close’.(2) 1.4% lower and 18.7% higher than the percentage of bankcustomers, with Westpac as their main bank, who would describetheir relationship with Westpac as ‘Not Close’.(3) 1.4% and 18.7% lower than the percentage of bank customers,with Westpac as their main bank, who would describe theirrelationship with Westpac as ‘Not Close’.(4) 1.4% higher and 18.7% lower than the percentage of bankcustomers, with Westpac as their main bank, who would describetheir relationship with Westpac as ‘Not Close’.(5) −1.4% and 18.7%.Questions 42 and 43 refer to the following additional information.Consider only customers with ANZ as their main bank.Let:p Close be the proportion who would describe their relationship with ANZ aseither ‘Quite Close’ or ‘Very Close’© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 38 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014andp Perform be the proportion who would describe the ANZ performance as‘Good’ or ‘Excellent’.Information from Tables 10 and 11, page 37, is used to conduct a two-tailedt-test for no difference between p Close and p Perform .42. The formula for the standard error of the estimate, se pˆClose- pˆ), is:(Perform(1)(2)22pˆClosepˆ Perform157 157(ˆ p ˆ ) (ˆ ˆClose pPerform- pClose- p157Perform)2(3)(ˆ pClose pˆPerform)- (ˆ p157Close pˆPerform)2(4)(5)(1 - pˆ) (1 ˆ ) (ˆ ˆClose - pPerform- pClose- p157pˆ(1 ˆ ) ˆ (1 ˆClose- pClosepPerform- p157157PerformPerform))243. The expression for evaluating the test statistic for the null hypothesis,H 0 : p Close − p Perform = 0, is:pClose- pPerform(1)se pˆ) se(ˆp )(ClosePerform(2)(3)(4)(5)pˆse(ˆpCloseClosese(ˆppˆse(ˆppse pˆpˆ- pˆ- pˆCloseCloseCloseClose))Perform2Perform- pˆ)Perform- se(ˆp- pˆ se(ˆp- p- pˆPerformClose Perform(Close PerformPerformPerform)))2© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 39 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Questions 44 to 46 refer to the following information.In 2008, as part of the first World Internet Project New Zealand survey, theInstitute of Culture, Discourse and Communication published The Internet inNew Zealand 2007 Final Report.A random sample of 1430 people aged 12 and over were surveyed viatelephone and were asked many questions in order to ascertain NewZealanders’ usage of and attitudes towards the Internet in 2007.One of the questions asked:‘How important is the Internet in your daily life — important, neutral, notimportant?’The respondents were also categorised by ethnicity. Table 2 shows theresponse to this question.Importance of the InternetEthnicityImportantNeutralNotimportantTotalPakeha 476 137 302 915Maori 46 25 44 115Pasifika 50 26 10 86Asian 130 16 11 157Other 77 33 47 157Total 779 237 414 1430Table 2: Importance of the Internet in daily lifeAssume these 1430 respondents form a random sample from the population ofall New Zealanders aged 12 and over.Let:andp Pasifika be the true proportion of Pasifika New Zealanders aged 12 andover who think that the Internet is important in their daily lifep Pakeha be the true proportion of Pakeha New Zealanders aged 12 andover who think that the Internet is important in their daily life.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 40 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions201444. An estimate of p Pasifika − p Pakeha , to 2 decimal places, is:(1) 0.30(4) −0.58(2) 0.06(5) −0.06(3) −0.3045. The sampling situation associated with se( pˆPasifika- pˆPakeha)is bestdescribed as:(1) one single sample of size 779, several response categories.(2) two independent samples, of sizes 86 and 915.(3) one single sample of size 779, many yes/no items.(4) one single sample of size 1001, several response categories.(5) two independent samples, of sizes 50 and 476.46. A 95% confidence interval for p Pasifika − p Pakeha is (−0.0480, 0.1704).Which one of the following statements is true?(1) It would be surprising to see a different sample of 1430 NewZealanders aged 12 and over produce a poll result with pˆANSWERSsmaller thanpˆPakeha.Pasifika(2) The difference between pˆ Pasifika- pˆPakehais significant at the 5% levelof significance.(3) The estimated difference between p Pasifika − p Pakeha could just besampling error.(4) The margin of error for this 95% confidence interval forp Pasifika − p Pakeha is approximately 22%.(5) Since pˆPasifikais bigger than pˆ Pakeha, we can claim that p Pasifika isbigger than p Pakeha .1. (1) 2. (4) 3. (1) 4. (3) 5. (1) 6. (3)7. (4) 8. (2) 9. (2) 10. (4) 11. (1) 12. (3)13. (4) 14. (1) 15. (4) 16. (1) 17. (4) 18. (2)19. (3) 20. (4) 21. (2) 22. (1) 23. (3) 24. (3)25. (2) 26. (5) 27. (4) 28. (5) 29. (3) 30. (4)31. (3) 32. (4) 33. (4) 34. (2) 35. (2) 36. (1)37. (1) 38. (3) 39. (2) 40. (4) 41. (2) 42. (4)43. (2) 44. (2) 45. (2) 46. (3)© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 41 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 42 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Finding a P-value using Excel – Calculating t ProbabilitiesExample: Find the P-value when the t-test statistic, t 0 , = –1.25 andthe degrees of freedom,df, = ∞ using Excel 2010:1. Click in cell A1.2. Click the Insert Functionbutton from beside theformula bar.3. In the Insert Function dialog box (Figure 1):Figure 1a. Choose Statistical from the Or select a category menu.b. Choose T.DIST.2T from the Select a function list and click OK.4. In the Function Arguments dialog box (Figure 2):a. Fill the t-test statisticin the X field. Note:If your t-test statisticis negative, DON’Ttype the negativesign.b. Type in the degrees of freedom (use either n – 1or a large number such as 99999 for ∞/infinity).Figure 2c. Click OK. (The value of 0.211302 should appear in cell A1).d. Round the p-value to 3 or 4 decimal places by usingthe Decrease Decimal button from the Numbersection of the Home tab.e. If your test is one-sided, you will need to halve your p-value byclicking in cell A2, typing =A1/2 and pressing Enter.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 43 of 44The University of Auckland


Statistics – Understanding Hypothesis Tests:Proportions2014Finding a t-multiplier using Excel – Calculating the Inverseof the Student t-distributionExample: Find the t-multiplier for a 95% confidence interval for a singleproportion using Excel 2010.freedom, df = ∞.It is a proportion scenario so degrees of1. Click on cell A1.2. Click the Insert Function buttonfrom beside the formula bar.3. Choose Statistical from theOr select a category box in theInsert Function dialog box.4. Choose T.INV.2T from theSelect a function box (Figure 3) and click OKFigure 35. Fill in the T.INV.2T dialog box (Figure 4).Figure 46. Click OK. (The value 1.960201264 should appear in cell A1.)7. Round the t-multiplier to 3 or 4 decimal places by using the DecreaseDecimal button from the Number section of the Home tab.© Libraries and Learning Services - Student Learning Services (Tā te Ākonga) Page 44 of 44The University of Auckland

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!