MATH 227 STATISTICS FINAL EXAM - West Los Angeles College
MATH 227 STATISTICS FINAL EXAM - West Los Angeles College
MATH 227 STATISTICS FINAL EXAM - West Los Angeles College
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>MATH</strong> <strong>227</strong> <strong>STATISTICS</strong><br />
<strong>FINAL</strong> <strong>EXAM</strong><br />
M<strong>227</strong> Spring 2007<br />
NAME:______________<br />
Thursday, June 7, 2007 11:30AM-1:30PM<br />
Instructor: M. Robertson<br />
1
M<strong>227</strong> Fall 2006 <strong>FINAL</strong> <strong>EXAM</strong> NAME:______________<br />
Thursday, June 7, 2007 11:30AM-1:30PM<br />
Instructor: M. Robertson<br />
Directions: Use the SCANTRON form for all problems 1-100. Show all of your work in the test<br />
booklet. You may use your calculator, but indicate what you did. Good luck! You have 2 hours.<br />
1-45) TRUE(A) or FALSE(B) 1 point each<br />
1. For any two events A and B, PA ( ∪ B) = PA ( ) + PB ( ) −PA ( ∩ B)<br />
.<br />
2. If two events A and B are mutually exclusive, then PA ( ∩ B) = PA ( ) × PB ( ).<br />
3. Zip codes are an example of numerical (not categorical) data.<br />
2<br />
2<br />
4. The Chi-Squared χ test statistic is used to perform a hypothesis test on σ (or on σ ).<br />
5. For a data set having a positively skewed unimodal histogram, the median will be to the right<br />
of the mean.<br />
6. In a sample of size 25, the median is the average of the 12th and 13th largest values.<br />
2<br />
7. The variable χ is normally distributed.<br />
8. For any continuous probability distribution, P ( x = c)<br />
= 0 for all values of c .<br />
9. For a standard normal distribution, and for some point a on the number-line,<br />
P ( z < −a)<br />
= P(<br />
z > a)<br />
.<br />
10. If the null hypothesis is not rejected, then there is strong statistical evidence that the null<br />
hypothesis is absolutely true.<br />
11. In the multiple regression model y = β0 + β1x1+ β2x2, β<br />
2<br />
can be interpreted as the amount<br />
y will be expected to change when the value of the predictor variable x 2<br />
is increased by one unit,<br />
as long as the predictor variable x 1<br />
is held constant.<br />
2
12. The large sample z test for µ<br />
1<br />
− µ<br />
2<br />
can be used as long as at least one of the two sample<br />
sizes, n 1 and n 2 , is greater than 30.<br />
13. The p-value, or observed significance level, represents the probability, assuming H is false,<br />
0<br />
of obtaining a value of the test statistic at least as contradictory to H as what actually resulted.<br />
0<br />
14. If you toss a “fair” coin 100 times, you will observe exactly 50 heads.<br />
15. If you toss a “fair” coin 100 times, the expected number of heads will be 50.<br />
2<br />
16. x , s and σ are sample statistics.<br />
17. Regarding the regression equation y = β0 + β1x, the sample regression coefficient<br />
1<br />
ˆβ is the<br />
unbiased estimator for population regression coefficient β<br />
0<br />
.<br />
18. The standard deviation σ<br />
x<br />
of the sampling distribution of x increases as n increases.<br />
19. A standard normal distribution has µ = 1 and σ = 0.<br />
20. The mean and the standard deviation of the sampling distribution of x are µ and<br />
respectively.<br />
σ ,<br />
n<br />
21. The mean of a numerical random variable must always equal one of the possible values that<br />
the variable can take on.<br />
22. The 50th percentile of a normal distribution is equal to the mean of the distribution.<br />
23. In simple linear regression, the model utility test has as its null hypothesis H0: β<br />
1<br />
= 0.<br />
24. In simple linear regression, if the null hypothesis is rejected in a model utility test, there is a<br />
useful linear relationship between x and y , so that values of x may help predict y .<br />
3
25. In the simple linear regression model y = β0 + β1x, β<br />
1<br />
can be interpreted as the amount<br />
y will be expected to change when the value of the predictor variable x is increased by one unit.<br />
26. When a scatterplot is used to graph a bivariate data set, the variable plotted on the y-axis is<br />
often called the response variable while the variable plotted on the x-axis is called the predictor<br />
(explanatory) variable.<br />
27. In Hypothesis Testing, a small p-value indicates that the observed sample results are<br />
inconsistent with the null hypothesis.<br />
28. The null hypothesis should be rejected when the p-value is larger than the significance level<br />
of the test.<br />
29. A simple linear regression model y = β0 + β1x, β<br />
0<br />
represents the probability of type II error<br />
when performing a hypothesis test for β<br />
1<br />
.<br />
2<br />
30. The Chi-Squared χ test statistic is used to test independence between categorical data sets.<br />
31. In testing the utility of a simple linear regression model, the test statistic is a t − ratio .<br />
2<br />
32. In categorical data analysis, a small value of the observed test statistic χ indicates that the<br />
observed cell counts are reasonably similar to those expected when H 0 (categories are<br />
independent) is true.<br />
33. Categorical data used in a test for independence is often summarized in a two-way<br />
contingency table.<br />
34. The expected cell count for the row 1 and column 1 entry in a contingency table is equal to<br />
the product of the row 1 and column 1 “marginal” totals.<br />
35. Two outcomes are independent if the chance that one outcome occurs is unaffected by<br />
knowledge of whether or not the other occurred.<br />
36. Binomial and Poisson random variables are all examples of discrete random variables.<br />
4
37. A discrete numerical variable is one whose possible values form an interval along the number<br />
line.<br />
38. The mean is the middle value of an ordered data set.<br />
39. The Emperical Rule can only be used when the histogram of the data set can be closely<br />
approximated by a normal curve.<br />
40.<br />
2<br />
s is called the sample standard deviation.<br />
41. A z-score tells how many standard deviations a value is from the mean.<br />
42. If the Pearson’s correlation coefficient r between x and y is 0, then there is no relationship<br />
between x and y.<br />
43. A Normal curve is a skewed curve that is can be used to approximate most histograms.<br />
44. A value of Pearson’s correlation coefficient r that is close to 1 indicates that there is a strong<br />
linear relationship between two variables.<br />
45. The Poisson distribution is used to determine the probability of a given number of successes<br />
in a fixed period of time.<br />
46-100) Multiple Choice Questions 2 points each<br />
46. A manufacturer of cellular phones has decided that an assembly line is operating satisfactorily<br />
if less than 3% of the phones produced per day are defective. To check the quality of a day's<br />
production, the company decides to randomly sample 30 phones from a day's production to test<br />
for defects. Define the population of interest to the manufacturer.<br />
a. All the phones produced during the day in question.<br />
b. The 30 phones sampled and tested.<br />
c. The 30 responses: defective or not defective.<br />
d. The 3% of the phones that are defective.<br />
47. For a random variable z which has a standard normal distribution, P(z < 2.10) =<br />
a) .4821 b) .0179 c) .9821 d) none of these<br />
5
48. Suppose the random variable X has a normal distribution with mean 9.0 and variance 49.<br />
The probability that X takes on a value of at least 18 is approximately equal to<br />
a) .0985 b) .4015 c) .9015 d) correct approx. answer not given<br />
49. Could the population be normally distributed<br />
a. No, since the relationship between the expected z-values and the observed values is not linear.<br />
b. Yes, since the relationship between the expected z-values and the observed values is<br />
approximately linear.<br />
c. No, since the relationship between the expected z-values and the observed values is<br />
approximately linear.<br />
d. Yes, since the relationship between the expected z-values and the observed values is not linear.<br />
50. If the slope of the regression line is negative and the coefficient of determination is .64, then<br />
the correlation coefficient is<br />
a. .64<br />
b. .8<br />
c. -.64<br />
d. -.8<br />
6
51. Chebychev's Rule states that the proportion of observations that are within 3 standard<br />
deviations of the mean is at least<br />
a. 1/3<br />
b. 2/3<br />
c. 1/9<br />
d. 8/9<br />
e. correct answer not shown<br />
52. The percentage of points falling below the 75th percentile is<br />
a. 25%<br />
b. 75%<br />
c. can’t say<br />
53. Suppose a 95% confidence interval is computed for µ resulting in the interval (112.4, 121.6).<br />
Then it would be accurate to say<br />
a. 95% of the time, µ falls within the interval (112.4, 121.6).<br />
b. there is a 95% chance that µ will fall within the interval (112.4, 121.6).<br />
c. When using this method, 95% of all the possible samples will produce intervals that do<br />
capture µ .<br />
d. 95% of all the possible values for µ fall within the interval (112.4, 121.6).<br />
54. Which of the following topics did we spend the least amount of time studying in class<br />
a. Box-and-Whisker Plots<br />
b. Normal Distributions<br />
c. Confidence Intervals<br />
d. Hypothesis Testing<br />
7
55. The diameter of ball bearings produced in a manufacturing process can be explained using a<br />
uniform distribution over the interval 3.5 to 5.5 millimeters. What is the probability that a<br />
randomly selected ball bearing has a diameter greater than 4 millimeters Hint: Draw the<br />
distribution, shade appropriate area.<br />
a. .7272<br />
b. .4444<br />
c. .75<br />
d. .50<br />
56. For air travelers, one of the biggest complaints is of the waiting time between when the<br />
airplane taxis away from the terminal until the flight takes off. This waiting time is known to have<br />
a skewed right distribution with a mean of 10 minutes and a standard deviation of 8 minutes.<br />
Suppose 100 flights have randomly been sampled. Describe the sampling distribution of the mean<br />
waiting time between when the airplane taxis away from the terminal until the flight takes off for<br />
these 100 flights.<br />
a. Distribution skewed right, Mean = 10 minutes, standard deviation = .8 minutes<br />
b. Distribution skewed right, Mean = 10 minutes, standard deviation = 8 minutes<br />
c. Distribution normal, Mean = 10 minutes, standard deviation = 8 minutes<br />
d. Distribution normal, Mean = 10 minutes, standard deviation = .8 minutes<br />
57. Which of the following statements about the sampling distribution of x is incorrect<br />
a. The sampling distribution is approximately normal whenever the sample size is sufficiently<br />
large (n > 30).<br />
b. The sampling distribution is generated by repeatedly taking samples of size n and<br />
computing the sample means.<br />
c. The mean of the sampling distribution is µ<br />
x<br />
.<br />
d. The standard deviation of the sampling distribution is σ<br />
x<br />
.<br />
58. The average score of all pro golfers for a particular course has a mean of 70 and a standard<br />
deviation of 3.0. Suppose 36 golfers played the course today. Find the probability that the<br />
average score of the 36 golfers exceeded 71. Hint: This is P ( x > 71)<br />
.<br />
a. .1293<br />
b. .4772<br />
c. .3707<br />
d. .0228<br />
59. If we repeatedly sample from a population samples of size n, and calculate the sample<br />
median for each sample, the accumulation of these sample medians would result in<br />
a. a confidence interval for the population variation<br />
b. the sampling distribution for the population mean<br />
c. an estimate of the sample median<br />
d. the sampling distribution of the sample median<br />
8
60. In the construction of confidence intervals, if all other quantities are unchanged, an increase in<br />
the sample size will lead to a _________ interval.<br />
a. narrower<br />
b. wider<br />
c. less significant<br />
d. biased<br />
61. It is desired to estimate which of the two soft drinks, Coke or Pepsi, <strong>West</strong> <strong>Los</strong> <strong>Angeles</strong><br />
<strong>College</strong> students prefer. A random sample of 167 students produced the following 95%<br />
confidence interval for the proportion of students who prefer Pepsi: (.344, .494). Identify the<br />
point estimate for estimating the true proportion of <strong>West</strong> <strong>Los</strong> <strong>Angeles</strong> <strong>College</strong> students who<br />
prefer Pepsi.<br />
a. .494<br />
b. 1.96<br />
c. 0.95<br />
d. .344<br />
e. .419<br />
62. Which of the calculated values of a test statistic would have the smallest p-value for an upper<br />
tailed test<br />
a. t = 3.05 with 10 degrees of freedom<br />
b. t = 3.05 with 20 degrees of freedom<br />
c. z = 3.05<br />
d. z = 1.46<br />
63. In a two-tailed large sample test with calculated test statistic z = 1.68, the p-value is<br />
a. .0930<br />
b. .0465<br />
c. .9170<br />
d. .9535<br />
64. The probability that a normal variable X falls within 2 standard deviations of the mean is<br />
a. .0228<br />
b. .9772<br />
c. .0456<br />
d. .9544<br />
9
65. The Central Limit Theorem predicts that<br />
a. the sampling distribution of µ _ will be approximately normal for reasonably large samples.<br />
x<br />
_<br />
b. the sampling distribution of x will be approximately normal for reasonably large samples.<br />
_<br />
c. the mean of the sampling distribution of x will tend to be close to µ for reasonably large<br />
samples.<br />
d. the mean of the sampling distribution of µ _ will tend to be close to µ for reasonably large<br />
x<br />
samples.<br />
For problems 66 through 69, determine the form of the test statistic in each<br />
hypothesis test situation. Assume the correct answer corresponds to classroom<br />
discussion.<br />
66. The mean GPA of <strong>West</strong> <strong>Los</strong> <strong>Angeles</strong> <strong>College</strong> football players is less than 2.3. Assume n=15<br />
and GPAs are normally distributed.<br />
2<br />
a) z (standard Normal) b) t c) χ d) F e.) λ<br />
67. The heart rate for second born identical twins are different than the heart rate for first born<br />
identical twins. The sample consists of 90 pairs of twins.<br />
2<br />
a) z (standard Normal) b) t c) χ d) F e.) λ<br />
68. The standard deviation for the volume of Odwalla soda is greater than 0.03 ounces.<br />
2<br />
a) z (standard Normal) b) t c) χ d) F e.) λ<br />
69. Whether occupation is independent of religious belief, using a 2-way contingency table.<br />
2<br />
a) z (standard Normal) b) t c) χ d) F e.) λ<br />
10
For problems 70 through 80, consider the following data set which consists of<br />
measurements of the daily emission of sulfer oxides (in tons) for an industrial plant.<br />
{ 15.8, 18.7, 6.2, 17.5, 11.0, 19.0, 26.4, 13.9, 14.7 }<br />
The results of some of the calculations for this data set are presented below. You<br />
may use these results to answer the following questions.<br />
∑ x = 14320 . ∑ x 2 = 2532 . 3<br />
70. What is n, the size of the sample<br />
a) 9 b) 10 c) 8 d) 143.20 e) can’t say<br />
71. What is x , the sample mean<br />
a) 17.9 b) 15.91 c) 143.20 d) can’t say<br />
72. x , the sample mean, and s , the sample standard deviation, are examples of<br />
a) statistics b) parameters c) neither of these<br />
73. What is µ , the population mean<br />
a) 17.9 b) 15.91 c) 143.20 d) can’t say<br />
74. x is a ________ estimator of µ .<br />
a) biased b) interval c) unbiased d) point e) both c and d<br />
75. What is<br />
2<br />
s , the sample variance<br />
a) 253.83 b) 28.20 c) 31.73 d) 5.63 e) 5.31<br />
76. What is s , the sample standard deviation<br />
a) 253.83 b) 28.20 c) 31.73 d) 5.63 e) 5.31<br />
11
77. Find the 50th percentile of the above data set. What is another name for this value<br />
a) 15.8, mode b) 15.8, median c) 20.2, range d) can’t tell<br />
78. Considering this sample of data provided, would you describe this data set as skewed If so,<br />
why If not, why not<br />
a) no, since x = P50<br />
b) yes, skewed left, since x < P50<br />
c) yes, skewed right, since x < P50<br />
d) yes, skewed right, since x > P50<br />
e) none of the above<br />
79. Which formula should be used here to find a confidence interval for the population<br />
mean µ <br />
x<br />
σ<br />
x<br />
s<br />
s<br />
a) x ± z c<br />
b) x ± zc<br />
c) x ± t c<br />
d) none of these<br />
n<br />
n<br />
n<br />
80. Construct the 95% confidence interval for the population mean µ<br />
x<br />
.<br />
a) 15 .91±<br />
4. 33<br />
b) 15 .91±<br />
3. 68<br />
c) 15 .91±<br />
1. 44<br />
d) 15 .91±<br />
4. 245<br />
e) 15 .91±<br />
4. 49<br />
12
81. In the formula for<br />
2<br />
s , the denominator is −1<br />
n rather than n because<br />
2<br />
σ<br />
x<br />
a) dividing by n underestimates<br />
2<br />
b) dividing by n overestimates σ<br />
x<br />
c) it becomes an unbiased estimator of<br />
d) a and c<br />
e) b and c<br />
2<br />
σ<br />
x<br />
82. Which of the following is a measure of relative standing<br />
a) standard deviation b) range c) z-score d) t-distribution<br />
83. If the distribution of the random variable is _______, it is a continuous random variable.<br />
a) binomial b) normal c) t d) a or b e) b or c<br />
84. If X is a continuous random variable with mean µ x<br />
= 10. 0 , is the following statement true<br />
Explain why or why not.<br />
P( X = 10. 0) = . 5<br />
a) yes, the probability of the mean is always .5.<br />
b) yes, according to the Central Limit Theorem<br />
c) no, the probability should be 0, because the “area” of a line is 0.<br />
⎛ n ⎞<br />
d) no, since we don’t know n, we can’t compute ⎜ ⎟<br />
⎝10⎠<br />
13
For problems 85 through 90, consider the following situation - A coin is tossed 10 times.<br />
Suppose that the coin is not fair, and that for each toss of the coin, the probability of getting<br />
a HEAD is 0.6. Let X = the number of heads observed in the 10 tosses.<br />
85. Is the random variable X discrete or continuous<br />
a) discrete b) continuous<br />
86. What possible values does X take on<br />
a) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10<br />
b) 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10<br />
c) All Real numbers in the interval [0,10]<br />
d) .6 or .4 only<br />
87. What type of distribution does the random variable X have<br />
a) binomial b) uniform c) normal d) a and b e) b and c<br />
88. Find P( X = 5 )<br />
a) .367 b).201 c) 0 d) 1 e) 0.6<br />
89. Find the mean of the random variable X. Show the best formula that you should use.<br />
a) µ npq , so µ = 2. 4<br />
x<br />
=<br />
b) µ<br />
x<br />
= np , so µ<br />
x<br />
= 6. 0<br />
c) µ<br />
x<br />
= ∑ xp(x)<br />
, so µ<br />
x<br />
= 2. 4<br />
d) µ<br />
x<br />
= ∑ xp(x)<br />
, so µ<br />
x<br />
= 6. 0<br />
e) a and c are both correct<br />
x<br />
90. Find the standard deviation of the random variable X. Show the formula that you used.<br />
a) σ = np( 1−<br />
p)<br />
= 1. 55<br />
x<br />
b) σ = np( 1−<br />
p)<br />
= 2. 4<br />
x<br />
∑ 2<br />
( x − x)<br />
c) s = = 5.5<br />
n −1<br />
d) none of these<br />
_<br />
14
For problems 91 through 94, consider the following situation -<br />
An economist claims that, in California, the average Hispanic-owned business<br />
generates revenues of less than $70,000 annually. A random survey of 10 Hispanicowned<br />
business firms revealed a sample mean annual revenue of $58,600 with a<br />
sample standard deviation of $18,000. Assume a normal population. We wish to test<br />
the economist's claim, at a 0.10 significance level. Is there evidence that the<br />
economist is correct<br />
91. The alternative hypothesis that is appropriate here is<br />
a) µ > 70000 b) µ > 70000 c) µ < 70000 d) µ = 70000<br />
92. The distributional form of the correct test statistic to use in this situation is<br />
a) z (standard Normal) b) t c)<br />
2<br />
χ<br />
d) F<br />
93. The decision for this hypothesis test will imply that<br />
a) there is evidence that the economist is correct<br />
b) there is no evidence that the economist is correct<br />
c) correct answer not given<br />
94. The p-value for the observed value of the test statistic is<br />
a) greater than .1<br />
b) between .05 and .1<br />
c) less than .05<br />
d) none of the choices provided are correct.<br />
15
For problems 95 through 97, consider the following MINITAB output for analysis of a<br />
bivariate quantitative data set. Note here that the variable Hours is being used to predict<br />
the variable Wear.<br />
Regression Analysis<br />
The regression equation is<br />
Wear = 0.0028 + 0.00657 Hours<br />
Predictor Coef Stdev t-ratio p<br />
Hours 0.0065723 0.0001460 45.02 0.000<br />
R-sq = 99.1%<br />
95. For this analysis, the coefficient of determination is<br />
a) .991 b) .995 c) .00657 d) .0028 e) none of these<br />
96. For this bivariate data set, the approximate value of the correlation coefficient is<br />
a) .991 b).995 c) .00657 d) -0.991 e) none of these<br />
97. The output indicates that Hours is a useful linear predictor of Wear. (using a .05 level of<br />
significance.)<br />
a) true b) false c) can't tell with the output provided<br />
_________________________________________________________________________<br />
98. The types of discrete random variables discussed in class were:<br />
a) binomial<br />
b) geometric<br />
c) poisson<br />
d) all of the above<br />
e) none of these<br />
99. Which of the following is a measure of the variability of a distribution<br />
a. Skewness b. Median c. Standard deviation d. z-score<br />
100. A type I error is made by<br />
a. rejecting Ho when it is true.<br />
b. rejecting Ho when it is false<br />
c. failing to reject Ho when it is true<br />
d. failing to reject Ho when it is false<br />
16
PART III: SHOW ALL WORK FOR FULL CREDIT<br />
95 points total<br />
1.) Are Women Getting Taller A researcher claims that the average height of a woman<br />
aged 20 years or older is greater than the 1994 mean height of 63.7 inches, on the basis of<br />
data obtained from the Centers for Disease Control and Prevention’s, Advance Data<br />
Report, No. 347. She obtains a random sample of 45 women and finds the sample mean<br />
height to be 63.9 inches. Assume that the population standard deviation is 3.5 inches.<br />
Test the researcher’s claim at the 0.05 level of significance.<br />
1. Describe the population parameter about which hypothesis are to be tested.<br />
2. The null hypothesis is H : 0<br />
3. The alternative hypothesis is H<br />
a<br />
: Determine what type of test (upper, lower, or two-tailed).<br />
4. Select the significance level α for the test. Draw the appropriate distribution, label α , and determine the<br />
Rejection Region.<br />
5. Display the test statistic to be used, with substitution of the hypothesized value identified in step 2 but without any<br />
computations at this point. Also state any assumptions.<br />
6. Compute all quantities appearing in the test statistic and then the value of the test statistic itself.<br />
7. Determine the p-value associated with the observed value of the test statistic Draw a picture of the appropriate<br />
distribution, shading the p-value area.<br />
8. State the conclusion.<br />
9. State what a TYPE II error would be in the context of this problem. For the above hypothesis test, is it<br />
possible to find the probability of making this type of error If so, what is it<br />
17
2.) Modern medical practice tells us not to encourage babies to become too fat. 14 females took<br />
part in a lifelong experiment , where bivariate data was collected on each of the subjects. Let the<br />
predictor (explanatory) variable x represent the weight (in lbs) of the subject as a 1-year old baby.<br />
Let the response variable y represent the weight (in lbs) of the subject as a 30-year old adult. The<br />
summary statistics are as follows:<br />
∑ x = 300 ∑ y = 1775<br />
xy = 38,220<br />
∑<br />
a.) Calculate<br />
SS<br />
xx<br />
,<br />
SS<br />
yy<br />
, and SS<br />
xy<br />
.<br />
2<br />
∑ x = 6572<br />
2<br />
∑ y = 226,125<br />
b.) Calculate r . Interpret this value in the context of the problem.<br />
c.) Assume the equation of the sample regression line is y = 99.25 + 1.285x. What is the<br />
value of ˆβ 1, and interpret this value in the context of the problem.<br />
d.) Use x = 20 in regression line to find the corresponding value for y .<br />
e.) Give two interpretations of this corresponding y value, in terms of the above problem. Be<br />
as specific as possible.<br />
18
3.) Sample bivariate data indicates a correlation between the number of cigarettes a person<br />
smokes, and the incidence of pancreatic cancer. The correlation coefficient, r, is 0.93.<br />
Circle TRUE or FALSE to the following statements, and EXPLAIN YOUR ANSWERS.<br />
a) There is a strong linear relationship between the number of cigarettes a person smokes and the<br />
incidence of pancreatic cancer.<br />
TRUE or FALSE<br />
Explanation<br />
____________________________________________________________________<br />
b) The information indicates that smoking causes pancreatic cancer. TRUE or FALSE<br />
Explanation<br />
____________________________________________________________________<br />
c) This regression equation relating smoking and pancreatic cancer is more useful than a<br />
regression equation using data for which r = -0.98.<br />
TRUE or FALSE<br />
Explanation<br />
____________________________________________________________________<br />
d) The slope of the regression equation is negative. TRUE or FALSE<br />
Explanation<br />
_____________________________________________________________________<br />
e) About 86% of the variation in the incidence of pancreatic cancer is due to the variation in the<br />
number of cigarettes a person smokes.<br />
TRUE or FALSE<br />
Explanation<br />
_____________________________________________________________________<br />
f) If you were testing H0 : β<br />
1<br />
= 0 vs. Ha<br />
: β1<br />
≠ 0, you would probably reject H<br />
0<br />
.<br />
TRUE or FALSE<br />
Explanation<br />
_____________________________________________________________________<br />
19
4. For each of the population parameters, write the symbol for the corresponding sample<br />
statistic (it’s point estimator), and the name of the estimator.<br />
Estimator Symbol<br />
Estimator Name<br />
σ _______ ________________________<br />
µ _______ ________________________<br />
p _______ ________________________<br />
2<br />
σ _______ ________________________<br />
µ − µ<br />
_______ ________________________<br />
1 2<br />
5. IQs are calibrated so that they have a mean of 100 and a standard deviation of 15.<br />
a) Find the probability that a randomly selected person has an IQ between 106 and<br />
116. Be sure to draw a sketch.<br />
Ans. _____________<br />
b) A person with an IQ at or below the 5 th percentile is considered “developmentally<br />
disabled”. What IQ corresponds to the cutoff for this designation<br />
Ans. ______________<br />
20
EXTRA CREDIT) Many people believe that criminals who plead guilty tend to get lighter<br />
sentences than those who are convicted in trials. The data below summarizes randomly selected<br />
sample data for San Francisco defendants in burglary cases. At the 0.05 significance level, test<br />
the claim that the type of plea is independent of whether a person is sent to prison.<br />
Note: First find row and column TOTALS, along with the GRAND TOTAL. To find “expected”<br />
quantities, you may use your calculator for the calculations, but show how one of these is<br />
calculated.<br />
Guilty Plea Not Guilty Plea TOTAL<br />
Sent to Prison 392 38<br />
Not Sent to Prison 584 16<br />
TOTAL<br />
1. Describe the population parameter about which hypothesis are to be tested.<br />
2. The null hypothesis is<br />
3. The alternative hypothesis is<br />
4. Select the significance level α for the test. Draw the appropriate distribution, label α , and determine the<br />
Rejection Region.<br />
5. Display the test statistic to be used, with substitution of the hypothesized value identified in step 2 but<br />
without any computations at this point. Also state any assumptions.<br />
6. Compute all quantities appearing in the test statistic and then the value of the test statistic itself.<br />
7. Determine the p-value associated with the observed value of the test statistic Draw a picture of the<br />
appropriate distribution, shading the p-value area.<br />
8. State the conclusion.<br />
21
EXTRA CREDIT: Sears claims that the “new and improved” batteries have mean lifetime which<br />
is now greater than 17 months, but a leading consumer group thinks the lifetime is still 17 months<br />
(or less).<br />
So, the hypotheses are H 0 : µ = 17<br />
H a ; µ > 17<br />
a) Who is more likely to want a larger significance level, Sears or the consumer group<br />
Explain.<br />
Ans. ___________________________________________________________________<br />
____________________________________________________________________<br />
_____________________________________________________________________<br />
_____________________________________________________________________<br />
b) Suppose both parties decide on α = 0.05. A hypothesis test is conducted from sample<br />
data, and the p-value = 0.03. What would be the conclusion to this hypothesis test<br />
Why Explain in the context of the problem. Your explanation should be complete and<br />
specific.<br />
Ans. _____________________________________________________________<br />
____________________________________________________________________<br />
_____________________________________________________________________<br />
_____________________________________________________________________<br />
_____________________________________________________________________<br />
_____________________________________________________________________<br />
22