MATH1725 Introduction to Statistics: Worked examples

MATH1725 Introduction to Statistics: Worked examples 

Worked Example: Lectures 1–2 

The lifetimes of 400 light-bulbs were found to the nearest hour. The results were recorded as 

follows. 

Lifetime (hours) 0–199 200–399 400–599 600–799 800–999 1000–1199 1200–1999 

Frequency 143 97 64 51 14 14 17 

Construct a histogram and cumulative frequency polygon for these data. Estimate the percentage 

of bulbs with lifetime less than 480 hours. 

Answer: Lifetimes cannot be negative so class intervals are [0,199.5), [199.5,399.5), [399.5,599.5), 

and so on. 

Freq. per 200 hour class 

0 20 40 60 80 120 

0 500 1000 1500 2000 

Lifetime (hours) 

Adjust height of the rectangle for the “1200–2000” interval to make histogram area proportional 

to frequency. If the vertical axis is “frequency per interval of 200 hours”, the height of the [0,199.5) 

class is 143 × 200/199.5 = 143.4 to allow for the first class not being of width 200. 

Lifetime (hours) 0.0 199.5 399.5 599.5 799.5 999.5 1299.5 1999.5 

Cumulative frequency 0 143 240 304 355 369 383 400 

Make the cumulative frequency at time zero equal to 0. 

Cumulative freq. 

0 100 200 300 400 


240 260 280 300 

265.8 

480 

0 500 1000 1500 2000 


400 450 500 550 600 


Estimated number of light-bulbs with lifetime less than 480 hours is 

240 + 

480 − 399.5 

200 

× (304 − 240) = 265.8. 

1

Required percentage is 

265.8 

400 

× 100 = 66.4% 


The Christmas cactus Zygocactus truncatus has branches made up of separate segments. For one 

such cactus the number of segments in each branch were counted. 

Number x of segments 1 2 3 4 5 6 7 8 9 

Number of branches with x segments 3 0 6 7 8 18 8 0 2 

Construct a cumulative frequency polygon to represent these data. 

Answer: The data is discrete so cumulative frequency plot is a step function. 

Number x of segments 1 2 3 4 5 6 7 8 9 

Number of branches with ≤ x segments 3 3 9 16 24 42 50 50 52 


0 10 20 30 40 50 60 

0 2 4 6 8 10 

Number of segments 


The following data give one hundred measurement errors made during the mapping of the American 

state of Massachusetts during the last century. 

Error X (in minutes ′ of arc) (−4, −2] (−2,0] (0,+2] (+2,+4] (+4,+6] 

Frequency 10 43 39 5 3 

Show that the sample mean and sample standard deviation for these data are ¯x = −0.04 ′ and 

s = 1.717 ′ respectively. 

Answer: 

Class Class frequency f Class mid-point x fx fx 2 

−4< x ≤−2 10 −3 −30 90 

−2< x ≤ 0 43 −1 −43 43 

0< x ≤+2 39 +1 39 39 

+2< x ≤+4 5 +3 15 45 

+4< x ≤+6 3 +5 15 75 

Totals n = 100 −4 292 

2

¯x = −4 

100 = −0.04′ . 

s 2 = 1 99 (292 − 100 × (−0.04)) = 2.9479, so s = √ (s 2 ) = √ 2.9479 = 1.717 ′ . 


The time between arrival of 60 patients at an intensive care unit were recorded to the nearest hour. 

The data are shown below. 

Time (hours) 0–19 20–39 40–59 60–79 80–99 100–119 120–139 140–159 160–179 

Frequency 16 13 17 4 4 3 1 1 1 

Determine the median and semi-interquartile range. Explain why this pair of statistics might be 

preferred to the mean and standard deviation for these data. 

Answer: 

Time (hours) 0.0 19.5 39.5 59.5 79.5 99.5 119.5 139.5 159.5 179.5 

Cumulative frequency 0 16 29 46 50 54 57 58 59 60 

Median lies in “40–59” class, corresponding to cumulative frequency 30. 

Lower quartile is in “0–19” class, corresponding to cumulative frequency 15. Notice that this 

class has width 19.5 hours, not 20 hours. 

Upper quartile is in “40–59” class, corresponding to cumulative frequency 45. 

Median = 39.5 + 

30 − 29 

× 20 = 40.7 hours. 

46 − 29 

Lower quartile = 0.0 + 15 − 0 × 19.5 = 18.3 hours. 

16 − 0 

45 − 29 

Upper quartile = 39.5 + × 20 = 58.3 hours. 

46 − 29 

Semi-interquartile range = 1 (58.3 − 18.3) = 20.0 hours. 

2 

The histogram for these data is positively skew, so the median and semi-interquartile range might 

be preferred to the mean and standard deviation as measures of location and dispersion respectively. 

Freq. per 20 hour class 

0 5 10 15 20 

0 50 100 150 200 

Inter−arrival time (hours) 

3


A firm investigates the length of telephone conversations of their office staff. Ten consecutive 

conversations had lengths, in minutes: 

10.7, 9.5, 11.1, 7.8, 11.9, 4.1, 10.0, 9.2, 6.5, 9.2. 

Derive a 95% confidence interval for the mean conversation length. Test whether the mean length 

of a conversation is eight minutes. 

Answer: 

¯x = 1 n∑ 

x i = 90 

n 10 = 9 minutes. 

i=1 

{ n∑ 

} 

s 2 = 1 x 2 i 

n − 1 

− n¯x2 = 5.42. 

i=1 

Estimate the population variance σ 2 by s 2 with s = √ 5.42 = 2.33. Then 

¯X − µ 

s/ √ n ∼ t n−1. 

95% confidence interval for µ is ¯x ± t 9 (2.5%)s/ √ 10. Here s/ √ 10 = 0.737, t 9 (2.5%) = 2.262. 

s 

¯x ± t 9 (2.5%) √ = 9 ± (2.262 × 0.737) 

10 

= 9 ± 1.667 = (7.3,10.7). 

Since 8 minutes lies inside the 95% confidence interval we would accept H 0 in testing H 0 : µ = 

8 vs. H 1 : µ ≠ 8 at the 5% significance level. 


A population has a Poisson distribution but it is not known whether the mean µ is 1 or 4. To 

test the hypothesis H 0 : µ = 1 vs. H 1 : µ = 1 on the basis of one observation X the following test 

procedure is considered: reject H 0 if X ≥ i. 

Type I error is defined to be “rejecting H 0 when H 0 is true”. Find the probability of type I 

error for the three cases i = 2, 3, 4. 

Answer: If H 0 is true, µ = 1 and 

so that pr{Type I error} = pr{X ≥ i}. 

If i = 2, 

pr{X = x} = e−1 

x! , x = 0,1,2,... , 

pr{Type I error} = pr{X ≥ 2} = 1−pr{X < 2} = 1−pr{X = 0}−pr{X = 1} = 1−e −1 −e −1 = 0.264. 

Similarly if i = 3, 

If i = 4, 

pr{Type I error} = pr{X ≥ 3} = 1 − pr{X < 3} = 0.080. 

pr{Type I error} = pr{X ≥ 4} = 0.019. 

Notice that an exact 5% or 10% significance level test does not exist for this discrete distribution. 

4


A sample of size 64 is drawn by simple random sampling from a normal population which has 

known variance 4. The sample mean is −0.45. Test the hypothesis H 0 : µ = 0 vs. H 1 : µ ≠ 0 at 

the 5% level of significance. Repeat for testing H 0 : µ = 0 vs. H 1 : µ > 0 

Answer: Here ¯X ∼ N(µ,σ 2 /n) with σ 2 = 4, n = 64, so σ 2 /n = 0.0625 and ¯X ∼ N(µ,0.0625). 

Test statistic is 

Z = ¯X − µ 

σ/ √ n = ¯X 

√ = 

¯X 

0.0625 0.25 

where Z ∼ N(0,1) if H 0 is true. 

For α = 0.05 with a two-sided test, z α/2 = 1.96. Critical region is Z < −1.96 and Z > 1.96. 

Observed value is z = −0.45/0.25 = −1.8. This does not lie in critical region so accept H 0 . 

For α = 0.05 with a one-sided test, z α = 1.645. Critical region is Z < −1.645. Observed value 

is z = −1.8 which lies in critical region so reject H 0 . 

Worked Example: Lecture 6 

The absenteeism rates (in days and parts of days) for nine employees of a large company were 

recorded in two consecutive years. 

Employee 1 2 3 4 5 6 7 8 9 

Year 1 3.0 6.7 11.3 5.0 9.4 15.7 8.0 10.0 9.7 

Year 2 2.8 5.1 8.4 5.0 6.2 12.2 10.0 6.8 6.0 

Is there any evidence that the average absenteeism rate is different for the two years 

Answer: Data paired as same employee studied in each of the two years. 

Form difference d i = (year 1) i − (year 2) i . Need to estimate variance σ 2 d . 

Test H 0 : µ d = 0 vs. H 1 : µ d ≠ 0. See lecture 6. 


Which phrases i-iv below apply to the sample correlation coefficient r XY 

(i) measures linear association between two variables, 

(ii) is never negative, 

(iii) has positive slope, 

(iv) depends on the units of measurement of X and Y . 

Answer: i only. 


The tensile strength of a glued joint is related to the glue thickness. A sample of six values gave 

the following results: 

Glue Thickness (inches) 0.12 0.12 0.13 0.13 0.14 0.14 

Tensile Strength (lbs.) 49.8 46.1 46.5 45.8 44.3 45.9 

Calculate the sample correlation coefficient r for these data. 

Use the fitted least squares regression line to predict the tensile strength of a joint for a glue 

thickness of 0.14 inches. 

Using scatter-diagrams, sketch the form of regression line expected in the three cases when r 

takes the values −1, 0, and +1. 

5

Answer: Let X denote the glue thickness and Y the joint strength. 

x y x 2 y 2 xy 

0.12 49.8 0.0144 2480.04 5.976 

0.12 46.1 0.0144 2125.21 5.532 

0.13 46.5 0.0169 2162.25 6.045 

0.13 45.8 0.0169 2097.64 5.954 

0.14 44.3 0.0196 1962.49 6.202 

0.14 45.9 0.0196 2106.81 6.426 

Totals 0.78 278.4 0.1018 12934.44 36.135 

¯x = 0.78 

6 

= 0.131, ȳ = 278.4 

6 

= 46.41, s 2 X = 1 5 {0.1018 − 6(0.131)2 } = 0.00008, 

s 2 Y = 1 5 {12934.44 − 6(46.41)2 } = 3.336, 

s XY = 1 {36.135 − 6(0.131)(46.41)} = −0.0114. 

5 

Regression line: 

r XY = s XY 

s X s Y 

= 

−0.0114 

√ 0.00008 × 3.336 

= −0.698. 

y = ȳ + (x − ¯x) s XY 

s 2 X 

= 46.4 + (x − 0.13) −0.0114 

0.00008 

= 64.925 − 142.5x. 

At x = 1.4 ′′ this gives y = 44.975 lbs.. 

Scatter-plots: 

r = −1: data lies on a straight line with negative slope. 

r = +1: data lies on a straight line with positive slope. 

r = 0: data randomly scattered (X and Y independent) or could show case with X and Y having 

a non-linear dependence as in the lecture notes. You could even show both of these cases! 


A coin is tossed three times. Let X denote the number of heads and Y the length of the longest 

run of heads or tails. Thus HTT gives X = 1 and Y = 2, THT gives X = 1 and Y = 1. 

(a) Obtain the joint probabilities of X and Y . 

(b) Obtain the marginal probability distribution of X and Y . 

(c) If X = 1, what is the distribution of Y 

Answer: (a and b) All eight outcomes are equally likely, so occur with probability 1/8. 

Outcome HHH HHT HTH HTT THH THT TTH TTT 

X 3 2 2 1 2 1 1 0 

Y 3 2 1 2 2 1 2 3 

Probability 1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8 

Y 

1 2 3 p X (x) 

0 0 0 1/8 1/8 

X 1 1/8 1/4 0 3/8 

2 1/8 1/4 0 3/8 

3 0 0 1/8 1/8 

p Y (y) 1/4 1/2 1/4 Total = 1 

6

Joint probabilities p(x,y) are found by summing probabilities for each outcome giving rise to 

(X = x,Y = y). Thus p(1,2) = pr{HTT or TTH} = 1/4. 

Marginal probabilities are found by forming row or column sum. For example 

pr{X = 2} = p(2,1) + p(2,2) + p(2,3) = 3 8 . 

(c) If X = 1, then 

pr{Y = y|X = 1} = p(1,y) 

p X (1) = p(1,y) 

3/8 . 

Thus 

pr{Y = 1|X = 1} = 1/8 

2/8 

= 1/3, pr{Y = 2|X = 1} = = 2/3, pr{Y = 3|X = 1} = 0. 

3/8 3/8 

If X = 1, then the outcome is one of HTT, THT, TTH. In one out of these three cases we observe 

Y = 1 and in two out of three we observe Y = 2. 


Suppose X and Y are independent continuous random variables which are each uniformly distributed 

on the interval (0,1). 

(a) Find the probability that 0 < X + Y < z for values z ∈ (0,2). 

(b) If Z = X + Y , deduce the form of the probability density function f(z) of Z. 

Hints: In (a), think about the area on the x-y plane corresponding to 0 < x + y < z. In (b), first 

find the cumulative distribution function F(z) = pr{Z ≤ z}. 

Answer: As X and Y are uniformly distributed on the interval [0,1) they have pdf 

f X (x) = 

{ 1 if 0 < x < 1, 

0 otherwise, 

f Y (y) = 

{ 1 if 0 < y < 1, 

0 otherwise. 

(a) 

Joint probability density is f(x,y) = f X (x)f Y (y) 

by independence of X and Y . Hence f(x,y) = 1, 

a constant, for 0 < x < 1 and 0 < y < 1. 

1 

f(x,y) 

Probability of an event A is volume under pdf 

with base area given by A. Here A is the region 

for which 0 < X + Y < z. 

0 

A 

1 

Y 

Consider the two cases z < 1 and z > 1 separately. 

1 

X 

Y 

1 

Case z < 1 Y Case z > 1 

1 

z 

2-z 

x+y

{ 1 

From the figure above, pr{0 < X + Y < z} = 2 z2 if 0 < z < 1, 

1 − 1 2 (2 − z)2 if 1 ≤ z < 2. 

An alternative derivation uses integration. For example, in the case z < 1, 

∫ ∫ 

∫ z 

∫ z−y 

∫ z 

pr{0 < X + Y < z} = f(x,y)dxdy = dxdy = (z − y)dy = 1 2 z2 . 

0

Answer: 

E[T] = E[a 1 X 1 + a 2 X 2 ] = a 1 E[X 1 ] + a 2 E[X 2 ] = a 1 µ + a 2 µ = (a 1 + a 2 )µ. 

If we require E[T] = µ, then a 1 + a 2 = 1, so that a 2 = 1 − a 1 . 

Since E[T] = µ, then T is said to be an unbiased estimator of the mean µ. 

Var[T] = Var[a 1 X 1 + a 2 X 2 ] = a 2 1 Var[X 1] + a 2 2 Var[X 2] = a 2 1 σ2 + a 2 2 σ2 = (a 2 1 + a2 2 )σ2 . 

Since a 2 = 1 − a 1 , Var[T] = {a 2 1 + (1 − a 1) 2 }σ 2 = (2a 2 1 − 2a 1 + 1)σ 2 . Differentiate this with respect 

to a 1 to find the minimum. 

d 

da 1 

Var[T] = (4a 1 − 2)σ 2 , 

which is zero when a 1 = 1 2 . Hence Var[T] is a minimum when a 1 = a 2 = 1 2 so T = 1 2 (X 1 + X 2 ). 

Alternative derivation: write a 1 = 1 2 + ε, a 2 = 1 2 

− ε. Then 

and is a minimum if ε = 0. 

Var[T] = (a 2 1 + a2 2 )σ2 = {( 1 2 + ε)2 + ( 1 2 − ε)2 }σ 2 = ( 1 2 + 2ε2 )σ 2 , 

What does this question show In part (a) you chose a 2 to restrict attention to linear combinations 

of the X i which were unbiased estimators of the mean µ, so E[T] = µ. In part (b) you then 

showed that of all such unbiased estimators, the sample mean ¯X is the one with smallest variance, 

so giving values closest to the true mean µ. 

Worked Example: Lecture 15. 

The following data give the noise level (in decibels) generated by fourteen different chain saws 

powered in one of two different ways. 

Petrol-powered chain saws 103 103 105 106 108 105 106 

Electric-powered chain saws 97 95 94 93 91 95 94 

At the 5% level of significance, test whether the average noise level of petrol-powered chain saws 

is higher than for electric-powered chain saws. 

Answer: Testing H 0 : µ 1 = µ 2 vs. H 1 : µ 1 > µ 2 , i.e. H 0 : µ 1 − µ 2 = 0 vs. H 1 : µ 1 − µ 2 > 0. 

Have two independent samples with unknown variance. Need to assume variances are equal. 

Worked Example: Lecture 15. 

The following data give the length (in mm.) of cuckoo (cuculus canorus) eggs found in nests 

belonging to wrens (A) and reed warblers (B). 

A: 19.8 22.1 21.5 20.9 22.0 21.0 22.3 21.0 20.3 20.9 

B: 23.2 22.0 22.2 21.2 21.6 21.6 21.9 22.0 22.9 22.8 

Assuming the variances for each group are the same, is there any evidence at the 5% level to 

suggest that the egg size differs between the two host species 

9

Answer: Have two independent normal distributions with unknown variances. 

Wrens: ¯x 1 = 21.18 mm., s 2 1 = 0.6418, n 1 = 10. 

Reed warblers: ¯x 2 = 22.14 mm., s 2 2 = 0.4116, n 2 = 10. 

Assume σ 2 1 = σ2 2 = σ2 (unknown). Estimate σ 2 using 

s 2 = (n 1 − 1)s 2 1 + (n 2 − 1)s 2 2 

= 9s2 1 + 9s2 2 

= 0.5267. 

n 1 + n 2 − 2 18 

( 1 

Also ¯x 1 − ¯x 2 = 21.18 − 22.14 = −0.96, 

√s 2 + 1 ) 

= 0.1053, t 18 (2.5%) = 2.101. 

n 1 n 2 

If µ 1 = µ 2 then the two groups of eggs have the same mean length. 

¯x 1 − ¯x 2 

To test H 0 : µ 1 = µ 2 vs. H 1 : µ 1 ≠ µ 2 at 5% level, reject H 0 if 

√ ∣ s 2 (1/n 1 + 1/n 2 ) ∣ ≥ t 8(2.5%). 

¯x 1 − ¯x ∣ ∣ 

2 

∣∣∣ 

Here 

√ ∣ s 2 (1/n 1 + 1/n 2 ) ∣ = −0.96 ∣∣∣ 

√ = 2.95 so reject the null hypothesis of equal means at 5% 

0.1052 

level. The two groups of eggs are significantly different at 5% level. 

This does not necessarily imply cuckoos can control their egg size. It has been proposed that a 

cuckoo lays its egg in the particular nest for which it is best adapted. For further information see: 

Wyllie, I. (1981) The Cuckoo. Batsford: London. 

Davies, N.B. and Brooke, M. Coevolution of the cuckoo and its host, Scientific American, January 

1991, p.66-73. 

10

Question (lecture 1-2). 

For values 1, 3, 4, 5, 6 obtain the sample mean, sample median, sample variance and sample 

standard deviation. 

Answer: 1 

Question (lecture 1-2). 

The number of insurance policies sold by a small firm per week is 7, 8, 5, 6, 6, 7, 9, 5, 7, 8, 4, 7, 6, 

7, 7, 5, 8, 6, 7, 6, 6. Obtain the sample mean, sample median, sample variance, sample standard 

deviation. Check your values using R. 

Answer: 2 

Question (lecture 3). 

For Z ∼ N(0,1), calculate pr{Z ≤ 0.55}, pr{Z > 2.25}, pr{Z ≤ −0.15}, pr{−1.50 < Z ≤ 2.25}. 

Answer: 3 


For Z ∼ N(0,1), calculate pr{Z ≤ 0.63}. 

Answer: 4 


For Z ∼ N(0,1), determine the value of z such that: pr{Z ≤ z} = 0.8944, pr{Z > −z} = 0.9713, 

pr{−z < Z ≤ z} = 0.9108. 

Answer: 5 


An advertising company requires all of its job applicants to take a psychometric test. Based on 

recent studies, it is believed that the test score follows a normal distribution with mean 100 and 

standard deviation 15. Determine the probability that a job applicant will receive a test score 

below 118, above 112, between 100 and 112. 

Answer: 6 


If X ∼ t 5 , for what value of x is pr{X > x} = 0.05 

Answer: 7 


If T ∼ t 8 , for what value t is pr{T > t} = 0.025 For what value t is pr{T ≤ t} = 0.05 

Answer: 8 


1 3.8, 4, 3.7, 1.92. 

2 6.524, 7.0 (middle ordered value), 1.462, 1.209. 

3 pr{Z ≤ 0.55} = Φ(0.55) = 0.7088, pr{Z > 2.25} = 1 − Φ(Z ≤ 2.25) = 1 − Φ(2.25) = 0.0122, pr{Z ≤ −0.15} = 

1 − pr{Z ≤ 0.15} = 1 − Φ(0.15) = 0.4404, pr{−1.50 < Z ≤ 2.25} = pr{Z ≤ 2.25} − pr{Z ≤ −1.50} = 0.9210. Recall 

that pr{Z > z} = 1 − pr{Z ≤ z}, pr{Z < −z} = pr{Z > z} by symmetry, and also pr{X < b} = pr{X < a} + 

pr{a < X < b}. 

4 Using interpolation in the tables Φ(0.63) = 0.7356. 

5 pr{Z ≤ 1.25} = 0.8944, pr{Z > −1.90} = pr{Z ≤ 1.90} = 0.9713, pr{−z < Z ≤ z} = Φ(z) − Φ(−z) = 2Φ(z) − 

1 = 0.9108 so Φ(z) = 0.9554 and z = 1.70. 

6 0.8849, 0.2119, 0.2881. Hint: If X ∼ N(µ, σ 2 ), then pr{X ≤ x} = Φ ` x−µ 

´. 

7 σ 

From tables, x = 2.015. 

8 t 8(2.5%) = 2.306. pr{T > 1.860} = 0.05 so pr{T ≤ −1.860} = 0.05 by symmetry. Thus t = −1.860. 

11

If T ∼ t 10 , what is pr{T ≤ −2.228} What is pr{−2.228 < T ≤ 2.228} 

Answer: 9 


If 15, 13, 16, 18, 20 forms a random sample from a normal population with known variance σ 2 = 4, 

obtain a 95% confidence interval for the mean µ. 

Answer: 10 


A firm investigates the length of time staff spend answering telephone calls. Nine consecutive 

conservations had lengths in minutes 10.3, 9.4, 9.9, 7.5, 11.7, 3.4, 7.8, 11.0, 10.0. If these form 

a random sample from a normal population with unknown variance σ 2 , obtain a 95% confidence 

interval for the mean µ. 

Answer: 11 


It is required to obtain a 95% confidence interval for the mean µ of a normal population. Previous 

work has suggested that the variance σ 2 = 16. How large should the sample size n be if it is 

required to ensure that the width of the 95% confidence interval for µ is less than 0.5 

Answer: 12 


For observations 3, 6, 5, 2 from a normal distribution with mean µ and known variance σ 2 = 4, 

test the hypothesis H 0 : µ = 0 against the alternative hypothesis H 1 : µ ≠ 0 at the 5% level. 

Answer: 13 


At a gambling establishment I notice that a particular die gives 25 sixes in 100 rolls of the die. 

Is the die a fair one (is it biased) (Test whether the probability θ of a six occurring equals 1/6 

against the alternative hypothesis that θ ≠ 1/6.) 

Answer: 14 


For observations 3, 6, 5, 2 from a normal distribution with mean µ and unknown variance σ 2 , test 

the hypothesis H 0 : µ = 1 against the alternative hypothesis H 1 : µ ≠ 1 at the 5% level. 

9 0.025, 0.95. 

10 n = 5, ¯x = 16.4, 1.96σ/ √ n = 1.753, so 95% interval is 16.4 ± 1.75 = (14.65, 18.15). 

11 n = 9, ¯x = 9.0, s 2 = 6.25, t 8(2.5%) = 2.306. Interval is ¯x ± t 8(2.5%) s √ n 

= 9.0 ± 1.92. 

Rcode which could be used to obtain required quantities: 

x=c(10.3,9.4,9.9,7.5,11.7,3.4,7.8,11.0,10.0) 

mean(x) 

var(x) 

qt(0.975,8) # Gives 2.5% percentage point for t(8) pdf. 

12 Width of interval is 2 × (1.96σ/ √ n). Thus require 2 × (1.96 × 4)/ √ n < 0.5 so n > 16 2 × 1.96 2 = 983.45 so take 

n = 984. 

13 n = 4, ¯x = 4, σ 2 = 4, µ 0 = 0, σ 2 ¯x − µ0 

/n = 1. Test statistic is z = 

σ/ √ = 4. Test rule is reject H0 if |z| > 1.96. 

n 

Thus reject H 0 at 5% level. 

14 Let X be number of sixes in 100 throws, so X ∼ Bin(n = 100, θ = 1/6) if H 0 true. X ≈ N(µ = 16.667, σ 2 = 

13.889) if H 0 true. Test statistic is z = x √ − 16.667 = 2.236. Test rule is reject H 0 if |z| > 1.96, so reject H 0 at 5% 

13.889 

level. 

12

Answer: 15 


For values (x,y) as given below, obtain the sample correlation r. 

Answer: 16 

x i 1.1 2.2 3.4 4.5 5.0 

y i 3.3 6.1 7.0 10.4 11.5 


For values (x,y) as given below, obtain the line of regression for y given x. What does the residual 

at the first data point x 1 = 1.1 equal If x = 4, what is the predicted value of y 

Answer: 17 

x i 1.1 2.2 3.4 4.5 5.0 

y i 3.3 6.1 7.0 10.4 11.5 


For values (x,y) as given below, a line of regression for y given x is fitted. 

Test the hypothesis that the slope β equals zero. 

Answer: 18 

x i 1.1 2.2 3.4 4.5 5.0 

y i 3.3 6.1 7.0 10.4 11.5 


Suppose pr{X = x} = x 10 

for x = 1,2,3,4. Check that the probability function is valid (is 0 ≤ 

pr{X = x} ≤ 1 for all x and does ∑ pr{X = x} = 1). Calculate E[X] and Var[X]. 

x 

15 n = 4, ¯x = 4, s 2 = 3.333, µ 0 = 1, s 2 ¯x − µ0 

/n = 0.8333. Test statistic is t = 

σ/ √ n = √ 4 − 1 = 3.286. Test rule is 

0.8333 

reject H 0 if |t| > t 3(2.5%). As t 3(2.5%) = 3.182, reject H 0 at 5% level. 

16 ¯x = 3.24, s 2 x = 1 X 

(xi − ¯x) 2 = 1 “X 

x 

2 

i − n¯x 2” = 2.593, 

n − 1 

n − 1 

ȳ = 7.66, s 2 y = 1 X 

(yi − ȳ) 2 = 1 “X 

y 

2 

i − nȳ 2” = 11.033, 

n − 1 

n − 1 

s xy = 1 X 

(xi − ¯x)(y i − ȳ) = 1 “X 

xiy i − n¯xȳ” 

= 5.2645, r XY = s p xy/ s 

n − 1 

n − 1 

2 xs 2 y = 0.984. 

Check your answer using R! 

x=c(1.1,2.2,3.4,4.5,5.0) # And setup y similarly. 

cor(x,y) 

17 ¯x = 3.24, ȳ = 7.66, s 2 x = 2.593, s 2 y = 11.033, s xy = 5.2645. Regression line is y = α + βx where ˆβ = s xy/s 2 x = 

2.030, ˆα = ȳ − ˆβ¯x = 1.082 so fitted line is y = 1.082 + 2.030x. If x 1 = 1.1, predict ŷ 1 = 3.315. At x = 1.1, residual 

is r 1 = y 1 − ŷ 1 = 3.3 − 3.315 = −0.015. If x = 4, predict y = 9.023. Check your answers using R! 


lm(y∼x) # Gives parameter estimates. 

model=lm(y∼x) # Stores regression model output as model. 

model$residual[1] 

r 

# First residual value. 

18 If H 0: β = 0, then ˆβ/ ˆσ 

2 

∼ t n−2, where S xx = P r 

ˆσ 

(x i − ¯x) 2 = (n − 1)s 2 2 

x. Here = 0.2105 where 

S xx S xx 

S xx = (n − 1)s 2 x = 10.372. Thus t = 9.646. t 3(2.5%) = 3.182. As |t| > 3.182, reject H 0 at 5% level. Check 

your answers using R! 


model=lm(y∼x) 

summary(model) # Can you find your answers in the R output 

13

Answer: 19 


Suppose (X,Y ) take values (0,0), (0,1), (1,0), (1,1) with probabilities 0.2, 0.5, 0.2, 0.1 respectively. 

Obtain the marginal probabilities for X, and the conditional probabilities for Y given X = 1. 

Obtain E[XY ]. Are X and Y independent 

Answer: 20 


Suppose f XY (x,y) = 4xy for 0 < x < 1 and 0 < y < 1. Obtain the marginal pdf f X (x). Obtain 

E[XY ]. Are X and Y independent 

Answer: 21 


The table below gives the joint probability function for (X,Y ). 

Y 

0 1 2 

0 0.1 0.1 0.1 

X 1 0.2 0.0 0.2 

2 0.1 0.0 0.2 

Obtain the marginal probabilities p X (x) and p Y (y) for X and Y . Hence obtain E[X], E[Y ], Var[X], 

Var[Y ]. Obtain cov(X,Y ) and corr(X,Y ). 

Answer: 22 


If cov(X,Y ) = 0.5 and Var[X] = 2, what is cov(X,X + Y ) 

Answer: 23 


If cov(X + Y,X − Y ) = 12, Var[X + Y ] = 20 and Var[X − Y ] = 16, obtain σ 2 X = Var[X], 

σ 2 Y = Var[Y ], σ XY = cov(X,Y ) and so obtain corr(X,Y ). 

Answer: 24 


A fair die is rolled 100 times and the number X of ones and the number Y of twos is counted. 

What distribution does X have What distribution does Y have If Z = X + Y is the total 

number of ones or twos in the 100 rolls of the die, what distribution does Z have What is the 

variance of X, Y and Z Hence obtain cov(X,Y ) and corr(X,Y ). 

Answer: 25 

19 Yes, 3, 1. 

20 p X(0) = 0.7, p X(1) = 0.3, pr{Y = 0|X = 1} = 2 , pr{Y = 1|X = 1} = pr{X = 1 ∩ Y = 1} /pr{X = 1} = 1 . 

3 3 

E[XY ] = 0.1. No. 

21 f X(x) = R y fXY (x, y)dy = 2x for 0 < x < 1. E[XY ] = 4 . Yes. 

22 9 

Marginal probabilities for X are 0.3, 0.4, 0.3, and for Y they are 0.4, 0.1, 0.5. E[X] = 1, E[Y ] = 1.1, 

Var[X] = 0.6, Var[Y ] = 0.89, cov(X, Y ) = 0.1, corr(X, Y ) = 0.137. 

23 Var[X] + cov(X, Y ) = 2.5. 

24 σX 2 −σY 2 = 12, σX 2 +2σ XY +σY 2 = 20, σX 2 −2σ XY +σY 2 = 16, so 2σX 2 +2σY 2 = 36 and 4σ XY = 4. Thus σX 2 = 15, 

σY 2 = 3, σ XY = 1 and corr(X, Y ) = 1/ √ 45. 

25 X ∼ Bin(n = 100, θ = 1 ). Similarly for Y . Z ∼ Bin(100, θ = 1 ). Var[X] = Var[Y ] = 500/36, Var[Z] = 200/9 = 

6 3 

σX 2 + 2σ XY + σY 2 . Hence cov(X, Y ) = −100/36 so corr(X, Y ) = − 1 . Notice X and Y are not uncorrelated. If you 

5 

have a lot of ones, you would expect fewer twos! 

14


If Var[X] = 4 and Var[Y ] = 9 and corr(X,Y ) = 0.1, obtain cov(X + 2Y,X − Y ). 

Answer: 26 


If X ∼ N(1,9) and Y ∼ N(1,16) and X and Y are independent, what is pr{|X − Y | < 5} 

Answer: 27 


Suppose that X 1 ,X 2 ,... ,X n are independent and identically distributed random variables with 

common mean E[X i ] = µ and common variance Var[X i ] = σ 2 . Let ¯X denote the mean of the X i 

with mean µ and variance σ 2 /n. By writing (X i − ¯X) 2 = ({X i − µ} − { ¯X − µ}) 2 and expanding 

the bracket, show that 

S 2 = 1 n∑ 

(X i − 

n − 1 

¯X) 2 

has mean E[S 2 ] = σ 2 . 

Answer: 28 

i=1 


In January 2011 Durham police reported a “significant increase in road accidents during December 

[2010] ...mainly due to severe weather”. During December 2010 there were 336 reported collisions, 

up from 308 in the previous December. By fitting a suitable model to these data, test whether 

there is indeed a significant difference in the number of accidents between December 2009 and 

December 2010. Source: http://www.bbc.co.uk/news/uk-england-12261462 

(This is harder than you would get in the examination – I have not done anything like this in the 

module. Use the approximation that if X ∼ Poisson(µ) and µ is large, then X ≈ N(µ,σ 2 = µ).) 

Answer: 29 


Two independent samples gave values 3, 6, 5, 2 for sample 1 and 2, 2, 3, 3, 5 for sample 2. 

Assuming that the samples come from independent normal distributions with known variances 4 

and 1 respectively, test at the 5% level whether the difference in mean equals zero against the 

alternative that it does not equal zero. 

Answer: 30 

26 cov(X, Y ) = corr(X, Y ) × p Var[X]Var[Y ] so cov(X, Y ) = 0.6 and cov(X +2Y, X −Y ) = Var[X]+cov(X, Y ) − 

2Var[Y ] = −13.4. 

27 X − Y ∼ N(0,25) so we want pr{−5 < X − Y ≤ +5}. pr{X − Y ≤ 5} = Φ(1) = 0.8413 so pr{X − Y > 5} = 

0.1587 and answer is 0.6826. 

28 Recall that Var[X i] = E[(X i − µ) 2 ] = σ 2 and Var[ ¯X] = E[( ¯X − µ) 2 ] = σ 2 /n. Also notice that ({X i − µ} − 

{ ¯X − µ}) 2 = (X i − µ) 2 + ( ¯X − µ) 2 − 2(X i − µ)( ¯X − µ) and P i (Xi − µ) = n( ¯X − µ). Thus P P 

i (Xi − ¯X) 2 = 

i (Xi − µ)2 − n( ¯X − µ) 2 . Now take expectations. 

29 A suitable model is to assume accidents occur randomly and independently in time. Assuming a constant level 

of car usage we are using a Poisson process model. Thus the number X 1 of accidents in December 2010 satisfies 

X 1 ∼ Poisson(µ 1). Similarly the number X 2 of accidents in December 2009 satisfies X 2 ∼ Poisson(µ 2). We want to 

test whether µ 1 = µ 2. For µ i large, X i ≈ N(µ i, µ i) for i = 1,2 independently so X 1 − X 2 ≈ N(µ 1 − µ 2, µ 1 + µ 2). 

Thus if H 0 is true, and µ 1 = µ 2 = µ, 

X1 − X2 

U = √ ≈ N(0,1). 

2µ 

Assuming the null hypothesis is true, we would estimate µ by ˆµ = 1 (336+308) = 322. Thus, replacing µ by ˆµ = 322 

2 

we obtain U = 1.103. Since |U| < 1.96, we accept the null hypothesis at the 5% level. The observed increase in 

accidents was not significant! 

30 n 1 = 4, ¯x 1 = 4, σ1 2 = 4, n 2 = 5, ¯x 2 = 3, σ2 2 = 1. Testing H 0: µ 1 − µ 2 = 0 vs. H 1: µ 1 − µ 2 ≠ 0. Test statistic is 

15


Two independent samples gave values 3, 6, 5, 2 for sample 1 and 2, 2, 3, 3, 5 for sample 2. Assuming 

that the samples come from independent normal distributions with common unknown variance σ 2 , 

test at the 5% level whether the difference in mean equals zero against the alternative that it does 

not equal zero. 

Answer: 31 


Five randomly selected remuneration packages for US oil and gas CEOs in 2008 were (in thousands 

of US dollars) 21333, 7294, 6712, 5727, 7087. Five randomly selected remuneration packages for 

US health care CEOs in 2008 were (in thousands of dollars) 14262, 8381, 7245, 10211, 1817. Test 

at the 5% level whether the difference in mean remuneration equals zero against the alternative 

hypothesis that it does not equal zero. You can assume that the two populations have common 

(unknown) variance σ 2 . 

Answer: 32 


A quarter of insurance claims are incomplete in some way. If you have 250 forms to process, what 

is the approximate probability that you will find fewer than 50 of them incomplete 

Answer: 33 


In n = 100 tosses of a coin I obtain X = 72 heads. Obtain an approximate 95% confidence interval 

for the probability θ of a head. 

Answer: 34 


In December 2010 two analysts suggested several shares as likely to rise in 2011. By the end of 

October 2011 one (Neil Woodford) had four out of n 1 = 7 “share tips” showing a rise while the 

other (Harry Nummo) had three out of n 2 = 10 “share tips” showing a rise. Test at the 5% level 

whether the two success proportions are significantly different. 

Answer: 35 

z = 

¯x1 − ¯x2 

q 

σ 2 1 

n 1 

+ σ2 2 

= q 

4 − 3 

4 

n 2 

4 + 1 5 

= 0.913. Test rule is reject H 0 if |z| > 1.96. Thus accept H 0 at 5% level. 

31 n 1 = 4, ¯x 1 = 4, s 2 1 = 3.333, n 2 = 5, ¯x 2 = 3, s 2 2 = 1.5, pooled estimate of σ 2 is s 2 = 3s2 1 + 4s 2 2 

= 2.2857. Testing 

7 

¯x1 − ¯x2 4 − 3 

H 0: µ 1 − µ 2 = 0 vs. H 1: µ 1 − µ 2 ≠ 0. Test statistic is t = q = q = 0.986. Test rule is 

1 

s 

n 1 

+ 1 

1 

n 2 

1.5119 × + 1 4 5 

reject H 0 if |t| > t 7(2.5%). As t 7(2.5%) = 2.365, accept H 0 at 5% level. 

32 Data source: http://graphicsweb.wsj.com/php/CEOPAY09.html. 

n 1 = 5, ¯x 1 = 9630.6, s 2 1 = 43158021, n 2 = 5, ¯x 2 = 8383.2, s 2 2 = 20577907, n 1 + n 2 − 2 = 8, t 8(2.5%) = 2.306. 

If variances are equal to σ 2 , estimate σ 2 using s 2 = (n1 − 1)s2 1 + (n 2 − 1)s 2 2 

= 31867964. Test statistic is t = 

n 1 + n 2 − 2 

|¯x 1 −¯x 2 | 

= 0.349. Since t8(2.5%) = 2.306, then |t| < t8(2.5%) so accept H0 that µ1 = µ2 against the 

r 

s 2 ( 1 

n 1 

+ 1 

n 2 

) 

= 1247.4 

3570.32 

alternative µ 1 ≠ µ 2 at the 5% level. 

33 If X is the number of incomplete forms, X ∼ Bin(n = 250, θ = 1 ) ≈ N(µ = 62.5, 4 σ2 = 46.875). You require 

„ 49 + 

1 

2 

pr{X < 50} = pr{X ≤ 49} = Φ 

− µ « 

= Φ(−1.899) = 0.0288. Notice we have used a continuity correction. 

σ 

34 Number of heads X ∼ Bin(n = 100, θ). 

s 

Here n = 100, X = 72 observed, ˆθ = X/n = 72/100 = 0.72. 

Approximate 95% confidence interval is ˆθ 

ˆθ(1 − 

± 1.96 

ˆθ) = 0.72 ± 0.088. 

n 

35 Data source: http://www.thisismoney.co.uk/money/investing/article-1709914/Stock-market-predict 

16


In January 2011 Durham police were reported as disappointed by the increase in the number 

of people arrested for drinking and driving. Between December 1st 2010 and December 

31st 2010 they had 52 positive breath tests out of 1799 breath tests administered, while for 

the same period in 2009 they had 41 positive tests out of 1433 administered. Construct a 

95% confidence interval for the difference in proportion of drivers who tested positive. Source: 

http://www.bbc.co.uk/news/uk-england-12261462 

Answer: 36 


I observe two dice. For one die I notice that it gives a six 20 times out of 100 and for the second 

die I notice that it gives a six 22 times out of 80. Test at the 5% level whether the two dice give 

the same probability of showing a six. 

Answer: 37 


If X ∼ χ 2 4 , for what value of x is pr{X > x} = 0.05 

Answer: 38 


I roll a die 100 times and observe the following results. 

Test at the 5% level whether the die is fair. 

Answer: 39 

Outcome i 1 2 3 4 5 6 

Observed frequency 16 15 16 15 15 23 

ions-tips-2011.html 

Two binomial proportions here. ˆθ1 = 4/7 = 0.571, ˆθ2 = 3/10 = 0.300, n 1 = 7, n 2 = 10. Common estimated 

proportion is θ = 7ˆθ 1 + 10ˆθ 2 

|ˆθ 1 − 

= 0.412. Approximate test statistic is z = 

ˆθ 2| 

r 

17 

ˆθ(1 − ˆθ) 

“ = 1.119. reject H0 at 

1 

n 1 

+ 1 

5% level if |z| > 1.96, so here accept the hypothesis that the two proportions are equal. 

36 Two binomial proportions again. ˆθ 1 = 52/1799 = 0.028905, ˆθ 2 = 41/1433 = 0.028611, n 1 = 1799, n 2 = 1433. 

Common estimated proportion is θ = 1799ˆθ 1 + 1433ˆθ 2 

= 0.0288. (This is very small so the normal approximation 

is doubtful. In practice we would transform to give approximate normality.) Approximate test statistic is 

3232 

|ˆθ 1 − 

z = 

ˆθ 2| 

r 

ˆθ(1 − ˆθ) 

“ = 0.0496. Reject H0 at 5% level if |z| > 1.96, so here accept the hypothesis that the two 

1 

n 1 

+ 1 

n 2 

” 

proportions are equal. 

37 n 1 = 100, x 1 = 20, ˆθ1 = 20/100 = 0.200, n 2 = 80, x 2 = 22, ˆθ2 = 22/80 = 0.275. We test H 0: θ 1 = 

θ 2(= θ) vs. H 1: θ 1 ≠ θ 2. This is equivalent to testing H 0: θ 1 − θ 2 = 0 vs. H 1: θ 1 − θ 2 ≠ 0. Assuming H 0 is 

true, the estimated common proportion θ is estimated by ˆθ = n1ˆθ 1 + n 2ˆθ2 

= 

n 1 + n 2 

ˆθ 1 − 

z = 

ˆθ 2 

q = 

ˆθ(1−ˆθ) 

+ ˆθ(1−ˆθ) 

n 2 

n 1 

20 + 22 

180 

n 2 

” 

= 0.2333. Test statistic is 

0.200 − 0.275 

√ 0.0017889 + 0.0014907 

= −1.31. Test rule is reject H 0 if |z| > 1.96, so accept H 0 at 5% 

level. 

38 From tables, x = 9.488. 

39 Let X denote the outcome of the die. We test whether pr{X = i} = 1/6 for all i. Expected frequency for any 

outcome would then be 100 × 1 6 = 16.667. 

Outcome i 1 2 3 4 5 6 

Observed frequency O i 16 15 16 15 15 23 

Expected frequency E i 16.67 16.67 16.67 16.67 16.67 16.67 

(O i − E i) 2 /E i 0.0267 0.1667 0.0267 0.1667 0.1667 2.407 sum=2.960 

17


I live 55 miles commuting distance from the University. Over 35 car journeys I count the number 

X of road accidents observed and obtain the following data. 

Number of accidents observed per journey X 0 1 2 

Observed frequency 28 5 2 

Test at the 5% level whether a Poisson distribution gives a good fit to the data. Why is the Poisson 

distribution a suitable model for these data 

Answer: 40 


Two surveys were conducted about a certain product and the following results obtained. 

Like OK Dislike Total 

Survey A 44 23 33 100 

Survey B 30 20 30 80 

Total 74 43 63 180 

Test whether the like/OK/dislike population proportions for the two surveys are equal. 

Answer: 41 

Number of cells is 6; number of estimated parameters is 0; number of constraints on expected frequencies is 1. 

Number of degrees of freedom is k = 6 − 0 − 1 = 5. Test statistic is χ 2 obs = 2.960. Reject H 0 if χ 2 obs > χ 2 5(5%). As 

χ 2 5(5%) = 11.071, we accept the null hypothesis that the die is fair. 

40 ¯x = 9/35 = 0.257. Best fitting Poisson distribution is X ∼ Poisson(µ = 0.257). Fitted probabilities are 

pr{X = x} = µx e −µ 

x! 

for x = 0,1, 2, . . .. Fitted frequencies are E x = 35 × pr{X = x} for x = 0,1, 2, . . .. 

Number of accidents X 0 1 ≥ 2 

Observed frequency O i 28 5 2 

Expected frequency E i 27.06 6.959 0.977 

(by difference) 

(O i − E i) 2 /E i 0.0324 0.5516 1.0723 sum=1.656 

Number of cells is 3; number of estimated parameters is 1; number of constraints on expected frequencies is 1. 

Number of degrees of freedom is k = 3 − 1 − 1 = 1. Test statistic is χ 2 obs = 1.656. Reject H 0 if χ 2 obs > χ 2 1(5%). As 

χ 2 1(5%) = 3.841, we accept the Poisson distribution fit. 

Poisson distribution sensible model by thinking of a Poisson process. If accidents happen randomly and independently 

in time, number in one journey of an hour has a Poisson distribution. 

In practice we might pool cells to ensure all expected frequencies are at least five. Also for χ 2 -tests with one degree 

of freedom a better test uses Yates’s continuity correction so that χ 2 obs = X (|O i − E i| − 1 2 )2 

. 

E i 

i 

41 (row total) × (column total) 

Expected frequencies are . Thus: 

grand total 

Like OK Dislike Total 

Survey A (100 × 74)/180 = 41.11 23.89 35.00 100 

Survey B 32.89 19.11 28.00 80 

Total 74 43 63 180 

Number of degrees of freedom is k = (3 − 1)(2 − 1) = 2. 

(O i − E i) 2 /E i Like OK Dislike 

Survey A 0.2030 0.0331 0.1143 

Survey B 0.2538 0.0413 0.1429 

Test statistic is χ 2 obs = X i 

(O i − E i) 2 /E i = 0.7883. Reject H 0 if χ 2 obs > χ 2 2(5%). As χ 2 2(5%) = 5.991, so accept 

hypothesis that the two surveys have the same overall proportions for each category. 

Note that the test is whether the proportion liking is the same for surveys A and B, and the proportions saying OK 

are the same for A and B, and the proportions disliking are the same for A and B. 

18

MATH1725 Introduction to Statistics: Worked examples

Create successful ePaper yourself

Delete template?

Save as template?