5.05d - OCR Spec

CAIE Further Paper 4 2020 November Q1

7 marks Moderate -0.3

1 Kayla is investigating the lengths of the leaves of a certain type of tree found in two forests $X$ and $Y$. She chooses a random sample of 40 leaves of this type from forest $X$ and records their lengths, $x \mathrm {~cm}$. She also records the lengths, $y \mathrm {~cm}$, for a random sample of 60 leaves of this type from forest $Y$. Her results are summarised as follows. $$\sum x = 242.0 \quad \sum x ^ { 2 } = 1587.0 \quad \sum y = 373.2 \quad \sum y ^ { 2 } = 2532.6$$ Find a $90 \%$ confidence interval for the difference between the population mean lengths of leaves in forests $X$ and $Y$.

CAIE Further Paper 4 2020 November Q6

12 marks Challenging +1.2

6 Nassa is researching the lengths of a particular type of snake in two countries, $A$ and $B$.

He takes a random sample of 10 snakes of this type from country $A$ and measures the length, $x \mathrm {~m}$, of each snake. He then calculates a $90 \%$ confidence interval for the population mean length, $\mu \mathrm { m }$, for snakes of this type, assuming that snake lengths have a normal distribution. This confidence interval is $3.36 \leqslant \mu \leqslant 4.22$. Find the sample mean and an unbiased estimate for the population variance.
Nassa also measures the lengths, $y \mathrm {~m}$, of a random sample of 8 snakes of this type taken from country $B$. His results are summarised as follows. $$\sum y = 27.86 \quad \sum y ^ { 2 } = 98.02$$ Nassa claims that the mean length of snakes of this type in country $B$ is less than the mean length of snakes of this type in country $A$. Nassa assumes that his sample from country $B$ also comes from a normal distribution, with the same variance as the distribution from country $A$. Test at the $10 \%$ significance level whether there is evidence to support Nassa's claim.
If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.

CAIE Further Paper 4 2021 November Q1

7 marks Standard +0.3

1 The times taken for students at a college to run 200 m have a normal distribution with mean $\mu \mathrm { s }$. The times, $x$ s, are recorded for a random sample of 10 students from the college. The results are summarised as follows, where $\bar { x }$ is the sample mean. $$\bar { x } = 25.6 \quad \sum ( x - \bar { x } ) ^ { 2 } = 78.5$$

Find a 90\% confidence interval for $\mu$.
A test of the null hypothesis $\mu = k$ is carried out on this sample, using a $10 \%$ significance level. The test does not support the alternative hypothesis $\mu < k$.
Find the greatest possible value of $k$.

CAIE Further Paper 4 2021 November Q1

7 marks Standard +0.3

1 The number, $x$, of pine trees was counted in each of 40 randomly chosen regions of equal size in country $A$. The number, $y$, of pine trees was counted in each of 60 randomly chosen regions of the same equal size in country $B$. The results are summarised as follows. $$\sum x = 752 \quad \sum x ^ { 2 } = 14320 \quad \sum y = 1548 \quad \sum y ^ { 2 } = 40200$$ Find a 95\% confidence interval for the difference between the mean number of pine trees in regions of this size in countries $A$ and $B$.

CAIE Further Paper 4 2021 November Q6

10 marks Standard +0.8

6 A scientist is investigating the masses of a particular type of fish found in lakes $A$ and $B$. He chooses a random sample of 10 fish of this type from lake $A$ and records their masses, $x \mathrm {~kg}$, as follows.
0.9
1.8
1.8
1.9
2.1
2.4
2.6
2.2
2.5
3.0 The scientist also chooses a random sample of 12 fish of this type from lake $B$, but he only has a summary of their masses, $y \mathrm {~kg}$, as follows. $$\sum y = 24.48 \quad \sum y ^ { 2 } = 53.75$$ Test at the $10 \%$ significance level whether the mean mass of fish of this type in lake $A$ is greater than the mean mass of fish of this type in lake $B$. You should state any assumptions that you need to make for the test to be valid.
[0pt] [10]
If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.

CAIE Further Paper 4 2022 November Q1

7 marks Standard +0.3

1 Jasmine is researching the heights of pine trees in forests in two regions $A$ and $B$. She chooses a random sample of 50 pine trees in region $A$ and records their heights, $x \mathrm {~m}$. She also chooses a random sample of 60 pine trees in region $B$ and records their heights, $y \mathrm {~m}$. Her results are summarised as follows. $$\sum x = 1625 \quad \sum x ^ { 2 } = 53200 \quad \sum y = 1854 \quad \sum y ^ { 2 } = 57900$$ Find a $95 \%$ confidence interval for the difference between the population mean heights of pine trees in regions $A$ and $B$.

CAIE Further Paper 4 2022 November Q1

6 marks Challenging +1.2

1 A basketball club has a large number of players. The heights, $x \mathrm {~m}$, of a random sample of 10 of these players are measured. A $90 \%$ confidence interval for the population mean height, $\mu \mathrm { m }$, of players in this club is calculated. It is assumed that heights are normally distributed. The confidence interval is $1.78 \leqslant \mu \leqslant 2.02$. Find the values of $\sum x$ and $\sum x ^ { 2 }$ for this sample.

CAIE Further Paper 4 2023 November Q1

6 marks Standard +0.3

1 A factory produces small bottles of natural spring water. Two different machines, $X$ and $Y$, are used to fill empty bottles with the water. A quality control engineer checks the volumes of water in the bottles filled by each of the machines. He chooses a random sample of 60 bottles filled by machine $X$ and a random sample of 75 bottles filled by machine $Y$. The volumes of water, $x$ and $y$ respectively, in millilitres, are summarised as follows. $$\sum x = 6345 \quad \sum ( x - \bar { x } ) ^ { 2 } = 243.8 \quad \sum y = 7614 \quad \sum ( y - \bar { y } ) ^ { 2 } = 384.9$$ $\bar { x }$ and $\bar { y }$ are the sample means of the volume of water in the bottles filled by machines $X$ and $Y$ respectively. Find a $95 \%$ confidence interval for the difference between the mean volume of water in bottles filled by machine $X$ and the mean volume of water in bottles filled by machine $Y$.

CAIE Further Paper 4 2024 November Q1

4 marks Standard +0.3

1 A scientist is investigating the lengths of the leaves of a certain type of plant. The scientist assumes that the lengths of the leaves of this type of plant are normally distributed. He measures the lengths, $x \mathrm {~cm}$, of the leaves of a random sample of 8 plants of this type. His results are as follows. $\begin{array} { l l l l l l l l } 3.5 & 4.2 & 3.8 & 5.2 & 2.9 & 3.7 & 4.1 & 3.2 \end{array}$ Find a $90 \%$ confidence interval for the population mean length of leaves of this type of plant.

OCR S2 2007 June Q8

13 marks Standard +0.3

8 A random variable $Y$ is normally distributed with mean $\mu$ and variance 12.25. Two statisticians carry out significance tests of the hypotheses $\mathrm { H } _ { 0 } : \mu = 63.0 , \mathrm { H } _ { 1 } : \mu > 63.0$.

Statistician $A$ uses the mean $\bar { Y }$ of a sample of size 23, and the critical region for his test is $\bar { Y } > 64.20$. Find the significance level for $A$ 's test.
Statistician $B$ uses the mean of a sample of size 50 and a significance level of $5 \%$.
1. Find the critical region for $B$ 's test.
2. Given that $\mu = 65.0$, find the probability that $B$ 's test results in a Type II error.
3. Given that, when $\mu = 65.0$, the probability that $A$ 's test results in a Type II error is 0.1365 , state with a reason which test is better.

OCR S3 2006 January Q1

6 marks Moderate -0.3

1 In order to judge the support for a new method of collecting household waste, a city council arranged a survey of 400 householders selected at random. The results showed that 186 householders were in favour of the new method.

Calculate a 95\% confidence interval for the proportion of all householders who are in favour of the new method. A city councillor said he believed that as many householders were in favour of the new method as were against it.
Comment on the councillor's belief.

OCR S3 2006 January Q6

13 marks Standard +0.3

6 A company with a large fleet of cars compared two types of tyres, $A$ and $B$. They measured the stopping distances of cars when travelling at a fixed speed on a dry road. They selected 20 cars at random from the fleet and divided them randomly into two groups of 10 , one group being fitted with tyres of type $A$ and the other group with tyres of type $B$. One of the cars fitted with tyres of type $A$ broke down so these tyres were tested on only 9 cars. The stopping distances, $x$ metres, for the two samples are summarised by $$n _ { A } = 9 , \quad \bar { x } _ { A } = 17.30 , \quad s _ { A } ^ { 2 } = 0.7400 , \quad n _ { B } = 10 , \quad \bar { x } _ { B } = 14.74 , \quad s _ { B } ^ { 2 } = 0.8160 ,$$ where $s _ { A } ^ { 2 }$ and $s _ { B } ^ { 2 }$ are unbiased estimates of the two population variances.
It is given that the two populations have the same variance.

Show that an unbiased estimate of this variance is 0.780 , correct to 3 decimal places. The population mean stopping distances for cars with tyres of types $A$ and $B$ are denoted by $\mu _ { A }$ metres and $\mu _ { B }$ metres respectively.
Stating any further assumption you need to make, calculate a $98 \%$ confidence interval for $\mu _ { A } - \mu _ { B }$. The manufacturers of Type $B$ tyres assert that $\mu _ { B } < \mu _ { A } - 2$.
Carry out a significance test of this assertion at the $5 \%$ significance level. \section*{[Question 7 is printed overleaf.]}

OCR S3 2007 January Q4

10 marks Standard +0.3

4 A machine is set to produce metal discs with mean diameter 15.4 mm . In order to test the correctness of the setting, a random sample of 12 discs was selected and the diameters, $x \mathrm {~mm}$, were measured. The results are summarised by $\Sigma x = 177.6$ and $\Sigma x ^ { 2 } = 2640.40$. Diameters may be assumed to be normally distributed with mean $\mu \mathrm { mm }$.

Find a $95 \%$ confidence interval for $\mu$.
Test, at the $5 \%$ significance level, the null hypothesis $\mu = 15.4$ against the alternative hypothesis $\mu < 15.4$.

OCR S3 2007 January Q5

11 marks Standard +0.3

5 Each person in a random sample of 1200 people was asked whether he or she approved of certain proposals to reduce atmospheric pollution. It was found that 978 people approved. The proportion of people in the whole population who would approve is denoted by $p$.

Write down an estimate $\hat { p }$ of $p$.
Find a 90\% confidence interval for $p$.
Explain, in the context of the question, the meaning of a $90 \%$ confidence interval.
Estimate the sample size that would give a value for $\hat { p }$ that differs from the value of $p$ by less than 0.01 with probability $90 \%$.

OCR S3 2008 January Q1

6 marks Standard +0.3

1 A blueberry farmer increased the amount of water sprayed over his berries to see what effect this had on their weight. The farmer weighed each of a random sample of 80 berries of the previous season's crop and each of a random sample of 100 berries of the new crop. The results are summarised in the following table, in which $\bar { x }$ denotes the sample mean weight in grams, and $s ^ { 2 }$ denotes an unbiased estimate of the relevant population variance.

	Sample size	$\bar { x }$	$s ^ { 2 }$
Previous season's crop $( P )$	80	1.24	0.00356
New crop $( N )$	100	1.36	0.00340

Calculate an estimate of $\operatorname { Var } \left( \bar { X } _ { N } - \bar { X } _ { P } \right)$.
Calculate a $95 \%$ confidence interval for the difference in population mean weights.
Give a reason why it is unnecessary to use a $t$-distribution in calculating the confidence interval.

OCR S3 2008 January Q2

8 marks Standard +0.3

2 The times taken for customers' phone complaints to be handled were monitored regularly by a company. During a particular week a researcher checked a random sample of 20 complaints and the times, $x$ minutes, taken to handle the complaints are summarised by $\Sigma x = 337.5$. Handling times may be assumed to have a normal distribution with mean $\mu$ minutes and standard deviation 3.8 minutes.

Calculate a $98 \%$ confidence interval for $\mu$. During the same week two other researchers each calculated a $98 \%$ confidence interval for $\mu$ based on independent samples.
Calculate the probability that at least one of the three intervals does not contain $\mu$.
State two ways in which the calculation in part (i) would differ if the standard deviation were unknown.

OCR S3 2008 January Q5

11 marks Standard +0.3

5 Of two brands of lawnmower, $A$ and $B$, brand $A$ was claimed to take less time, on average, than brand $B$ to mow similar stretches of lawn. In order to test this claim, 9 randomly selected gardeners were each given the task of mowing two regions of lawn, one with each brand of mower. All the regions had the same size and shape and had grass of the same height. The times taken, in seconds, are given in the table.

Gardener	1	2	3	4	5	6	7	8	9
Brand $A$	412	386	389	401	396	394	397	411	391
Brand $B$	422	394	385	408	394	399	397	410	397

Test the claim using a paired-sample $t$-test at the $5 \%$ significance level. State a distributional assumption required for the test to be valid.
Give a reason why a paired-sample $t$-test should be used, rather than a 2 -sample $t$-test, in this case.

OCR S3 2011 January Q1

5 marks Moderate -0.8

1 A random variable has a normal distribution with unknown mean $\mu$ and known standard deviation 0.19 . In order to estimate $\mu$ a random sample of five observations of the random variable was taken. The values were as follows. $$\begin{array} { l l l l l } 5.44 & 4.93 & 5.12 & 5.36 & 5.40 \end{array}$$ Using these five values, calculate,

an estimate of $\mu$,
a 95\% confidence interval for $\mu$.

OCR S3 2011 January Q5

9 marks Moderate -0.3

5 An experiment with hybrid corn resulted in yellow kernels and purple kernels. Of a random sample of 90 kernels, 18 were yellow and 72 were purple.

Calculate an approximate $90 \%$ confidence interval for the proportion of yellow kernels produced in all such experiments.
Deduce an approximate $90 \%$ confidence interval for the proportion of purple kernels produced in all such experiments.
Explain what is meant by a $90 \%$ confidence interval for a population proportion.
Mendel's theory of inheritance predicts that $25 \%$ of all such kernels will be yellow. State, giving a reason, whether or not your calculations support the theory.

OCR S3 2011 January Q8

16 marks Standard +0.3

8

State circumstances under which it would be necessary to calculate a pooled estimate of variance when carrying out a two-sample hypothesis test.
An investigation into whether passive smoking affects lung capacity considered a random sample of 20 children whose parents did not smoke and a random sample of 22 children whose parents did smoke. None of the children themselves smoked. The lung capacity, in litres, of each child was measured and the results are summarised as follows. For the children whose parents did not smoke: $n _ { 1 } = 20 , \Sigma x _ { 1 } = 42.4$ and $\Sigma x _ { 1 } ^ { 2 } = 90.43$.
For the children whose parents did smoke: $\quad n _ { 2 } = 22 , \Sigma x _ { 2 } = 42.5$ and $\Sigma x _ { 2 } ^ { 2 } = 82.93$.
The means of the two populations are denoted by $\mu _ { 1 }$ and $\mu _ { 2 }$ respectively.
1. State conditions for which a $t$-test would be appropriate for testing whether $\mu _ { 1 }$ exceeds $\mu _ { 2 }$.
2. Assuming the conditions are valid, carry out the test at the $1 \%$ significance level and comment on the result.
3. Calculate a 99\% confidence interval for $\mu _ { 1 } - \mu _ { 2 }$.

OCR S3 2007 June Q6

12 marks Standard +0.3

6 Random samples of 200 'Alpha' and 150 'Beta' vacuum cleaners were monitored for reliability. It was found that 62 Alpha and 35 Beta cleaners required repair during the guarantee period of one year. The proportions of all Alpha and Beta cleaners that require repair during the guarantee period are $p _ { \alpha }$ and $p _ { \beta }$ respectively.

Find a $95 \%$ confidence interval for $p _ { \alpha }$.
Give a reason why, apart from rounding, the interval is approximate.
Test, at the $5 \%$ significance level, whether $p _ { \alpha }$ differs from $p _ { \beta }$.

OCR S3 2007 June Q8

14 marks Standard +0.3

8 The continuous random variable $Y$ has a distribution with mean $\mu$ and variance 20. A random sample of 50 observations of $Y$ is selected and these observations are summarised in the following grouped frequency table.

Values	$y < 20$	$20 \leqslant y < 25$	$25 \leqslant y < 30$	$y \geqslant 30$
Frequency	3	27	12	8

Assuming that $Y \sim \mathrm {~N} ( 25,20 )$, show that the expected frequency for the interval $20 \leqslant y < 25$ is 18.41, correct to 2 decimal places, and obtain the remaining expected frequencies.
Test, at the $5 \%$ significance level, whether the distribution $\mathrm { N } ( 25,20 )$ fits the data.
Given that the sample mean is 24.91 , find a $98 \%$ confidence interval for $\mu$.
Does the outcome of the test in part (ii) affect the validity of the confidence interval found in part (iii)? Justify your answer.

OCR S3 2011 June Q2

8 marks Standard +0.3

2 The population proportion of all men with red-green colour blindness is denoted by $p$. Each of a random sample of 80 men was tested and it was found that 6 had red-green colour blindness.

Calculate an approximate $95 \%$ confidence interval for $p$.
For a different random sample of men, the proportion with red-green colour blindness is denoted by $p _ { s }$. Estimate the sample size required in order that $\left| p _ { s } - p \right| \leqslant 0.05$ with probability $95 \%$.
Give one reason why the calculated sample size is an estimate.

OCR S3 2011 June Q6

13 marks Standard +0.3

6 The Body Mass Index (BMI) of each of a random sample of 100 army recruits from a large intake in 2008 was measured. The results are summarised by $$\Sigma x = 2605.0 , \quad \Sigma x ^ { 2 } = 68636.41 .$$ It may be assumed that BMI has a normal distribution.

Find a 98\% confidence interval for the mean BMI of all recruits in 2008.
Estimate the percentage of the intake with a BMI greater than 30.0.
The BMIs of two randomly chosen recruits are denoted by $\boldsymbol { B } _ { 1 }$ and $\boldsymbol { B } _ { 2 }$. Estimate $\mathrm { P } \left( \boldsymbol { B } _ { 1 } - \boldsymbol { B } _ { 2 } < 5 \right)$.
State, giving a reason, for which of the above calculations the normality assumption is unnecessary.

OCR S3 Specimen Q3

8 marks Standard +0.3

3 A random sample of 80 precision-engineered cylindrical components is checked as part of a quality control process. The diameters of the cylinders should be 25.00 cm . Accurate measurements of the diameters, $x \mathrm {~cm}$, for the sample are summarised by $$\Sigma ( x - 25 ) = 0.44 , \quad \Sigma ( x - 25 ) ^ { 2 } = 0.2287 .$$

Calculate a $99 \%$ confidence interval for the population mean diameter of the components.
For the calculation in part (i) to be valid, is it necessary to assume that component diameters are normally distributed? Justify your answer.

	Sample size	\(\bar { x }\)	\(s ^ { 2 }\)
Previous season's crop \(( P )\)	80	1.24	0.00356
New crop \(( N )\)	100	1.36	0.00340

Gardener	1	2	3	4	5	6	7	8	9
Brand \(A\)	412	386	389	401	396	394	397	411	391
Brand \(B\)	422	394	385	408	394	399	397	410	397

Values	\(y < 20\)	\(20 \leqslant y < 25\)	\(25 \leqslant y < 30\)	\(y \geqslant 30\)
Frequency	3	27	12	8

5.05d Confidence intervals: using normal distribution

CAIE Further Paper 4 2020 November Q1

CAIE Further Paper 4 2020 November Q6

CAIE Further Paper 4 2021 November Q1

CAIE Further Paper 4 2021 November Q1

CAIE Further Paper 4 2021 November Q6

CAIE Further Paper 4 2022 November Q1

CAIE Further Paper 4 2022 November Q1

CAIE Further Paper 4 2023 November Q1

CAIE Further Paper 4 2024 November Q1

OCR S2 2007 June Q8

OCR S3 2006 January Q1

OCR S3 2006 January Q6

OCR S3 2007 January Q4

OCR S3 2007 January Q5

OCR S3 2008 January Q1

OCR S3 2008 January Q2

OCR S3 2008 January Q5

OCR S3 2011 January Q1

OCR S3 2011 January Q5

OCR S3 2011 January Q8

OCR S3 2007 June Q6

OCR S3 2007 June Q8

OCR S3 2011 June Q2

OCR S3 2011 June Q6

OCR S3 Specimen Q3