5.05a - OCR Spec

Edexcel S3 Q7

16 marks Standard +0.3

7. A telephone company believes that, for young people, the average length of a telephone call on a land line is longer than on a mobile, due to the difference in price. The company collected data on the time, $t$ minutes, of 500 calls made by young people on mobiles and the data is summarised by $$\Sigma t = 7335 , \quad \Sigma t ^ { 2 } = 172040 .$$

Calculate unbiased estimates of the mean and variance of $t$. For 200 calls made on land lines by the same young people, unbiased estimates of the mean and variance of the call length were 15.9 minutes and 108.5 minutes ${ } ^ { 2 }$ respectively.
Stating your hypotheses clearly, test at the $5 \%$ level whether or not there is evidence that longer calls are made on land lines than on mobiles.
(9 marks)
Explain the importance of the central limit theorem in carrying out the test in part (b).

Edexcel S4 2007 June Q5

7 marks Challenging +1.2

5. The number of tornadoes per year to hit a particular town follows a Poisson distribution with mean $\lambda$. A weatherman claims that due to climate changes the mean number of tornadoes per year has decreased. He records the number of tornadoes $x$ to hit the town last year. To test the hypotheses $\mathrm { H } _ { 0 } : \lambda = 7$ and $\mathrm { H } _ { 1 } : \lambda < 7$, a critical region of $x \leq 3$ is used.

Find, in terms $\lambda$ the power function of this test.
Find the size of this test.
Find the probability of a Type II error when $\lambda = 4$.

Edexcel S4 2008 June Q6

12 marks Standard +0.3

A drug is claimed to produce a cure to a certain disease in $35 \%$ of people who have the disease. To test this claim a sample of 20 people having this disease is chosen at random and given the drug. If the number of people cured is between 4 and 10 inclusive the claim will be accepted. Otherwise the claim will not be accepted.
1. Write down suitable hypotheses to carry out this test.
2. Find the probability of making a Type I error.
The table below gives the value of the probability of the Type II error, to 4 decimal places, for different values of $p$ where $p$ is the probability of the drug curing a person with the disease.
P (cure) 0.2 0.3 0.4 0.5
P (Type II error) 0.5880 $r$ 0.8565 $s$
Calculate the value of $r$ and the value of $s$.
Calculate the power of the test for $p = 0.2$ and $p = 0.4$
Comment, giving your reasons, on the suitability of this test procedure.

Edexcel S4 2009 June Q3

12 marks Standard +0.3

Define, in terms of $\mathrm { H } _ { 0 }$ and/or $\mathrm { H } _ { 1 }$,
1. the size of a hypothesis test,
2. the power of a hypothesis test.
The probability of getting a head when a coin is tossed is denoted by $p$.
This coin is tossed 12 times in order to test the hypotheses $\mathrm { H } _ { 0 } : p = 0.5$ against $\mathrm { H } _ { 1 } : p \neq 0.5$, using a 5\% level of significance.
Find the largest critical region for this test, such that the probability in each tail is less than 2.5\%.
Given that $p = 0.4$
1. find the probability of a type II error when using this test,
2. find the power of this test.
Suggest two ways in which the power of the test can be increased.

Edexcel S4 2010 June Q3

12 marks Standard +0.3

A manager in a sweet factory believes that the machines are working incorrectly and the proportion $p$ of underweight bags of sweets is more than $5 \%$. He decides to test this by randomly selecting a sample of 5 bags and recording the number $X$ that are underweight. The manager sets up the hypotheses $\mathrm { H } _ { 0 } : p = 0.05$ and $\mathrm { H } _ { 1 } : p > 0.05$ and rejects the null hypothesis if $x > 1$.
1. Find the size of the test.
2. Show that the power function of the test is
$$1 - ( 1 - p ) ^ { 4 } ( 1 + 4 p )$$ The manager goes on holiday and his deputy checks the production by randomly selecting a sample of 10 bags of sweets. He rejects the hypothesis that $p = 0.05$ if more than 2 underweight bags are found in the sample.
Find the probability of a Type I error using the deputy's test. \section*{Question 3 continues on page 12} The table below gives some values, to 2 decimal places, of the power function for the deputy's test.
$p$ 0.10 0.15 0.20 0.25
Power 0.07 $s$ 0.32 0.47
Find the value of $s$. The graph of the power function for the manager's test is shown in Figure 1. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{0bc6c296-9cbe-498b-89d9-c034b1b246e4-08_1157_1436_847_260} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure}
On the same axes, draw the graph of the power function for the deputy's test.
1. State the value of $p$ where these graphs intersect.
2. Compare the effectiveness of the two tests if $p$ is greater than this value. The deputy suggests that they should use his sampling method rather than the manager's.
Give a reason why the manager might not agree to this change.

OCR MEI Further Statistics B AS 2018 June Q4

15 marks Easy -1.2

4 The random variable $X$ has a continuous uniform distribution on [ 0,10 ].

Find $\mathrm { P } ( 3 < X < 6 )$.

Find each of the following.

$\mathrm { E } ( X )$
$\operatorname { Var } ( X )$

Marisa is investigating the sample mean, $Y$, of 8 independent values of $X$. She designs a simulation shown in the spreadsheet in Fig. 4.1. Each of the 25 rows below the heading row consists of 8 values of $X$ together with the value of $Y$. All of the values in the spreadsheet have been rounded to 2 decimal places. \begin{table}[h]

1	A	B	C	D	E	F	G	H	I	J
1	$X _ { 1 }$	$X _ { 2 }$	$X _ { 3 }$	$X _ { 4 }$	$X _ { 5 }$	$X _ { 6 }$	$X _ { 7 }$	$X _ { 8 }$	$Y$
2	6.31	2.45	3.27	3.06	4.16	1.53	0.43	7.99	3.65
3	1.70	1.52	7.10	8.93	6.44	2.70	9.96	7.83	5.77
4	9.15	0.52	4.95	6.99	6.52	3.15	0.81	5.35	4.68
5	0.65	2.71	7.92	9.65	0.50	4.87	6.46	2.67	4.43
6	3.09	6.11	3.96	0.09	0.18	4.67	0.67	6.20	3.12
7	7.06	5.84	1.97	3.60	9.36	1.97	4.48	3.47	4.72
8	1.46	1.57	5.45	0.37	3.76	7.56	8.48	9.12	4.72
9	9.42	1.85	4.91	1.61	1.94	8.00	1.77	5.34	4.36
10	2.98	5.32	2.91	4.12	9.16	1.76	9.97	6.88	5.39
11	2.83	3.44	3.28	7.85	1.00	0.93	8.77	4.03	4.01
12	4.51	0.59	5.84	9.87	8.65	3.94	7.18	0.23	5.10
13	4.49	0.69	3.65	8.78	4.96	8.96	3.77	1.43	4.59
14	6.57	8.08	4.85	6.75	7.92	0.27	9.69	4.04	6.02
15	8.35	1.09	8.63	8.04	7.23	2.12	2.57	9.59	5.95
16	5.24	9.53	6.08	8.21	3.61	7.07	6.65	7.63	6.75
17	7.89	5.50	3.09	0.71	6.47	5.49	6.47	4.95	5.07
18	8.36	7.27	2.35	9.04	0.58	2.26	3.01	7.90	5.10
19	3.76	1.01	9.61	9.65	7.89	9.98	6.28	4.34	6.56
20	9.94	6.84	3.38	5.53	0.26	8.53	5.72	5.12	5.66
21	7.25	9.10	0.34	2.88	4.66	2.65	6.37	7.63	5.11
22	7.18	7.14	5.38	0.04	4.09	6.47	4.96	4.23	4.94
23	8.69	5.04	4.90	2.94	2.00	4.23	4.13	0.97	4.11
24	3.46	6.33	0.48	9.35	0.23	1.18	7.97	6.37	4.42
25	2.37	7.26	7.16	1.24	5.26	2.80	3.55	3.84	4.19
26	2.16	8.30	7.17	3.32	2.96	1.30	9.11	0.31	4.33
27

\captionsetup{labelformat=empty} \caption{Fig. 4.1}

\end{table}

Use the spreadsheet to estimate $\mathrm { P } ( 3 < Y < 6 )$.
Explain why it is not surprising that this estimated probability is substantially greater than the value which you calculated in part (i). Marisa wonders whether, even though the sample size is only 8, use of the Central Limit Theorem will provide a good approximation to $\mathrm { P } ( 3 < Y < 6 )$.
Calculate an estimate of $\mathrm { P } ( 3 < Y < 6 )$ using the Central Limit Theorem. A Normal probability plot of the 25 simulated values of $Y$ is shown in Fig. 4.2. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{0c58d4d7-10e9-473a-888a-b407ec90bf08-5_800_1291_306_386} \captionsetup{labelformat=empty} \caption{Fig. 4.2}
\end{figure}
Explain what the Normal probability plot suggests about the use of the Central Limit Theorem to approximate $\mathrm { P } ( 3 < Y < 6 )$. Marisa now decides to use a spreadsheet with 1000 rows below the heading row, rather than the 25 which she used in the initial simulation shown in Fig. 4.1. She uses a counter to count the number of values of $Y$ between 3 and 6. This value is 808.
Explain whether the value 808 supports the suggestion that the Central Limit Theorem provides a good approximation to $\mathrm { P } ( 3 < Y < 6 )$. Marisa decides to repeat each of her two simulations many times in order to investigate how variable the probability estimates are in each case.
Explain whether you would expect there to be more, the same or less variability in the probability estimates based on 1000 rows than in the probability estimates based on 25 rows.

OCR MEI Further Statistics B AS 2019 June Q1

6 marks Standard +0.3

1 It is known that the red blood cell count of adults in a particular country, measured in suitable units, has mean 4.96 and variance 0.15.

Find the probability that the mean red blood cell count of a random sample of 50 adults from this country is at least 5.00.
Explain how you can find the probability in part (a) despite the fact that you do not know the distribution of red blood cell counts.

OCR MEI Further Statistics B AS 2022 June Q3

8 marks Standard +0.3

3 A local council collects domestic kitchen waste for composting. Householders place their kitchen waste in a 'compost bin' and this is emptied weekly by the council. The average weight of kitchen waste collected per household each week is known to be 3.4 kg . The council runs a campaign to try to increase the amount of kitchen waste per household which is put in the compost bin. After the campaign, a random sample of 40 households is selected and the weights in kg of kitchen waste in their compost bins are measured. A hypothesis test is carried out in order to investigate whether the campaign has been successful, using software to analyse the sample. The output from the software is shown below.
□
Z Test of a Mean
Null Hypothesis $\mu = 3.4$ Alternative Hypothesis $\bigcirc < 0 > 0 \neq$ Sample

Mean	3.565
s	1.05
N	40

Result

Z Test of a Mean

Mean	3.565
S	1.05
SE	0.1660
N	40
Z	0.994
p	0.160

Explain why the test is based on the Normal distribution even though the distribution of the population of amounts of kitchen waste per household is not known.
Using the output from the software, complete the test at the $5 \%$ significance level.
Show how the value of $Z$ in the software output was calculated.
Calculate the least value of the sample mean which would have resulted in the conclusion of the test in part (b) being different. You should assume that the standard error is unchanged.

OCR MEI Further Statistics B AS Specimen Q6

8 marks Standard +0.3

6 The table below shows the mean and variance of the test scores of a random samples of 70 girls who are starting an A level Mathematics course.

Sample mean	Sample variance
118.86	86.57

Showing your working, find a $95 \%$ confidence interval for the population mean.
Explain why you can construct the interval in part (i) despite no information about the distribution of the parent population being given.
The same random sample of girls repeats the test. The mean improvement in score is 0.9 . The $95 \%$ confidence interval for the improvement is $[ - 1.5,3.3 ]$. What is the sample variance for the improvement in score?

OCR MEI Further Statistics Major 2019 June Q4

7 marks Moderate -0.3

4 Shellfish in the sea near nuclear power stations are regularly monitored for levels of radioactivity. On a particular occasion, the levels of caesium-137 (a radioactive isotope) in a random sample of 8 cockles, measured in becquerels per kilogram, were as follows. $\begin{array} { l l l l l l l l } 2.36 & 2.97 & 2.69 & 3.00 & 2.51 & 2.45 & 2.21 & 2.63 \end{array}$ Software is used to produce a 95\% confidence interval for the level of caesium-137 in the cockles. The output from the software is shown in Fig. 4. The value for 'SE' has been deliberately omitted. T Estimate of a Mean
Confidence Level 0.95 Sample
Mean 2.6025
s 0.2793
□
0.2793 N □ 8 Result T Estimate of a Mean \begin{table}[h]

Mean	2.6025
s	0.2793
SE
N	8
df	7
Interval	$2.6025 \pm 0.2335$

\captionsetup{labelformat=empty} \caption{Fig. 4}

\end{table}

State an assumption necessary for the use of the $t$ distribution in the construction of this confidence interval.
State the confidence interval which the software gives in the form $a < \mu < b$.
In the software output shown in Fig. 4, SE stands for standard error. Find the standard error in this case.
Show how the value of 0.2335 in the confidence interval was calculated.
State how, using this sample, a wider confidence interval could be produced.

OCR MEI Further Statistics Major 2023 June Q4

11 marks Standard +0.3

4 A machine manufactures batches of 100 titanium sheets. The thickness of every sheet in a batch is Normally distributed with mean $\mu \mathrm { mm }$ and standard deviation 0.03 mm . You should assume that each sheet is of uniform thickness and that the thicknesses of different sheets are independent of each other. The values of $\mu$ for three different batches, A, B and C, are 3.125, 3.117 and 3.109 respectively.

Determine the probability that the total thickness of 10 sheets from Batch A is less than 31.0 mm .
Determine the probability that, if a single sheet from Batch A is cut into pieces and 10 of the pieces are stacked together, the total thickness of the stack is less than 31.0 mm .
Determine the probability that, if one sheet from each of Batches A, B and C are stacked together, the total thickness of the stack is at least 9.4 mm .
Determine the probability that the total thickness of 10 sheets from Batch A is less than the total thickness of 10 sheets from Batch B.

OCR MEI Further Statistics Major 2024 June Q5

10 marks Standard +0.3

5 A researcher is investigating whether doing yoga has any effect on quality of sleep in older people. The researcher selects a random sample of 40 older people, who then complete a yoga course. Before they start the course and again at the end, the 40 people fill in a questionnaire which measures their perceived sleep quality. The higher the score, the better is the perceived quality of sleep. The researcher uses software to produce a 90\% confidence interval for the difference in mean sleep quality (sleep quality after the course minus sleep quality before the course). The output from the software is shown below. Z Estimate of a Mean Confidence level □ 0.9 Sample

Mean	0.586
$s$	2.14
	40

Result
Z Estimate of a Mean

Mean	0.586
s	2.14
SE	0.3384
N	40
Lower limit	0.029
Upper limit	1.143
Interval	$0.586 \pm 0.557$

Explain why the confidence interval is based on the Normal distribution even though the distribution of the population of differences is not known.
Explain whether the confidence interval suggests that the mean sleep qualities before and after completing a yoga course are different.
In the output from the software, SE stands for 'standard error'.
1. Explain what standard error is.
2. Show how the standard error was calculated in this case.
A colleague of the researcher suggests that the confidence level should have been $95 \%$ rather than $90 \%$. Determine whether this would have made a difference to your answer to part (b).

OCR MEI Further Statistics Major 2024 June Q11

11 marks Challenging +1.2

11 The discrete random variable $X$ has a uniform distribution over the set of all integers between 25 and $n$ inclusive, where $n$ is a positive integer with $n > 25$.

Determine $\mathrm { P } \left( \mathrm { X } < \frac { \mathrm { n } + 25 } { 2 } \right)$ in each of the following cases.

OCR MEI Further Statistics Major 2020 November Q10

12 marks Standard +0.3

10 The discrete random variables $X$ and $Y$ have distributions as follows: $X \sim \mathrm {~B} ( 20,0.3 )$ and $Y \sim \operatorname { Po } ( 3 )$. The spreadsheet in Fig. 10 shows a simulation of the distributions of $X$ and $Y$. Each of the 20 rows below the heading row consists of a value of $X$, a value of $Y$, and the value of $X - 2 Y$. \begin{table}[h]

1	A	B	C
1	X	Y	$X - 2 Y$
2	6	6	-6
3	5	4	-3
4	8	1	6
5	6	5	-4
6	6	3	0
7	8	1	6
8	6	4	-2
9	5	4	-3
10	7	4	-1
11	8	3	2
12	6	2	2
13	5	1	3
14	6	1	4
15	5	4	-3
16	7	2	3
17	5	2	1
18	4	4	-4
19	5	0	5
20	5	1	3
21	4	2	0
nn

\captionsetup{labelformat=empty} \caption{Fig. 10}

\end{table}

Use the spreadsheet to estimate each of the following.
The mean of 50 values of $X - 2 Y$ is denoted by the random variable $W$.
Calculate an estimate of $\mathrm { P } ( W > 1 )$.

OCR MEI Further Statistics Major 2021 November Q1

6 marks Standard +0.3

1 When babies are born, their head circumferences are measured. A random sample of 50 newborn female babies is selected. The sample mean head circumference is 34.711 cm . The sample standard deviation head circumference is 1.530 cm .

Determine a 95\% confidence interval for the population mean head circumference of newborn female babies.
Explain why you can calculate this interval even though the distribution of the population of head circumferences of newborn female babies is unknown.

OCR MEI Further Statistics Major 2021 November Q11

11 marks Challenging +1.8

11 The continuous random variable $X$ has probability density function given by $f ( x ) = \begin{cases} a x ^ { 2 } & 0 \leqslant x < 2 , \\ b ( 3 - x ) ^ { 2 } & 2 \leqslant x \leqslant 3 , \\ 0 & \text { otherwise } \end{cases}$ where $a$ and $b$ are positive constants.

Given that $\mathrm { E } ( X ) = 2$, determine the values of $a$ and $b$.
Determine the median value of $X$.
A random sample of 50 observations of $X$ is selected. Given that $\operatorname { Var } ( X ) = 0.2$, determine an estimate of the probability that the mean value of the 50 observations is less than 1.9.

WJEC Further Unit 5 2023 June Q5

13 marks Standard +0.3

5. The masses, $X$, in kg, of men who work for a large company are normally distributed with mean 75 and standard deviation 10.

Find the probability that the mean mass of a random sample of 5 men is less than 70 kg .
The mean mass, in kg , of a random sample of $n$ men drawn from this distribution is $\bar { X }$. Given that $\mathrm { P } ( \bar { X } > 80 )$ is approximately $0 \cdot 007$, find $n$. The masses, in kg, of women who work for the company are normally distributed with mean 68 and standard deviation 6 . A lift in the company building will not move if the total mass in the lift is more than 500 kg .
A random sample of 3 men and 4 women get in the lift. Find the probability that the lift will not move.
State a modelling assumption you have made in calculating your answer for part (c).

Edexcel FS1 2019 June Q3

6 marks Standard +0.8

A biased spinner can land on the numbers $1,2,3,4$ or 5 with the following probabilities.

Number on spinner	1	2	3	4	5
Probability	0.3	0.1	0.2	0.1	0.3

The spinner will be spun 80 times and the mean of the numbers it lands on will be calculated. Find an estimate of the probability that this mean will be greater than 3.25
(6)

Edexcel FS1 2020 June Q7

15 marks Challenging +1.2

A six-sided die has sides labelled $1,2,3,4,5$ and 6

The random variable $S$ represents the score when the die is rolled.
Alicia rolls the die 45 times and the mean score, $\bar { S }$, is calculated.
Assuming the die is fair and using a suitable approximation,

find, to 3 significant figures, the value of $k$ such that $\mathrm { P } ( \bar { S } < k ) = 0.05$
Explain the relevance of the Central Limit Theorem in part (a). Alicia considers the following hypotheses: $\mathrm { H } _ { 0 }$ : The die is fair $\mathrm { H } _ { 1 }$ : The die is not fair
If $\bar { S } < 3.1$ or $\bar { S } > 3.9$, then $\mathrm { H } _ { 0 }$ will be rejected.
Given that the true distribution of $S$ has mean 4 and variance 3
find the power of this test.
Describe what would happen to the power of this test if Alicia were to increase the number of rolls of the die.
Give a reason for your answer.

Edexcel FS1 2021 June Q3

4 marks Standard +0.8

A courier delivers parcels. The random variable $X$ represents the number of parcels delivered successfully each day by the courier where $X \sim \mathrm {~B} ( 400,0.64 )$

A random sample $X _ { 1 } , X _ { 2 } , \ldots X _ { 100 }$ is taken.
Estimate the probability that the mean number of parcels delivered each day by the courier is greater than 257

Edexcel FS2 2022 June Q5

8 marks Standard +0.8

The concentration of an air pollutant is measured in micrograms $/ \mathrm { m } ^ { 3 }$

Samples of air were taken at two different sites and the concentration of this particular air pollutant was recorded. For Site $A$ the summary statistics are shown below.

\cline { 2 - 3 } \multicolumn{1}{c\|}{}	number of samples	$S _ { A } ^ { 2 }$
Site $A$	13	6.39

For Site $B$ there were 9 samples of air taken.
A test of the hypothesis $\mathrm { H } _ { 0 } : \sigma _ { A } ^ { 2 } = \sigma _ { B } ^ { 2 }$ against the hypothesis $\mathrm { H } _ { 1 } : \sigma _ { A } ^ { 2 } \neq \sigma _ { B } ^ { 2 }$ is carried out using a $2 \%$ level of significance.

State a necessary assumption required to carry out the test. Given that the assumption in part (a) holds,
find the set of values of $s _ { B } ^ { 2 }$ that would lead to the null hypothesis being rejected,
find a 99\% confidence interval for the variance of the concentration of the air pollutant at Site A.

Edexcel FS2 2023 June Q2

12 marks Standard +0.3

Camilo grows two types of apple, green apples and red apples.

The standard deviation of the weights of green apples is known to be 3.5 grams.
A random sample of 80 green apples has a mean weight of 128 grams.

Find a 98\% confidence interval for the mean weight of the population of green apples. Show your working clearly and give the confidence interval limits to 2 decimal places. Camilo believes that the mean weight of the population of green apples is more than 10 grams greater than the mean weight of the population of red apples. A random sample of $n$ red apples has a mean weight of 117 grams.
The standard deviation of the weights of the red apples is known to be 4 grams.
A test of Camilo's belief is carried out at the 5\% level of significance.
State the null and alternative hypotheses for this test.
Find the smallest value of $n$ for which the null hypothesis will be rejected.
Explain the relevance of the Central Limit Theorem in parts (a) and (c).
Given that $n = 85$, state the conclusion of the hypothesis test.

OCR S2 2007 June Q4

6 marks Moderate -0.3

State two conditions needed for $X$ to be well modelled by a normal distribution.
It is given that $X \sim \mathrm {~N} \left( 50.0,8 ^ { 2 } \right)$. The mean of 20 random observations of $X$ is denoted by $\bar { X }$. Find $\mathrm { P } ( \bar { X } > 47.0 )$. 5 The number of system failures per month in a large network is a random variable with the distribution $\operatorname { Po } ( \lambda )$. A significance test of the null hypothesis $\mathrm { H } _ { 0 } : \lambda = 2.5$ is carried out by counting $R$, the number of system failures in a period of 6 months. The result of the test is that $\mathrm { H } _ { 0 }$ is rejected if $R > 23$ but is not rejected if $R \leqslant 23$.
State the alternative hypothesis.
Find the significance level of the test.
Given that $\mathrm { P } ( R > 23 ) < 0.1$, use tables to find the largest possible actual value of $\lambda$. You should show the values of any relevant probabilities. 6 In a rearrangement code, the letters of a message are rearranged so that the frequency with which any particular letter appears is the same as in the original message. In ordinary German the letter $e$ appears $19 \%$ of the time. A certain encoded message of 20 letters contains one letter $e$.
Using an exact binomial distribution, test at the $10 \%$ significance level whether there is evidence that the proportion of the letter $e$ in the language from which this message is a sample is less than in German, i.e., less than $19 \%$.
Give a reason why a binomial distribution might not be an appropriate model in this context. 7 Two continuous random variables $S$ and $T$ have probability density functions as follows. $$\begin{array} { l l } S : & f ( x ) = \begin{cases} \frac { 1 } { 2 } & - 1 \leqslant x \leqslant 1 \\ 0 & \text { otherwise } \end{cases} \\ T : & g ( x ) = \begin{cases} \frac { 3 } { 2 } x ^ { 2 } & - 1 \leqslant x \leqslant 1 \\ 0 & \text { otherwise } \end{cases} \end{array}$$
Sketch on the same axes the graphs of $y = \mathrm { f } ( x )$ and $y = \mathrm { g } ( x )$. [You should not use graph paper or attempt to plot points exactly.]
Explain in everyday terms the difference between the two random variables.
Find the value of $t$ such that $\mathrm { P } ( T > t ) = 0.2$. 8 A random variable $Y$ is normally distributed with mean $\mu$ and variance 12.25. Two statisticians carry out significance tests of the hypotheses $\mathrm { H } _ { 0 } : \mu = 63.0 , \mathrm { H } _ { 1 } : \mu > 63.0$.
Statistician $A$ uses the mean $\bar { Y }$ of a sample of size 23, and the critical region for his test is $\bar { Y } > 64.20$. Find the significance level for $A$ 's test.
Statistician $B$ uses the mean of a sample of size 50 and a significance level of $5 \%$.
1. Find the critical region for $B$ 's test.
2. Given that $\mu = 65.0$, find the probability that $B$ 's test results in a Type II error.
3. Given that, when $\mu = 65.0$, the probability that $A$ 's test results in a Type II error is 0.1365 , state with a reason which test is better. 9 (a) The random variable $G$ has the distribution $\mathrm { B } ( n , 0.75 )$. Find the set of values of $n$ for which the distribution of $G$ can be well approximated by a normal distribution.
  (b) The random variable $H$ has the distribution $\mathrm { B } ( n , p )$. It is given that, using a normal approximation, $\mathrm { P } ( H \geqslant 71 ) = 0.0401$ and $\mathrm { P } ( H \leqslant 46 ) = 0.0122$.
  1. Find the mean and standard deviation of the approximating normal distribution.
  2. Hence find the values of $n$ and $p$.

AQA S1 2005 January Q2

9 marks Moderate -0.3

2 The volume, in millilitres, of lemonade in mini-cans may be assumed to be normally distributed with a standard deviation of 3.5. The volumes, in millilitres, of lemonade in a random sample of 12 mini-cans were as follows.

155	148	156	149	147	156
157	156	150	154	148	154

Construct a $98 \%$ confidence interval for the mean volume of lemonade in a mini-can, giving the limits to one decimal place.
On each mini-can is printed " 150 ml ". Comment on this, using the given sample and your confidence interval in part (a).
State why, in part (a), use of the Central Limit Theorem was not necessary.

AQA S1 2005 January Q4

15 marks Moderate -0.3

4 Chopped lettuce is sold in bags nominally containing 100 grams.
The weight, $X$ grams, of chopped lettuce, delivered by the machine filling the bags, may be assumed to be normally distributed with mean $\mu$ and standard deviation 4.

Assuming that $\mu = 106$, determine the probability that a randomly selected bag of chopped lettuce:
1. weighs less than 110 grams;
2. is underweight.
Determine the minimum value of $\mu$ so that at most 2 per cent of bags of chopped lettuce are underweight. Give your answer to one decimal place.
Boxes each contain 10 bags of chopped lettuce. The mean weight of a bag of chopped lettuce in a box is denoted by $\bar { X }$. Given that $\mu = 108.5$ :
1. write down values for the mean and variance of $\bar { X }$;
2. determine the probability that $\bar { X }$ exceeds 110 .

1	A	B	C	D	E	F	G	H	I	J
1	\(X _ { 1 }\)	\(X _ { 2 }\)	\(X _ { 3 }\)	\(X _ { 4 }\)	\(X _ { 5 }\)	\(X _ { 6 }\)	\(X _ { 7 }\)	\(X _ { 8 }\)	\(Y\)
2	6.31	2.45	3.27	3.06	4.16	1.53	0.43	7.99	3.65
3	1.70	1.52	7.10	8.93	6.44	2.70	9.96	7.83	5.77
4	9.15	0.52	4.95	6.99	6.52	3.15	0.81	5.35	4.68
5	0.65	2.71	7.92	9.65	0.50	4.87	6.46	2.67	4.43
6	3.09	6.11	3.96	0.09	0.18	4.67	0.67	6.20	3.12
7	7.06	5.84	1.97	3.60	9.36	1.97	4.48	3.47	4.72
8	1.46	1.57	5.45	0.37	3.76	7.56	8.48	9.12	4.72
9	9.42	1.85	4.91	1.61	1.94	8.00	1.77	5.34	4.36
10	2.98	5.32	2.91	4.12	9.16	1.76	9.97	6.88	5.39
11	2.83	3.44	3.28	7.85	1.00	0.93	8.77	4.03	4.01
12	4.51	0.59	5.84	9.87	8.65	3.94	7.18	0.23	5.10
13	4.49	0.69	3.65	8.78	4.96	8.96	3.77	1.43	4.59
14	6.57	8.08	4.85	6.75	7.92	0.27	9.69	4.04	6.02
15	8.35	1.09	8.63	8.04	7.23	2.12	2.57	9.59	5.95
16	5.24	9.53	6.08	8.21	3.61	7.07	6.65	7.63	6.75
17	7.89	5.50	3.09	0.71	6.47	5.49	6.47	4.95	5.07
18	8.36	7.27	2.35	9.04	0.58	2.26	3.01	7.90	5.10
19	3.76	1.01	9.61	9.65	7.89	9.98	6.28	4.34	6.56
20	9.94	6.84	3.38	5.53	0.26	8.53	5.72	5.12	5.66
21	7.25	9.10	0.34	2.88	4.66	2.65	6.37	7.63	5.11
22	7.18	7.14	5.38	0.04	4.09	6.47	4.96	4.23	4.94
23	8.69	5.04	4.90	2.94	2.00	4.23	4.13	0.97	4.11
24	3.46	6.33	0.48	9.35	0.23	1.18	7.97	6.37	4.42
25	2.37	7.26	7.16	1.24	5.26	2.80	3.55	3.84	4.19
26	2.16	8.30	7.17	3.32	2.96	1.30	9.11	0.31	4.33
27

1	A	B	C
1	X	Y	\(X - 2 Y\)
2	6	6	-6
3	5	4	-3
4	8	1	6
5	6	5	-4
6	6	3	0
7	8	1	6
8	6	4	-2
9	5	4	-3
10	7	4	-1
11	8	3	2
12	6	2	2
13	5	1	3
14	6	1	4
15	5	4	-3
16	7	2	3
17	5	2	1
18	4	4	-4
19	5	0	5
20	5	1	3
21	4	2	0
nn

\cline { 2 - 3 } \multicolumn{1}{c\|}{}	number of samples	\(S _ { A } ^ { 2 }\)
Site \(A\)	13	6.39

P (cure)	0.2	0.3	0.4	0.5
P (Type II error)	0.5880	\(r\)	0.8565	\(s\)

\(p\)	0.10	0.15	0.20	0.25
Power	0.07	\(s\)	0.32	0.47

5.05a Sample mean distribution: central limit theorem

Edexcel S3 Q7

Edexcel S4 2007 June Q5

Edexcel S4 2008 June Q6

Edexcel S4 2009 June Q3

Edexcel S4 2010 June Q3

OCR MEI Further Statistics B AS 2018 June Q4

OCR MEI Further Statistics B AS 2019 June Q1

OCR MEI Further Statistics B AS 2022 June Q3

OCR MEI Further Statistics B AS Specimen Q6

OCR MEI Further Statistics Major 2019 June Q4

OCR MEI Further Statistics Major 2023 June Q4

OCR MEI Further Statistics Major 2024 June Q5

OCR MEI Further Statistics Major 2024 June Q11

OCR MEI Further Statistics Major 2020 November Q10

OCR MEI Further Statistics Major 2021 November Q1

OCR MEI Further Statistics Major 2021 November Q11

WJEC Further Unit 5 2023 June Q5

Edexcel FS1 2019 June Q3

Edexcel FS1 2020 June Q7

Edexcel FS1 2021 June Q3

Edexcel FS2 2022 June Q5

Edexcel FS2 2023 June Q2

OCR S2 2007 June Q4

AQA S1 2005 January Q2

AQA S1 2005 January Q4