| Exam Board | Edexcel |
|---|---|
| Module | S3 (Statistics 3) |
| Year | 2022 |
| Session | January |
| Marks | 14 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Chi-squared goodness of fit: Normal |
| Difficulty | Standard +0.3 This is a standard chi-squared goodness of fit test with normal distribution. Part (a) requires simple calculation using total frequency = 100. Part (b) is routine hypothesis testing with given test statistic. Part (c) tests understanding of degrees of freedom when parameters are estimated. All steps are textbook procedures with no novel problem-solving required, making it slightly easier than average. |
| Spec | 2.04e Normal distribution: as model N(mu, sigma^2)2.04f Find normal probabilities: Z transformation5.06c Fit other distributions: discrete and continuous |
| Weight of strawberries (grams) | Number of baskets |
| 302-303 | 5 |
| 304-305 | 13 |
| 306-307 | 10 |
| 308-309 | 18 |
| 310-311 | 25 |
| 312-313 | 20 |
| 314-315 | 5 |
| 316-317 | 4 |
| Weight of strawberries (s, grams) | Expected frequency |
| \(s \leqslant 303.5\) | \(a\) |
| \(303.5 < s \leqslant 305.5\) | 7.8 |
| \(305.5 < s \leqslant 307.5\) | 13.6 |
| \(307.5 < s \leqslant 309.5\) | 18.4 |
| \(309.5 < s \leqslant 311.5\) | 19.6 |
| \(311.5 < s \leqslant 313.5\) | 16.3 |
| \(313.5 < s \leqslant 315.5\) | 10.6 |
| \(s > 315.5\) | \(b\) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(P(S < 303.5) = P\left(Z < \frac{303.5-310}{4}\right)\) or \(P(S > 315.5) = P\left(Z > \frac{315.5-310}{4}\right)\) | M1 | For standardising with 303.5 or 315.5, 310 and 4 |
| \(= 0.05208\) or \(0.084565\ldots\) awrt 0.052 or awrt 0.084/0.085 | A1 | awrt 0.052 or awrt 0.084/0.085 |
| So \(a = 5.2\) or \(b = 8.5\) awrt 5.2 or awrt 8.4/8.5 | A1 | Either correct value |
| e.g. \(b = 100 - 10.6 - 16.3 - 19.6 - 18.4 - 13.6 - 7.8 - \text{'5.2'}\) | M1 | A complete method to find the second missing value |
| Both \(a = 5.2\) and \(b = 8.5\) awrt 5.2/5.3 and awrt 8.4/8.5 | A1 | Both correct values |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(H_0\): The normal distribution \(N(310, 16)\) is a suitable model/data are consistent with the model. \(H_1\): The normal distribution \(N(310, 16)\) is not a suitable model/data are not consistent with the model. | B1 | Both hypotheses correct. If mentioning normal, must mention \(N(310,16)\) at least once |
| \(\left[X^2 =\right]\sum\frac{(O-E)^2}{E} = \frac{(5-\text{'5.2'})^2}{\text{'5.2'}} + \frac{(4-\text{'8.5'})^2}{\text{'8.5'}} + 9.71\) | M1 M1 | M1 for either term shown; M1 for complete method to find \(\sum\frac{(O-E)^2}{E}\), e.g. \(9.71 +\) 2 additional terms (independent of 1st M1) |
| \(= 12.10\ldots\) awrt 12.0 to 12.1 | A1 | awrt 12.0 to 12.1 |
| \(\nu = 7\) | B1 | \(\nu = 7\) implied by correct critical value of 14.067 |
| \(\chi^2_7(0.05) = 14.067\) | B1ft | 14.067; may see 5.991, 7.815, 9.488, 11.070, 12.592 |
| Not in CR/not significant/do not reject \(H_0\). There is not sufficient evidence to suggest \(N(310,16)\) is not a suitable model/model is suitable/data are consistent with the model | A1 | Dependent on 2nd M1. Correct contextualised conclusion stating model is suitable, consistent with their \(X^2\) and \(\chi^2\) critical value. If no hypotheses or hypotheses wrong way round do not award. |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(\nu = 8 - 3 = 5\) / two parameters estimated so additional degrees of freedom subtracted | M1 | A statement that implies 2 additional degrees of freedom are subtracted |
| Therefore the critical value is reduced/now 11.070 | A1 | A correct conclusion from correct reasoning |
# Question 6:
## Part (a):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $P(S < 303.5) = P\left(Z < \frac{303.5-310}{4}\right)$ or $P(S > 315.5) = P\left(Z > \frac{315.5-310}{4}\right)$ | M1 | For standardising with 303.5 or 315.5, 310 and 4 |
| $= 0.05208$ or $0.084565\ldots$ awrt 0.052 or awrt 0.084/0.085 | A1 | awrt 0.052 or awrt 0.084/0.085 |
| So $a = 5.2$ or $b = 8.5$ awrt 5.2 or awrt 8.4/8.5 | A1 | Either correct value |
| e.g. $b = 100 - 10.6 - 16.3 - 19.6 - 18.4 - 13.6 - 7.8 - \text{'5.2'}$ | M1 | A complete method to find the second missing value |
| Both $a = 5.2$ and $b = 8.5$ awrt 5.2/5.3 and awrt 8.4/8.5 | A1 | Both correct values |
## Part (b):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $H_0$: The normal distribution $N(310, 16)$ is a suitable model/data are consistent with the model. $H_1$: The normal distribution $N(310, 16)$ is not a suitable model/data are not consistent with the model. | B1 | Both hypotheses correct. If mentioning normal, must mention $N(310,16)$ at least once |
| $\left[X^2 =\right]\sum\frac{(O-E)^2}{E} = \frac{(5-\text{'5.2'})^2}{\text{'5.2'}} + \frac{(4-\text{'8.5'})^2}{\text{'8.5'}} + 9.71$ | M1 M1 | M1 for either term shown; M1 for complete method to find $\sum\frac{(O-E)^2}{E}$, e.g. $9.71 +$ 2 additional terms (independent of 1st M1) |
| $= 12.10\ldots$ awrt 12.0 to 12.1 | A1 | awrt 12.0 to 12.1 |
| $\nu = 7$ | B1 | $\nu = 7$ implied by correct critical value of 14.067 |
| $\chi^2_7(0.05) = 14.067$ | B1ft | 14.067; may see 5.991, 7.815, 9.488, 11.070, 12.592 |
| Not in CR/not significant/do not reject $H_0$. There is not sufficient evidence to suggest $N(310,16)$ is not a suitable model/model is suitable/data are consistent with the model | A1 | Dependent on 2nd M1. Correct contextualised conclusion stating model is suitable, consistent with their $X^2$ and $\chi^2$ critical value. If no hypotheses or hypotheses wrong way round do not award. |
## Part (c):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $\nu = 8 - 3 = 5$ / two parameters estimated so additional degrees of freedom subtracted | M1 | A statement that implies 2 additional degrees of freedom are subtracted |
| Therefore the critical value is reduced/now 11.070 | A1 | A correct conclusion from correct reasoning |
\begin{enumerate}
\item A farmer sells strawberries in baskets. The contents of each of 100 randomly selected baskets were weighed and the results, given to the nearest gram, are shown below.
\end{enumerate}
\begin{center}
\begin{tabular}{|l|l|}
\hline
Weight of strawberries (grams) & Number of baskets \\
\hline
302-303 & 5 \\
\hline
304-305 & 13 \\
\hline
306-307 & 10 \\
\hline
308-309 & 18 \\
\hline
310-311 & 25 \\
\hline
312-313 & 20 \\
\hline
314-315 & 5 \\
\hline
316-317 & 4 \\
\hline
\end{tabular}
\end{center}
The farmer proposes that the weight of strawberries per basket, in grams, should be modelled by a normal distribution with a mean of 310 g and standard deviation 4 g .
Using his model, the farmer obtains the following expected frequencies.
\begin{center}
\begin{tabular}{|l|l|}
\hline
Weight of strawberries (s, grams) & Expected frequency \\
\hline
$s \leqslant 303.5$ & $a$ \\
\hline
$303.5 < s \leqslant 305.5$ & 7.8 \\
\hline
$305.5 < s \leqslant 307.5$ & 13.6 \\
\hline
$307.5 < s \leqslant 309.5$ & 18.4 \\
\hline
$309.5 < s \leqslant 311.5$ & 19.6 \\
\hline
$311.5 < s \leqslant 313.5$ & 16.3 \\
\hline
$313.5 < s \leqslant 315.5$ & 10.6 \\
\hline
$s > 315.5$ & $b$ \\
\hline
\end{tabular}
\end{center}
(a) Find the value of $a$ and the value of $b$. Give your answers correct to one decimal place.
Before $s \leqslant 303.5$ and $s > 315.5$ are included, for the remaining cells,
$$\sum \frac { ( O - E ) ^ { 2 } } { E } = 9.71$$
(b) Using a 5\% significance level, test whether the data are consistent with the model. You should state your hypotheses, the test statistic and the critical value used.
An alternative model uses estimates for the population mean and standard deviation from the data given.
Using these estimated values no expected frequency is below 5\\
Another test is to be carried out, using a $5 \%$ significance level, to assess whether the data are consistent with this alternative model.\\
(c) State the effect, if any, on the critical value for this test. Give a reason for your answer.
\hfill \mbox{\textit{Edexcel S3 2022 Q6 [14]}}