| Exam Board | Edexcel |
|---|---|
| Module | FS1 AS (Further Statistics 1 AS) |
| Year | 2022 |
| Session | June |
| Marks | 9 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Chi-squared goodness of fit: Binomial |
| Difficulty | Standard +0.8 This is a substantial multi-part chi-squared question requiring understanding of test conditions, degrees of freedom with given parameters, execution of the test with table merging, and most challengingly, part (d) requires students to estimate a better parameter value for the binomial model—a non-routine extension beyond standard textbook exercises that demands statistical insight and iterative problem-solving. |
| Spec | 5.06b Fit prescribed distribution: chi-squared test5.06c Fit other distributions: discrete and continuous |
| Number of heads | 0 | 1 | 2 | 3 | 4 | 5 |
| Observed frequency | 2 | 27 | 93 | 181 | 146 | 51 |
| Number of heads | 0 | 1 | 2 | 3 | 4 | 5 |
| Expected frequency | 5.12 | 38.40 | 115.20 | 172.80 | 129.60 | 38.88 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| Not all the expected frequencies are likely to be over 5, or the sample size is too small. | B1 | For recognising the limitations of using a chi-squared model on small sample sizes e.g. 20 is not large, not enough data, sample needs to be larger, you may need to combine cells. |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| 5 degrees of freedom since the parameter is not estimated from the data [and the totals agree] | B1 | For 5 [dof] and a correct reason indicating parameter (probability) is not estimated. Condone missing comment about totals |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(H_0\): B(5, 0.6) is a suitable model; \(H_1\): B(5, 0.6) is not a suitable model | B1 | Both hypotheses correct. Must have B(5,0.6) or binomial with \(n=5\) and \(p=0.6\) (in at least 1) and be attached to \(H_0\) and \(H_1\) the right way round. |
| \(\sum\dfrac{(O-E)^2}{E} = \dfrac{(2-5.12)^2}{5.12} + ... + \dfrac{(51-38.88)^2}{38.88}\) | M1 | Attempting to find the test statistic \(\sum\dfrac{(O-E)^2}{E}\) (at least two correct expressions, fractions or decimals) or \(\chi^2 = \sum\dfrac{O^2}{E} = \dfrac{(2)^2}{"5.12"} + ... + \dfrac{51^2}{38.88} - 500\) |
| \(= 15.8063...\) awrt 16 | A1 | awrt 16 |
| \([15.8 >]\ \chi^2_{5,(0.05)} = 11.070\) | B1ft | Allow 11.07 or awrt 11.070. For correct CV, ft their answer to (b). NB dof 3 is 7.815, dof 4 is 9.488 |
| B(5, 0.6) is not a suitable model [for the number of heads spun] | A1ft | Ft "their 11.070" and their CV or \(p\) value. Correct conclusion independent of hypotheses. Do not accept contradicting statements. |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(\dfrac{[0\times2]+(1\times27)+(2\times93)+(3\times181)+(4\times146)+(5\times51)}{500} [= 3.19]\) | M1 | For a correct method using the data to improve the model. Implied by 3.19 |
| \(\text{B}([5],\ p = \dfrac{3.19}{5} = 0.638)\) | A1 | Correct model. Condone use of any value of \(n\). Accept Binomial with \(p = 0.638\) |
# Question 3:
## Part (a)
| Answer/Working | Mark | Guidance |
|---|---|---|
| Not all the expected frequencies are likely to be over 5, or the sample size is too small. | B1 | For recognising the limitations of using a chi-squared model on small sample sizes e.g. 20 is not large, not enough data, sample needs to be larger, you may need to combine cells. |
## Part (b)
| Answer/Working | Mark | Guidance |
|---|---|---|
| 5 degrees of freedom since the parameter is not estimated from the data [and the totals agree] | B1 | For 5 [dof] and a correct reason indicating parameter (probability) is not estimated. Condone missing comment about totals |
## Part (c)
| Answer/Working | Mark | Guidance |
|---|---|---|
| $H_0$: B(5, 0.6) is a suitable model; $H_1$: B(5, 0.6) is not a suitable model | B1 | Both hypotheses correct. Must have B(5,0.6) or binomial with $n=5$ and $p=0.6$ (in at least 1) and be attached to $H_0$ and $H_1$ the right way round. |
| $\sum\dfrac{(O-E)^2}{E} = \dfrac{(2-5.12)^2}{5.12} + ... + \dfrac{(51-38.88)^2}{38.88}$ | M1 | Attempting to find the test statistic $\sum\dfrac{(O-E)^2}{E}$ (at least two correct expressions, fractions or decimals) or $\chi^2 = \sum\dfrac{O^2}{E} = \dfrac{(2)^2}{"5.12"} + ... + \dfrac{51^2}{38.88} - 500$ |
| $= 15.8063...$ awrt 16 | A1 | awrt 16 |
| $[15.8 >]\ \chi^2_{5,(0.05)} = 11.070$ | B1ft | Allow 11.07 or awrt 11.070. For correct CV, ft their answer to (b). **NB** dof 3 is 7.815, dof 4 is 9.488 |
| B(5, 0.6) is not a suitable model [for the number of heads spun] | A1ft | Ft "their 11.070" and their CV or $p$ value. Correct conclusion independent of hypotheses. Do not accept contradicting statements. |
## Part (d)
| Answer/Working | Mark | Guidance |
|---|---|---|
| $\dfrac{[0\times2]+(1\times27)+(2\times93)+(3\times181)+(4\times146)+(5\times51)}{500} [= 3.19]$ | M1 | For a correct method using the data to improve the model. Implied by 3.19 |
| $\text{B}([5],\ p = \dfrac{3.19}{5} = 0.638)$ | A1 | Correct model. Condone use of any value of $n$. Accept Binomial with $p = 0.638$ |
---
\begin{enumerate}
\item In a game, a coin is spun 5 times and the number of heads obtained is recorded. Tao suggests playing the game 20 times and carrying out a chi-squared test to investigate whether the coin might be biased.\\
(a) Explain why playing the game only 20 times may cause problems when carrying out the test.
\end{enumerate}
Chris decides to play the game 500 times. The results are as follows
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Number of heads & 0 & 1 & 2 & 3 & 4 & 5 \\
\hline
Observed frequency & 2 & 27 & 93 & 181 & 146 & 51 \\
\hline
\end{tabular}
\end{center}
Chris decides to test whether or not the data can be modelled by a binomial distribution, with the probability of a head on each spin being 0.6
She calculates the expected frequencies, to 2 decimal places, as follows
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Number of heads & 0 & 1 & 2 & 3 & 4 & 5 \\
\hline
Expected frequency & 5.12 & 38.40 & 115.20 & 172.80 & 129.60 & 38.88 \\
\hline
\end{tabular}
\end{center}
(b) State the number of degrees of freedom in Chris' test, giving a reason for your answer.\\
(c) Carry out the test at the $5 \%$ level of significance. You should state your hypotheses, test statistic, critical value and conclusion clearly.\\
(d) Showing your working, find an alternative model which would better fit Chris' data.
\hfill \mbox{\textit{Edexcel FS1 AS 2022 Q3 [9]}}