| Exam Board | Edexcel |
|---|---|
| Module | FS1 (Further Statistics 1) |
| Year | 2023 |
| Session | June |
| Marks | 15 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Chi-squared goodness of fit: Binomial |
| Difficulty | Standard +0.8 This is a comprehensive chi-squared goodness of fit question requiring multiple techniques: using binomial symmetry to find expected frequencies, conducting a full hypothesis test with pooling, estimating parameters from data, and understanding degrees of freedom adjustment when parameters are estimated. While each individual step is standard for Further Statistics 1, the multi-part nature, parameter estimation, and conceptual understanding of degrees of freedom make this moderately challenging but within expected scope for UFM. |
| Spec | 5.02b Expectation and variance: discrete random variables5.02c Linear coding: effects on mean and variance5.06b Fit prescribed distribution: chi-squared test5.06c Fit other distributions: discrete and continuous |
| Number of heads | 0 | 1 | 2 | 3 | 4 | 5 |
| Frequency | 3 | 10 | 45 | 62 | 38 | 12 |
| Number of heads | 0 | 1 | 2 | 3 | 4 | 5 |
| Expected frequency | \(r\) | 26.56 | \(s\) | \(s\) | 26.56 | \(r\) |
| Number of heads | 0 | 1 | 2 | 3 | 4 | 5 |
| Expected frequency | 2.07 | 14.65 | 41.44 | 58.63 | 41.47 | 11.74 |
| Answer | Marks | Guidance |
|---|---|---|
| (a) \([X \sim \text{B}(5, 0.5)]\) P\((X = 0) = \text{P}(X = 5) = 0.03125\) or P\((X = 2)\) or P\((X = 3) = 0.3125\) or 0.5^5 or 5C2 0.5^5 | M1 | 1.1b |
| \([\) multiply by 170 to get\(]\) \(r = \mathbf{5.31(25)}\) ; \(s = \mathbf{53.1(25)}\) | A1; A1 | 1.1b(x2) |
| (b) \(H_0 : \text{B}(5, 0.5) \text{ is a suitable model}\) \(H_1 : \text{B}(5, 0.5) \text{ is NOT a } \ldots\) | B1 | 2.5 |
| \((\alpha-E_i)^2\) | \((0-4.31)^2\) | \((10-28.56)^2\) |
| / \(E_i\) | / 5.31 | / 28.56 |
| \(= 1.00...\) | \(= 8.56...\) | \(= 1.23...\) |
| \(\alpha_i^2 / E_i\) | \(= 1.69...\) | \(= 3.76...\) |
| M1 | 1.1b | |
| \(\sum \frac{(O_i - E_i)^2}{E_i}\) or \(\sum \frac{O_i^2}{E_i} - 170 = 27.4\ldots\) awrt \(\mathbf{27.4}\) or awrt 27.5 | A1 | 1.1b |
| Degrees of freedom is \(6 - 1 - \mathbf{5}\), and critical value is \(\mathbf{11.07(0)}\) \([\text{Significant result}]\) Marcus' model/B\((5, 0.5)\) is not a good fit. (o.e.) | B1 ft B1 ft, A1 | 1.1b(x2), 2.2b |
| (c) \(\hat{p} = \left[\frac{0 \times 3 + 1 \times 10 + \ldots + 5 \times 12}{170 \times 5}\right] = 0.58588\ldots\) awrt \(\mathbf{0.586}\) | B1 | 1.1b |
| (d)(i) Need to pool (first 2) cells (0 and 1 since \(E(0) < 5\)) and use of \(\hat{p}\) | M1, A1 | 2.4, 1.1b |
| Degrees of freedom: 5 groups \(- 2\) constraints \(= \mathbf{3}\) | B1 ft | 1.1b |
| (ii) Critical value is \(\mathbf{7.815}\) | (3) | |
| (e)(i) Nima's model is a good fit (since \(1.62 < '7.815'\))/Marcus' not and this suggests coin is biased/probability of head approx. 0.6 | B1 | 2.4 |
| (ii) Nima's test suggests binomial is a good model and therefore independence of spins is a reasonable assumption | B1 | 2.2b |
**(a)** $[X \sim \text{B}(5, 0.5)]$ P$(X = 0) = \text{P}(X = 5) = 0.03125$ or P$(X = 2)$ or P$(X = 3) = 0.3125$ or 0.5^5 or 5C2 0.5^5 | M1 | 1.1b |
$[$ multiply by 170 to get$]$ $r = \mathbf{5.31(25)}$ ; $s = \mathbf{53.1(25)}$ | A1; A1 | 1.1b(x2) | 1st A1 for $r = \text{awrt } 53.1$ (condone $\frac{425}{16}$)
**(b)** $H_0 : \text{B}(5, 0.5) \text{ is a suitable model}$ $H_1 : \text{B}(5, 0.5) \text{ is NOT a } \ldots$ | B1 | 2.5 |
| $(\alpha-E_i)^2$ | $(0-4.31)^2$ | $(10-28.56)^2$ | $(48-53.13)^2$ | $(62-53.1)^2$ | $(38-28.56)^2$ | $(12-5.31)^2$ |
| --- | --- | --- | --- | --- | --- | --- |
| / $E_i$ | / 5.31 | / 28.56 | / 53.1 | / 53.1 | / 28.56 | / 5.31 |
| | $= 1.00...$ | $= 8.56...$ | $= 1.23...$ | $= 1.48...$ | $= 4.92...$ | $= 8.41...$ |
| $\alpha_i^2 / E_i$ | $= 1.69...$ | $= 3.76...$ | $= 38.1...$ | $= 72.3...$ | $= 54.3...$ | $= 27.1...$ |
| M1 | 1.1b |
$\sum \frac{(O_i - E_i)^2}{E_i}$ or $\sum \frac{O_i^2}{E_i} - 170 = 27.4\ldots$ awrt $\mathbf{27.4}$ or awrt **27.5** | A1 | 1.1b |
Degrees of freedom is $6 - 1 - \mathbf{5}$, and critical value is $\mathbf{11.07(0)}$ $[\text{Significant result}]$ Marcus' model/B$(5, 0.5)$ is not a good fit. (o.e.) | B1 ft B1 ft, A1 | 1.1b(x2), 2.2b | (6)
**(c)** $\hat{p} = \left[\frac{0 \times 3 + 1 \times 10 + \ldots + 5 \times 12}{170 \times 5}\right] = 0.58588\ldots$ awrt $\mathbf{0.586}$ | B1 | 1.1b | (1)
**(d)(i)** Need to pool (first 2) cells (0 and 1 since $E(0) < 5$) and use of $\hat{p}$ | M1, A1 | 2.4, 1.1b |
Degrees of freedom: 5 groups $- 2$ constraints $= \mathbf{3}$ | B1 ft | 1.1b |
**(ii)** Critical value is $\mathbf{7.815}$ | | (3)
**(e)(i)** Nima's model is a good fit (since $1.62 < '7.815'$)/Marcus' not and this suggests coin is biased/probability of head approx. 0.6 | B1 | 2.4 |
**(ii)** Nima's test suggests binomial is a good model and therefore independence of spins is a reasonable assumption | B1 | 2.2b | (2)
**(Total: 15 marks)**
---
\begin{enumerate}
\item In a class experiment, each day for 170 days, a child is chosen at random and spins a large cardboard coin 5 times and the number of heads is recorded.\\
The results are summarised in the following table.
\end{enumerate}
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Number of heads & 0 & 1 & 2 & 3 & 4 & 5 \\
\hline
Frequency & 3 & 10 & 45 & 62 & 38 & 12 \\
\hline
\end{tabular}
\end{center}
Marcus believes that a $\mathrm { B } ( 5,0.5 )$ distribution can be used to model these data and he calculates expected frequencies, to 2 decimal places, as follows
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Number of heads & 0 & 1 & 2 & 3 & 4 & 5 \\
\hline
Expected frequency & $r$ & 26.56 & $s$ & $s$ & 26.56 & $r$ \\
\hline
\end{tabular}
\end{center}
(a) Find the value of $r$ and the value of $s$\\
(b) Carry out a suitable test, at the $5 \%$ level of significance, to determine whether or not the $\mathrm { B } ( 5,0.5 )$ distribution is a good model for these data.\\
You should state clearly your hypotheses, the test statistic and the critical value used.
Nima believes that a better model for these data would be $\mathrm { B } ( 5 , p )$\\
(c) Find a suitable estimate for $p$
To test her model, Nima uses this value of $p$, to calculate expected frequencies as follows
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Number of heads & 0 & 1 & 2 & 3 & 4 & 5 \\
\hline
Expected frequency & 2.07 & 14.65 & 41.44 & 58.63 & 41.47 & 11.74 \\
\hline
\end{tabular}
\end{center}
The test statistic for Nima's test is 1.62 (to 3 significant figures)\\
(d) State,\\
(i) giving your reasons, the degrees of freedom\\
(ii) the critical value\\
that Nima should use for a test at the 5\% significance level.\\
(e) With reference to Marcus' and Nima's test results, comment on\\
(i) the probability of the coin landing on heads,\\
(ii) the independence of the spins of the coin.
Give reasons for your answers.
\hfill \mbox{\textit{Edexcel FS1 2023 Q3 [15]}}