| Exam Board | Edexcel |
|---|---|
| Module | S3 (Statistics 3) |
| Year | 2021 |
| Session | January |
| Marks | 18 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Chi-squared goodness of fit: Normal |
| Difficulty | Standard +0.3 This is a standard chi-squared goodness of fit test with clearly specified parameters. Part (a) requires routine calculation of expected frequencies using given normal distribution parameters, computing chi-squared statistic, and comparing to critical value. Parts (b-d) involve standard formulas for unbiased estimates and straightforward normal probability calculations. All techniques are textbook exercises with no novel problem-solving required, making it slightly easier than average for S3 level. |
| Spec | 5.05b Unbiased estimates: of population mean and variance5.06b Fit prescribed distribution: chi-squared test |
| Length, \(x\) cm | \(x < 5\) | \(5 \leqslant x < 5.5\) | \(5.5 \leqslant x < 6\) | \(6 \leqslant x < 6.5\) | \(x \geqslant 6.5\) |
| Frequency | 6 | 14 | 24 | 26 | 10 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| \(H_0: \text{N}(6, 0.75^2) \text{ is a suitable model for the length of fallen pine cones}\) | B1 | 1st B1 for both hypotheses. Must include the model and mention "length(s)" and "cones" |
| \(H_1: \text{N}(6, 0.75^2) \text{ is NOT a suitable model for the lengths of the pine cones}\) | ||
| e.g. \(E_i: 5 \le x < 5.5 = 80 \times P(5 \le X < 5.5) = 80 \times P(-\frac{4}{3} \le Z < -\frac{2}{3}) = [12.77-12.90]\) OR \(E_i: 6 \le x < 6.5 = 80 \times P(0 \le Z < \frac{2}{3}) = [19.80-19.89]\) OR \(E_i: 5.5 \le x < 6 = 19.80-19.89\) or \(x \ge 6.5 = 40 - "19.80" = 20.11-20.20\) | M1 A1 | 1st A1 for a middle value e.g. awrt 12.77–12.90 inclusive (12.77 is from tables, 12.90 calc); 2nd M1 for use of symmetry to get \(E_i\) for \(5.5 \le x < 6\) (same as \(6 \le x < 6.5\)) or \(x \ge 6.5\) (\(40 - ...\)) |
| Table showing \(E_i\) values in ranges | M1 | |
| \(\sum \frac{(O_i - E_i)}{E_i}\) or \(\sum \frac{O_i^2}{E_i} - 80 = 8.308...\); answer in [8.15 – 8.4] | dM1; A1 | dM1 (dep on 1st M1) for correct attempt to find test statistic...at least one correct term; 3rd A1 for answer in range 8.15-8.4 (inclusive) |
| \(\nu = 5 - 1 = 4 \Rightarrow \chi_4^2(10\%) = 7.779\) | B1; B1ft; A1ft | 2nd B1 for degrees of freedom = 4; 3rd B1ft for correct 10% critical value using their degrees of freedom |
| (significant result so) the data do not support Chrystal's belief | 4th A1ft dep on M3 and cv = awrt 7.78 for contextual conclusion: length, cones, N(\(\mu, \sigma\) not needed) or Chrystal's belief | |
| Subtotal: (10 marks) | ||
| \(\hat{\mu} = \frac{464}{80} = \mathbf{5.8}\) (cm); \(s^2 = \frac{2722.59 - 80 \times "5.8"^2}{79}\) | B1; M1 | B1 for 5.8; M1 for correct expression (ft their mean) |
| \(s^2 = 0.39734...\) awrt 0.397 (cm²) | A1 | A1 for awrt 0.397 (Condone \(\frac{2722.59 - 80 \times "5.8"^2}{80}\)) |
| Subtotal: (3 marks) | ||
| \(\nu = 5 - 3 = 2\); so \(\chi_2^2(10\%) = 4.605\) | B1; B1ft | 1st B1 for degrees of freedom = 2; 2nd B1ft for correct cv (different from their part (a)) ft their df |
| (Not sig') so a normal distribution is a plausible model for length of pine cones | B1ft | 3rd B1ft for correct conclusion in context ft cv ("length" and "cones") Ignore any \(\mu\) or \(\sigma\) |
| Subtotal: (3 marks) | ||
| \(P(X > 7 \mid \mu = 5.8 \text{ and } s = \sigma = 0.63035...) = P\left(Z > \frac{7 - "5.8"}{\sqrt{0.397}...}\right) = P(Z > 1.90...)\) | M1 | M1 for standardising with 7, their 5.8 (≠ 6) and their s.d. from (b). Ignore any ×80 |
| \(= \mathbf{0.028-0.029}\) | A1 | A1 for correct proportion of 0.028 or 0.029. (ISW if correct ans followed by ×80) |
| Subtotal: (2 marks) |
| Answer/Working | Marks | Guidance |
|---|---|---|
| $H_0: \text{N}(6, 0.75^2) \text{ is a suitable model for the length of fallen pine cones}$ | B1 | 1st B1 for both hypotheses. Must include the model and mention "length(s)" and "cones" |
| $H_1: \text{N}(6, 0.75^2) \text{ is NOT a suitable model for the lengths of the pine cones}$ | | |
| e.g. $E_i: 5 \le x < 5.5 = 80 \times P(5 \le X < 5.5) = 80 \times P(-\frac{4}{3} \le Z < -\frac{2}{3}) = [12.77-12.90]$ OR $E_i: 6 \le x < 6.5 = 80 \times P(0 \le Z < \frac{2}{3}) = [19.80-19.89]$ OR $E_i: 5.5 \le x < 6 = 19.80-19.89$ or $x \ge 6.5 = 40 - "19.80" = 20.11-20.20$ | M1 A1 | 1st A1 for a middle value e.g. awrt 12.77–12.90 inclusive (12.77 is from tables, 12.90 calc); 2nd M1 for use of symmetry to get $E_i$ for $5.5 \le x < 6$ (same as $6 \le x < 6.5$) or $x \ge 6.5$ ($40 - ...$) |
| Table showing $E_i$ values in ranges | M1 | |
| $\sum \frac{(O_i - E_i)}{E_i}$ or $\sum \frac{O_i^2}{E_i} - 80 = 8.308...$; answer in **[8.15 – 8.4]** | dM1; A1 | dM1 (dep on 1st M1) for correct attempt to find test statistic...at least one correct term; 3rd A1 for answer in range 8.15-8.4 (inclusive) |
| $\nu = 5 - 1 = 4 \Rightarrow \chi_4^2(10\%) = 7.779$ | B1; B1ft; A1ft | 2nd B1 for degrees of freedom = 4; 3rd B1ft for correct 10% critical value using their degrees of freedom |
| (significant result so) the data do not support Chrystal's belief | | 4th A1ft dep on M3 and cv = awrt 7.78 for contextual conclusion: length, cones, N($\mu, \sigma$ not needed) or Chrystal's belief |
| **Subtotal: (10 marks)** | | |
| $\hat{\mu} = \frac{464}{80} = \mathbf{5.8}$ (cm); $s^2 = \frac{2722.59 - 80 \times "5.8"^2}{79}$ | B1; M1 | B1 for 5.8; M1 for correct expression (ft their mean) |
| $s^2 = 0.39734...$ awrt **0.397** (cm²) | A1 | A1 for awrt 0.397 (Condone $\frac{2722.59 - 80 \times "5.8"^2}{80}$) |
| **Subtotal: (3 marks)** | | |
| $\nu = 5 - 3 = 2$; so $\chi_2^2(10\%) = 4.605$ | B1; B1ft | 1st B1 for degrees of freedom = 2; 2nd B1ft for correct cv (different from their part (a)) ft their df |
| (Not sig') so a normal distribution is a plausible model for length of pine cones | B1ft | 3rd B1ft for correct conclusion in context ft cv ("length" and "cones") Ignore any $\mu$ or $\sigma$ |
| **Subtotal: (3 marks)** | | |
| $P(X > 7 \mid \mu = 5.8 \text{ and } s = \sigma = 0.63035...) = P\left(Z > \frac{7 - "5.8"}{\sqrt{0.397}...}\right) = P(Z > 1.90...)$ | M1 | M1 for standardising with 7, their 5.8 (≠ 6) and their s.d. from (b). Ignore any ×80 |
| $= \mathbf{0.028-0.029}$ | A1 | A1 for correct proportion of 0.028 or 0.029. (ISW if correct ans followed by ×80) |
| **Subtotal: (2 marks)** | | |
**Total: [18 marks]**
---
5. Chrystal is studying the lengths of pine cones that have fallen from a tree. She believes that the length, $X \mathrm {~cm}$, of the pine cones can be modelled by a normal distribution with mean 6 cm and standard deviation 0.75 cm .
She collects a random sample of 80 pine cones and their lengths are recorded in the table below.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | }
\hline
Length, $x$ cm & $x < 5$ & $5 \leqslant x < 5.5$ & $5.5 \leqslant x < 6$ & $6 \leqslant x < 6.5$ & $x \geqslant 6.5$ \\
\hline
Frequency & 6 & 14 & 24 & 26 & 10 \\
\hline
\end{tabular}
\end{center}
\begin{enumerate}[label=(\alph*)]
\item Stating your hypotheses clearly and using a $10 \%$ level of significance, test Chrystal's belief. Show your working clearly and state the expected frequencies, the test statistic and the critical value used.\\
(10)
Chrystal's friend David asked for more information about the lengths of the 80 pine cones. Chrystal told him that
$$\sum x = 464 \quad \text { and } \quad \sum x ^ { 2 } = 2722.59$$
\item Calculate unbiased estimates of the mean and variance of the lengths of the pine cones.
David used the calculations from part (b) to test whether or not the lengths of the pine cones are normally distributed using Chrystal's sample. His test statistic was 3.50 (to 3 significant figures) and he did not pool any classes.
\item Using a $10 \%$ level of significance, complete David's test stating the critical value and the degrees of freedom used.
\item Estimate, to 2 significant figures, the proportion of pine cones from the tree that are longer than 7 cm .
\includegraphics[max width=\textwidth, alt={}, center]{ba3f3f9c-53d2-4e95-b2f3-3f617f1821ed-15_2255_50_314_34}
\end{enumerate}
\hfill \mbox{\textit{Edexcel S3 2021 Q5 [18]}}