| Exam Board | Edexcel |
|---|---|
| Module | FS1 (Further Statistics 1) |
| Year | 2019 |
| Session | June |
| Marks | 19 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Chi-squared goodness of fit: Binomial |
| Difficulty | Standard +0.3 This is a standard chi-squared goodness of fit test with straightforward application of the test procedure. Part (a) requires identifying a binomial model (routine), part (b) involves calculating the test statistic from given expected frequencies and comparing to critical value (mechanical), and part (c) requires finding missing expected frequencies by subtraction (arithmetic). All steps are textbook procedures with no novel insight required, making it slightly easier than average. |
| Spec | 2.04b Binomial distribution: as model B(n,p)5.06b Fit prescribed distribution: chi-squared test5.06c Fit other distributions: discrete and continuous |
| Number of oak trees in a square | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 or more |
| Frequency | 1 | 4 | 21 | 23 | 13 | 11 | 7 | 0 |
| Number of oak trees in a square | 0 or 1 | 2 | 3 | 4 | 5 | 6 |
| Expected frequency | 5.53 | 14.89 | 24.26 | 22.24 | 10.87 | 2.21 |
| Number of oak trees in a square | 0 or 1 | 2 | 3 | 4 | 5 | 6 or more |
| Expected frequency | 12.69 | 16.07 | \(s\) | 14.58 | \(t\) | 9.37 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \([T = \text{no. of oak trees in a square}]\) \(T \sim \text{Binomial}\) | M1 | For choosing binomial |
| \(T \sim B(6, p)\) | A1 | A1 for \(B(6,p)\); can be in words, allow \(B(6, 0.55)\) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| Expected frequency for 6 is less than 5 so pool: new \(E_i = 13.08\) | M1 | For pooling last 2 classes (\(E_i = 13.08\) but accept 13.1) |
| \(\frac{(O_i - E_i)^2}{E_i}\): 0.051, 2.51, 0.0654, 3.84, 1.85 | M1, A1 | 2nd M1 for at least 3 correct values; 1st A1 for awrt 8.31 |
| \(\frac{O_i^2}{E_i}\): 4.521, 29.617, 21.805, 7.599, 24.771 | ||
| \(\sum \frac{(O_i - E_i)^2}{E_i} = 8.313\) | M1, A1 | |
| \(p\) needed estimating (\(\hat{p} = 0.55\)) so \(\nu = 5 - 2 = 3\); cv 7.815 | B1, B1ft | 1st B1 for 3 degrees of freedom; 2nd B1ft for cv 7.815 (e.g. \(\nu = 4\) use 9.488) |
| Significant result, so Liam's model is not suitable | M1, A1 | 3rd M1 for correct conclusion; 2nd A1 for conclusion in context with all other marks scored |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \([R = \text{no. of oak trees in a square for Simone's model}]\) \(R \sim Po(3.3)\) | M1 | For selecting correct model \(Po(3.3)\); allow \(Po(\text{awrt } 3.3)\) |
| Correct expression for \(s\) or \(t\) using Poisson | M1 | For use of model with expression or correct value for \(s\) or \(t\) |
| \(s = \mathbf{17.67}\) and \(t = \mathbf{9.62}\) | A1, A1 | 1st A1 for one correct; 2nd A1 for both correct (allow awrt 2dp) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(H_0\): Poisson is a good fit (for no. of oak trees per square) | B1 | For correct hypotheses; must mention Poisson: use of \(Po(3.3)\) is B0 |
| \(H_1\): Poisson is not a good fit (for no. of oak trees per square) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| No pooling needed so degrees of freedom is \(6 - 2 = 4\) | B1 | For correct degrees of freedom \(\nu = 4\) only |
| Critical value is 9.488 (accept 9.49) | B1 | For selecting correct critical value (9.488 only) |
| Not significant so Poisson (or Simone's) model is suitable | B1 | For not significant conclusion based on 8.749 vs cv (condone use of \(Po(3.3)\)) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| Poisson model has better fit so suggests oak trees occur at random or binomial suggests deliberately planted or cultivated | B1 | For choosing Poisson as better or stating Poisson implies wild or bino'l implies cultivated |
| Therefore the forest is likely to be wild not cultivated | B1 | Dep on rejecting bin and accepting Poisson; if tests give same results then 2nd B0 automatically |
# Question 4:
## Part (a)
| Answer | Mark | Guidance |
|--------|------|----------|
| $[T = \text{no. of oak trees in a square}]$ $T \sim \text{Binomial}$ | M1 | For choosing binomial |
| $T \sim B(6, p)$ | A1 | A1 for $B(6,p)$; can be in words, allow $B(6, 0.55)$ |
## Part (b)
| Answer | Mark | Guidance |
|--------|------|----------|
| Expected frequency for 6 is less than 5 so pool: new $E_i = 13.08$ | M1 | For pooling last 2 classes ($E_i = 13.08$ but accept 13.1) |
| $\frac{(O_i - E_i)^2}{E_i}$: 0.051, 2.51, 0.0654, 3.84, 1.85 | M1, A1 | 2nd M1 for at least 3 correct values; 1st A1 for awrt 8.31 |
| $\frac{O_i^2}{E_i}$: 4.521, 29.617, 21.805, 7.599, 24.771 | | |
| $\sum \frac{(O_i - E_i)^2}{E_i} = 8.313$ | M1, A1 | |
| $p$ needed estimating ($\hat{p} = 0.55$) so $\nu = 5 - 2 = 3$; cv 7.815 | B1, B1ft | 1st B1 for 3 degrees of freedom; 2nd B1ft for cv 7.815 (e.g. $\nu = 4$ use 9.488) |
| Significant result, so Liam's model is not suitable | M1, A1 | 3rd M1 for correct conclusion; 2nd A1 for conclusion in context with all other marks scored |
## Part (c)
| Answer | Mark | Guidance |
|--------|------|----------|
| $[R = \text{no. of oak trees in a square for Simone's model}]$ $R \sim Po(3.3)$ | M1 | For selecting correct model $Po(3.3)$; allow $Po(\text{awrt } 3.3)$ |
| Correct expression for $s$ or $t$ using Poisson | M1 | For use of model with expression or correct value for $s$ or $t$ |
| $s = \mathbf{17.67}$ and $t = \mathbf{9.62}$ | A1, A1 | 1st A1 for one correct; 2nd A1 for both correct (allow awrt 2dp) |
## Part (d)
| Answer | Mark | Guidance |
|--------|------|----------|
| $H_0$: Poisson is a good fit (for no. of oak trees per square) | B1 | For correct hypotheses; must mention Poisson: use of $Po(3.3)$ is B0 |
| $H_1$: Poisson is not a good fit (for no. of oak trees per square) | | |
## Part (e)
| Answer | Mark | Guidance |
|--------|------|----------|
| No pooling needed so degrees of freedom is $6 - 2 = 4$ | B1 | For correct degrees of freedom $\nu = 4$ only |
| Critical value is 9.488 (accept 9.49) | B1 | For selecting correct critical value (9.488 only) |
| Not significant so Poisson (or Simone's) model is suitable | B1 | For not significant conclusion based on 8.749 vs cv (condone use of $Po(3.3)$) |
## Part (f)
| Answer | Mark | Guidance |
|--------|------|----------|
| Poisson model has better fit so suggests oak trees occur at random **or** binomial suggests deliberately planted or cultivated | B1 | For choosing Poisson as better or stating Poisson implies wild or bino'l implies cultivated |
| Therefore the forest is likely to be wild not cultivated | B1 | Dep on rejecting bin and accepting Poisson; if tests give same results then 2nd B0 automatically |
---
\begin{enumerate}
\item Liam and Simone are studying the distribution of oak trees in some woodland. They divided the woodland into 80 equal squares and recorded the number of oak trees in each square. The results are summarised in Table 1 below.
\end{enumerate}
\begin{table}[h]
\begin{center}
\begin{tabular}{ | l | l | l | l | l | l | l | l | c | }
\hline
Number of oak trees in a square & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 or more \\
\hline
Frequency & 1 & 4 & 21 & 23 & 13 & 11 & 7 & 0 \\
\hline
\end{tabular}
\captionsetup{labelformat=empty}
\caption{Table 1}
\end{center}
\end{table}
Liam believes that the oak trees were deliberately planted, with 6 oak trees per square and that a constant proportion $p$ of the oak trees survived.\\
(a) Suggest the model Liam should use to describe the number of oak trees per square.
Liam decides to test whether or not his model is suitable and calculates the expected frequencies given in Table 2.
\begin{table}[h]
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Number of oak trees in a square & 0 or 1 & 2 & 3 & 4 & 5 & 6 \\
\hline
Expected frequency & 5.53 & 14.89 & 24.26 & 22.24 & 10.87 & 2.21 \\
\hline
\end{tabular}
\captionsetup{labelformat=empty}
\caption{Table 2}
\end{center}
\end{table}
(b) Showing your working clearly, complete the test using a $5 \%$ level of significance. You should state your critical value and conclusion clearly.
Simone believes that a Poisson distribution could be used to model the number of oak trees per square. She calculates the expected frequencies given in Table 3.
\begin{table}[h]
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Number of oak trees in a square & 0 or 1 & 2 & 3 & 4 & 5 & 6 or more \\
\hline
Expected frequency & 12.69 & 16.07 & $s$ & 14.58 & $t$ & 9.37 \\
\hline
\end{tabular}
\captionsetup{labelformat=empty}
\caption{Table 3}
\end{center}
\end{table}
(c) Find the value of $s$ and the value of $t$, giving your answers to 2 decimal places.\\
(d) Write down hypotheses to test the suitability of Simone's model.
The test statistic for this test is 8.749\\
(e) Complete the test. Use a $5 \%$ level of significance and state your critical value and conclusion clearly.\\
(f) Using the results of these tests, explain whether the origin of this woodland is likely to be cultivated or wild.
\hfill \mbox{\textit{Edexcel FS1 2019 Q4 [19]}}