| Exam Board | Edexcel |
|---|---|
| Module | FS1 (Further Statistics 1) |
| Year | 2020 |
| Session | June |
| Marks | 13 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Chi-squared goodness of fit: Binomial |
| Difficulty | Standard +0.3 This is a straightforward chi-squared goodness of fit test with standard structure: verify a given proportion, perform a hypothesis test with provided test statistic (no calculation needed), and qualitatively discuss how data changes affect the test statistic. Part (a) is simple arithmetic, part (b) requires standard hypothesis test procedure with degrees of freedom justification (a routine FS1 skill), and part (c) requires conceptual understanding but no calculation. This is easier than average A-level maths as it's a textbook application with the hardest calculation already done. |
| Spec | 2.04b Binomial distribution: as model B(n,p)2.04c Calculate binomial probabilities5.06b Fit prescribed distribution: chi-squared test |
| Number of defective pins | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
| Observed frequency | 19 | 11 | 7 | 2 | 0 | 1 | 0 |
| Number of defective pins | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
| Observed frequency | 19 | 11 | 6 | 3 | 1 | 0 | 0 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(p = \frac{(0)+11+14+6+(0)+5+(0)}{6\times 40}\) | M1 | Correct expression for \(p\); allow \(\frac{36}{240}\) but not \(\frac{6}{40}\) on its own |
| \(p = \mathbf{0.15}\)* | A1*cso | \(p=0.15\) stated and no incorrect working seen |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(X \sim B(6, 0.15)\); expected frequencies calculated | M1 | Attempting to find expected frequencies, at least 2 correct |
| \(40 \times P(X=x)\): \(7.04\ldots, 1.65\ldots, 0.219\ldots, 0.015\ldots, 0.00\ldots\) | M1 | Recognising need to combine cells (sight of awrt 8.94 implies M1M1) |
| Combine last 5 cells / only 3 cells in total | A1 | Combining cells for \(X \geq 2\) (to make 3 cells) |
| 2 is subtracted (2 restrictions) and proportion used from data (and 1 equal totals) | B1 | Justifying why 2 is subtracted with \(p\) calculated from data |
| \(3 - 2 = 1\) degree of freedom | A1 | |
| \(H_0\): Binomial distribution is a suitable model; \(H_1\): Binomial distribution is not a suitable model | B1 | 0.15 must not be included |
| Critical value \(\chi^2_{(1,0.10)} = 2.705\) or \(2.706\) | B1ft | Correct critical value (ft their df) |
| Test statistic not in critical region, insufficient evidence to reject \(H_0\) \((2.689 < 2.705/6)\); data consistent with binomial/engineer's model | B1ft | Correct inference (ft comparison of CV with 2.689) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| Total proportion of defective pins remains the same | M1 | Determining number/proportion (\(p=0.15\)) of defective pins has not changed |
| Cells for \(X \geq 2\) are still combined in the test | M1 | Understanding cells for \(X \geq 2\) still combined |
| So there is no change to the value of the test statistic | A1 | Dep on both M1s |
# Question 5:
## Part (a):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $p = \frac{(0)+11+14+6+(0)+5+(0)}{6\times 40}$ | M1 | Correct expression for $p$; allow $\frac{36}{240}$ but not $\frac{6}{40}$ on its own |
| $p = \mathbf{0.15}$* | A1*cso | $p=0.15$ stated and no incorrect working seen |
## Part (b):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $X \sim B(6, 0.15)$; expected frequencies calculated | M1 | Attempting to find expected frequencies, at least 2 correct |
| $40 \times P(X=x)$: $7.04\ldots, 1.65\ldots, 0.219\ldots, 0.015\ldots, 0.00\ldots$ | M1 | Recognising need to combine cells (sight of awrt 8.94 implies M1M1) |
| Combine last 5 cells / only 3 cells in total | A1 | Combining cells for $X \geq 2$ (to make 3 cells) |
| 2 is subtracted (2 restrictions) and proportion used from data (and 1 equal totals) | B1 | Justifying why 2 is subtracted with $p$ calculated from data |
| $3 - 2 = 1$ degree of freedom | A1 | |
| $H_0$: Binomial distribution is a suitable model; $H_1$: Binomial distribution is not a suitable model | B1 | 0.15 must **not** be included |
| Critical value $\chi^2_{(1,0.10)} = 2.705$ or $2.706$ | B1ft | Correct critical value (ft their df) |
| Test statistic not in critical region, insufficient evidence to reject $H_0$ $(2.689 < 2.705/6)$; data consistent with binomial/engineer's model | B1ft | Correct inference (ft comparison of CV with 2.689) |
## Part (c):
| Answer/Working | Mark | Guidance |
|---|---|---|
| Total proportion of defective pins remains the same | M1 | Determining number/proportion ($p=0.15$) of defective pins has not changed |
| Cells for $X \geq 2$ are still combined in the test | M1 | Understanding cells for $X \geq 2$ still combined |
| So there is no change to the value of the test statistic | A1 | Dep on both M1s |
---
\begin{enumerate}
\item A factory produces pins.
\end{enumerate}
An engineer selects 40 independent random samples of 6 pins produced at the factory and records the number of defective pins in each sample.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | }
\hline
Number of defective pins & 0 & 1 & 2 & 3 & 4 & 5 & 6 \\
\hline
Observed frequency & 19 & 11 & 7 & 2 & 0 & 1 & 0 \\
\hline
\end{tabular}
\end{center}
(a) Show that the proportion of defective pins in the 40 samples is 0.15
The engineer suggests that the number of defective pins in a sample of 6 can be modelled using a binomial distribution. Using the information from the sample above, a test is to be carried out at the $10 \%$ significance level, to see whether the data are consistent with the engineer's suggested model.
The value of the test statistic for this test is 2.689\\
(b) Justifying the degrees of freedom used, carry out the test, at the $10 \%$ significance level, to see whether the data are consistent with the engineer's suggested model. State your hypotheses clearly.
The engineer later discovers that the previously recorded information was incorrect. The data should have been as follows.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | }
\hline
Number of defective pins & 0 & 1 & 2 & 3 & 4 & 5 & 6 \\
\hline
Observed frequency & 19 & 11 & 6 & 3 & 1 & 0 & 0 \\
\hline
\end{tabular}
\end{center}
(c) Describe the effect this would have on the value of the test statistic that should be used for the hypothesis test.\\
Give reasons for your answer.
\hfill \mbox{\textit{Edexcel FS1 2020 Q5 [13]}}