| Exam Board | Edexcel |
|---|---|
| Module | S3 (Statistics 3) |
| Year | 2021 |
| Session | June |
| Marks | 16 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Chi-squared goodness of fit: Binomial |
| Difficulty | Standard +0.3 This is a standard chi-squared goodness of fit test for a binomial distribution with straightforward parts: stating assumptions (recall), calculating a mean (basic arithmetic), finding expected frequencies using symmetry, and performing a hypothesis test following a standard procedure. The only mild challenge is combining cells to meet the expected frequency condition, but this is routine S3 material requiring no novel insight. |
| Spec | 2.05a Hypothesis testing language: null, alternative, p-value, significance5.02b Expectation and variance: discrete random variables5.02c Linear coding: effects on mean and variance5.06b Fit prescribed distribution: chi-squared test |
| Number of successes | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
| Number of practices | 4 | 6 | 3 | 12 | 10 | 7 | 4 | 2 | 2 |
| Number of successes | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
| Expected frequency | 0.47 | 2.96 | 8.23 | 13.07 | \(f\) | 8.23 | 3.27 | 0.74 | \(g\) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance Notes |
| Relief of symptoms is either a "success" or a "failure". The probability the medicine being a success is constant. Samples from different medical practices are independent. | B1 B1 | Any 2. Context required in one assumption. |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance Notes |
| \(\text{Mean} = \frac{0\times4 + 1\times6 + 2\times3 + ... + 8\times2}{50} = 3.54^*\) | M1, A1cso | At least two correct terms on numerator and 50 on denominator, fully correct expression or \(\frac{177}{50}\). dep on M1 scored cso. |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance Notes |
| \(p = \frac{3.54}{8} = 0.4425\) | B1 | Can be implied by at least 1 correct value for \(f\) or \(g\). |
| \(f = 50 \times C^8_4 \times 0.4425^4 \times 0.5575^4 = 12.96\) | M1A1A1 | Use of Bin(50, \(p\)) for M1. Allow awrt 12.96, awrt 0.07 |
| \(g = 50 \times 0.4425^8 = 0.07\) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance Notes |
| \(H_0\): Binomial distribution is a suitable model. \(H_1\): Binomial distribution is not a suitable model. | B1 | Both hypotheses correct. If parameters used then B0. |
| Combining 0,1,2 or 5,6,7,8 | M1 | |
| \((O-E)^2/E\) values: 0.154, 0.088, 0.676/7, 0.588/7. Total = 1.506 | M1 | M1 for attempting \(\frac{(O-E)^2}{E}\) or \(\frac{O^2}{E}\) with at least 2 correct expressions or 2 correct values to 2sf. |
| \(\sum\frac{(O-E)^2}{E} = \sum\frac{O^2}{E} - 50 = 1.50...\) | A1 | awrt 1.5 (calculator: 1.50498…) |
| \(\nu = 4-2 = 2\), \(\chi^2_2(10\%) = 4.605\) | B1, B1f.t. | 2 can be implied by 4.605 seen. Only f.t. \(\nu = r-2\) |
| Insufficient evidence to reject \(H_0\) | M1 | For correct non-contextual statement linking their test statistic and their cv. |
| Data is consistent with a binomial distribution (oe) | A1 | A correct comment suggesting binomial model is suitable/good fit. Hypotheses wrong way around scores A0. Condone parameters here. |
# Question 5:
## Part 5(a):
| Answer/Working | Mark | Guidance Notes |
|---|---|---|
| Relief of symptoms is either a "success" or a "failure". The probability the medicine being a success is constant. Samples from different medical practices are independent. | B1 B1 | Any 2. Context required in one assumption. |
## Part 5(b):
| Answer/Working | Mark | Guidance Notes |
|---|---|---|
| $\text{Mean} = \frac{0\times4 + 1\times6 + 2\times3 + ... + 8\times2}{50} = 3.54^*$ | M1, A1cso | At least two correct terms on numerator and 50 on denominator, fully correct expression or $\frac{177}{50}$. dep on M1 scored cso. |
## Part 5(c):
| Answer/Working | Mark | Guidance Notes |
|---|---|---|
| $p = \frac{3.54}{8} = 0.4425$ | B1 | Can be implied by at least 1 correct value for $f$ or $g$. |
| $f = 50 \times C^8_4 \times 0.4425^4 \times 0.5575^4 = 12.96$ | M1A1A1 | Use of Bin(50, $p$) for M1. Allow awrt 12.96, awrt 0.07 |
| $g = 50 \times 0.4425^8 = 0.07$ | | |
## Part 5(d):
| Answer/Working | Mark | Guidance Notes |
|---|---|---|
| $H_0$: Binomial distribution is a suitable model. $H_1$: Binomial distribution is not a suitable model. | B1 | Both hypotheses correct. If parameters used then B0. |
| Combining 0,1,2 or 5,6,7,8 | M1 | |
| $(O-E)^2/E$ values: 0.154, 0.088, 0.676/7, 0.588/7. Total = 1.506 | M1 | M1 for attempting $\frac{(O-E)^2}{E}$ or $\frac{O^2}{E}$ with at least 2 correct expressions or 2 correct values to 2sf. |
| $\sum\frac{(O-E)^2}{E} = \sum\frac{O^2}{E} - 50 = 1.50...$ | A1 | awrt 1.5 (calculator: 1.50498…) |
| $\nu = 4-2 = 2$, $\chi^2_2(10\%) = 4.605$ | B1, B1f.t. | 2 can be implied by 4.605 seen. Only f.t. $\nu = r-2$ |
| Insufficient evidence to reject $H_0$ | M1 | For correct non-contextual statement linking their test statistic and their cv. |
| Data is consistent with a binomial distribution (oe) | A1 | A correct comment suggesting binomial model is suitable/good fit. Hypotheses wrong way around scores A0. Condone parameters here. |
---
\begin{enumerate}
\item A researcher is looking into the effectiveness of a new medicine for the relief of symptoms. He collects random samples of 8 people who are taking the medicine from each of 50 different medical practices. The number of people who say that the medicine is a success, in each sample, is recorded. The results are summarised in the table below.
\end{enumerate}
\begin{center}
\begin{tabular}{ | l | l | l | l | l | l | l | l | l | l | }
\hline
Number of successes & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 \\
\hline
Number of practices & 4 & 6 & 3 & 12 & 10 & 7 & 4 & 2 & 2 \\
\hline
\end{tabular}
\end{center}
The researcher decides to model this data using a binomial distribution.\\
(a) State two necessary assumptions that the researcher made in order to use this model.\\
(b) Show that the mean number of successes per sample is 3.54
He decides to use this mean to calculate expected frequencies. The results are shown in the table below.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | c | c | }
\hline
Number of successes & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 \\
\hline
Expected frequency & 0.47 & 2.96 & 8.23 & 13.07 & $f$ & 8.23 & 3.27 & 0.74 & $g$ \\
\hline
\end{tabular}
\end{center}
(c) Calculate the value of $f$ and the value of $g$. Give your answers to 2 decimal places.\\
(d) Stating your hypotheses clearly, test at the $10 \%$ level of significance, whether or not the binomial distribution is a suitable model for the number of successes in samples of 8 people.
\hfill \mbox{\textit{Edexcel S3 2021 Q5 [16]}}