| Exam Board | Edexcel |
|---|---|
| Module | S3 (Statistics 3) |
| Year | 2014 |
| Session | June |
| Marks | 13 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Chi-squared goodness of fit: Binomial |
| Difficulty | Standard +0.3 This is a standard chi-squared goodness of fit test with binomial distribution, typical of S3 specification. Parts (a)-(c) involve routine recall and calculation (stating binomial conditions, computing mean, using binomial probabilities). Part (d) is a standard hypothesis test procedure. The question is slightly easier than average because it provides expected frequencies and requires minimal calculation compared to typical chi-squared questions. |
| Spec | 2.04b Binomial distribution: as model B(n,p)5.06b Fit prescribed distribution: chi-squared test5.06c Fit other distributions: discrete and continuous |
| Number of seeds germinating in each row | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| Observed number of rows | 2 | 6 | 11 | 19 | 25 | 32 | 16 | 9 |
| Number of seeds germinating in each row | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| Expected number of rows | 0.20 | 2.06 | \(s\) | 23.22 | \(t\) | 31.35 | 15.68 | 3.36 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| Seeds are independent / There are a fixed number of seeds in a row / There are only two outcomes / The probability of a seed germinating is constant | B1 B1 | Any two conditions, at least one must have context; 2 correct no context: B1B0; do not award B0B1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(\frac{(0\times2)+(1\times6)+(2\times11)+(3\times19)+(4\times25)+(5\times32)+(6\times16)+(7\times9)}{120\times7} = \frac{504}{840} = 0.6\) | M1, A1cso | M1 requires at least two correct terms in numerator and \(/(120\times7)\) or \(/120\) then \(/7\); A1 cso as given answer |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(s = 120 \times 21q^5p^2 = 120 \times 21 \times 0.4^5 \times 0.6^2 = 9.29\) | B1 | Cao |
| \(t = 120 \times 35q^3p^4 = 120 \times 35 \times 0.4^3 \times 0.6^4 = 34.84\) | B1 | Cao |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(H_0\): A binomial distribution is a suitable model; \(H_1\): A binomial distribution is not a suitable model | B1 | B0 if 0.6 included; condone \(X \sim B(n,p)\) etc |
| Combined columns and expected values calculated | M1 | M1 for using some combined columns (\(<8\)) |
| \(\nu = 5 - 2 = 3\) | B1ft | Follows from 'their number of columns' \(-2\) |
| Critical value \(\chi^2 = 11.345\) | B1ft | Follows from degrees of freedom |
| \(\sum\frac{(O-E)^2}{E} = 10.23\) or \(\sum\frac{O^2}{E} - N = 130.23 - 120 = 10.23\) | M1A1 | M1 for attempting with at least \(2^{nd}\) (3 seeds) and \(4^{th}\) (5 seeds) accurate to 2sf; A1 awrt 10.2 |
| \(10.23 < 11.345\) therefore do not reject \(H_0\); a binomial is a suitable model | A1 | 2nd A1 dependent on 2nd M1; correct comment that binomial model is suitable; no follow through |
# Question 5:
## Part (a)
| Answer | Mark | Guidance |
|--------|------|----------|
| Seeds are **independent** / There are a **fixed number** of seeds in a row / There are only **two outcomes** / The **probability** of a seed germinating is **constant** | B1 B1 | Any two conditions, at least one must have context; 2 correct no context: B1B0; do not award B0B1 |
## Part (b)
| Answer | Mark | Guidance |
|--------|------|----------|
| $\frac{(0\times2)+(1\times6)+(2\times11)+(3\times19)+(4\times25)+(5\times32)+(6\times16)+(7\times9)}{120\times7} = \frac{504}{840} = 0.6$ | M1, A1cso | M1 requires at least two correct terms in numerator **and** $/(120\times7)$ or $/120$ then $/7$; A1 cso as given answer |
## Part (c)
| Answer | Mark | Guidance |
|--------|------|----------|
| $s = 120 \times 21q^5p^2 = 120 \times 21 \times 0.4^5 \times 0.6^2 = 9.29$ | B1 | Cao |
| $t = 120 \times 35q^3p^4 = 120 \times 35 \times 0.4^3 \times 0.6^4 = 34.84$ | B1 | Cao |
## Part (d)
| Answer | Mark | Guidance |
|--------|------|----------|
| $H_0$: A binomial distribution is a suitable model; $H_1$: A binomial distribution is not a suitable model | B1 | B0 if 0.6 included; condone $X \sim B(n,p)$ etc |
| Combined columns and expected values calculated | M1 | M1 for using some combined columns ($<8$) |
| $\nu = 5 - 2 = 3$ | B1ft | Follows from 'their number of columns' $-2$ |
| Critical value $\chi^2 = 11.345$ | B1ft | Follows from degrees of freedom |
| $\sum\frac{(O-E)^2}{E} = 10.23$ or $\sum\frac{O^2}{E} - N = 130.23 - 120 = 10.23$ | M1A1 | M1 for attempting with at least $2^{nd}$ (3 seeds) and $4^{th}$ (5 seeds) accurate to 2sf; A1 awrt 10.2 |
| $10.23 < 11.345$ therefore do not reject $H_0$; a binomial is a suitable model | A1 | 2nd A1 dependent on 2nd M1; correct comment that binomial model is suitable; **no follow through** |
---
5. A research station is doing some work on the germination of a new variety of genetically modified wheat.
They planted 120 rows containing 7 seeds in each row.\\
The number of seeds germinating in each row was recorded. The results are as follows
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | c | }
\hline
Number of seeds germinating in each row & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 \\
\hline
Observed number of rows & 2 & 6 & 11 & 19 & 25 & 32 & 16 & 9 \\
\hline
\end{tabular}
\end{center}
\begin{enumerate}[label=(\alph*)]
\item Write down two reasons why a binomial distribution may be a suitable model.
\item Show that the probability of a randomly selected seed from this sample germinating is 0.6
The research station used a binomial distribution with probability 0.6 of a seed germinating. The expected frequencies were calculated to 2 decimal places. The results are as follows
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | c | }
\hline
Number of seeds germinating in each row & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 \\
\hline
Expected number of rows & 0.20 & 2.06 & $s$ & 23.22 & $t$ & 31.35 & 15.68 & 3.36 \\
\hline
\end{tabular}
\end{center}
\item Find the value of $s$ and the value of $t$.
\item Stating your hypotheses clearly, test, at the $1 \%$ level of significance, whether or not the data can be modelled by a binomial distribution.
\end{enumerate}
\hfill \mbox{\textit{Edexcel S3 2014 Q5 [13]}}