| Exam Board | OCR |
|---|---|
| Module | Further Statistics AS (Further Statistics AS) |
| Year | 2023 |
| Session | June |
| Marks | 12 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Assess model suitability before testing |
| Difficulty | Standard +0.3 This is a straightforward chi-squared goodness of fit test with given ratios. Students must calculate expected frequencies from ratios (standard procedure), complete a partially-filled table, perform the test with given significance level, and make a basic interpretation. The only slight elevation above average is part (a) requiring comparison of mean and variance for binomial validity, but overall this follows a standard template with no novel problem-solving required. |
| Spec | 5.06c Fit other distributions: discrete and continuous5.06d Goodness of fit: chi-squared test |
| \(x\) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | Total |
| Observed frequency | 7 | 10 | 16 | 15 | 15 | 11 | 14 | 88 |
| Expected frequency | ||||||||
| Contribution to \(\chi ^ { 2 }\) statistic | 0.9 | 0.3333 | 0.2857 | 0.0625 | 0.0714 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| *Either* \(6p = 3.35 \Rightarrow p = 0.558(33)\) | M1 | Use \(np\) and \(npq\). Attempt to use Poisson: M0 |
| \(\Rightarrow\) variance should be 1.48 (1.47958) | A1 | Correct relevant calculation, e.g. \(q = 1.025...\), or \(p = -0.0125\) or solve \(6p^2 - 6p + 3.392 = 0\) to get both \(p \approx 1.4\) or \(-0.4\), but *not* from \(p = 0.5\) |
| Not close to 3.392 so \(B(6,p)\) not a good model | A1 [3] | Validly deduce that \(B(6,p)\) not valid, e.g. \(0 < p < 1\), and state conclusion. SC: 0.5 used: M1A0A1 |
| *Or* \(npq > np\); so \(q > 1\) which is impossible. Hence \(B(6,p)\) not a good model | M1A1 A1 | (qualitative argument) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Expected frequencies 10, 12, 14, 16, 14, 12, 10 | B1 | |
| Use \(\frac{(O-E)^2}{E}\) | M1 | Allow from at least one of \(0.083(...)\) and 1.6 correct |
| \(0.083(3...)\), 1.6 and total 3.3362 or 3.3363 | A1 [3] | Allow 3.34, 3.336 or better. If total omitted, or "0", in (b), can be recovered from (c) ("0" probably comes from misunderstanding "Total") |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(H_0\): data consistent with proposed model, \(H_1\): not so | B1 | Allow "data follows …" but *not* "data is in ratio …" nor "evidence that …" |
| \(3.336(2) < 10.64\) | B1ft | Compare *their* 3.336 with correct CV (3.336 may be from calculator) |
| Do not reject \(H_0\) | M1ft | Correct first conclusion, FT on their TS and on CV 9.236 or 12.59 |
| Insufficient evidence that proposed model does not fit data | A1ft [4] | Contextualised, not over-assertive. Needs 'double negative', *not* "significant evidence that data is consistent", etc. A0 if hypotheses wrong way round |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Inferences from a hypothesis test are not "definite" | B1 | "Definite" stated to be too strong, oe (not *just* "Rosa is wrong") |
| All we have is evidence / Sample size is small / other experiments might produce different results | B1 [2] | Relevant valid comment, e.g. "data might be misleading", "second model likely to be correct", "either could be correct", and no wrong extras. "Neither/both good" etc, from wrong conclusion to (a) or (c): max B1B0 |
# Question 6:
## Part (a):
| Answer | Marks | Guidance |
|--------|-------|----------|
| *Either* $6p = 3.35 \Rightarrow p = 0.558(33)$ | **M1** | Use $np$ and $npq$. Attempt to use Poisson: M0 |
| $\Rightarrow$ variance should be 1.48 (1.47958) | **A1** | Correct relevant calculation, e.g. $q = 1.025...$, or $p = -0.0125$ or solve $6p^2 - 6p + 3.392 = 0$ to get both $p \approx 1.4$ or $-0.4$, but *not* from $p = 0.5$ |
| Not close to 3.392 so $B(6,p)$ not a good model | **A1 [3]** | Validly deduce that $B(6,p)$ not valid, e.g. $0 < p < 1$, and state conclusion. SC: 0.5 used: M1A0A1 |
| *Or* $npq > np$; so $q > 1$ which is impossible. Hence $B(6,p)$ not a good model | **M1A1 A1** | (qualitative argument) |
## Part (b):
| Answer | Marks | Guidance |
|--------|-------|----------|
| Expected frequencies 10, 12, 14, 16, 14, 12, 10 | **B1** | |
| Use $\frac{(O-E)^2}{E}$ | **M1** | Allow from at least one of $0.083(...)$ and 1.6 correct |
| $0.083(3...)$, 1.6 and total 3.3362 or 3.3363 | **A1 [3]** | Allow 3.34, 3.336 or better. If total omitted, or "0", in **(b)**, can be recovered from **(c)** ("0" probably comes from misunderstanding "Total") |
## Part (c):
| Answer | Marks | Guidance |
|--------|-------|----------|
| $H_0$: data consistent with proposed model, $H_1$: not so | **B1** | Allow "data follows …" but *not* "data is in ratio …" nor "evidence that …" |
| $3.336(2) < 10.64$ | **B1ft** | Compare *their* 3.336 with correct CV (3.336 may be from calculator) |
| Do not reject $H_0$ | **M1ft** | Correct first conclusion, FT on their TS and on CV 9.236 or 12.59 |
| Insufficient evidence that proposed model does not fit data | **A1ft [4]** | Contextualised, not over-assertive. Needs 'double negative', *not* "significant evidence that data is consistent", etc. A0 if hypotheses wrong way round |
## Part (d):
| Answer | Marks | Guidance |
|--------|-------|----------|
| Inferences from a hypothesis test are not "definite" | **B1** | "Definite" stated to be too strong, oe (not *just* "Rosa is wrong") |
| All we have is evidence / Sample size is small / other experiments might produce different results | **B1 [2]** | Relevant valid comment, e.g. "data might be misleading", "second model likely to be correct", "either could be correct", and no wrong extras. "Neither/both good" etc, from wrong conclusion to **(a)** or **(c)**: max B1B0 |
---
6 A machine is used to toss a coin repeatedly. Rosa believes that the outcome of each toss made by the machine is not independent of the previous toss. Rosa gets the machine to toss a coin 6 times and record the number of heads, $X$, obtained. After recording the number of heads obtained, Rosa resets the machine and gets it to toss the coin 6 more times. Rosa again records the number of heads obtained and she repeats this procedure until she has recorded 88 independent values of $X$.
\begin{enumerate}[label=(\alph*)]
\item The sample mean and sample variance of $X$ are 3.35 and 3.392 respectively.
Explain what these results suggest about the validity of a binomial model $\mathrm { B } ( 6 , p )$ for the data.
Rosa uses a computer spreadsheet to work out the probabilities for a more sophisticated model in which the outcome of each toss is dependent on the outcome of the previous toss. Her model suggests that the probabilities $\mathrm { P } ( X = x )$, for $x = 0,1,2,3,4,5,6$, are approximately in the ratio $5 : 6 : 7 : 8 : 7 : 6 : 5$. She carries out a $\chi ^ { 2 }$ test to investigate whether this model is a good fit for the data.
The following table shows the full results of the experiments, together with some of the calculations needed for the test.
\begin{center}
\begin{tabular}{|l|l|l|l|l|l|l|l|l|}
\hline
$x$ & 0 & 1 & 2 & 3 & 4 & 5 & 6 & Total \\
\hline
Observed frequency & 7 & 10 & 16 & 15 & 15 & 11 & 14 & 88 \\
\hline
Expected frequency & & & & & & & & \\
\hline
Contribution to $\chi ^ { 2 }$ statistic & 0.9 & 0.3333 & 0.2857 & 0.0625 & 0.0714 & & & \\
\hline
\end{tabular}
\end{center}
\item In the Printed Answer Booklet, complete the table.
\item Carry out the test, using a 10\% significance level.
\item Rosa says that the results definitely show that one of the two proposed models is correct.
Comment on this statement.
\end{enumerate}
\hfill \mbox{\textit{OCR Further Statistics AS 2023 Q6 [12]}}