OCR Further Statistics 2018 December — Question 7 12 marks

Exam BoardOCR
ModuleFurther Statistics (Further Statistics)
Year2018
SessionDecember
Marks12
TopicChi-squared goodness of fit
TypeChi-squared goodness of fit: Given ratios
DifficultyStandard +0.8 This is a Further Maths Statistics chi-squared test requiring calculation of expected frequencies from Geo(0.4), combining categories appropriately, computing the test statistic with correct degrees of freedom (accounting for no estimated parameters), and critically evaluating the model fit to suggest improvements. The geometric distribution context and model criticism in part (b) elevate this above a routine chi-squared test.
Spec5.02g Geometric probabilities: P(X=r) = p(1-p)^(r-1)5.06c Fit other distributions: discrete and continuous

7 Sasha tends to forget his passwords. He investigates whether the number of attempts he needs to log on to a system with a password can be modelled by a geometric distribution. On 60 occasions he records the number of attempts he needs to log on, and the results are shown in the table.
Number of attempts1234 or more
Frequency2019133
  1. Test at the \(1 \%\) significance level whether the results are consistent with the distribution Geo(0.4).
    [0pt]
  2. Suggest which two probabilities should be changed, and in what way, to produce an improved model. (Numerical values are not required.) You should give a reason for your suggestion. [3]

(a)
\(H_0\): results are consistent with Geo(0.4)
\(H_1\): results not consistent
Probabilities 0.4, 0.24, 0.144 and 0.216
Expected frequencies 24, 14.4, 8.64, 12.96
\(\chi^2 = 0.6667 + 1.4694 + 2.2002 + 7.6544\)
\(= 11.99\ldots\)
\(\chi^2 = 11.34\)
Reject \(H_0\).
Significant evidence that data not well modelled by Geo(0.4)
AnswerMarks
B1, B1, B1, M1, M1, A1, A1, M1ft, A1ftBoth. Or equivalent; (no additional guidance); (no additional guidance); (no additional guidance); (no additional guidance); Correct first conclusion; Context needed, not too definite [not e.g. "Geo(4) is not a good model"]; FT on 11.99 only; FT on 11.99 only
(b)
Increase P(3)
Decrease P(≥ 4)
These are the two cells with the largest contribution to \(\chi^2\)
AnswerMarks
M1, A1, B1(no additional guidance); (no additional guidance); (no additional guidance)
## (a)
$H_0$: results are consistent with Geo(0.4)

$H_1$: results not consistent

Probabilities 0.4, 0.24, 0.144 and 0.216

Expected frequencies 24, 14.4, 8.64, 12.96

$\chi^2 = 0.6667 + 1.4694 + 2.2002 + 7.6544$

$= 11.99\ldots$

$\chi^2 = 11.34$

Reject $H_0$.

Significant evidence that data not well modelled by Geo(0.4)

| B1, B1, B1, M1, M1, A1, A1, M1ft, A1ft | Both. Or equivalent; (no additional guidance); (no additional guidance); (no additional guidance); (no additional guidance); Correct first conclusion; Context needed, not too definite [not e.g. "Geo(4) is not a good model"]; FT on 11.99 only; FT on 11.99 only |

## (b)
Increase P(3)

Decrease P(≥ 4)

These are the two cells with the largest contribution to $\chi^2$

| M1, A1, B1 | (no additional guidance); (no additional guidance); (no additional guidance) |

---
7 Sasha tends to forget his passwords. He investigates whether the number of attempts he needs to log on to a system with a password can be modelled by a geometric distribution. On 60 occasions he records the number of attempts he needs to log on, and the results are shown in the table.

\begin{center}
\begin{tabular}{ | c | c | c | c | c | }
\hline
Number of attempts & 1 & 2 & 3 & 4 or more \\
\hline
Frequency & 20 & 19 & 13 & 3 \\
\hline
\end{tabular}
\end{center}
\begin{enumerate}[label=(\alph*)]
\item Test at the $1 \%$ significance level whether the results are consistent with the distribution Geo(0.4).\\[0pt]
\item Suggest which two probabilities should be changed, and in what way, to produce an improved model. (Numerical values are not required.) You should give a reason for your suggestion. [3]
\end{enumerate}

\hfill \mbox{\textit{OCR Further Statistics 2018 Q7 [12]}}