Edexcel S3 — Question 6 15 marks

Exam BoardEdexcel
ModuleS3 (Statistics 3)
Marks15
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared test of independence
TypeContingency table construction from description
DifficultyStandard +0.3 This is a standard chi-squared test for independence with straightforward contingency table construction. The calculations are routine (expected frequencies, test statistic), and the conceptual understanding required (degrees of freedom formula, effect of proportional changes) is basic S3 material. The only mild challenge is part (c) requiring insight that proportional changes don't affect the test outcome, but this is a relatively simple conceptual point.
Spec5.06a Chi-squared: contingency tables

A survey found that of the 320 people questioned who had passed their driving test aged under twenty-five, 104 had been involved in an accident in the two years following their test. Of the 80 people in the survey who were aged twenty-five or over when they passed their test, 16 had been involved in an accident in the following two years.
  1. Draw up a contingency table showing this information. [2]
It is desired to test whether the proportion of drivers having accidents within two years of passing their test is different for those who were aged under twenty-five at the time of passing their test than for those aged twenty-five or over.
    1. Stating your hypotheses clearly, carry out the test at the 5\% level of significance.
    2. Explain clearly why there is only one degree of freedom. [11]
It is found that 12 people who were aged under twenty-five when they took their test and had been involved in an accident in the following two years had been omitted from the information given.
  1. Explain why you do not need to repeat the calculation to know the correct result of the test. [2]

AnswerMarks Guidance
Part (a)M1 A1
accidentno accident
< 25 yrs104 216
≥ 25 yrs16 64
120280 400
Part (b)(i)M1 A1, A1, B1 expected freq. < 25/accident = \(\frac{120 \times 320}{400} = 96\); giving expected freqs: 96, 224, 24, 56; \(H_0\): no assoc'n between age pass test and accident in next 2 yrs; \(H_1\): there is assoc'n between age pass test and acc in next 2 yrs
OE (O – E)
10496 8
216224 −8
1624 −8
6456 8
M1 A2, M1 A1, A1\(\therefore \sum \frac{(O-E)^2}{E} = 4.762\); \(\nu = 1, \chi^2_{\text{crit}}(5\%) = 3.841\); \(4.762 > 3.841\) \(\therefore\) significant; evidence of assoc'n between age pass test and acc in next 2 yrs
Part (b)(ii)B1 using totals, which must agree, once know one value can calculate all others
Part (c)B2, (15) higher proportion of accidents in < 25 led to significant result; extra data increases this difference so still significant
**Part (a)** | M1 A1 | 
| | accident | no accident | |
|---|---|---|---|
| < 25 yrs | 104 | 216 | 320 |
| ≥ 25 yrs | 16 | 64 | 80 |
| | 120 | 280 | 400 |

**Part (b)(i)** | M1 A1, A1, B1 | expected freq. < 25/accident = $\frac{120 \times 320}{400} = 96$; giving expected freqs: 96, 224, 24, 56; $H_0$: no assoc'n between age pass test and accident in next 2 yrs; $H_1$: there is assoc'n between age pass test and acc in next 2 yrs

| | O | E | (O – E) | $\frac{(O-E)^2}{E}$ |
|---|---|---|---|---|
| | 104 | 96 | 8 | 0.6667 |
| | 216 | 224 | −8 | 0.2857 |
| | 16 | 24 | −8 | 2.6667 |
| | 64 | 56 | 8 | 1.1429 |

| M1 A2, M1 A1, A1 | $\therefore \sum \frac{(O-E)^2}{E} = 4.762$; $\nu = 1, \chi^2_{\text{crit}}(5\%) = 3.841$; $4.762 > 3.841$ $\therefore$ significant; evidence of assoc'n between age pass test and acc in next 2 yrs

**Part (b)(ii)** | B1 | using totals, which must agree, once know one value can calculate all others

**Part (c)** | B2, (15) | higher proportion of accidents in < 25 led to significant result; extra data increases this difference so still significant
A survey found that of the 320 people questioned who had passed their driving test aged under twenty-five, 104 had been involved in an accident in the two years following their test. Of the 80 people in the survey who were aged twenty-five or over when they passed their test, 16 had been involved in an accident in the following two years.

\begin{enumerate}[label=(\alph*)]
\item Draw up a contingency table showing this information. [2]
\end{enumerate}

It is desired to test whether the proportion of drivers having accidents within two years of passing their test is different for those who were aged under twenty-five at the time of passing their test than for those aged twenty-five or over.

\begin{enumerate}[label=(\alph*)]
\setcounter{enumi}{1}
\item \begin{enumerate}[label=(\roman*)]
\item Stating your hypotheses clearly, carry out the test at the 5\% level of significance.

\item Explain clearly why there is only one degree of freedom. [11]
\end{enumerate}
\end{enumerate}

It is found that 12 people who were aged under twenty-five when they took their test and had been involved in an accident in the following two years had been omitted from the information given.

\begin{enumerate}[label=(\alph*)]
\setcounter{enumi}{2}
\item Explain why you do not need to repeat the calculation to know the correct result of the test. [2]
\end{enumerate}

\hfill \mbox{\textit{Edexcel S3  Q6 [15]}}