Edexcel S3 2023 January — Question 3 9 marks

Exam BoardEdexcel
ModuleS3 (Statistics 3)
Year2023
SessionJanuary
Marks9
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared goodness of fit
TypeChi-squared test of independence
DifficultyModerate -0.8 This is a straightforward chi-squared test question requiring standard expected frequency calculations using (row total × column total)/grand total, followed by a routine hypothesis test. The calculations are simple arithmetic, the test procedure is entirely standard, and part of the test statistic is already provided. This is easier than average A-level content as it requires only direct application of learned procedures with no problem-solving or insight.
Spec5.06a Chi-squared: contingency tables

3 A mobile phone company offers an insurance policy to its customers when they purchase a mobile phone. The company conducted a survey on the age of the customers and whether or not claims were made. A random sample of 1200 customers from this company was investigated for 2020 and the results are shown in the table below.
Claim made in 2020No claim made in 2020Total
\multirow{3}{*}{Age}17-20 years24176200
21-50 years48652700
51 years and over14286300
Total8611141200
The data are to be used to determine whether or not making a claim is independent of age.
  1. Calculate the expected frequencies for the age group 51 years and over that
    1. made a claim in 2020
    2. did not make a claim in 2020 The 4 classes of customers aged between 17 and 50 give a value of \(\sum \frac { ( O - E ) ^ { 2 } } { E } = 7.123\) correct to 3 decimal places.
  2. Test, at the \(1 \%\) level of significance, whether or not making a claim is independent of age. Show your working clearly, stating your hypotheses, the degrees of freedom, the test statistic and the critical value used.

Question 3:
Part (a)
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(\frac{86 \times 300}{1200}\) or \(\frac{1114 \times 300}{1200}\)M1 A correct method for finding one expected value; implied by one correct value
21.5 and 278.5A1 Correct answer for both 21.5 and 278.5
Part (b)
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(H_0\): Making a claim and age are independent (not associated); \(H_1\): Making a claim and age are not independent (associated)B1 Both hypotheses correct; must mention claim and age at least once; use of "relationship" or "correlation" or "connection" is B0
\(\frac{(14 - \text{"21.5"})^2}{\text{"21.5"}} = 2.6162...\) and \(\frac{(286 - \text{"278.5"})^2}{\text{"278.5"}} = 0.20197...\)M1 A correct method for finding both contributions to the \(\chi^2\) value or awrt 2.62 or awrt 0.202; allow truncated answers of 2.61 and 0.201; may be implied by awrt 9.94
\(\sum \frac{(O-E)^2}{E} = 7.123 + \text{"2.616..."} + \text{"0.2019..."}\)M1 Adding their two values to 7.123 (may be implied by a full \(\chi^2\) calculation, with at least 3 correct expressions or values; do not ISW)
\(= 9.941...\) awrt 9.94A1 awrt 9.94
\(\nu = (2-1)(3-1) = 2\)B1 \(\nu = 2\); this mark can be implied by a correct critical value of 9.21 or better
\(\chi^2_2(0.01) = 9.210 \Rightarrow\) CR: \(X^2 \geq 9.21[0]\)B1ft 9.21[0] or better ft their degrees of freedom; common ones \(\nu = 3\) is 11.345
[In the CR/significant/Reject \(H_0\)] There is sufficient evidence to suggest that making a claim is not independent of agedA1ft Independent of hypotheses but dependent on both M marks; correct contextual conclusion compatible with their values, which has the words claim and age; e.g. if they have 11.345 and 9.94 they should say it is independent/not associated; do not allow contradicting statements
# Question 3:

## Part (a)
| Answer/Working | Mark | Guidance |
|---|---|---|
| $\frac{86 \times 300}{1200}$ or $\frac{1114 \times 300}{1200}$ | M1 | A correct method for finding one expected value; implied by one correct value |
| 21.5 and 278.5 | A1 | Correct answer for both 21.5 and 278.5 |

## Part (b)
| Answer/Working | Mark | Guidance |
|---|---|---|
| $H_0$: Making a claim and age are independent (not associated); $H_1$: Making a claim and age are not independent (associated) | B1 | Both hypotheses correct; must mention claim and age at least once; use of "relationship" or "correlation" or "connection" is B0 |
| $\frac{(14 - \text{"21.5"})^2}{\text{"21.5"}} = 2.6162...$ and $\frac{(286 - \text{"278.5"})^2}{\text{"278.5"}} = 0.20197...$ | M1 | A correct method for finding both contributions to the $\chi^2$ value or awrt 2.62 or awrt 0.202; allow truncated answers of 2.61 and 0.201; may be implied by awrt 9.94 |
| $\sum \frac{(O-E)^2}{E} = 7.123 + \text{"2.616..."} + \text{"0.2019..."}$ | M1 | Adding their two values to 7.123 (may be implied by a full $\chi^2$ calculation, with at least 3 correct expressions or values; do not ISW) |
| $= 9.941...$ awrt 9.94 | A1 | awrt 9.94 |
| $\nu = (2-1)(3-1) = 2$ | B1 | $\nu = 2$; this mark can be implied by a correct critical value of 9.21 or better |
| $\chi^2_2(0.01) = 9.210 \Rightarrow$ CR: $X^2 \geq 9.21[0]$ | B1ft | 9.21[0] or better ft their degrees of freedom; common ones $\nu = 3$ is 11.345 |
| [In the CR/significant/Reject $H_0$] There is sufficient evidence to suggest that making a **claim** is not independent of **age** | dA1ft | Independent of hypotheses but dependent on both M marks; correct contextual conclusion compatible with their values, which has the words claim and age; e.g. if they have 11.345 and 9.94 they should say it is independent/not associated; do not allow contradicting statements |
3 A mobile phone company offers an insurance policy to its customers when they purchase a mobile phone. The company conducted a survey on the age of the customers and whether or not claims were made.

A random sample of 1200 customers from this company was investigated for 2020 and the results are shown in the table below.

\begin{center}
\begin{tabular}{|l|l|l|l|l|}
\hline
\multicolumn{2}{|c|}{} & Claim made in 2020 & No claim made in 2020 & Total \\
\hline
\multirow{3}{*}{Age} & 17-20 years & 24 & 176 & 200 \\
\hline
 & 21-50 years & 48 & 652 & 700 \\
\hline
 & 51 years and over & 14 & 286 & 300 \\
\hline
 & Total & 86 & 1114 & 1200 \\
\hline
\end{tabular}
\end{center}

The data are to be used to determine whether or not making a claim is independent of age.
\begin{enumerate}[label=(\alph*)]
\item Calculate the expected frequencies for the age group 51 years and over that
\begin{enumerate}[label=(\roman*)]
\item made a claim in 2020
\item did not make a claim in 2020

The 4 classes of customers aged between 17 and 50 give a value of $\sum \frac { ( O - E ) ^ { 2 } } { E } = 7.123$ correct to 3 decimal places.
\end{enumerate}\item Test, at the $1 \%$ level of significance, whether or not making a claim is independent of age. Show your working clearly, stating your hypotheses, the degrees of freedom, the test statistic and the critical value used.
\end{enumerate}

\hfill \mbox{\textit{Edexcel S3 2023 Q3 [9]}}