| Exam Board | Edexcel |
|---|---|
| Module | S3 (Statistics 3) |
| Year | 2023 |
| Session | January |
| Marks | 9 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Chi-squared test of independence |
| Difficulty | Moderate -0.8 This is a straightforward chi-squared test question requiring standard expected frequency calculations using (row total × column total)/grand total, followed by a routine hypothesis test. The calculations are simple arithmetic, the test procedure is entirely standard, and part of the test statistic is already provided. This is easier than average A-level content as it requires only direct application of learned procedures with no problem-solving or insight. |
| Spec | 5.06a Chi-squared: contingency tables |
| Claim made in 2020 | No claim made in 2020 | Total | ||
| \multirow{3}{*}{Age} | 17-20 years | 24 | 176 | 200 |
| 21-50 years | 48 | 652 | 700 | |
| 51 years and over | 14 | 286 | 300 | |
| Total | 86 | 1114 | 1200 | |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(\frac{86 \times 300}{1200}\) or \(\frac{1114 \times 300}{1200}\) | M1 | A correct method for finding one expected value; implied by one correct value |
| 21.5 and 278.5 | A1 | Correct answer for both 21.5 and 278.5 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(H_0\): Making a claim and age are independent (not associated); \(H_1\): Making a claim and age are not independent (associated) | B1 | Both hypotheses correct; must mention claim and age at least once; use of "relationship" or "correlation" or "connection" is B0 |
| \(\frac{(14 - \text{"21.5"})^2}{\text{"21.5"}} = 2.6162...\) and \(\frac{(286 - \text{"278.5"})^2}{\text{"278.5"}} = 0.20197...\) | M1 | A correct method for finding both contributions to the \(\chi^2\) value or awrt 2.62 or awrt 0.202; allow truncated answers of 2.61 and 0.201; may be implied by awrt 9.94 |
| \(\sum \frac{(O-E)^2}{E} = 7.123 + \text{"2.616..."} + \text{"0.2019..."}\) | M1 | Adding their two values to 7.123 (may be implied by a full \(\chi^2\) calculation, with at least 3 correct expressions or values; do not ISW) |
| \(= 9.941...\) awrt 9.94 | A1 | awrt 9.94 |
| \(\nu = (2-1)(3-1) = 2\) | B1 | \(\nu = 2\); this mark can be implied by a correct critical value of 9.21 or better |
| \(\chi^2_2(0.01) = 9.210 \Rightarrow\) CR: \(X^2 \geq 9.21[0]\) | B1ft | 9.21[0] or better ft their degrees of freedom; common ones \(\nu = 3\) is 11.345 |
| [In the CR/significant/Reject \(H_0\)] There is sufficient evidence to suggest that making a claim is not independent of age | dA1ft | Independent of hypotheses but dependent on both M marks; correct contextual conclusion compatible with their values, which has the words claim and age; e.g. if they have 11.345 and 9.94 they should say it is independent/not associated; do not allow contradicting statements |
# Question 3:
## Part (a)
| Answer/Working | Mark | Guidance |
|---|---|---|
| $\frac{86 \times 300}{1200}$ or $\frac{1114 \times 300}{1200}$ | M1 | A correct method for finding one expected value; implied by one correct value |
| 21.5 and 278.5 | A1 | Correct answer for both 21.5 and 278.5 |
## Part (b)
| Answer/Working | Mark | Guidance |
|---|---|---|
| $H_0$: Making a claim and age are independent (not associated); $H_1$: Making a claim and age are not independent (associated) | B1 | Both hypotheses correct; must mention claim and age at least once; use of "relationship" or "correlation" or "connection" is B0 |
| $\frac{(14 - \text{"21.5"})^2}{\text{"21.5"}} = 2.6162...$ and $\frac{(286 - \text{"278.5"})^2}{\text{"278.5"}} = 0.20197...$ | M1 | A correct method for finding both contributions to the $\chi^2$ value or awrt 2.62 or awrt 0.202; allow truncated answers of 2.61 and 0.201; may be implied by awrt 9.94 |
| $\sum \frac{(O-E)^2}{E} = 7.123 + \text{"2.616..."} + \text{"0.2019..."}$ | M1 | Adding their two values to 7.123 (may be implied by a full $\chi^2$ calculation, with at least 3 correct expressions or values; do not ISW) |
| $= 9.941...$ awrt 9.94 | A1 | awrt 9.94 |
| $\nu = (2-1)(3-1) = 2$ | B1 | $\nu = 2$; this mark can be implied by a correct critical value of 9.21 or better |
| $\chi^2_2(0.01) = 9.210 \Rightarrow$ CR: $X^2 \geq 9.21[0]$ | B1ft | 9.21[0] or better ft their degrees of freedom; common ones $\nu = 3$ is 11.345 |
| [In the CR/significant/Reject $H_0$] There is sufficient evidence to suggest that making a **claim** is not independent of **age** | dA1ft | Independent of hypotheses but dependent on both M marks; correct contextual conclusion compatible with their values, which has the words claim and age; e.g. if they have 11.345 and 9.94 they should say it is independent/not associated; do not allow contradicting statements |
3 A mobile phone company offers an insurance policy to its customers when they purchase a mobile phone. The company conducted a survey on the age of the customers and whether or not claims were made.
A random sample of 1200 customers from this company was investigated for 2020 and the results are shown in the table below.
\begin{center}
\begin{tabular}{|l|l|l|l|l|}
\hline
\multicolumn{2}{|c|}{} & Claim made in 2020 & No claim made in 2020 & Total \\
\hline
\multirow{3}{*}{Age} & 17-20 years & 24 & 176 & 200 \\
\hline
& 21-50 years & 48 & 652 & 700 \\
\hline
& 51 years and over & 14 & 286 & 300 \\
\hline
& Total & 86 & 1114 & 1200 \\
\hline
\end{tabular}
\end{center}
The data are to be used to determine whether or not making a claim is independent of age.
\begin{enumerate}[label=(\alph*)]
\item Calculate the expected frequencies for the age group 51 years and over that
\begin{enumerate}[label=(\roman*)]
\item made a claim in 2020
\item did not make a claim in 2020
The 4 classes of customers aged between 17 and 50 give a value of $\sum \frac { ( O - E ) ^ { 2 } } { E } = 7.123$ correct to 3 decimal places.
\end{enumerate}\item Test, at the $1 \%$ level of significance, whether or not making a claim is independent of age. Show your working clearly, stating your hypotheses, the degrees of freedom, the test statistic and the critical value used.
\end{enumerate}
\hfill \mbox{\textit{Edexcel S3 2023 Q3 [9]}}