| Exam Board | CAIE |
|---|---|
| Module | Further Paper 4 (Further Paper 4) |
| Year | 2023 |
| Session | November |
| Marks | 16 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Wilcoxon tests |
| Type | Wilcoxon rank-sum test (Mann-Whitney U test) |
| Difficulty | Standard +0.8 This is a Further Maths statistics question requiring execution of both Wilcoxon rank-sum and t-test procedures with small samples, plus conceptual understanding of when these tests agree. While the calculations are systematic rather than requiring deep insight, the dual-test requirement, ranking procedure, and need to interpret relative test power elevates this above a standard single-test question. |
| Spec | 5.05c Hypothesis test: normal distribution for population mean5.07d Paired vs two-sample: selection5.07e Test medians |
| Machine \(X\) | 4.0 | 4.6 | 4.7 | 4.8 | 5.0 | 5.2 | 5.6 | 5.8 | |
| Machine \(Y\) | 4.5 | 4.9 | 5.1 | 5.3 | 5.4 | 5.7 | 5.9 | 6.3 | 6.4 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(H_0\): population medians are equal or \(m_x = m_y\); \(H_1\): population median for \(X\) < population median for \(Y\) or \(m_x < m_y\) | B1 | Do not accept 'difference between population medians < 0' without \(X\) or \(Y\) specified |
| Rankings table: \(X\) values ranked 1,3,4,5,7,9,12,14 and \(Y\) values ranked 2,6,8,10,11,13,15,16,17; Sum of \(X\) ranks = 55, Sum of \(Y\) ranks = 98 | M1 | Rankings, allow at most 3 errors |
| Test statistic = 55 | A1 | |
| Tabular value for \(m=8\), \(n=9\) is 54 | B1 | |
| \(55 > 54\), accept \(H_0\)/not significant | M1 | Ft *their* '55' must come from ranks; ft *their* '54' must come from table |
| Insufficient evidence to support manager's claim. Insufficient evidence to suggest that the median time of machine \(X\) is less than the median time of machine \(Y\) | A1 | Correct conclusion in context, following correct work, level of uncertainty in language. A0 if hypotheses the wrong way round or missing |
| Total: 6 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(H_0: \mu_x = \mu_y \quad H_1: \mu_x < \mu_y\) | B1 | |
| \(\sum x = 39.7,\ \sum x^2 = 199.33,\ \sum y = 49.5,\ \sum y^2 = 275.47\); \(s_x^2 = \frac{1}{7}\left(199.33 - \frac{39.7^2}{8}\right) = 0.33125\) | B1 | |
| \(s_y^2 = \frac{1}{8}\left(275.47 - \frac{49.5^2}{9}\right) = 0.4025\) | B1 | |
| Pooled variance \(s^2 = \frac{7 \times 0.33125 + 8 \times 0.4025}{8+9-2}\) | M1 | |
| \(= 0.36925\) | A1 | |
| \(t = \dfrac{\frac{39.7}{8} - \frac{49.5}{9}}{s\sqrt{\frac{1}{8}+\frac{1}{9}}}\) | M1 | |
| \(t = -1.82\) | A1 | |
| Tabular value \(= 1.753\): \(\quad 1.82 > 1.753\) | M1 | |
| Reject \(H_0\), sufficient evidence that mean for machine \(X\) is less than mean for machine \(Y\) | A1 | CWO. Correct conclusion in context, following correct work, level of uncertainty in language |
| Total: 9 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(t\)-test is assuming a normal distribution, and with equal variances. This may not be true. So, no reason to expect results to be the same. Outliers affect part (b) but not part (a) | B1 | Not specific to data in question. Mention of normal distribution is not enough |
| Total: 1 |
## Question 5(a):
| Answer | Mark | Guidance |
|--------|------|----------|
| $H_0$: **population** medians are equal or $m_x = m_y$; $H_1$: **population** median for $X$ < **population** median for $Y$ or $m_x < m_y$ | B1 | Do not accept 'difference between population medians < 0' without $X$ or $Y$ specified |
| Rankings table: $X$ values ranked 1,3,4,5,7,9,12,14 and $Y$ values ranked 2,6,8,10,11,13,15,16,17; Sum of $X$ ranks = 55, Sum of $Y$ ranks = 98 | M1 | Rankings, allow at most 3 errors |
| Test statistic = 55 | A1 | |
| Tabular value for $m=8$, $n=9$ is 54 | B1 | |
| $55 > 54$, accept $H_0$/not significant | M1 | Ft *their* '55' must come from ranks; ft *their* '54' must come from table |
| Insufficient evidence to support manager's claim. Insufficient evidence to suggest that the median time of machine $X$ is less than the median time of machine $Y$ | A1 | Correct conclusion in context, following correct work, level of uncertainty in language. A0 if hypotheses the wrong way round or missing |
| **Total: 6** | | |
---
## Question 5(b):
| Answer | Mark | Guidance |
|--------|------|----------|
| $H_0: \mu_x = \mu_y \quad H_1: \mu_x < \mu_y$ | B1 | |
| $\sum x = 39.7,\ \sum x^2 = 199.33,\ \sum y = 49.5,\ \sum y^2 = 275.47$; $s_x^2 = \frac{1}{7}\left(199.33 - \frac{39.7^2}{8}\right) = 0.33125$ | B1 | |
| $s_y^2 = \frac{1}{8}\left(275.47 - \frac{49.5^2}{9}\right) = 0.4025$ | B1 | |
| Pooled variance $s^2 = \frac{7 \times 0.33125 + 8 \times 0.4025}{8+9-2}$ | M1 | |
| $= 0.36925$ | A1 | |
| $t = \dfrac{\frac{39.7}{8} - \frac{49.5}{9}}{s\sqrt{\frac{1}{8}+\frac{1}{9}}}$ | M1 | |
| $t = -1.82$ | A1 | |
| Tabular value $= 1.753$: $\quad 1.82 > 1.753$ | M1 | |
| Reject $H_0$, sufficient evidence that mean for machine $X$ is less than mean for machine $Y$ | A1 | CWO. Correct conclusion in context, following correct work, level of uncertainty in language |
| **Total: 9** | | |
---
## Question 5(c):
| Answer | Mark | Guidance |
|--------|------|----------|
| $t$-test is assuming a normal distribution, and with equal variances. This may not be true. So, no reason to expect results to be the same. Outliers affect part **(b)** but not part **(a)** | B1 | Not specific to data in question. Mention of normal distribution is not enough |
| **Total: 1** | | |
5 A company is deciding which of two machines, $X$ and $Y$, can make a certain type of electrical component more quickly. The times taken, in minutes, to make one component of this type are recorded for a random sample of 8 components made by machine $X$ and a random sample of 9 components made by machine $Y$. These times are as follows.
\begin{center}
\begin{tabular}{ l l l l l l l l l l }
Machine $X$ & 4.0 & 4.6 & 4.7 & 4.8 & 5.0 & 5.2 & 5.6 & 5.8 & \\
Machine $Y$ & 4.5 & 4.9 & 5.1 & 5.3 & 5.4 & 5.7 & 5.9 & 6.3 & 6.4 \\
\end{tabular}
\end{center}
The manager claims that on average the time taken by machine $X$ to make one component is less than that taken by machine $Y$.
\begin{enumerate}[label=(\alph*)]
\item Carry out a Wilcoxon rank-sum test at the $5 \%$ significance level to test whether the manager's claim is supported by the data.
\item Assuming that the times taken to produce the components by the two machines are normally distributed with equal variances, carry out a $t$-test at the $5 \%$ significance level to test whether the manager's claim is supported by the data.\\
\section*{Question 5(c) is printed on the next page.}
\item In general, would you expect the conclusions from the tests in parts (a) and (b) to be the same? Give a reason for your answer.\\
If you use the following page to complete the answer to any question, the question number must be clearly shown.
\end{enumerate}
\hfill \mbox{\textit{CAIE Further Paper 4 2023 Q5 [16]}}