CAIE Further Paper 4 2023 November — Question 5 16 marks

Exam BoardCAIE
ModuleFurther Paper 4 (Further Paper 4)
Year2023
SessionNovember
Marks16
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicWilcoxon tests
TypeWilcoxon rank-sum test (Mann-Whitney U test)
DifficultyStandard +0.8 This is a Further Maths statistics question requiring execution of both Wilcoxon rank-sum and t-test procedures with small samples, plus conceptual understanding of when these tests agree. While the calculations are systematic rather than requiring deep insight, the dual-test requirement, ranking procedure, and need to interpret relative test power elevates this above a standard single-test question.
Spec5.05c Hypothesis test: normal distribution for population mean5.07d Paired vs two-sample: selection5.07e Test medians

5 A company is deciding which of two machines, \(X\) and \(Y\), can make a certain type of electrical component more quickly. The times taken, in minutes, to make one component of this type are recorded for a random sample of 8 components made by machine \(X\) and a random sample of 9 components made by machine \(Y\). These times are as follows.
Machine \(X\)4.04.64.74.85.05.25.65.8
Machine \(Y\)4.54.95.15.35.45.75.96.36.4
The manager claims that on average the time taken by machine \(X\) to make one component is less than that taken by machine \(Y\).
  1. Carry out a Wilcoxon rank-sum test at the \(5 \%\) significance level to test whether the manager's claim is supported by the data.
  2. Assuming that the times taken to produce the components by the two machines are normally distributed with equal variances, carry out a \(t\)-test at the \(5 \%\) significance level to test whether the manager's claim is supported by the data.
    \section*{Question 5(c) is printed on the next page.}
  3. In general, would you expect the conclusions from the tests in parts (a) and (b) to be the same? Give a reason for your answer.
    If you use the following page to complete the answer to any question, the question number must be clearly shown.

Question 5(a):
AnswerMarks Guidance
AnswerMark Guidance
\(H_0\): population medians are equal or \(m_x = m_y\); \(H_1\): population median for \(X\) < population median for \(Y\) or \(m_x < m_y\)B1 Do not accept 'difference between population medians < 0' without \(X\) or \(Y\) specified
Rankings table: \(X\) values ranked 1,3,4,5,7,9,12,14 and \(Y\) values ranked 2,6,8,10,11,13,15,16,17; Sum of \(X\) ranks = 55, Sum of \(Y\) ranks = 98M1 Rankings, allow at most 3 errors
Test statistic = 55A1
Tabular value for \(m=8\), \(n=9\) is 54B1
\(55 > 54\), accept \(H_0\)/not significantM1 Ft *their* '55' must come from ranks; ft *their* '54' must come from table
Insufficient evidence to support manager's claim. Insufficient evidence to suggest that the median time of machine \(X\) is less than the median time of machine \(Y\)A1 Correct conclusion in context, following correct work, level of uncertainty in language. A0 if hypotheses the wrong way round or missing
Total: 6
Question 5(b):
AnswerMarks Guidance
AnswerMark Guidance
\(H_0: \mu_x = \mu_y \quad H_1: \mu_x < \mu_y\)B1
\(\sum x = 39.7,\ \sum x^2 = 199.33,\ \sum y = 49.5,\ \sum y^2 = 275.47\); \(s_x^2 = \frac{1}{7}\left(199.33 - \frac{39.7^2}{8}\right) = 0.33125\)B1
\(s_y^2 = \frac{1}{8}\left(275.47 - \frac{49.5^2}{9}\right) = 0.4025\)B1
Pooled variance \(s^2 = \frac{7 \times 0.33125 + 8 \times 0.4025}{8+9-2}\)M1
\(= 0.36925\)A1
\(t = \dfrac{\frac{39.7}{8} - \frac{49.5}{9}}{s\sqrt{\frac{1}{8}+\frac{1}{9}}}\)M1
\(t = -1.82\)A1
Tabular value \(= 1.753\): \(\quad 1.82 > 1.753\)M1
Reject \(H_0\), sufficient evidence that mean for machine \(X\) is less than mean for machine \(Y\)A1 CWO. Correct conclusion in context, following correct work, level of uncertainty in language
Total: 9
Question 5(c):
AnswerMarks Guidance
AnswerMark Guidance
\(t\)-test is assuming a normal distribution, and with equal variances. This may not be true. So, no reason to expect results to be the same. Outliers affect part (b) but not part (a)B1 Not specific to data in question. Mention of normal distribution is not enough
Total: 1
## Question 5(a):

| Answer | Mark | Guidance |
|--------|------|----------|
| $H_0$: **population** medians are equal or $m_x = m_y$; $H_1$: **population** median for $X$ < **population** median for $Y$ or $m_x < m_y$ | B1 | Do not accept 'difference between population medians < 0' without $X$ or $Y$ specified |
| Rankings table: $X$ values ranked 1,3,4,5,7,9,12,14 and $Y$ values ranked 2,6,8,10,11,13,15,16,17; Sum of $X$ ranks = 55, Sum of $Y$ ranks = 98 | M1 | Rankings, allow at most 3 errors |
| Test statistic = 55 | A1 | |
| Tabular value for $m=8$, $n=9$ is 54 | B1 | |
| $55 > 54$, accept $H_0$/not significant | M1 | Ft *their* '55' must come from ranks; ft *their* '54' must come from table |
| Insufficient evidence to support manager's claim. Insufficient evidence to suggest that the median time of machine $X$ is less than the median time of machine $Y$ | A1 | Correct conclusion in context, following correct work, level of uncertainty in language. A0 if hypotheses the wrong way round or missing |
| **Total: 6** | | |

---

## Question 5(b):

| Answer | Mark | Guidance |
|--------|------|----------|
| $H_0: \mu_x = \mu_y \quad H_1: \mu_x < \mu_y$ | B1 | |
| $\sum x = 39.7,\ \sum x^2 = 199.33,\ \sum y = 49.5,\ \sum y^2 = 275.47$; $s_x^2 = \frac{1}{7}\left(199.33 - \frac{39.7^2}{8}\right) = 0.33125$ | B1 | |
| $s_y^2 = \frac{1}{8}\left(275.47 - \frac{49.5^2}{9}\right) = 0.4025$ | B1 | |
| Pooled variance $s^2 = \frac{7 \times 0.33125 + 8 \times 0.4025}{8+9-2}$ | M1 | |
| $= 0.36925$ | A1 | |
| $t = \dfrac{\frac{39.7}{8} - \frac{49.5}{9}}{s\sqrt{\frac{1}{8}+\frac{1}{9}}}$ | M1 | |
| $t = -1.82$ | A1 | |
| Tabular value $= 1.753$: $\quad 1.82 > 1.753$ | M1 | |
| Reject $H_0$, sufficient evidence that mean for machine $X$ is less than mean for machine $Y$ | A1 | CWO. Correct conclusion in context, following correct work, level of uncertainty in language |
| **Total: 9** | | |

---

## Question 5(c):

| Answer | Mark | Guidance |
|--------|------|----------|
| $t$-test is assuming a normal distribution, and with equal variances. This may not be true. So, no reason to expect results to be the same. Outliers affect part **(b)** but not part **(a)** | B1 | Not specific to data in question. Mention of normal distribution is not enough |
| **Total: 1** | | |
5 A company is deciding which of two machines, $X$ and $Y$, can make a certain type of electrical component more quickly. The times taken, in minutes, to make one component of this type are recorded for a random sample of 8 components made by machine $X$ and a random sample of 9 components made by machine $Y$. These times are as follows.

\begin{center}
\begin{tabular}{ l l l l l l l l l l }
Machine $X$ & 4.0 & 4.6 & 4.7 & 4.8 & 5.0 & 5.2 & 5.6 & 5.8 &  \\
Machine $Y$ & 4.5 & 4.9 & 5.1 & 5.3 & 5.4 & 5.7 & 5.9 & 6.3 & 6.4 \\
\end{tabular}
\end{center}

The manager claims that on average the time taken by machine $X$ to make one component is less than that taken by machine $Y$.
\begin{enumerate}[label=(\alph*)]
\item Carry out a Wilcoxon rank-sum test at the $5 \%$ significance level to test whether the manager's claim is supported by the data.
\item Assuming that the times taken to produce the components by the two machines are normally distributed with equal variances, carry out a $t$-test at the $5 \%$ significance level to test whether the manager's claim is supported by the data.\\

\section*{Question 5(c) is printed on the next page.}
\item In general, would you expect the conclusions from the tests in parts (a) and (b) to be the same? Give a reason for your answer.\\

If you use the following page to complete the answer to any question, the question number must be clearly shown.
\end{enumerate}

\hfill \mbox{\textit{CAIE Further Paper 4 2023 Q5 [16]}}