| Exam Board | OCR MEI |
|---|---|
| Module | S3 (Statistics 3) |
| Year | 2007 |
| Session | June |
| Marks | 18 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Wilcoxon tests |
| Type | Paired t-test |
| Difficulty | Standard +0.3 This is a straightforward paired t-test application with standard hypothesis setup, calculation of differences, and confidence interval construction. While it requires multiple steps and careful arithmetic, all techniques are routine S3 procedures with no novel problem-solving or conceptual challenges beyond textbook exercises. |
| Spec | 5.05c Hypothesis test: normal distribution for population mean5.05d Confidence intervals: using normal distribution |
| Shop | A | B | C | D | E | F | G | H | I | J | K |
| \% days lost before | 3.5 | 5.0 | 3.5 | 3.2 | 4.5 | 4.9 | 4.1 | 6.0 | 6.8 | 8.1 | 6.0 |
| \% days lost after | 1.8 | 4.3 | 2.9 | 4.5 | 4.4 | 5.8 | 3.5 | 6.7 | 6.4 | 5.4 | 5.1 |
| Answer | Marks | Guidance |
|---|---|---|
| \(H_0: \mu_D = 0\) | B1 | Both. Accept alternatives e.g. \(\mu_D < 0\) for \(H_1\), or \(\mu_A - \mu_B\) etc provided adequately defined. |
| \(H_1: \mu_D > 0\) | B1 | Allow absence of "population" if correct notation \(\mu\) is used, but do NOT allow "\(\bar{x} = ...\)" or similar unless \(\bar{x}\) is clearly and explicitly stated to be a population mean. Hypotheses in words only must include "population". |
| Where \(\mu_D\) is the (population) mean reduction in absenteeism. | B1 | |
| Must assume Normality ... ... of differences. | B1 |
| Answer | Marks | Guidance |
|---|---|---|
| \(\bar{x} = 0.4364\), \(s_{n1} = 1.1518\) (\(s_n^2 = 1.3265\)) | B1, M1 | Do not allow \(s_n = 1.098\) (\(s_n^2 = 1.205\)). Allow c's \(\bar{x}\) and/or \(s_{n1}\). Allow alternative: \(0 \pm\) (c's \(1.812) \times \frac{1.1518}{\sqrt{11}} (= -0.6293, 0.6293)\) for subsequent comparison with \(\bar{x}\). (Or \(\bar{x} \pm\) (c's \(1.812) \times \frac{1.1518}{\sqrt{11}} (= -0.1929, 1.0657)\) for comparison with 0.) |
| Test statistic is \(\frac{0.4364 - 0}{1.1518 / \sqrt{11}} = 1.256(56...)\) | A1 | c.a.o. but ft from here in any case if wrong. Use of \(0 - \bar{x}\) scores M1A0, but ft. |
| Refer to \(t_{10}\). Upper 5% point is 1.812. | M1, A1 | No ft from here if wrong. No ft from here if wrong. For alternative \(H_1\) expect \(-1.812\) unless it is clear that absolute values are being used. |
| \(1.256 < 1.812\), \(\therefore\) Result is not significant. Seems there has been no reduction in mean absenteeism. | E1, E1 | ft only c's test statistic. ft only c's test statistic. Special case: (\(t_{11}\) and 1.796) can score 1 of these last 2 marks if either form of conclusion is given. |
| Answer | Marks | Guidance |
|---|---|---|
| \(\bar{x} = 4.6182\), \(s_{n1} = 1.4851\) (\(s_n^2 = 2.2056\)) | B1, M1, B1, M1 | Do not allow \(s_n = 1.4160\) (\(s_n^2 = 2.0051\)). ft c's \(\bar{x} \pm\). ft c's \(s_{n1}\). |
| CI is given by \(4.6182 \pm \frac{2.228}{} \times \frac{1.4851}{\sqrt{11}} = 4.6182 \pm 0.9976 = (3.620(6), 5.615(8))\) | A1 | c.a.o. Must be expressed as an interval. ZERO if not same distribution as test. Same wrong distribution scores maximum M1B0M1A0. Recovery to \(t_{10}\) is OK. |
| Assume Normality of population of "days lost after". Since 3.5 lies outside the interval it seems that the target has not been achieved. | E1, E1 |
**Part (a)(i):**
$H_0: \mu_D = 0$ | B1 | Both. Accept alternatives e.g. $\mu_D < 0$ for $H_1$, or $\mu_A - \mu_B$ etc provided adequately defined.
$H_1: \mu_D > 0$ | B1 | Allow absence of "population" if correct notation $\mu$ is used, but do NOT allow "$\bar{x} = ...$" or similar unless $\bar{x}$ is clearly and explicitly stated to be a population mean. Hypotheses in words only must include "population".
Where $\mu_D$ is the (population) mean reduction in absenteeism. | B1 |
Must assume Normality ... ... of differences. | B1 | | 4 marks
**Part (a)(ii):**
Differences (reductions) (before – after)
1.7, 0.7, 0.6, –1.3, 0.1, –0.9, 0.6, –0.7, 0.4, 2.7, 0.9
$\bar{x} = 0.4364$, $s_{n1} = 1.1518$ ($s_n^2 = 1.3265$) | B1, M1 | Do not allow $s_n = 1.098$ ($s_n^2 = 1.205$). Allow c's $\bar{x}$ and/or $s_{n1}$. Allow alternative: $0 \pm$ (c's $1.812) \times \frac{1.1518}{\sqrt{11}} (= -0.6293, 0.6293)$ for subsequent comparison with $\bar{x}$. (Or $\bar{x} \pm$ (c's $1.812) \times \frac{1.1518}{\sqrt{11}} (= -0.1929, 1.0657)$ for comparison with 0.)
Test statistic is $\frac{0.4364 - 0}{1.1518 / \sqrt{11}} = 1.256(56...)$ | A1 | c.a.o. but ft from here in any case if wrong. Use of $0 - \bar{x}$ scores M1A0, but ft.
Refer to $t_{10}$. Upper 5% point is 1.812. | M1, A1 | No ft from here if wrong. No ft from here if wrong. For alternative $H_1$ expect $-1.812$ unless it is clear that absolute values are being used.
$1.256 < 1.812$, $\therefore$ Result is not significant. Seems there has been no reduction in mean absenteeism. | E1, E1 | ft only c's test statistic. ft only c's test statistic. Special case: ($t_{11}$ and 1.796) can score 1 of these last 2 marks if either form of conclusion is given. | 7 marks
**Part (b):**
For "days lost after"
$\bar{x} = 4.6182$, $s_{n1} = 1.4851$ ($s_n^2 = 2.2056$) | B1, M1, B1, M1 | Do not allow $s_n = 1.4160$ ($s_n^2 = 2.0051$). ft c's $\bar{x} \pm$. ft c's $s_{n1}$.
CI is given by $4.6182 \pm \frac{2.228}{} \times \frac{1.4851}{\sqrt{11}} = 4.6182 \pm 0.9976 = (3.620(6), 5.615(8))$ | A1 | c.a.o. Must be expressed as an interval. ZERO if not same distribution as test. Same wrong distribution scores maximum M1B0M1A0. Recovery to $t_{10}$ is OK.
Assume Normality of population of "days lost after". Since 3.5 lies outside the interval it seems that the target has not been achieved. | E1, E1 | | 7 marks
3 The management of a large chain of shops aims to reduce the level of absenteeism among its workforce by means of an incentive bonus scheme. In order to evaluate the effectiveness of the scheme, the management measures the percentage of working days lost before and after its introduction for each of a random sample of 11 shops. The results are shown below.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | c | c | c | c | }
\hline
Shop & A & B & C & D & E & F & G & H & I & J & K \\
\hline
\% days lost before & 3.5 & 5.0 & 3.5 & 3.2 & 4.5 & 4.9 & 4.1 & 6.0 & 6.8 & 8.1 & 6.0 \\
\hline
\% days lost after & 1.8 & 4.3 & 2.9 & 4.5 & 4.4 & 5.8 & 3.5 & 6.7 & 6.4 & 5.4 & 5.1 \\
\hline
\end{tabular}
\end{center}
\begin{enumerate}[label=(\alph*)]
\item The management decides to carry out a $t$ test to investigate whether there has been a reduction in absenteeism.
\begin{enumerate}[label=(\roman*)]
\item State clearly the hypotheses that should be used together with any necessary assumptions.
\item Carry out the test using a $5 \%$ significance level.
\end{enumerate}\item Find a 95\% confidence interval for the true mean percentage of days lost after the introduction of the incentive scheme and state any assumption needed. The management has set a target that the mean percentage should be 3.5. Do you think this has been achieved? Explain your answer.
\end{enumerate}
\hfill \mbox{\textit{OCR MEI S3 2007 Q3 [18]}}