| Exam Board | OCR MEI |
|---|---|
| Module | S3 (Statistics 3) |
| Year | 2006 |
| Session | June |
| Marks | 18 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Wilcoxon tests |
| Type | Wilcoxon matched-pairs signed-rank test |
| Difficulty | Moderate -0.3 This is a standard application of the Wilcoxon matched-pairs signed-rank test followed by a routine paired t-test confidence interval. Part (i) requires calculating differences, ranking absolute values, summing ranks (standard procedure with clear steps), and comparing to tables. Part (ii) involves straightforward paired t-interval calculation and stating the normality assumption, plus recognizing small sample size necessitates t-distribution. While requiring careful arithmetic and knowledge of when to use each test, this follows textbook procedures without requiring problem-solving insight or novel approaches. Slightly easier than average due to being procedural application of learned techniques. |
| Spec | 5.05d Confidence intervals: using normal distribution5.07b Sign test: and Wilcoxon signed-rank |
| Factory | A | B | C | D | E | F | G | H | I |
| Number before installation | 8 | 12 | 6 | 4 | 14 | 22 | 4 | 13 | 14 |
| Number after installation | 6 | 11 | 0 | 1 | 18 | 10 | 11 | 5 | 4 |
| Factory | T | U | V | W | X | Y | Z |
| Cost before installation | 1215 | 95 | 546 | 467 | 2356 | 236 | 550 |
| Cost after installation | 1268 | 110 | 578 | 480 | 2417 | 318 | 620 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| Differences and Rank of \ | diff\ | : \(-2\to2,\ -1\to1,\ -6\to5,\ -3\to3,\ 4\to4,\ -12\to9,\ 7\to6,\ -8\to7,\ -10\to8\) |
| M1, A1 | For ranks. FT from here if ranks wrong. | |
| \(T = 4 + 6 = 10\) (or \(1+2+3+5+7+8+9 = 35\)) | B1 | |
| Refer to tables of Wilcoxon paired (/single sample) statistic. | M1 | No ft from here if wrong. |
| Lower (or upper if 35 used) 5% tail is needed. | M1 | i.e. a 1-tail test. No ft from here if wrong. |
| Value for \(n = 9\) is 8 (or 37 if 35 used). | A1 | No ft from here if wrong. |
| Result is not significant. | A1 | ft only c's test statistic. |
| No evidence to suggest a real change. | A1 | ft only c's test statistic. |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| Normality of differences is required. | B1 | |
| CI MUST be based on DIFFERENCES. Differences are 53, 15, 32, 13, 61, 82, 70 | ZERO/6 for the CI if differences not used. Accept negatives throughout. | |
| \(\bar{d} = 46.5714,\quad s_{n-1} = 27.0485\) | B1 | Accept \(s_{n-1}^2 = 731.62\ldots\) [\(s_n = 25.0420\), but do NOT allow this here or in construction of CI.] |
| CI is given by \(46.5714 \pm 3.707 \times \dfrac{27.0485}{\sqrt{7}}\) | M1, B1, B1 | Allow c's \(\bar{d} \pm \ldots\); If \(t_6\) used: 99% 2-tail point for c's \(t\) distribution. (Independent of previous mark.) |
| M1 | Allow c's \(s_{n-1}\). | |
| \(= 46.5714 \pm 37.8980 = (8.67(34),\ 84.47)\) | A1 | c.a.o. Must be expressed as an interval. [Upper boundary is 84.4694] |
| Cannot base CI on Normal distribution because sample is small and population s.d. is not known | E1, E1 | Insist on "population", but allow "\(\sigma\)". |
# Question 4:
## Part (i)
| Answer | Mark | Guidance |
|--------|------|----------|
| Differences and Rank of \|diff\|: $-2\to2,\ -1\to1,\ -6\to5,\ -3\to3,\ 4\to4,\ -12\to9,\ 7\to6,\ -8\to7,\ -10\to8$ | M1 | For differences. ZERO in this section if differences not used. |
| | M1, A1 | For ranks. FT from here if ranks wrong. |
| $T = 4 + 6 = 10$ (or $1+2+3+5+7+8+9 = 35$) | B1 | |
| Refer to tables of Wilcoxon paired (/single sample) statistic. | M1 | No ft from here if wrong. |
| Lower (or upper if 35 used) 5% tail is needed. | M1 | i.e. a 1-tail test. No ft from here if wrong. |
| Value for $n = 9$ is 8 (or 37 if 35 used). | A1 | No ft from here if wrong. |
| Result is not significant. | A1 | ft only c's test statistic. |
| No evidence to suggest a real change. | A1 | ft only c's test statistic. |
## Part (ii)
| Answer | Mark | Guidance |
|--------|------|----------|
| Normality of differences is required. | B1 | |
| CI MUST be based on DIFFERENCES. Differences are 53, 15, 32, 13, 61, 82, 70 | | ZERO/6 for the CI if differences not used. Accept negatives throughout. |
| $\bar{d} = 46.5714,\quad s_{n-1} = 27.0485$ | B1 | Accept $s_{n-1}^2 = 731.62\ldots$ [$s_n = 25.0420$, but do NOT allow this here or in construction of CI.] |
| CI is given by $46.5714 \pm 3.707 \times \dfrac{27.0485}{\sqrt{7}}$ | M1, B1, B1 | Allow c's $\bar{d} \pm \ldots$; **If $t_6$ used**: 99% 2-tail point for c's $t$ distribution. (Independent of previous mark.) |
| | M1 | Allow c's $s_{n-1}$. |
| $= 46.5714 \pm 37.8980 = (8.67(34),\ 84.47)$ | A1 | c.a.o. Must be expressed as an interval. [Upper boundary is 84.4694] |
| Cannot base CI on Normal distribution because sample is small and population s.d. is not known | E1, E1 | Insist on "population", but allow "$\sigma$". |
4 A company has many factories. It is concerned about incidents of trespassing and, in the hope of reducing if not eliminating these, has embarked on a programme of installing new fencing.\\
(i) Records for a random sample of 9 factories of the numbers of trespass incidents in typical weeks before and after installation of the new fencing are as follows.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | c | c | }
\hline
Factory & A & B & C & D & E & F & G & H & I \\
\hline
Number before installation & 8 & 12 & 6 & 4 & 14 & 22 & 4 & 13 & 14 \\
\hline
Number after installation & 6 & 11 & 0 & 1 & 18 & 10 & 11 & 5 & 4 \\
\hline
\end{tabular}
\end{center}
Use a Wilcoxon test to examine at the $5 \%$ level of significance whether it appears that, on the whole, the number of trespass incidents per week is lower after the installation of the new fencing than before.\\
(ii) Records are also available of the costs of damage from typical trespass incidents before and after the introduction of the new fencing for a random sample of 7 factories, as follows (in £).
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | }
\hline
Factory & T & U & V & W & X & Y & Z \\
\hline
Cost before installation & 1215 & 95 & 546 & 467 & 2356 & 236 & 550 \\
\hline
Cost after installation & 1268 & 110 & 578 & 480 & 2417 & 318 & 620 \\
\hline
\end{tabular}
\end{center}
Stating carefully the required distributional assumption, provide a two-sided $99 \%$ confidence interval based on a $t$ distribution for the population mean difference between costs of damage before and after installation of the new fencing.
Explain why this confidence interval should not be based on the Normal distribution.
\hfill \mbox{\textit{OCR MEI S3 2006 Q4 [18]}}