| Exam Board | OCR MEI |
|---|---|
| Module | S4 (Statistics 4) |
| Year | 2009 |
| Session | June |
| Marks | 24 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | T-tests (unknown variance) |
| Type | Two-sample t-test equal variance |
| Difficulty | Standard +0.3 This is a standard two-sample and paired t-test question requiring routine calculations with given data. While it involves multiple parts and some conceptual understanding of paired vs unpaired designs, the procedures are textbook applications with no novel problem-solving required. The calculations are straightforward for S4 level, making it slightly easier than average. |
| Spec | 5.05c Hypothesis test: normal distribution for population mean |
| Method A | 124.8 | 136.4 | 116.6 | 129.1 | 140.7 | 120.2 | 124.6 | 127.5 | 111.8 | 130.3 |
| Method B | 130.4 | 136.2 | 119.8 | 150.6 | 143.5 | 126.1 | 130.7 |
| Pair | I | II | III | IV | V | VI | VII | VIII | IX |
| Method A | 119.6 | 127.6 | 141.3 | 139.5 | 141.3 | 124.1 | 116.6 | 136.2 | 128.8 |
| Method B | 112.2 | 128.8 | 130.2 | 134.0 | 135.1 | 120.4 | 116.9 | 134.4 | 125.2 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| \(\bar{x} = 126.2\), \(s = 8.7002\), \(s^2 = 75.693\); \(\bar{y} = 133.9\), \(s = 10.4760\), \(s^2 = 109.746\) | A1 | A1 if all correct. Do not accept \(\bar{X} = \bar{Y}\). No mark for use of \(s_n\) (8.2537 and 9.6989) |
| \(H_0: \mu_A = \mu_B\); \(H_1: \mu_A \neq \mu_B\) (where \(\mu_A, \mu_B\) are population means) | 1, 1 | |
| Pooled \(s^2 = \frac{9 \times 75.693 + 6 \times 109.476}{15} = \frac{681.24 + 658.48}{15} = 89.3146\) \([\sqrt{} = 9.4506]\) | B1 | |
| Test statistic: \(\frac{126.2 - 133.9}{\sqrt{89.3146}\sqrt{\frac{1}{10}+\frac{1}{7}}} = \frac{-7.7}{4.6573} = -1.653\) | M1, A1 | |
| Refer to \(t_{15}\) | 1 | No FT if wrong |
| Double-tailed 10% point is 1.753; Not significant | 1, 1 | No FT if wrong |
| No evidence that population mean concentrations differ | 1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| There may be consistent differences between days (days of week, types of rubbish, ambient conditions,...) which should be allowed for | E1, E1 | |
| Assumption: Normality of population of differences | 1 | |
| Differences are \(7.4, -1.2, 11.1, 5.5, 6.2, 3.7, -0.3, 1.8, 3.6\); \([\bar{d} = 4.2, s = 3.862\ (s^2 = 14.915)]\) | M1 | A1 can be awarded here if NOT awarded in part (i). Use of \(s_n (= 3.641)\) is not acceptable |
| Test statistic: \(\frac{4.2 - 0}{3.862/\sqrt{9}} = 3.26\) | M1, A1 | |
| Refer to \(t_8\); Double-tailed 5% point is 2.306 | 1, 1 | No FT if wrong |
| Significant; Seems population means differ | 1, 1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| Wilcoxon rank sum test | B1 | |
| Wilcoxon signed rank test | B1 | |
| \(H_0\): \(\text{median}_A = \text{median}_B\); \(H_1\): \(\text{median}_A \neq \text{median}_B\) | 1, 1 | Or more formal statements |
# Question 3:
## Part (i):
| Answer/Working | Marks | Guidance |
|---|---|---|
| $\bar{x} = 126.2$, $s = 8.7002$, $s^2 = 75.693$; $\bar{y} = 133.9$, $s = 10.4760$, $s^2 = 109.746$ | A1 | A1 if all correct. Do not accept $\bar{X} = \bar{Y}$. No mark for use of $s_n$ (8.2537 and 9.6989) |
| $H_0: \mu_A = \mu_B$; $H_1: \mu_A \neq \mu_B$ (where $\mu_A, \mu_B$ are population means) | 1, 1 | |
| Pooled $s^2 = \frac{9 \times 75.693 + 6 \times 109.476}{15} = \frac{681.24 + 658.48}{15} = 89.3146$ $[\sqrt{} = 9.4506]$ | B1 | |
| Test statistic: $\frac{126.2 - 133.9}{\sqrt{89.3146}\sqrt{\frac{1}{10}+\frac{1}{7}}} = \frac{-7.7}{4.6573} = -1.653$ | M1, A1 | |
| Refer to $t_{15}$ | 1 | No FT if wrong |
| Double-tailed 10% point is 1.753; Not significant | 1, 1 | No FT if wrong |
| No evidence that population mean concentrations differ | 1 | | **[10 marks]**
## Part (ii):
| Answer/Working | Marks | Guidance |
|---|---|---|
| There may be consistent differences between days (days of week, types of rubbish, ambient conditions,...) which should be allowed for | E1, E1 | |
| Assumption: Normality of population of differences | 1 | |
| Differences are $7.4, -1.2, 11.1, 5.5, 6.2, 3.7, -0.3, 1.8, 3.6$; $[\bar{d} = 4.2, s = 3.862\ (s^2 = 14.915)]$ | M1 | A1 can be awarded here if NOT awarded in part (i). Use of $s_n (= 3.641)$ is not acceptable |
| Test statistic: $\frac{4.2 - 0}{3.862/\sqrt{9}} = 3.26$ | M1, A1 | |
| Refer to $t_8$; Double-tailed 5% point is 2.306 | 1, 1 | No FT if wrong |
| Significant; Seems population means differ | 1, 1 | | **[10 marks]**
## Part (iii):
| Answer/Working | Marks | Guidance |
|---|---|---|
| Wilcoxon rank sum test | B1 | |
| Wilcoxon signed rank test | B1 | |
| $H_0$: $\text{median}_A = \text{median}_B$; $H_1$: $\text{median}_A \neq \text{median}_B$ | 1, 1 | Or more formal statements | **[4 marks]**
---
3 (i) At a waste disposal station, two methods for incinerating some of the rubbish are being compared. Of interest is the amount of particulates in the exhaust, which can be measured over the working day in a convenient unit of concentration. It is assumed that the underlying distributions of concentrations of particulates are Normal. It is also assumed that the underlying variances are equal. During a period of several months, measurements are made for method A on a random sample of 10 working days and for method B on a separate random sample of 7 working days, with results, in the convenient unit, as follows.
\begin{center}
\begin{tabular}{ l l l l l l l l l l l }
Method A & 124.8 & 136.4 & 116.6 & 129.1 & 140.7 & 120.2 & 124.6 & 127.5 & 111.8 & 130.3 \\
Method B & 130.4 & 136.2 & 119.8 & 150.6 & 143.5 & 126.1 & 130.7 & & & \\
\end{tabular}
\end{center}
Use a $t$ test at the $10 \%$ level of significance to examine whether either method is better in resulting, on the whole, in a lower concentration of particulates. State the null and alternative hypotheses under test.\\
(ii) The company's statistician criticises the design of the trial in part (i) on the grounds that it is not paired. Summarise the arguments the statistician will have used. A new trial is set up with a paired design, measuring the concentrations of particulates on a random sample of 9 paired occasions. The results are as follows.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | c | c | }
\hline
Pair & I & II & III & IV & V & VI & VII & VIII & IX \\
\hline
Method A & 119.6 & 127.6 & 141.3 & 139.5 & 141.3 & 124.1 & 116.6 & 136.2 & 128.8 \\
\hline
Method B & 112.2 & 128.8 & 130.2 & 134.0 & 135.1 & 120.4 & 116.9 & 134.4 & 125.2 \\
\hline
\end{tabular}
\end{center}
Use a $t$ test at the $5 \%$ level of significance to examine the same hypotheses as in part (i). State the underlying distributional assumption that is needed in this case.\\
(iii) State the names of procedures that could be used in the situations of parts (i) and (ii) if the underlying distributional assumptions could not be made. What hypotheses would be under test?
\hfill \mbox{\textit{OCR MEI S4 2009 Q3 [24]}}