| Exam Board | OCR MEI |
|---|---|
| Module | S4 (Statistics 4) |
| Year | 2007 |
| Session | June |
| Marks | 24 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared test of independence |
| Type | Standard 3×3 contingency table |
| Difficulty | Standard +0.8 This is a multi-part Further Maths Statistics question requiring understanding of ANOVA concepts (residual mean square interpretation, Type II errors), experimental design principles (randomized block design for fertility gradients), and computational execution of one-way ANOVA. While the calculations are standard, the conceptual parts require deeper statistical reasoning than typical A-level questions, placing it moderately above average difficulty. |
| Spec | 2.01c Sampling techniques: simple random, opportunity, etc2.01d Select/critique sampling: in context5.06b Fit prescribed distribution: chi-squared test |
| Fertiliser A | Fertiliser B | Fertiliser C | Fertiliser D | Fertiliser E |
| 23.6 | 26.0 | 18.8 | 29.0 | 17.7 |
| 18.2 | 35.3 | 16.7 | 37.2 | 16.5 |
| 32.4 | 30.5 | 23.0 | 32.6 | 12.8 |
| 20.8 | 31.4 | 28.3 | 31.4 | 20.4 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| There might be some consistent source of plot-to-plot variation that has inflated the residual and which the design has failed to cater for | E2 | E1 – Some reference to extra variation; E1 – Some indication of a reason |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Variation between the fertilisers should be compared with experimental error | E1 | |
| If the residual is inflated so that it measures more than experimental error, the comparison of between-fertilisers variation with it is less likely to reach significance | E2 | (E1, E1) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Randomised blocks | 1 | |
| Blocks (strips) clearly correctly oriented w.r.t. fertiliser gradient | E1 | |
| All fertilisers appear in a block | E1 | |
| Different (random) arrangements in the blocks | E1 | |
| SPECIAL CASE: Latin Square \(\frac{2}{4}\) | (1, E1) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Totals: 95.0, 123.2, 86.8, 130.2, 67.4 (each from sample of size 4), Grand total 502.6 | ||
| CF \(= \dfrac{502.6^2}{20} = 12630.338\) | ||
| Total SS \(= 13610.22 - \text{CF} = 979.882\) | ||
| Between fertilisers SS \(= \dfrac{95.0^2}{4} + \cdots + \dfrac{67.4^2}{4} - \text{CF}\) | M1 | |
| \(= 13308.07 - \text{CF} = 677.732\) | M1 | For correct method for any two |
| Residual SS (by subtraction) \(= 979.882 - 677.732 = 302.15\) | A1 | If each calculated SS is correct |
| ANOVA table: Between fertiliser SS=677.732, df=4, MS=169.433, MS Ratio=8.41 | M1, M1, 1.A1, 1 | |
| Residual SS=302.15, df=15, MS=20.143 | ||
| Total SS=979.882, df=19 | ||
| Refer to \(F_{4,15}\) | 1 | No FT if wrong |
| Upper 5% point is 3.06 | 1 | No FT if wrong |
| Significant | 1 | |
| Seems effects of fertilisers are not all the same | 1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Independent \(N(0, \sigma^2\text{[constant]})\) | 1, 1, 1 |
# Question 4:
## Part (i)
| Answer | Marks | Guidance |
|--------|-------|----------|
| There might be some consistent source of plot-to-plot variation that has inflated the residual and which the design has failed to cater for | E2 | E1 – Some reference to extra variation; E1 – Some indication of a reason |
## Part (ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Variation between the fertilisers should be compared with experimental error | E1 | |
| If the residual is inflated so that it measures more than experimental error, the comparison of between-fertilisers variation with it is less likely to reach significance | E2 | (E1, E1) |
## Part (iii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Randomised blocks | 1 | |
| Blocks (strips) clearly correctly oriented w.r.t. fertiliser gradient | E1 | |
| All fertilisers appear in a block | E1 | |
| Different (random) arrangements in the blocks | E1 | |
| SPECIAL CASE: Latin Square $\frac{2}{4}$ | (1, E1) | |
## Part (iv)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Totals: 95.0, 123.2, 86.8, 130.2, 67.4 (each from sample of size 4), Grand total 502.6 | | |
| CF $= \dfrac{502.6^2}{20} = 12630.338$ | | |
| Total SS $= 13610.22 - \text{CF} = 979.882$ | | |
| Between fertilisers SS $= \dfrac{95.0^2}{4} + \cdots + \dfrac{67.4^2}{4} - \text{CF}$ | M1 | |
| $= 13308.07 - \text{CF} = 677.732$ | M1 | For correct method for any two |
| Residual SS (by subtraction) $= 979.882 - 677.732 = 302.15$ | A1 | If each calculated SS is correct |
| ANOVA table: Between fertiliser SS=677.732, df=4, MS=169.433, MS Ratio=8.41 | M1, M1, 1.A1, 1 | |
| Residual SS=302.15, df=15, MS=20.143 | | |
| Total SS=979.882, df=19 | | |
| Refer to $F_{4,15}$ | 1 | No FT if wrong |
| Upper 5% point is 3.06 | 1 | No FT if wrong |
| Significant | 1 | |
| Seems effects of fertilisers are not all the same | 1 | |
## Part (vii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Independent $N(0, \sigma^2\text{[constant]})$ | 1, 1, 1 | |
4 An agricultural company conducts a trial of five fertilisers (A, B, C, D, E) in an experimental field at its research station. The fertilisers are applied to plots of the field according to a completely randomised design. The yields of the crop from the plots, measured in a standard unit, are analysed by the one-way analysis of variance, from which it appears that there are no real differences among the effects of the fertilisers.
A statistician notes that the residual mean square in the analysis of variance is considerably larger than had been anticipated from knowledge of the general behaviour of the crop, and therefore suspects that there is some inadequacy in the design of the trial.\\
(i) Explain briefly why the statistician should be suspicious of the design.\\
(ii) Explain briefly why an inflated residual leads to difficulty in interpreting the results of the analysis of variance, in particular that the null hypothesis is more likely to be accepted erroneously.
Further investigation indicates that the soil at the west side of the experimental field is naturally more fertile than that at the east side, with a consistent 'fertility gradient' from west to east.\\
(iii) What experimental design can accommodate this feature? Provide a simple diagram of the experimental field indicating a suitable layout.
The company decides to conduct a new trial in its glasshouse, where experimental conditions can be controlled so that a completely randomised design is appropriate. The yields are as follows.
\begin{center}
\begin{tabular}{ | c | c | c | c | c | }
\hline
Fertiliser A & Fertiliser B & Fertiliser C & Fertiliser D & Fertiliser E \\
\hline
23.6 & 26.0 & 18.8 & 29.0 & 17.7 \\
18.2 & 35.3 & 16.7 & 37.2 & 16.5 \\
32.4 & 30.5 & 23.0 & 32.6 & 12.8 \\
20.8 & 31.4 & 28.3 & 31.4 & 20.4 \\
\hline
\end{tabular}
\end{center}
[The sum of these data items is 502.6 and the sum of their squares is 13610.22 .]\\
(iv) Construct the usual one-way analysis of variance table. Carry out the appropriate test, using a $5 \%$ significance level. Report briefly on your conclusions.\\
(v) State the assumptions about the distribution of the experimental error that underlie your analysis in part (iv).
\hfill \mbox{\textit{OCR MEI S4 2007 Q4 [24]}}