OCR MEI S4 2007 June — Question 4 24 marks

Exam BoardOCR MEI
ModuleS4 (Statistics 4)
Year2007
SessionJune
Marks24
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared test of independence
TypeStandard 3×3 contingency table
DifficultyStandard +0.8 This is a multi-part Further Maths Statistics question requiring understanding of ANOVA concepts (residual mean square interpretation, Type II errors), experimental design principles (randomized block design for fertility gradients), and computational execution of one-way ANOVA. While the calculations are standard, the conceptual parts require deeper statistical reasoning than typical A-level questions, placing it moderately above average difficulty.
Spec2.01c Sampling techniques: simple random, opportunity, etc2.01d Select/critique sampling: in context5.06b Fit prescribed distribution: chi-squared test

4 An agricultural company conducts a trial of five fertilisers (A, B, C, D, E) in an experimental field at its research station. The fertilisers are applied to plots of the field according to a completely randomised design. The yields of the crop from the plots, measured in a standard unit, are analysed by the one-way analysis of variance, from which it appears that there are no real differences among the effects of the fertilisers. A statistician notes that the residual mean square in the analysis of variance is considerably larger than had been anticipated from knowledge of the general behaviour of the crop, and therefore suspects that there is some inadequacy in the design of the trial.
  1. Explain briefly why the statistician should be suspicious of the design.
  2. Explain briefly why an inflated residual leads to difficulty in interpreting the results of the analysis of variance, in particular that the null hypothesis is more likely to be accepted erroneously. Further investigation indicates that the soil at the west side of the experimental field is naturally more fertile than that at the east side, with a consistent 'fertility gradient' from west to east.
  3. What experimental design can accommodate this feature? Provide a simple diagram of the experimental field indicating a suitable layout. The company decides to conduct a new trial in its glasshouse, where experimental conditions can be controlled so that a completely randomised design is appropriate. The yields are as follows.
    Fertiliser AFertiliser BFertiliser CFertiliser DFertiliser E
    23.626.018.829.017.7
    18.235.316.737.216.5
    32.430.523.032.612.8
    20.831.428.331.420.4
    [The sum of these data items is 502.6 and the sum of their squares is 13610.22 .]
  4. Construct the usual one-way analysis of variance table. Carry out the appropriate test, using a \(5 \%\) significance level. Report briefly on your conclusions.
  5. State the assumptions about the distribution of the experimental error that underlie your analysis in part (iv).

Question 4:
Part (i)
AnswerMarks Guidance
AnswerMarks Guidance
There might be some consistent source of plot-to-plot variation that has inflated the residual and which the design has failed to cater forE2 E1 – Some reference to extra variation; E1 – Some indication of a reason
Part (ii)
AnswerMarks Guidance
AnswerMarks Guidance
Variation between the fertilisers should be compared with experimental errorE1
If the residual is inflated so that it measures more than experimental error, the comparison of between-fertilisers variation with it is less likely to reach significanceE2 (E1, E1)
Part (iii)
AnswerMarks Guidance
AnswerMarks Guidance
Randomised blocks1
Blocks (strips) clearly correctly oriented w.r.t. fertiliser gradientE1
All fertilisers appear in a blockE1
Different (random) arrangements in the blocksE1
SPECIAL CASE: Latin Square \(\frac{2}{4}\)(1, E1)
Part (iv)
AnswerMarks Guidance
AnswerMarks Guidance
Totals: 95.0, 123.2, 86.8, 130.2, 67.4 (each from sample of size 4), Grand total 502.6
CF \(= \dfrac{502.6^2}{20} = 12630.338\)
Total SS \(= 13610.22 - \text{CF} = 979.882\)
Between fertilisers SS \(= \dfrac{95.0^2}{4} + \cdots + \dfrac{67.4^2}{4} - \text{CF}\)M1
\(= 13308.07 - \text{CF} = 677.732\)M1 For correct method for any two
Residual SS (by subtraction) \(= 979.882 - 677.732 = 302.15\)A1 If each calculated SS is correct
ANOVA table: Between fertiliser SS=677.732, df=4, MS=169.433, MS Ratio=8.41M1, M1, 1.A1, 1
Residual SS=302.15, df=15, MS=20.143
Total SS=979.882, df=19
Refer to \(F_{4,15}\)1 No FT if wrong
Upper 5% point is 3.061 No FT if wrong
Significant1
Seems effects of fertilisers are not all the same1
Part (vii)
AnswerMarks Guidance
AnswerMarks Guidance
Independent \(N(0, \sigma^2\text{[constant]})\)1, 1, 1
# Question 4:

## Part (i)
| Answer | Marks | Guidance |
|--------|-------|----------|
| There might be some consistent source of plot-to-plot variation that has inflated the residual and which the design has failed to cater for | E2 | E1 – Some reference to extra variation; E1 – Some indication of a reason |

## Part (ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Variation between the fertilisers should be compared with experimental error | E1 | |
| If the residual is inflated so that it measures more than experimental error, the comparison of between-fertilisers variation with it is less likely to reach significance | E2 | (E1, E1) |

## Part (iii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Randomised blocks | 1 | |
| Blocks (strips) clearly correctly oriented w.r.t. fertiliser gradient | E1 | |
| All fertilisers appear in a block | E1 | |
| Different (random) arrangements in the blocks | E1 | |
| SPECIAL CASE: Latin Square $\frac{2}{4}$ | (1, E1) | |

## Part (iv)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Totals: 95.0, 123.2, 86.8, 130.2, 67.4 (each from sample of size 4), Grand total 502.6 | | |
| CF $= \dfrac{502.6^2}{20} = 12630.338$ | | |
| Total SS $= 13610.22 - \text{CF} = 979.882$ | | |
| Between fertilisers SS $= \dfrac{95.0^2}{4} + \cdots + \dfrac{67.4^2}{4} - \text{CF}$ | M1 | |
| $= 13308.07 - \text{CF} = 677.732$ | M1 | For correct method for any two |
| Residual SS (by subtraction) $= 979.882 - 677.732 = 302.15$ | A1 | If each calculated SS is correct |
| ANOVA table: Between fertiliser SS=677.732, df=4, MS=169.433, MS Ratio=8.41 | M1, M1, 1.A1, 1 | |
| Residual SS=302.15, df=15, MS=20.143 | | |
| Total SS=979.882, df=19 | | |
| Refer to $F_{4,15}$ | 1 | No FT if wrong |
| Upper 5% point is 3.06 | 1 | No FT if wrong |
| Significant | 1 | |
| Seems effects of fertilisers are not all the same | 1 | |

## Part (vii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Independent $N(0, \sigma^2\text{[constant]})$ | 1, 1, 1 | |
4 An agricultural company conducts a trial of five fertilisers (A, B, C, D, E) in an experimental field at its research station. The fertilisers are applied to plots of the field according to a completely randomised design. The yields of the crop from the plots, measured in a standard unit, are analysed by the one-way analysis of variance, from which it appears that there are no real differences among the effects of the fertilisers.

A statistician notes that the residual mean square in the analysis of variance is considerably larger than had been anticipated from knowledge of the general behaviour of the crop, and therefore suspects that there is some inadequacy in the design of the trial.\\
(i) Explain briefly why the statistician should be suspicious of the design.\\
(ii) Explain briefly why an inflated residual leads to difficulty in interpreting the results of the analysis of variance, in particular that the null hypothesis is more likely to be accepted erroneously.

Further investigation indicates that the soil at the west side of the experimental field is naturally more fertile than that at the east side, with a consistent 'fertility gradient' from west to east.\\
(iii) What experimental design can accommodate this feature? Provide a simple diagram of the experimental field indicating a suitable layout.

The company decides to conduct a new trial in its glasshouse, where experimental conditions can be controlled so that a completely randomised design is appropriate. The yields are as follows.

\begin{center}
\begin{tabular}{ | c | c | c | c | c | }
\hline
Fertiliser A & Fertiliser B & Fertiliser C & Fertiliser D & Fertiliser E \\
\hline
23.6 & 26.0 & 18.8 & 29.0 & 17.7 \\
18.2 & 35.3 & 16.7 & 37.2 & 16.5 \\
32.4 & 30.5 & 23.0 & 32.6 & 12.8 \\
20.8 & 31.4 & 28.3 & 31.4 & 20.4 \\
\hline
\end{tabular}
\end{center}

[The sum of these data items is 502.6 and the sum of their squares is 13610.22 .]\\
(iv) Construct the usual one-way analysis of variance table. Carry out the appropriate test, using a $5 \%$ significance level. Report briefly on your conclusions.\\
(v) State the assumptions about the distribution of the experimental error that underlie your analysis in part (iv).

\hfill \mbox{\textit{OCR MEI S4 2007 Q4 [24]}}