Question 4 - A-Level Maths

OCR MEI S4 2016 June — Question 4

Exam Board	OCR MEI
Module	S4 (Statistics 4)
Year	2016
Session	June
Paper	Download PDF ↗
Mark scheme	Download PDF ↗

4 The cardiovascular unit of a hospital is studying the effect on patients' heart rates of three different light exercises, \(\mathrm { A } , \mathrm { B }\) and C . Patients are given an exercise to do and the increases in their pulse rates are measured after 5 minutes. There are 16 patients in the study: 5 are chosen randomly and allocated to exercise A, 6 to exercise B, and 5 to exercise C. The data obtained are as follows.

A	B	C
63	69	56
41	72	44
42	52	65
51	64	48
47	54	53

	A	B	C
Sum of data	244	368	266
Sum of squares	12224	22910	14410

State the usual one-way analysis of variance model. Explain what the terms in the model mean in this context.
State the distributional assumptions required for the standard test.
Carry out the test at the \(5 \%\) level of significance and report your conclusions.
Someone unfamiliar with analysis of variance analysed these data. They used three \(t\) tests to compare A with \(\mathrm { B } , \mathrm { B }\) with C , and C with A . The test comparing A with B was significant at the \(5 \%\) level; the other two tests were not significant at the \(5 \%\) level. Comment on this analysis, explaining whether it is better than, worse than or equivalent to the analysis carried out in part (i). Your comments should include consideration of the independence of the \(t\) tests and the overall level of significance of the procedure.

Show mark scheme Show mark scheme source

Part (i)

Answer	Marks	Guidance
Answer	Marks	Guidance
\(Y_{ij}=\mu+\alpha_i+\varepsilon_{ij}\)	B1
where \(Y_{ij}\) is the \(j\)th value in the \(i\)th group	B1
\(\mu\) is the global mean in the underlying population	B1
\(\alpha_i\) is the 'treatment effect' in the \(i\)th group	B1	Or \(\mu_i-\mu\)
\(\varepsilon_{ij}\) is a random error term	B1	Accept "residual"
In this context, \(\mu\) measures the average effect of the exercise regimes, and the \(\alpha_i\) represent the differences from the mean for the three regimes	E1	Context explained at least once; 'Groups' are exercise regimes
\(\varepsilon_{ij}\) iid \(N(0,\sigma^2)\)	B1	Distributional assumption
\(H_0\): the three exercise regimes give the same (population) increase in mean pulse rate	B1	Or: \(\alpha_1=\alpha_2=\alpha_3(=0)\)
\(H_1\): the three exercise regimes do not give the same (population) increase in mean pulse rate		Not all \(\alpha_i\) the same
\(\sum\frac{T_i^2}{n_i}-\frac{T^2}{n}=\frac{244^2}{5}+\frac{368^2}{6}+\frac{266^2}{5}-\frac{878^2}{16}=448.8167\)	M1A1
\(\sum\sum y_{ij}^2-\frac{T^2}{n}=49544-\frac{878^2}{16}=1363.75\)	M1A1
ANOVA table: Between Groups SS = 448.8167, df = 2, MS = 224.41, F ratio = 3.1885, F critical = 3.8056	A1Ft, B1	Within Groups Sum Sq; Df all 3
Within Groups SS = 914.9333, df = 13, MS = 70.379	A1Ft	F ratio Ft their Sum Sqs; Ft their Total SS-BGSS
Total SS = 1363.75, df = 15	B1	F critical; 3.81 from tables
Result not significant	M1
Insufficient evidence to suppose that the exercise regimes have different effects on pulse rate on average	A1

Part (ii)

Answer	Marks	Guidance
Answer	Marks	Guidance
The analysis using three tests is not equivalent to ANOVA, and the multiple comparisons procedure is worse than ANOVA	B1	Other points could be made; e.g. Multiple comparisons are likely to generate more type I errors than the nominal significance level would suggest
The three tests are not independent	B1
The significance level of the whole procedure is therefore impossible to assess	B1	However, multiple comparisons are useful post hoc to identify where the largest differences have occurred
A comparison with the different result obtained in (i)	B1
and why this may be so	B1

## Part (i)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $Y_{ij}=\mu+\alpha_i+\varepsilon_{ij}$ | B1 | |
| where $Y_{ij}$ is the $j$th value in the $i$th group | B1 | |
| $\mu$ is the global mean in the underlying population | B1 | |
| $\alpha_i$ is the 'treatment effect' in the $i$th group | B1 | Or $\mu_i-\mu$ |
| $\varepsilon_{ij}$ is a random error term | B1 | Accept "residual" |
| In this context, $\mu$ measures the average effect of the exercise regimes, and the $\alpha_i$ represent the differences from the mean for the three regimes | E1 | Context explained at least once; 'Groups' are exercise regimes |
| $\varepsilon_{ij}$ iid $N(0,\sigma^2)$ | B1 | Distributional assumption |
| $H_0$: the three exercise regimes give the same (population) increase in mean pulse rate | B1 | Or: $\alpha_1=\alpha_2=\alpha_3(=0)$ |
| $H_1$: the three exercise regimes do not give the same (population) increase in mean pulse rate | | Not all $\alpha_i$ the same |
| $\sum\frac{T_i^2}{n_i}-\frac{T^2}{n}=\frac{244^2}{5}+\frac{368^2}{6}+\frac{266^2}{5}-\frac{878^2}{16}=448.8167$ | M1A1 | |
| $\sum\sum y_{ij}^2-\frac{T^2}{n}=49544-\frac{878^2}{16}=1363.75$ | M1A1 | |
| ANOVA table: Between Groups SS = 448.8167, df = 2, MS = 224.41, F ratio = 3.1885, F critical = 3.8056 | A1Ft, B1 | Within Groups Sum Sq; Df all 3 |
| Within Groups SS = 914.9333, df = 13, MS = 70.379 | A1Ft | F ratio Ft their Sum Sqs; Ft their Total SS-BGSS |
| Total SS = 1363.75, df = 15 | B1 | F critical; 3.81 from tables |
| Result not significant | M1 | |
| Insufficient evidence to suppose that the exercise regimes have different effects on pulse rate on average | A1 | |

## Part (ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| The analysis using three tests is not equivalent to ANOVA, and the multiple comparisons procedure is worse than ANOVA | B1 | Other points could be made; e.g. Multiple comparisons are likely to generate more type I errors than the nominal significance level would suggest |
| The three tests are not independent | B1 | |
| The significance level of the whole procedure is therefore impossible to assess | B1 | However, multiple comparisons are useful post hoc to identify where the largest differences have occurred |
| A comparison with the different result obtained in (i) | B1 | |
| and why this may be so | B1 | |

Show LaTeX source

4 The cardiovascular unit of a hospital is studying the effect on patients' heart rates of three different light exercises, $\mathrm { A } , \mathrm { B }$ and C . Patients are given an exercise to do and the increases in their pulse rates are measured after 5 minutes. There are 16 patients in the study: 5 are chosen randomly and allocated to exercise A, 6 to exercise B, and 5 to exercise C.

The data obtained are as follows.

\begin{center}
\begin{tabular}{ | c | c | c | }
\hline
A & B & C \\
\hline
63 & 69 & 56 \\
\hline
41 & 72 & 44 \\
\hline
42 & 52 & 65 \\
\hline
51 & 64 & 48 \\
\hline
47 & 54 & 53 \\
\hline
\end{tabular}
\end{center}

\begin{center}
\begin{tabular}{ | l | c | c | c | }
\hline
 & A & B & \multicolumn{1}{|c|}{C} \\
\hline
Sum of data & 244 & 368 & 266 \\
\hline
Sum of squares & 12224 & 22910 & 14410 \\
\hline
\end{tabular}
\end{center}

\begin{enumerate}[label=(\roman*)]
\item State the usual one-way analysis of variance model.

Explain what the terms in the model mean in this context.\\
State the distributional assumptions required for the standard test.\\
Carry out the test at the $5 \%$ level of significance and report your conclusions.
\item Someone unfamiliar with analysis of variance analysed these data. They used three $t$ tests to compare A with $\mathrm { B } , \mathrm { B }$ with C , and C with A . The test comparing A with B was significant at the $5 \%$ level; the other two tests were not significant at the $5 \%$ level.

Comment on this analysis, explaining whether it is better than, worse than or equivalent to the analysis carried out in part (i). Your comments should include consideration of the independence of the $t$ tests and the overall level of significance of the procedure.
\end{enumerate}

\hfill \mbox{\textit{OCR MEI S4 2016 Q4}}

This paper (4 questions)

View full paper

Q1 24 Q2 24 Q3 24 Q4