OCR MEI S2 2007 June — Question 2 19 marks

Exam BoardOCR MEI
ModuleS2 (Statistics 2)
Year2007
SessionJune
Marks19
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicHypothesis test of Spearman’s rank correlation coefficien
TypeHypothesis test for positive correlation
DifficultyStandard +0.3 This is a standard textbook exercise on Spearman's rank correlation with routine calculations: ranking data, applying the formula, and performing a one-tailed hypothesis test using tables. Part (iii) uses given summary statistics in the PMCC formula (plug-and-chug), and part (iv) requires recall of when PMCC is preferred. All steps are algorithmic with no novel insight required, making it slightly easier than average.
Spec5.08a Pearson correlation: calculate pmcc5.08d Hypothesis test: Pearson correlation5.08e Spearman rank correlation5.08f Hypothesis test: Spearman rank

2 A medical student is trying to estimate the birth weight of babies using pre-natal scan images. The actual weights, \(x \mathrm {~kg}\), and the estimated weights, \(y \mathrm {~kg}\), of ten randomly selected babies are given in the table below.
\(x\)2.612.732.872.963.053.143.173.243.764.10
\(y\)3.22.63.53.12.82.73.43.34.44.1
  1. Calculate the value of Spearman's rank correlation coefficient.
  2. Carry out a hypothesis test at the \(5 \%\) level to determine whether there is positive association between the student's estimates and the actual birth weights of babies in the underlying population.
  3. Calculate the value of the product moment correlation coefficient of the sample. You may use the following summary statistics in your calculations: $$\Sigma x = 31.63 , \quad \Sigma y = 33.1 , \quad \Sigma x ^ { 2 } = 101.92 , \quad \Sigma y ^ { 2 } = 112.61 , \quad \Sigma x y = 106.51 .$$
  4. Explain why, if the underlying population has a bivariate Normal distribution, it would be preferable to carry out a hypothesis test based on the product moment correlation coefficient. Comment briefly on the significance of the product moment correlation coefficient in relation to that of Spearman's rank correlation coefficient.

Question 2:
Part (i)
AnswerMarks Guidance
Answer/WorkingMarks Guidance
Ranking performed (all ranks reversed)M1 Allow all ranks reversed
\(d^2\) values calculated correctlyM1 For \(d^2\)
\(\Sigma d^2 = 68\)A1
\(r_s = 1 - \frac{6\Sigma d^2}{n(n^2-1)} = 1 - \frac{6 \times 68}{10 \times 99}\)M1 Method for \(r_s\)
\(= 0.588\) (to 3 s.f.) [allow 0.59 to 2 s.f.]A1 f.t. for \(\
Part (ii)
AnswerMarks Guidance
Answer/WorkingMarks Guidance
\(H_0\): no association between \(x\) and \(y\)B1 In context
\(H_1\): positive association between \(x\) and \(y\)B1 In context; NB \(H_0\), \(H_1\) not in terms of \(\rho\)
Critical value at 5% level is 0.5636B1 For \(\pm 0.5636\)
Since \(0.588 > 0.5636\), sufficient evidence to reject \(H_0\)M1 Sensible comparison with c.v., provided \(\
Positive association between true weight \(x\) and estimated weight \(y\)A1 Conclusion in words & in context, f.t. their \(r_s\) and sensible cv
Part (iii)
AnswerMarks Guidance
Answer/WorkingMarks Guidance
\(S_{xy} = \Sigma xy - \frac{1}{n}\Sigma x \Sigma y = 106.51 - \frac{1}{10} \times 31.63 \times 33.1 = 1.8147\)M1 Method for \(S_{xy}\)
\(S_{xx} = \Sigma x^2 - \frac{1}{n}(\Sigma x)^2 = 101.92 - \frac{1}{10} \times 31.63^2 = 1.8743\)M1 Method for at least one of \(S_{xx}\) or \(S_{yy}\)
\(S_{yy} = \Sigma y^2 - \frac{1}{n}(\Sigma y)^2 = 112.61 - \frac{1}{10} \times 33.1^2 = 3.049\)A1 At least one of \(S_{xy}\), \(S_{xx}\), \(S_{yy}\) correct
\(r = \frac{S_{xy}}{\sqrt{S_{xx}S_{yy}}} = \frac{1.8147}{\sqrt{1.8743 \times 3.049}} = 0.759\)M1 A1 M1 for structure of \(r\); A1 (awrt 0.76)
Part (iv)
AnswerMarks Guidance
Answer/WorkingMarks Guidance
PMCC uses actual values, not just ranksE1 Has values not just ranks
Contains more information than Spearman's, more discriminatoryE1 Contains more information; allow alternatives
Critical value for \(\rho = 0.5494\)B1 For a cv
PMCC is very highly significant whereas Spearman's is only just significantE1 Dependent mark
# Question 2:

## Part (i)

| Answer/Working | Marks | Guidance |
|---|---|---|
| Ranking performed (all ranks reversed) | M1 | Allow all ranks reversed |
| $d^2$ values calculated correctly | M1 | For $d^2$ |
| $\Sigma d^2 = 68$ | A1 | |
| $r_s = 1 - \frac{6\Sigma d^2}{n(n^2-1)} = 1 - \frac{6 \times 68}{10 \times 99}$ | M1 | Method for $r_s$ |
| $= 0.588$ (to 3 s.f.) [allow 0.59 to 2 s.f.] | A1 | f.t. for $\|r_s\| < 1$; NB no ranking scores zero |

## Part (ii)

| Answer/Working | Marks | Guidance |
|---|---|---|
| $H_0$: no association between $x$ and $y$ | B1 | In context |
| $H_1$: positive association between $x$ and $y$ | B1 | In context; NB $H_0$, $H_1$ not in terms of $\rho$ |
| Critical value at 5% level is 0.5636 | B1 | For $\pm 0.5636$ |
| Since $0.588 > 0.5636$, sufficient evidence to reject $H_0$ | M1 | Sensible comparison with c.v., provided $\|r_s\| < 1$ |
| Positive association between true weight $x$ and estimated weight $y$ | A1 | Conclusion in words & in context, f.t. their $r_s$ and sensible cv |

## Part (iii)

| Answer/Working | Marks | Guidance |
|---|---|---|
| $S_{xy} = \Sigma xy - \frac{1}{n}\Sigma x \Sigma y = 106.51 - \frac{1}{10} \times 31.63 \times 33.1 = 1.8147$ | M1 | Method for $S_{xy}$ |
| $S_{xx} = \Sigma x^2 - \frac{1}{n}(\Sigma x)^2 = 101.92 - \frac{1}{10} \times 31.63^2 = 1.8743$ | M1 | Method for at least one of $S_{xx}$ or $S_{yy}$ |
| $S_{yy} = \Sigma y^2 - \frac{1}{n}(\Sigma y)^2 = 112.61 - \frac{1}{10} \times 33.1^2 = 3.049$ | A1 | At least one of $S_{xy}$, $S_{xx}$, $S_{yy}$ correct |
| $r = \frac{S_{xy}}{\sqrt{S_{xx}S_{yy}}} = \frac{1.8147}{\sqrt{1.8743 \times 3.049}} = 0.759$ | M1 A1 | M1 for structure of $r$; A1 (awrt 0.76) |

## Part (iv)

| Answer/Working | Marks | Guidance |
|---|---|---|
| PMCC uses actual values, not just ranks | E1 | Has values not just ranks |
| Contains more information than Spearman's, more discriminatory | E1 | Contains more information; allow alternatives |
| Critical value for $\rho = 0.5494$ | B1 | For a cv |
| PMCC is very highly significant whereas Spearman's is only just significant | E1 | Dependent mark |

---
2 A medical student is trying to estimate the birth weight of babies using pre-natal scan images. The actual weights, $x \mathrm {~kg}$, and the estimated weights, $y \mathrm {~kg}$, of ten randomly selected babies are given in the table below.

\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | c | c | c | c | }
\hline
$x$ & 2.61 & 2.73 & 2.87 & 2.96 & 3.05 & 3.14 & 3.17 & 3.24 & 3.76 & 4.10 \\
\hline
$y$ & 3.2 & 2.6 & 3.5 & 3.1 & 2.8 & 2.7 & 3.4 & 3.3 & 4.4 & 4.1 \\
\hline
\end{tabular}
\end{center}

(i) Calculate the value of Spearman's rank correlation coefficient.\\
(ii) Carry out a hypothesis test at the $5 \%$ level to determine whether there is positive association between the student's estimates and the actual birth weights of babies in the underlying population.\\
(iii) Calculate the value of the product moment correlation coefficient of the sample. You may use the following summary statistics in your calculations:

$$\Sigma x = 31.63 , \quad \Sigma y = 33.1 , \quad \Sigma x ^ { 2 } = 101.92 , \quad \Sigma y ^ { 2 } = 112.61 , \quad \Sigma x y = 106.51 .$$

(iv) Explain why, if the underlying population has a bivariate Normal distribution, it would be preferable to carry out a hypothesis test based on the product moment correlation coefficient.

Comment briefly on the significance of the product moment correlation coefficient in relation to that of Spearman's rank correlation coefficient.

\hfill \mbox{\textit{OCR MEI S2 2007 Q2 [19]}}