| Exam Board | OCR MEI |
|---|---|
| Module | S2 (Statistics 2) |
| Year | 2007 |
| Session | June |
| Marks | 19 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Hypothesis test of Spearman’s rank correlation coefficien |
| Type | Hypothesis test for positive correlation |
| Difficulty | Standard +0.3 This is a standard textbook exercise on Spearman's rank correlation with routine calculations: ranking data, applying the formula, and performing a one-tailed hypothesis test using tables. Part (iii) uses given summary statistics in the PMCC formula (plug-and-chug), and part (iv) requires recall of when PMCC is preferred. All steps are algorithmic with no novel insight required, making it slightly easier than average. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.08d Hypothesis test: Pearson correlation5.08e Spearman rank correlation5.08f Hypothesis test: Spearman rank |
| \(x\) | 2.61 | 2.73 | 2.87 | 2.96 | 3.05 | 3.14 | 3.17 | 3.24 | 3.76 | 4.10 |
| \(y\) | 3.2 | 2.6 | 3.5 | 3.1 | 2.8 | 2.7 | 3.4 | 3.3 | 4.4 | 4.1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| Ranking performed (all ranks reversed) | M1 | Allow all ranks reversed |
| \(d^2\) values calculated correctly | M1 | For \(d^2\) |
| \(\Sigma d^2 = 68\) | A1 | |
| \(r_s = 1 - \frac{6\Sigma d^2}{n(n^2-1)} = 1 - \frac{6 \times 68}{10 \times 99}\) | M1 | Method for \(r_s\) |
| \(= 0.588\) (to 3 s.f.) [allow 0.59 to 2 s.f.] | A1 | f.t. for \(\ |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| \(H_0\): no association between \(x\) and \(y\) | B1 | In context |
| \(H_1\): positive association between \(x\) and \(y\) | B1 | In context; NB \(H_0\), \(H_1\) not in terms of \(\rho\) |
| Critical value at 5% level is 0.5636 | B1 | For \(\pm 0.5636\) |
| Since \(0.588 > 0.5636\), sufficient evidence to reject \(H_0\) | M1 | Sensible comparison with c.v., provided \(\ |
| Positive association between true weight \(x\) and estimated weight \(y\) | A1 | Conclusion in words & in context, f.t. their \(r_s\) and sensible cv |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| \(S_{xy} = \Sigma xy - \frac{1}{n}\Sigma x \Sigma y = 106.51 - \frac{1}{10} \times 31.63 \times 33.1 = 1.8147\) | M1 | Method for \(S_{xy}\) |
| \(S_{xx} = \Sigma x^2 - \frac{1}{n}(\Sigma x)^2 = 101.92 - \frac{1}{10} \times 31.63^2 = 1.8743\) | M1 | Method for at least one of \(S_{xx}\) or \(S_{yy}\) |
| \(S_{yy} = \Sigma y^2 - \frac{1}{n}(\Sigma y)^2 = 112.61 - \frac{1}{10} \times 33.1^2 = 3.049\) | A1 | At least one of \(S_{xy}\), \(S_{xx}\), \(S_{yy}\) correct |
| \(r = \frac{S_{xy}}{\sqrt{S_{xx}S_{yy}}} = \frac{1.8147}{\sqrt{1.8743 \times 3.049}} = 0.759\) | M1 A1 | M1 for structure of \(r\); A1 (awrt 0.76) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| PMCC uses actual values, not just ranks | E1 | Has values not just ranks |
| Contains more information than Spearman's, more discriminatory | E1 | Contains more information; allow alternatives |
| Critical value for \(\rho = 0.5494\) | B1 | For a cv |
| PMCC is very highly significant whereas Spearman's is only just significant | E1 | Dependent mark |
# Question 2:
## Part (i)
| Answer/Working | Marks | Guidance |
|---|---|---|
| Ranking performed (all ranks reversed) | M1 | Allow all ranks reversed |
| $d^2$ values calculated correctly | M1 | For $d^2$ |
| $\Sigma d^2 = 68$ | A1 | |
| $r_s = 1 - \frac{6\Sigma d^2}{n(n^2-1)} = 1 - \frac{6 \times 68}{10 \times 99}$ | M1 | Method for $r_s$ |
| $= 0.588$ (to 3 s.f.) [allow 0.59 to 2 s.f.] | A1 | f.t. for $\|r_s\| < 1$; NB no ranking scores zero |
## Part (ii)
| Answer/Working | Marks | Guidance |
|---|---|---|
| $H_0$: no association between $x$ and $y$ | B1 | In context |
| $H_1$: positive association between $x$ and $y$ | B1 | In context; NB $H_0$, $H_1$ not in terms of $\rho$ |
| Critical value at 5% level is 0.5636 | B1 | For $\pm 0.5636$ |
| Since $0.588 > 0.5636$, sufficient evidence to reject $H_0$ | M1 | Sensible comparison with c.v., provided $\|r_s\| < 1$ |
| Positive association between true weight $x$ and estimated weight $y$ | A1 | Conclusion in words & in context, f.t. their $r_s$ and sensible cv |
## Part (iii)
| Answer/Working | Marks | Guidance |
|---|---|---|
| $S_{xy} = \Sigma xy - \frac{1}{n}\Sigma x \Sigma y = 106.51 - \frac{1}{10} \times 31.63 \times 33.1 = 1.8147$ | M1 | Method for $S_{xy}$ |
| $S_{xx} = \Sigma x^2 - \frac{1}{n}(\Sigma x)^2 = 101.92 - \frac{1}{10} \times 31.63^2 = 1.8743$ | M1 | Method for at least one of $S_{xx}$ or $S_{yy}$ |
| $S_{yy} = \Sigma y^2 - \frac{1}{n}(\Sigma y)^2 = 112.61 - \frac{1}{10} \times 33.1^2 = 3.049$ | A1 | At least one of $S_{xy}$, $S_{xx}$, $S_{yy}$ correct |
| $r = \frac{S_{xy}}{\sqrt{S_{xx}S_{yy}}} = \frac{1.8147}{\sqrt{1.8743 \times 3.049}} = 0.759$ | M1 A1 | M1 for structure of $r$; A1 (awrt 0.76) |
## Part (iv)
| Answer/Working | Marks | Guidance |
|---|---|---|
| PMCC uses actual values, not just ranks | E1 | Has values not just ranks |
| Contains more information than Spearman's, more discriminatory | E1 | Contains more information; allow alternatives |
| Critical value for $\rho = 0.5494$ | B1 | For a cv |
| PMCC is very highly significant whereas Spearman's is only just significant | E1 | Dependent mark |
---
2 A medical student is trying to estimate the birth weight of babies using pre-natal scan images. The actual weights, $x \mathrm {~kg}$, and the estimated weights, $y \mathrm {~kg}$, of ten randomly selected babies are given in the table below.
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | c | c | c | c | }
\hline
$x$ & 2.61 & 2.73 & 2.87 & 2.96 & 3.05 & 3.14 & 3.17 & 3.24 & 3.76 & 4.10 \\
\hline
$y$ & 3.2 & 2.6 & 3.5 & 3.1 & 2.8 & 2.7 & 3.4 & 3.3 & 4.4 & 4.1 \\
\hline
\end{tabular}
\end{center}
(i) Calculate the value of Spearman's rank correlation coefficient.\\
(ii) Carry out a hypothesis test at the $5 \%$ level to determine whether there is positive association between the student's estimates and the actual birth weights of babies in the underlying population.\\
(iii) Calculate the value of the product moment correlation coefficient of the sample. You may use the following summary statistics in your calculations:
$$\Sigma x = 31.63 , \quad \Sigma y = 33.1 , \quad \Sigma x ^ { 2 } = 101.92 , \quad \Sigma y ^ { 2 } = 112.61 , \quad \Sigma x y = 106.51 .$$
(iv) Explain why, if the underlying population has a bivariate Normal distribution, it would be preferable to carry out a hypothesis test based on the product moment correlation coefficient.
Comment briefly on the significance of the product moment correlation coefficient in relation to that of Spearman's rank correlation coefficient.
\hfill \mbox{\textit{OCR MEI S2 2007 Q2 [19]}}