| Exam Board | Edexcel |
|---|---|
| Module | S3 (Statistics 3) |
| Marks | 16 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Hypothesis test of Spearman’s rank correlation coefficien |
| Type | Justify use of Spearman's |
| Difficulty | Standard +0.3 This is a standard S3 hypothesis testing question requiring routine application of Spearman's rank correlation. Parts (a) and (c) involve straightforward hypothesis tests with critical value comparison. Part (b) requires ranking data and calculating rs using the formula—mechanical but multi-step. Part (d) asks for standard textbook reasoning about when rank correlation is preferred. While lengthy (16 marks), it requires no novel insight, just careful execution of learned procedures, making it slightly easier than average. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.08b Linear coding: effect on pmcc5.08c Pearson: measure of straight-line fit5.08d Hypothesis test: Pearson correlation5.08e Spearman rank correlation5.08f Hypothesis test: Spearman rank5.08g Compare: Pearson vs Spearman |
| Gymnast | \(A\) | \(B\) | \(C\) | \(D\) | \(E\) | \(F\) | \(G\) | \(H\) |
| Technical ability | 8.5 | 8.6 | 9.5 | 7.5 | 6.8 | 9.1 | 9.4 | 9.2 |
| Artistic performance | 6.2 | 7.5 | 8.2 | 6.7 | 6.0 | 7.2 | 8.0 | 9.1 |
| Answer | Marks |
|---|---|
| \(H_0: \rho = 0\), \(H_1: \rho > 0\) | B1 B1 |
| \(\alpha = 0.01\), critical value = 0.7887 | B1 |
| Since 0.774 is not in the critical region there is insufficient evidence of positive correlation. | M1 A1 |
| Total: 5 marks |
| Answer | Marks | Guidance |
|---|---|---|
| e.g. | M1; A1 | |
| \(\begin{array}{c\ | cccccccc} R_T & 3 & 4 & 8 & 2 & 1 & 5 & 7 & 6 \\ R_A & 2 & 5 & 7 & 3 & 1 & 4 & 6 & 8 \end{array}\) | Ranks All correct |
| \(\sum d^2 = 10\) | M1 A1 | |
| \(r_s = 1 - \frac{6 \times 10}{8 \times 63} = 0.881\) | M1 A1 | |
| Total: 6 marks |
| Answer | Marks | Guidance |
|---|---|---|
| \(H_0: \rho = 0\), \(H_1: \rho > 0\) | both | B1 |
| \(\alpha = 0.01\); critical value: 0.8333 | B1 | |
| Since 0.881 is in the critical region there is evidence of positive correlation. | A1 \(\checkmark\) | |
| Total: 3 marks |
| Answer | Marks |
|---|---|
| Because it makes no distributional assumptions about the data or order is more important than the mark | B1 |
| Product moment correlation assumes bivariate normality and it is very unlikely that these scores will be distributed this way. | B1 |
| Total: 2 marks |
## Part (a)
| $H_0: \rho = 0$, $H_1: \rho > 0$ | B1 B1 |
| $\alpha = 0.01$, critical value = 0.7887 | B1 |
| Since 0.774 is not in the critical region there is insufficient evidence of positive correlation. | M1 A1 |
| **Total: 5 marks** |
## Part (b)
| e.g. | M1; A1 |
| $\begin{array}{c\|cccccccc} R_T & 3 & 4 & 8 & 2 & 1 & 5 & 7 & 6 \\ R_A & 2 & 5 & 7 & 3 & 1 & 4 & 6 & 8 \end{array}$ | Ranks All correct | |
| $\sum d^2 = 10$ | M1 A1 |
| $r_s = 1 - \frac{6 \times 10}{8 \times 63} = 0.881$ | M1 A1 |
| **Total: 6 marks** |
## Part (c)
| $H_0: \rho = 0$, $H_1: \rho > 0$ | both | B1 |
| $\alpha = 0.01$; critical value: 0.8333 | B1 |
| Since 0.881 is in the critical region there is evidence of positive correlation. | A1 $\checkmark$ |
| **Total: 3 marks** |
## Part (d)
| Because it makes no distributional assumptions about the data or order is more important than the mark | B1 |
| Product moment correlation assumes bivariate normality and it is very unlikely that these scores will be distributed this way. | B1 |
| **Total: 2 marks** |
For one of the activities at a gymnastics competition, 8 gymnasts were awarded marks out of 10 for each of artistic performance and technical ability. The results were as follows.
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline
Gymnast & $A$ & $B$ & $C$ & $D$ & $E$ & $F$ & $G$ & $H$ \\
\hline
Technical ability & 8.5 & 8.6 & 9.5 & 7.5 & 6.8 & 9.1 & 9.4 & 9.2 \\
\hline
Artistic performance & 6.2 & 7.5 & 8.2 & 6.7 & 6.0 & 7.2 & 8.0 & 9.1 \\
\hline
\end{tabular}
The value of the product moment correlation coefficient for these data is 0.774.
\begin{enumerate}[label=(\alph*)]
\item Stating your hypotheses clearly and using a 1% level of significance, interpret this value. [5]
\item Calculate the value of the rank correlation coefficient for these data. [6]
\item Stating your hypotheses clearly and using a 1% level of significance, interpret this coefficient. [3]
\item Explain why the rank correlation coefficient might be the better one to use with these data. [2]
\end{enumerate}
\hfill \mbox{\textit{Edexcel S3 Q7 [16]}}