| Exam Board | OCR MEI |
|---|---|
| Module | Further Statistics Minor (Further Statistics Minor) |
| Year | 2022 |
| Session | June |
| Marks | 14 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Hypothesis test of Spearman’s rank correlation coefficien |
| Type | Hypothesis test for association |
| Difficulty | Standard +0.3 This is a straightforward application of Spearman's rank correlation test with standard parts: identifying why Pearson's is inappropriate (likely non-linearity from scatter diagram), calculating ranks and rs, performing a hypothesis test against critical values, and explaining sampling concepts. All parts are routine textbook exercises requiring no novel insight, though the multi-part structure and calculation of Spearman's coefficient adds some work compared to the most basic questions. |
| Spec | 2.01a Population and sample: terminology5.08e Spearman rank correlation5.08f Hypothesis test: Spearman rank5.08g Compare: Pearson vs Spearman |
| Answer | Marks | Guidance |
|---|---|---|
| 5 | (a) | Because the scatter diagram does not appear to be |
| Answer | Marks |
|---|---|
| distribution is probably not bivariate Normal. | E1 |
| Answer | Marks |
|---|---|
| [2] | 3.5a |
| 2.4 | For not elliptical |
| Answer | Marks |
|---|---|
| mark) | “data is not bivariate |
| Answer | Marks | Guidance |
|---|---|---|
| 5 | (b) | Rank A 1 2 3 4 5 6 |
| Answer | Marks |
|---|---|
| 143 | M1 |
| Answer | Marks |
|---|---|
| [3] | 1.1 |
| Answer | Marks |
|---|---|
| 1.1 | For ranking Age |
| Answer | Marks | Guidance |
|---|---|---|
| BC | Ranks may be reversed | |
| 5 | (c) | H : There is no association between age and protein |
| Answer | Marks |
|---|---|
| − 0.7832 | > 0.5874 (so reject H .) |
| Answer | Marks |
|---|---|
| population) | B1 |
| Answer | Marks |
|---|---|
| [5] | 3.3 |
| Answer | Marks |
|---|---|
| 2.2b | Need to see context and population in |
| Answer | Marks | Guidance |
|---|---|---|
| critical value provided | r | < 1 |
| Answer | Marks |
|---|---|
| the right way round | Conclusion must not |
| Answer | Marks | Guidance |
|---|---|---|
| 5 | (d) | (Because a random sample) enables (proper) inference |
| about the population to be undertaken | B2 | |
| [2] | 2.4 | |
| 2.4 | B2 for correct explanation, as shown | SC B1 for partially |
| Answer | Marks | Guidance |
|---|---|---|
| 5 | (e) | Because as the sample size increases, the random |
| Answer | Marks |
|---|---|
| coefficient. | E1 |
| Answer | Marks |
|---|---|
| [2] | 2.2b |
| 2.2b | Allow E1 for ‘the influence of outliers |
| Answer | Marks | Guidance |
|---|---|---|
| Rank A | 1 | 2 |
| Rank P | 8 | 12 |
| Rank A | 7 | 8 |
| Rank P | 5 | 9 |
Question 5:
5 | (a) | Because the scatter diagram does not appear to be
elliptical (but more of a funnel shape) so the
distribution is probably not bivariate Normal. | E1
E1
[2] | 3.5a
2.4 | For not elliptical
For full answer (dependent on first
mark) | “data is not bivariate
Normal” is E0
Normal bivariate is E0
5 | (b) | Rank A 1 2 3 4 5 6
Rank P 8 12 6 10 11 7
Rank A 7 8 9 10 11 12
Rank P 5 9 4 3 1 2
112
Spearman’s rank coefficient = − 0.78(32) (= − )
143 | M1
M1
A1
[3] | 1.1
1.1
1.1 | For ranking Age
For ranking Protein consistent with
ranking for age
BC | Ranks may be reversed
5 | (c) | H : There is no association between age and protein
0
(level) in the population
H : There is some association between age and protein
1
(level) in the population
Critical value is (±)0.5874
| − 0.7832 | > 0.5874 (so reject H .)
0
There is sufficient evidence to suggest that there is
association between age and protein level (in the
population) | B1
B1
B1
M1
A1FT
[5] | 3.3
1.2
3.4
1.1
2.2b | Need to see context and population in
at least one of the hypotheses
n = 12, 2-tailed 5%
For comparison of their r and sensible
s
critical value provided |r| < 1
s
FT their r and sensible critical value
s
Hypotheses need to have been stated
the right way round | Conclusion must not
be too assertive and
refer to context
5 | (d) | (Because a random sample) enables (proper) inference
about the population to be undertaken | B2
[2] | 2.4
2.4 | B2 for correct explanation, as shown | SC B1 for partially
correct explanation, eg
a random sample is
less likely to be biased
5 | (e) | Because as the sample size increases, the random
variation in the sample tends to decrease.
The sample Spearman’s rank correlation coefficient
tends to get closer to the population correlation
coefficient. | E1
E1
[2] | 2.2b
2.2b | Allow E1 for ‘the influence of outliers
is reduced’ or for ‘gives a more
reliable result’ oe if there is no further
explanation.
Rank A | 1 | 2 | 3 | 4 | 5 | 6
Rank P | 8 | 12 | 6 | 10 | 11 | 7
Rank A | 7 | 8 | 9 | 10 | 11 | 12
Rank P | 5 | 9 | 4 | 3 | 1 | 2
5 A medical researcher is investigating whether there is any relationship between the age of a person and the level of a particular protein in the person's blood. She measures the levels of the protein (measured in suitable units) in a random sample of 12 hospital patients of various ages (in years). The spreadsheet shows the values obtained, together with a scatter diagram which illustrates the data.\\
\includegraphics[max width=\textwidth, alt={}, center]{e8624e9b-5143-49d2-9683-cc3a1082694e-5_736_1470_1087_246}
\begin{enumerate}[label=(\alph*)]
\item The researcher decides that a test based on Pearson's product moment correlation coefficient may not be valid. Explain why she comes to this conclusion.
\item Calculate the value of Spearman's rank correlation coefficient.
\item Carry out a test based on this coefficient at the $5 \%$ significance level to investigate whether there is any association between age and protein level.
\item Explain why the researcher chose a sample that was random.
\item The researcher had originally intended to use a sample size of 6 rather than the 12 that she actually used.
Explain what advantage there is in using the larger sample size.
\end{enumerate}
\hfill \mbox{\textit{OCR MEI Further Statistics Minor 2022 Q5 [14]}}