OCR MEI Further Statistics A AS 2018 June — Question 3 12 marks

Exam BoardOCR MEI
ModuleFurther Statistics A AS (Further Statistics A AS)
Year2018
SessionJune
Marks12
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicBivariate data
TypeCalculate r from raw bivariate data
DifficultyStandard +0.3 This is a straightforward application of Spearman's rank correlation coefficient with standard hypothesis testing. While it requires multiple steps (ranking data, calculating rs, hypothesis test, interpretation), each step follows a routine procedure taught in Further Statistics. The conceptual demand is low—recognizing non-linearity from a scatter diagram and applying a standard non-parametric test. This is slightly easier than average because it's a textbook application with no novel problem-solving required.
Spec5.08e Spearman rank correlation5.08f Hypothesis test: Spearman rank

3 Samples of water are taken from 10 randomly chosen wells in an area of a country. A researcher is investigating whether there is any relationship between the levels of dissolved oxygen, \(x\), and the amounts of radium, \(y\), in the water from the wells. Both quantities are measured in suitable units. The table and the scatter diagram in Fig. 3 show the values of \(x\) and \(y\) for the ten wells.
\(x\)45.948.352.264.666.667.669.375.077.482.8
\(y\)25.423.926.618.818.919.016.816.317.817.2
\begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{e3ac0ba0-9692-4018-894e-2b04b07eaf32-3_865_786_657_635} \captionsetup{labelformat=empty} \caption{Fig. 3}
\end{figure}
  1. Explain why it may not be appropriate to carry out a hypothesis test based on the product moment correlation coefficient.
  2. Calculate Spearman's rank correlation coefficient for these data.
  3. Using this value of Spearman's rank correlation coefficient, carry out a hypothesis test at the 1\% significance level to investigate whether there is any association between \(x\) and \(y\).
  4. Explain the meaning of the term 'significance level' in the context of the test carried out in part (iii).

Question 3:
AnswerMarks Guidance
3(i) Because the scatter diagram does not suggest a
bivariate Normal distribution since it is does not
appear to be roughly elliptical (seems to possibly
AnswerMarks
have two ‘islands’)E1
E1
AnswerMarks
[2]3.5a
3.5bFor not from bivariate Normal distn
For not ellipticalDo not accept ‘data is not
bivariate Normal’
Do not accept ‘Normal
bivariate’
AnswerMarks
(ii)Rank x 1 2 3 4 5 6 7 8 9 10
Rank y 9 8 10 5 6 7 2 1 4 3
r = 0.818 (= – 9/11)
AnswerMarks
sM1
A1
A1
AnswerMarks
[3]1.1a
1.1
AnswerMarks
1.1For using ranks
For correct ranks for yAccept both reversed
Accept – 0.82, – 0.8182 or
better
AnswerMarks
(iii)H : There is no association between level of
0
dissolved oxygen and amount of radium.
H : There is association between level of dissolved
1
oxygen and amount of radium.
For n = 10, 1% critical value = 0.7939
0.818 > 0.7939 so significant/Reject H .
0
The evidence suggests that there is some association
between level of dissolved oxygen and amount of
AnswerMarks
radium.B1
B1
B1
M1
A1
AnswerMarks
[5]3.3
1.2
1.1
1.1
AnswerMarks
2.2bB1 for H
0
B1 for H and population soi
1
Hypotheses must be in context (allow
hypotheses in terms of x & y)
NB H H NOT ito ρ
0 1
M1 for sensible comparison with
0.7939, leading to a conclusion,
AnswerMarks Guidance
providedr < 1
s
For non-assertive correct conclusion
in context and in terms of H . FT their
1
r
AnswerMarks
sHypotheses as shown in
answer column should be
understood to imply
population
No further marks from here
if wrong cv used.
See additional notes.
AnswerMarks
(iv)The significance level is the probability of rejecting
the null hypothesis when in fact it is true.
e.g. If there is no association between x and y only
about 1 sample in 100 would lead to the conclusion
AnswerMarks
that there is an association between x and y.E1
E1
AnswerMarks Guidance
[2]2.4
1.2For relating this to the context.
Rank x1 2
Rank y9 8
Question 3:
3 | (i) | Because the scatter diagram does not suggest a
bivariate Normal distribution since it is does not
appear to be roughly elliptical (seems to possibly
have two ‘islands’) | E1
E1
[2] | 3.5a
3.5b | For not from bivariate Normal distn
For not elliptical | Do not accept ‘data is not
bivariate Normal’
Do not accept ‘Normal
bivariate’
(ii) | Rank x 1 2 3 4 5 6 7 8 9 10
Rank y 9 8 10 5 6 7 2 1 4 3
r = 0.818 (= – 9/11)
s | M1
A1
A1
[3] | 1.1a
1.1
1.1 | For using ranks
For correct ranks for y | Accept both reversed
Accept – 0.82, – 0.8182 or
better
(iii) | H : There is no association between level of
0
dissolved oxygen and amount of radium.
H : There is association between level of dissolved
1
oxygen and amount of radium.
For n = 10, 1% critical value = 0.7939
0.818 > 0.7939 so significant/Reject H .
0
The evidence suggests that there is some association
between level of dissolved oxygen and amount of
radium. | B1
B1
B1
M1
A1
[5] | 3.3
1.2
1.1
1.1
2.2b | B1 for H
0
B1 for H and population soi
1
Hypotheses must be in context (allow
hypotheses in terms of x & y)
NB H H NOT ito ρ
0 1
M1 for sensible comparison with
0.7939, leading to a conclusion,
provided |r| < 1
s
For non-assertive correct conclusion
in context and in terms of H . FT their
1
r
s | Hypotheses as shown in
answer column should be
understood to imply
population
No further marks from here
if wrong cv used.
See additional notes.
(iv) | The significance level is the probability of rejecting
the null hypothesis when in fact it is true.
e.g. If there is no association between x and y only
about 1 sample in 100 would lead to the conclusion
that there is an association between x and y. | E1
E1
[2] | 2.4
1.2 | For relating this to the context.
Rank x | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10
Rank y | 9 | 8 | 10 | 5 | 6 | 7 | 2 | 1 | 4 | 3
3 Samples of water are taken from 10 randomly chosen wells in an area of a country. A researcher is investigating whether there is any relationship between the levels of dissolved oxygen, $x$, and the amounts of radium, $y$, in the water from the wells. Both quantities are measured in suitable units. The table and the scatter diagram in Fig. 3 show the values of $x$ and $y$ for the ten wells.

\begin{center}
\begin{tabular}{ | l | l | l | l | l | l | l | l | l | l | l | }
\hline
$x$ & 45.9 & 48.3 & 52.2 & 64.6 & 66.6 & 67.6 & 69.3 & 75.0 & 77.4 & 82.8 \\
\hline
$y$ & 25.4 & 23.9 & 26.6 & 18.8 & 18.9 & 19.0 & 16.8 & 16.3 & 17.8 & 17.2 \\
\hline
\end{tabular}
\end{center}

\begin{figure}[h]
\begin{center}
  \includegraphics[alt={},max width=\textwidth]{e3ac0ba0-9692-4018-894e-2b04b07eaf32-3_865_786_657_635}
\captionsetup{labelformat=empty}
\caption{Fig. 3}
\end{center}
\end{figure}

(i) Explain why it may not be appropriate to carry out a hypothesis test based on the product moment correlation coefficient.\\
(ii) Calculate Spearman's rank correlation coefficient for these data.\\
(iii) Using this value of Spearman's rank correlation coefficient, carry out a hypothesis test at the 1\% significance level to investigate whether there is any association between $x$ and $y$.\\
(iv) Explain the meaning of the term 'significance level' in the context of the test carried out in part (iii).

\hfill \mbox{\textit{OCR MEI Further Statistics A AS 2018 Q3 [12]}}