OCR MEI Further Statistics Minor 2019 June — Question 5 16 marks

Exam BoardOCR MEI
ModuleFurther Statistics Minor (Further Statistics Minor)
Year2019
SessionJune
Marks16
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared test of independence
TypeStandard 2×3 contingency table
DifficultyStandard +0.3 This is a straightforward application of standard correlation and regression techniques from A-level Further Statistics. Parts (a)-(d) involve routine calculation of Pearson's correlation coefficient and a one-tailed hypothesis test with given summary statistics, plus standard interpretation questions. Part (e) appears to ask for a simple prediction using the given regression equation. All steps are textbook procedures requiring no novel insight, though it's slightly above average difficulty due to being Further Maths content and requiring careful execution of multiple standard techniques.
Spec2.02c Scatter diagrams and regression lines2.02d Informal interpretation of correlation5.08a Pearson correlation: calculate pmcc5.08c Pearson: measure of straight-line fit5.08d Hypothesis test: Pearson correlation

5 A student wants to know if there is a positive correlation between the amounts of two pollutants, sulphur dioxide and PM10 particulates, on different days in the area of London in which he lives; these amounts, measured in suitable units, are denoted by \(s\) and \(p\) respectively.
He uses a government website to obtain data for a random sample of 15 days on which the amounts of these pollutants were measured simultaneously. Fig. 5.1 is a scatter diagram showing the data. Summary statistics for these 15 values of \(s\) and \(p\) are as follows. \(\sum s _ { 1 } = 155.4 \quad \sum p = 518.9 \quad \sum s ^ { 2 } = 2322.7 \quad \sum p ^ { 2 } = 21270.5 \quad \sum s p = 6009.1\) \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{4a4d5816-5b53-49a1-b72f-f8bcf3b4e8bc-4_935_1134_683_260} \captionsetup{labelformat=empty} \caption{Fig. 5.1}
\end{figure}
  1. Explain why the student might come to the conclusion that a test based on Pearson's product moment correlation coefficient may be valid.
  2. Find the value of Pearson's product moment correlation coefficient.
  3. Carry out a test at the \(5 \%\) significance level to investigate whether there is positive correlation between the amounts of sulphur dioxide and PM10 particulates.
  4. Explain why the student made sure that the sample chosen was a random sample. The student also wishes to model the relationship between the amounts of nitrogen dioxide \(n\) and PM10 particulates \(p\).
    He takes a random sample of 54 values of the two variables, both measured at the same times. Fig. 5.2 is a scatter diagram which shows the data, together with the regression line of \(n\) on \(p\), the equation of the regression line and the value of \(r ^ { 2 }\). \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{4a4d5816-5b53-49a1-b72f-f8bcf3b4e8bc-5_824_1230_495_258} \captionsetup{labelformat=empty} \caption{Fig. 5.2}
    \end{figure}
  5. Predict the value of \(n\) for \(p = 150\).
  6. Discuss the reliability of your prediction in part (e).

Question 5:
AnswerMarks Guidance
5(a) Scatter diagram appears to be roughly elliptical
the distribution may be bivariate Normal.E1
E1
AnswerMarks Guidance
[2]3.5a
2.4For either elliptical or bi-Norm Normal bivariate is E0
“the data is bivariate
Normal” is E0
AnswerMarks
(b)S 6009.1 1 155.4518.9(633.296)oe
sp 15
S 2322.7 1 155.42 (712.756)oe
ss 15
S 21270.5 1 518.92 (3320.019...)oe
pp 15
S 633.3
r  sp 
S S 712.83320.0
ss pp
AnswerMarks
= 0.41M1
M1
M1
A1
AnswerMarks
[4]1.1a
1.1
1.1
AnswerMarks
1.1For either S
sp
For either S or S
ss pp
For general form including
square root
Allow full credit for correct
AnswerMarks
answer even if no working(0.4116859…)
If given to more than
2sf must be rounded
correctly
AnswerMarks
(c)H : ρ = 0
0
H : ρ > 0 (one-tailed test)
1
where ρ is the (population) correlation coefficient
between s and p
(Critical value =) 0.4409
0.41 < 0.4409
Insufficient evidence to reject H
0
There is insufficient evidence at the 5% level to
suggest that there is positive correlation between
AnswerMarks
sulphur dioxide and PM10 levels.B1
B1
B1
M1
A1
AnswerMarks
[5]3.3
2.5
3.4
1.1
AnswerMarks
2.2bFor both hypotheses
For defining ρ
For critical value
their TS compared correctly to
their CV and conclusion
For non-assertive conclusion in
AnswerMarks
contextH : no correlation in
0
the population
H : positive
1
correlation in the
population
±0.4409 is B0
“Accept H ” is M0
0
For A1 TS & CV must
be correct
AnswerMarks
(d)A random sample enables proper inference
about the population to be undertaken
A random sample is unbiased
For the hypothesis test to be valid it is necessary
AnswerMarks
to assume that the sample is randomB1
B1
AnswerMarks
[2]2.4
2.4Any two of the statements in bold
(B1 each)“a random sample
reduces bias” is B0
“data is not biased” is
B0
AnswerMarks Guidance
(e)Prediction for n = 150 is 38 awrt B1
[1]1.1 (37.96)
(f)Although it is interpolation,
r2 is only 0.3056 oe
the points do not lie very close to the line
AnswerMarks
Not (very) reliableE1
E1
AnswerMarks
[2]3.5a
3.5aFor either but must conclude
about reliability
Mark the last statement about
AnswerMarks
reliabilityWeak correlation
r = 0.5528
Question 5:
5 | (a) | Scatter diagram appears to be roughly elliptical
the distribution may be bivariate Normal. | E1
E1
[2] | 3.5a
2.4 | For either elliptical or bi-Norm | Normal bivariate is E0
“the data is bivariate
Normal” is E0
(b) | S 6009.1 1 155.4518.9(633.296)oe
sp 15
S 2322.7 1 155.42 (712.756)oe
ss 15
S 21270.5 1 518.92 (3320.019...)oe
pp 15
S 633.3
r  sp 
S S 712.83320.0
ss pp
= 0.41 | M1
M1
M1
A1
[4] | 1.1a
1.1
1.1
1.1 | For either S
sp
For either S or S
ss pp
For general form including
square root
Allow full credit for correct
answer even if no working | (0.4116859…)
If given to more than
2sf must be rounded
correctly
(c) | H : ρ = 0
0
H : ρ > 0 (one-tailed test)
1
where ρ is the (population) correlation coefficient
between s and p
(Critical value =) 0.4409
0.41 < 0.4409
Insufficient evidence to reject H
0
There is insufficient evidence at the 5% level to
suggest that there is positive correlation between
sulphur dioxide and PM10 levels. | B1
B1
B1
M1
A1
[5] | 3.3
2.5
3.4
1.1
2.2b | For both hypotheses
For defining ρ
For critical value
their TS compared correctly to
their CV and conclusion
For non-assertive conclusion in
context | H : no correlation in
0
the population
H : positive
1
correlation in the
population
±0.4409 is B0
“Accept H ” is M0
0
For A1 TS & CV must
be correct
(d) | A random sample enables proper inference
about the population to be undertaken
A random sample is unbiased
For the hypothesis test to be valid it is necessary
to assume that the sample is random | B1
B1
[2] | 2.4
2.4 | Any two of the statements in bold
(B1 each) | “a random sample
reduces bias” is B0
“data is not biased” is
B0
(e) | Prediction for n = 150 is 38 awrt | B1
[1] | 1.1 | (37.96)
(f) | Although it is interpolation,
r2 is only 0.3056 oe
the points do not lie very close to the line
Not (very) reliable | E1
E1
[2] | 3.5a
3.5a | For either but must conclude
about reliability
Mark the last statement about
reliability | Weak correlation
r = 0.5528
5 A student wants to know if there is a positive correlation between the amounts of two pollutants, sulphur dioxide and PM10 particulates, on different days in the area of London in which he lives; these amounts, measured in suitable units, are denoted by $s$ and $p$ respectively.\\
He uses a government website to obtain data for a random sample of 15 days on which the amounts of these pollutants were measured simultaneously. Fig. 5.1 is a scatter diagram showing the data. Summary statistics for these 15 values of $s$ and $p$ are as follows.\\
$\sum s _ { 1 } = 155.4 \quad \sum p = 518.9 \quad \sum s ^ { 2 } = 2322.7 \quad \sum p ^ { 2 } = 21270.5 \quad \sum s p = 6009.1$

\begin{figure}[h]
\begin{center}
  \includegraphics[alt={},max width=\textwidth]{4a4d5816-5b53-49a1-b72f-f8bcf3b4e8bc-4_935_1134_683_260}
\captionsetup{labelformat=empty}
\caption{Fig. 5.1}
\end{center}
\end{figure}
\begin{enumerate}[label=(\alph*)]
\item Explain why the student might come to the conclusion that a test based on Pearson's product moment correlation coefficient may be valid.
\item Find the value of Pearson's product moment correlation coefficient.
\item Carry out a test at the $5 \%$ significance level to investigate whether there is positive correlation between the amounts of sulphur dioxide and PM10 particulates.
\item Explain why the student made sure that the sample chosen was a random sample.

The student also wishes to model the relationship between the amounts of nitrogen dioxide $n$ and PM10 particulates $p$.\\
He takes a random sample of 54 values of the two variables, both measured at the same times. Fig. 5.2 is a scatter diagram which shows the data, together with the regression line of $n$ on $p$, the equation of the regression line and the value of $r ^ { 2 }$.

\begin{figure}[h]
\begin{center}
  \includegraphics[alt={},max width=\textwidth]{4a4d5816-5b53-49a1-b72f-f8bcf3b4e8bc-5_824_1230_495_258}
\captionsetup{labelformat=empty}
\caption{Fig. 5.2}
\end{center}
\end{figure}
\item Predict the value of $n$ for $p = 150$.
\item Discuss the reliability of your prediction in part (e).
\end{enumerate}

\hfill \mbox{\textit{OCR MEI Further Statistics Minor 2019 Q5 [16]}}