OCR MEI Further Statistics A AS 2024 June — Question 4 10 marks

Exam BoardOCR MEI
ModuleFurther Statistics A AS (Further Statistics A AS)
Year2024
SessionJune
Marks10
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeFind unknown values from regression
DifficultyStandard +0.3 This is a straightforward linear regression question requiring standard calculations (finding a and b from summary statistics, computing residuals, and making predictions). The only slightly non-routine element is part (d) asking why the model breaks down, but this requires simple reasoning about concentration becoming negative. All techniques are standard A-level further statistics with no novel problem-solving required.
Spec5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context

4 A chemist is conducting an experiment in which the concentration of a certain chemical, A , is supposed to be recorded at the start of the experiment and then every 30 seconds after the start. The time after the start is denoted by \(t \mathrm {~s}\) and the concentration by \(\mathrm { z } \mathrm { mg } \mathrm { cm } ^ { - 3 }\). The collected data are shown in the table below. Note that the concentration at \(t = 90\) was not recorded.
Time, \(t\)03060120150
Concentration of A, \(z\)40.031.327.512.811.4
The chemist wishes to plot the data on a graph.
  1. Explain why \(t\) should be plotted on the horizontal axis. You are given that the summary statistics for the data are as follows. \(n = 5 \quad \sum t = 360 \quad \sum z = 123.0 \quad \sum t ^ { 2 } = 41400 \quad \sum z ^ { 2 } = 3629.74 \quad \sum \mathrm { t } = 5835\) The regression line of \(z\) on \(t\) is given by \(\mathbf { z = a + b t }\) and is used to model the concentration of chemical A for \(t \geqslant 0\).
    1. Use the summary statistics to determine the value of \(a\) and the value of \(b\).
    2. Find the value of the residual at each of the following values of \(t\).
      • \(t = 60\)
      • \(t = 120\)
        1. Use the equation of the regression line to estimate the value of the concentration at 90 seconds.
        2. With reference to your answers to part (b)(ii), comment on the reliability of your answer to part (c)(i).
      Further experiments indicate that the model is reasonably reliable for times greater than 150 seconds up to about 200 seconds.
  2. Show that the model cannot be valid beyond a time of about 200 seconds.

Question 4:
AnswerMarks Guidance
4(a) t should go on the horizontal axis as it is the
independent/control/non-random variable.B1
[1]1.2
4(b) (i)
S =41400− (=15480) or
tt
5
360123
S =5835− (=−3021)
tz
5
S −3021
(b=) tz = =−0.1951550... = awrt –
S 15480
tt
0.195
(a = ) 24.6 – “–0.195”×72 oe
AnswerMarks
a = awrt 38.7M1
A1
M1
A1
AnswerMarks
[4]1.1
1.1
1.1
AnswerMarks
1.141400−5722
or
5 8 3 5 − 5  7 2  2 4 .6
For calculation for S or S seen.
tt tz
–1007/5160
For using 𝑎 = 𝑧̅−𝑏𝑡̿ oe
AnswerMarks
1662/43 Condone a = 38.6For reference,
1 2 3 2
S = 3 6 2 9 .7 4 − = 6 0 3 .9 4
zz
5
awrt –0.195 with no working SCB1
a = awrt 38.7 with no working SCB1
AnswerMarks Guidance
4(b) (ii)
t = 120: awrt –2.4B1FT
B1FT
AnswerMarks
[2]1.1
1.1FT their a and b. Answer to ≥ 2s.f.
FT their a and b. Answer to ≥ 2s.f.SCB1 Both values correct in
magnitude but both signs incorrect.
AnswerMarks Guidance
4(c) (i)
So estimate of concentration is 21 mg cm–3B1FT
[1]3.4 FT their a and b. Answer to 1s.f.
2s.f. or 3s.f. only. Answers of 4s.f.
AnswerMarks Guidance
or more are over-specified..
4(c) (ii)
60 and 120) are (reasonably) small (in relation to
AnswerMarks Guidance
the values) so the estimate is likely to be reliable.B1
[1]3.5a Condone over-assertive answers
(eg “so the estimate is reliable”).Accept a different sensible conclusion
relating to their residuals. eg “Even
though this is interpolation and
surrounding residuals are small the
other residuals may be large so we
can’t tell.”
AnswerMarks Guidance
4(d) t = 200 gives z = – 0.38 so for t > 200 the model
predicts that the concentration is negative, which
AnswerMarks Guidance
is impossible.B1
[1]3.5b For sensible comment supported
with a numerical example.Stating that it is (further) extrapolation
is, in itself, insufficient for B1.
B0 if their gradient is positive.
Question 4:
4 | (a) | t should go on the horizontal axis as it is the
independent/control/non-random variable. | B1
[1] | 1.2
4 | (b) | (i) | 3602
S =41400− (=15480) or
tt
5
360123
S =5835− (=−3021)
tz
5
S −3021
(b=) tz = =−0.1951550... = awrt –
S 15480
tt
0.195
(a = ) 24.6 – “–0.195”×72 oe
a = awrt 38.7 | M1
A1
M1
A1
[4] | 1.1
1.1
1.1
1.1 | 41400−5722
or
5 8 3 5 − 5  7 2  2 4 .6
For calculation for S or S seen.
tt tz
–1007/5160
For using 𝑎 = 𝑧̅−𝑏𝑡̿ oe
1662/43 Condone a = 38.6 | For reference,
1 2 3 2
S = 3 6 2 9 .7 4 − = 6 0 3 .9 4
zz
5
awrt –0.195 with no working SCB1
a = awrt 38.7 with no working SCB1
4 | (b) | (ii) | t = 60: awrt 0.56
t = 120: awrt –2.4 | B1FT
B1FT
[2] | 1.1
1.1 | FT their a and b. Answer to ≥ 2s.f.
FT their a and b. Answer to ≥ 2s.f. | SCB1 Both values correct in
magnitude but both signs incorrect.
4 | (c) | (i) | z(90)  38.65... – 0.1951...90 = 21.08...
So estimate of concentration is 21 mg cm–3 | B1FT
[1] | 3.4 | FT their a and b. Answer to 1s.f.
2s.f. or 3s.f. only. Answers of 4s.f.
or more are over-specified. | .
4 | (c) | (ii) | (This is interpolation and) the residuals (for t =
60 and 120) are (reasonably) small (in relation to
the values) so the estimate is likely to be reliable. | B1
[1] | 3.5a | Condone over-assertive answers
(eg “so the estimate is reliable”). | Accept a different sensible conclusion
relating to their residuals. eg “Even
though this is interpolation and
surrounding residuals are small the
other residuals may be large so we
can’t tell.”
4 | (d) | t = 200 gives z = – 0.38 so for t > 200 the model
predicts that the concentration is negative, which
is impossible. | B1
[1] | 3.5b | For sensible comment supported
with a numerical example. | Stating that it is (further) extrapolation
is, in itself, insufficient for B1.
B0 if their gradient is positive.
4 A chemist is conducting an experiment in which the concentration of a certain chemical, A , is supposed to be recorded at the start of the experiment and then every 30 seconds after the start. The time after the start is denoted by $t \mathrm {~s}$ and the concentration by $\mathrm { z } \mathrm { mg } \mathrm { cm } ^ { - 3 }$. The collected data are shown in the table below. Note that the concentration at $t = 90$ was not recorded.

\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | }
\hline
Time, $t$ & 0 & 30 & 60 & 120 & 150 \\
\hline
Concentration of A, $z$ & 40.0 & 31.3 & 27.5 & 12.8 & 11.4 \\
\hline
\end{tabular}
\end{center}

The chemist wishes to plot the data on a graph.
\begin{enumerate}[label=(\alph*)]
\item Explain why $t$ should be plotted on the horizontal axis.

You are given that the summary statistics for the data are as follows.\\
$n = 5 \quad \sum t = 360 \quad \sum z = 123.0 \quad \sum t ^ { 2 } = 41400 \quad \sum z ^ { 2 } = 3629.74 \quad \sum \mathrm { t } = 5835$

The regression line of $z$ on $t$ is given by $\mathbf { z = a + b t }$ and is used to model the concentration of chemical A for $t \geqslant 0$.
\item \begin{enumerate}[label=(\roman*)]
\item Use the summary statistics to determine the value of $a$ and the value of $b$.
\item Find the value of the residual at each of the following values of $t$.

\begin{itemize}
\end{enumerate}\item $t = 60$
  \item $t = 120$
\item \begin{enumerate}[label=(\roman*)]
\item Use the equation of the regression line to estimate the value of the concentration at 90 seconds.
\item With reference to your answers to part (b)(ii), comment on the reliability of your answer to part (c)(i).
\end{itemize}

Further experiments indicate that the model is reasonably reliable for times greater than 150 seconds up to about 200 seconds.
\end{enumerate}\item Show that the model cannot be valid beyond a time of about 200 seconds.
\end{enumerate}

\hfill \mbox{\textit{OCR MEI Further Statistics A AS 2024 Q4 [10]}}