| Exam Board | OCR MEI |
|---|---|
| Module | Further Statistics A AS (Further Statistics A AS) |
| Year | 2020 |
| Session | November |
| Marks | 8 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Interpret features of scatter diagram |
| Difficulty | Moderate -0.3 This is a straightforward interpretation question requiring routine application of regression concepts: substituting into a given equation, commenting on interpolation vs extrapolation, interpreting R², and calculating the reverse regression line using standard formulas. All techniques are standard textbook exercises with no novel problem-solving required, though it does test understanding across multiple regression concepts. |
| Spec | 5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression5.09e Use regression: for estimation in context |
| Answer | Marks | Guidance |
|---|---|---|
| 5 | (a) | x = 150, y = 9.17 |
| x = 250, y = 6.89 | B1 |
| Answer | Marks |
|---|---|
| [2] | 1.1 |
| Answer | Marks | Guidance |
|---|---|---|
| 5 | (b) | The estimate for 150 is likely to be more reliable since |
| Answer | Marks | Guidance |
|---|---|---|
| involves extrapolation) | E1 | |
| [1] | 3.5b | |
| 5 | (c) | Because r2 = 0.7303 and because the points lie fairly |
| close to a straight line the fit is fairly good. | E1 |
| Answer | Marks |
|---|---|
| [2] | 2.2a |
| 2.4 | For comment relating to r2 or r |
| Answer | Marks | Guidance |
|---|---|---|
| 5 | (d) | It would be since the variables are random on random |
| [1] | 2.4 | Allow an answer that suggests that the |
| Answer | Marks | Guidance |
|---|---|---|
| 5 | (e) | a = 449.8, b = -32.06 |
| So regression equation is x =-32.06y + 449.8 | B1 |
| Answer | Marks |
|---|---|
| [2] | 1.1 |
| 1.1 | For either value (3sf or better) |
Question 5:
5 | (a) | x = 150, y = 9.17
x = 250, y = 6.89 | B1
B1
[2] | 1.1
1.1
5 | (b) | The estimate for 150 is likely to be more reliable since
this involves interpolation (whereas the estimate for 250
involves extrapolation) | E1
[1] | 3.5b
5 | (c) | Because r2 = 0.7303 and because the points lie fairly
close to a straight line the fit is fairly good. | E1
E1
[2] | 2.2a
2.4 | For comment relating to r2 or r
For comment relating to points
5 | (d) | It would be since the variables are random on random | E1
[1] | 2.4 | Allow an answer that suggests that the
calcium level may depend on the
hormone level so it might not be
appropriate
5 | (e) | a = 449.8, b = -32.06
So regression equation is x =-32.06y + 449.8 | B1
B1
[2] | 1.1
1.1 | For either value (3sf or better)
Allow B0B1 if only given to 2sf.
5 A doctor is investigating the relationship between the levels in the blood of a particular hormone and of calcium in healthy adults. The levels of the hormone and of calcium, each measured in suitable units, are denoted by $x$ and $y$ respectively.
The doctor selects a random sample of 14 adults and measures the hormone and calcium levels in each of them. The spreadsheet in Fig. 5 shows the values obtained, together with a scatter diagram which illustrates the data. The equation of the regression line of $y$ on $x$ is shown on the scatter diagram, together with the value of the square of the product moment correlation coefficient.
\begin{figure}[h]
\begin{center}
\includegraphics[alt={},max width=\textwidth]{ba3fcd3c-6834-4116-be0e-d5b27aed0a7e-5_801_1644_646_255}
\captionsetup{labelformat=empty}
\caption{Fig. 5}
\end{center}
\end{figure}
\begin{enumerate}[label=(\alph*)]
\item Use the equation of the regression line to estimate the mean calcium level of people with the following hormone levels.
\begin{itemize}
\item 150
\item 250
\item Explain which of your two estimates is likely to be more reliable.
\item Comment on the goodness of fit of the regression line.
\item Explain whether it would be appropriate to plot the scatter diagram the other way around with calcium level on the horizontal axis and hormone level on the vertical axis.
\item Calculate the equation of a regression line which would be suitable for estimating the mean hormone level of people with a known calcium level.
\end{itemize}
\end{enumerate}
\hfill \mbox{\textit{OCR MEI Further Statistics A AS 2020 Q5 [8]}}