| Exam Board | OCR MEI |
|---|---|
| Module | Further Statistics Minor (Further Statistics Minor) |
| Year | 2022 |
| Session | June |
| Marks | 13 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Identify response/explanatory variables |
| Difficulty | Moderate -0.8 This is a straightforward linear regression question requiring standard calculations (finding regression line equation, making predictions) and basic interpretation. All parts follow textbook procedures with no novel problem-solving required. The conceptual questions (parts a, d, e, f) test understanding at a basic level. Easier than average A-level due to routine application of formulas. |
| Spec | 2.02c Scatter diagrams and regression lines5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context |
| Answer | Marks | Guidance |
|---|---|---|
| 2 | (a) | Both (variables are random) |
| [1] | 1.1 | |
| 2 | (b) | 1 |
| Answer | Marks |
|---|---|
| h = 1.12d + 1.99 | M1 |
| Answer | Marks |
|---|---|
| [4] | 1.1a |
| Answer | Marks |
|---|---|
| 1.1 | For S dh or S dd |
| Answer | Marks |
|---|---|
| explicitly stated with values to 2 dp | Could be embedded |
| Answer | Marks | Guidance |
|---|---|---|
| 2 | (c) | Prediction for d = 7.5 is h = 10.4 |
| Prediction for d = 20.0 is h = 24.3 | B1FT |
| Answer | Marks |
|---|---|
| [2] | 1.1 |
| 1.1 | Allow only 1 mark if either prediction |
| Answer | Marks |
|---|---|
| Allow 24.4 from equation to 2dp | Use of d on h |
| Answer | Marks | Guidance |
|---|---|---|
| 2 | (d) | The prediction for d = 7.5 is likely to be reliable since |
| Answer | Marks |
|---|---|
| values. | B1 |
| Answer | Marks |
|---|---|
| [3] | 3.5a |
| Answer | Marks |
|---|---|
| 3.5a | B1 for interpolation |
| Answer | Marks |
|---|---|
| B1 for extrapolation | Use of ‘accurate’ for |
| Answer | Marks | Guidance |
|---|---|---|
| 2 | (e) | e.g. The predicted height for a tree of diameter 60 cm |
| Answer | Marks |
|---|---|
| trees). | E1 |
| Answer | Marks |
|---|---|
| [2] | 1.1 |
| 2.4 | 69(.2)m from equation to 2dp |
| Answer | Marks |
|---|---|
| trees | E1 for “The regression |
| Answer | Marks | Guidance |
|---|---|---|
| 2 | (f) | Coordinates are (5.66, 8.31) |
| [1] | 1.1 | 283 1247 |
Question 2:
2 | (a) | Both (variables are random) | B1
[1] | 1.1
2 | (b) | 1
S = 866.63 – ×84.9×124.7 (= 160.828)
dh
15
S = 624.55− 1 × 84.92 (= 144.016)
dd
15
160.828
𝑎 = (= 1.1167…)
144.016
124.7 84.9
𝑏 = −𝑡ℎ𝑒𝑖𝑟1.1167…× (= 1.9926…)
15 15
h = 1.12d + 1.99 | M1
A1
M1
A1
[4] | 1.1a
1.1
3.3
1.1 | For S dh or S dd
For a
For method for b
Allow final mark only if equation
explicitly stated with values to 2 dp | Could be embedded
May have a and 𝑏
notation reversed
y = 1.12x + 1.99 scores
M1 A0
2 | (c) | Prediction for d = 7.5 is h = 10.4
Prediction for d = 20.0 is h = 24.3 | B1FT
B1FT
[2] | 1.1
1.1 | Allow only 1 mark if either prediction
given to more than 1 dp.
Allow 24.4 from equation to 2dp | Use of d on h
regression line scores
B0
2 | (d) | The prediction for d = 7.5 is likely to be reliable since
the points lie close to the regression line, and this is
interpolation.
The prediction for d = 20.0 is less likely to be reliable
since this is extrapolation, well beyond any data
values. | B1
E1
B1
[3] | 3.5a
3.5a
3.5a | B1 for interpolation
E1 for noting strong linear
correlation
B1 for extrapolation | Use of ‘accurate’ for
‘reliable’ scores B0
2 | (e) | e.g. The predicted height for a tree of diameter 60 cm
is 69(.0) metres.
The regression line is no longer valid (for mature
trees). | E1
E1
[2] | 1.1
2.4 | 69(.2)m from equation to 2dp
(d > 42.9cm gives h > 50m)
FT their regression line
Relationship non-linear for mature
trees | E1 for “The regression
line will predict
heights (cm) greater
than diameters (m)”
Use of ‘accurate’ for
‘valid’ scores B0
2 | (f) | Coordinates are (5.66, 8.31) | B1
[1] | 1.1 | 283 1247
Accept ( , )
50 150
2 A forester is investigating the relationship between the diameter and the height of young beech trees. She selects a random sample of 15 young beech trees in a forest and records their diameters, $d \mathrm {~cm}$, and their heights, $h \mathrm {~m}$. The data are illustrated in the scatter diagram.\\
\includegraphics[max width=\textwidth, alt={}, center]{e8624e9b-5143-49d2-9683-cc3a1082694e-3_649_1116_386_230}
\begin{enumerate}[label=(\alph*)]
\item State whether either or both of the variables $d$ and $h$ are random variables.
Summary data for the diameters and heights are as follows.
$$\mathrm { n } = 15 \quad \sum \mathrm {~d} = 84.9 \quad \sum \mathrm {~h} = 124.7 \quad \sum \mathrm {~d} ^ { 2 } = 624.55 \quad \sum \mathrm {~h} ^ { 2 } = 1230.57 \quad \sum \mathrm { dh } = 866.63$$
\item Find the equation of the regression line of $h$ on $d$. Give your answer in the form $h = a d + b$, giving the values of $a$ and $b$ correct to $\mathbf { 2 }$ decimal places.
\item Use the regression line to predict the heights of beech trees with the following diameters.
\begin{itemize}
\item 7.5 cm
\item 20.0 cm
\item Comment on the reliability of your predictions.
\item There are many mature beech trees with diameter of 60 cm or greater. However, there are no beech trees with a height of more than 50 m .
\end{itemize}
Comment on this in relation to your regression line.
\item State the coordinates of the point at which the regression line of $d$ on $h$ meets the line which you calculated in part (b).
\end{enumerate}
\hfill \mbox{\textit{OCR MEI Further Statistics Minor 2022 Q2 [13]}}