| Exam Board | OCR |
|---|---|
| Module | Further Statistics AS (Further Statistics AS) |
| Year | 2022 |
| Session | June |
| Marks | 8 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate PMCC from raw data |
| Difficulty | Moderate -0.3 This is a straightforward Further Statistics question requiring standard PMCC/regression calculations with provided summary statistics. While it's Further Maths content, the computational steps are routine (using formulas with given sums), and parts (a), (d), (e) test basic conceptual understanding rather than problem-solving. Slightly easier than average A-level difficulty overall. |
| Spec | 5.09a Dependent/independent variables5.09c Calculate regression line |
| \(d\) | 0.1 | 0.15 | 0.2 | 0.25 | 0.3 | 0.35 | 0.4 | 0.45 | 0.5 |
| \(v\) | 0.8 | 0.5 | 0.7 | 1.2 | 1.1 | 1.3 | 1.6 | 1.4 | 0.4 |
| Answer | Marks | Guidance |
|---|---|---|
| The values of \(d\) do not depend on any other variable in the experiment and they may be (or are) selected by the experimenter | B1 [1] | Not just "equal increments". Need to use the nature of \(d\), not its values |
| Answer | Marks | Guidance |
|---|---|---|
| \(v\) is a dependent, response variable | B1 [1] | Both needed. Not "uncontrolled" |
| Answer | Marks | Guidance |
|---|---|---|
| \(v = 0.329 + 2.71d\) | B2 [2] | One error e.g. wrong letters: B1; In \([0.328, 0.329]\) and \([2.71, 2.72]\): B1. SR all 9 points: \(v = d + 0.7\) B1. SR: if B0, give M1 for correct substitution into \(b\) |
| Answer | Marks | Guidance |
|---|---|---|
| \(1.69\ (1.6857)\) | B1 [1] | In range \([1.68, 1.69]\). No FT |
| Answer | Marks | Guidance |
|---|---|---|
| Quite a big difference (between 0.4 and *their* 1.69) so statistician is likely to be right/it may well be an anomaly | B1ft [1] | Comment on statistician's statement, not over-certain, ft on their 1.69. *Ignore* any comment about extrapolation (irrelevant here) or outliers. B0 if clearly wrong comparison used |
| Answer | Marks | Guidance |
|---|---|---|
| Draw regression line and 2 or more verticals | B1 | |
| State that equation minimises sum of squares of vertical distances ("residuals") | B1 [2] | Needs "minimises", "sum", and "squares" or "residuals". Allow "minimise \(\Sigma d^2\)" if clear from diagram. Allow "average" or "total" instead of "sum". Distances shown not vertical: B0. Not just "minimise sum of squares" |
# Question 1:
## Part (a)(i):
The values of $d$ do not depend on any other variable in the experiment and they may be (or are) selected by the experimenter | **B1 [1]** | Not just "equal increments". Need to use the nature of $d$, not its values
## Part (a)(ii):
$v$ is a dependent, response variable | **B1 [1]** | Both needed. Not "uncontrolled"
## Part (b):
$v = 0.329 + 2.71d$ | **B2 [2]** | One error e.g. wrong letters: B1; In $[0.328, 0.329]$ and $[2.71, 2.72]$: B1. SR all 9 points: $v = d + 0.7$ B1. SR: if B0, give M1 for correct substitution into $b$
## Part (c):
$1.69\ (1.6857)$ | **B1 [1]** | In range $[1.68, 1.69]$. No FT
## Part (d):
Quite a big difference (between 0.4 and *their* 1.69) so statistician is likely to be right/it may well be an anomaly | **B1ft [1]** | Comment on statistician's statement, not over-certain, ft on their 1.69. *Ignore* any comment about extrapolation (irrelevant here) or outliers. B0 if clearly wrong comparison used
## Part (e):
Draw regression line and 2 or more verticals | **B1** |
State that equation minimises sum of squares of vertical distances ("residuals") | **B1 [2]** | Needs "minimises", "sum", and "squares" or "residuals". Allow "minimise $\Sigma d^2$" if clear from diagram. Allow "average" or "total" instead of "sum". Distances shown not vertical: B0. Not just "minimise sum of squares"
---
1 A geography student chose a certain point in a stream and took measurements of the speed of flow, $v \mathrm {~ms} ^ { - 1 }$, of water at various depths, $d \mathrm {~m}$, below the surface at that point. The results are shown in the table.
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | c | c | c | }
\hline
$d$ & 0.1 & 0.15 & 0.2 & 0.25 & 0.3 & 0.35 & 0.4 & 0.45 & 0.5 \\
\hline
$v$ & 0.8 & 0.5 & 0.7 & 1.2 & 1.1 & 1.3 & 1.6 & 1.4 & 0.4 \\
\hline
\end{tabular}
\end{center}
$n = 9 \quad \sum d = 2.7 \quad \sum v = 9.0 \quad \sum d ^ { 2 } = 0.96 \quad \sum v ^ { 2 } = 10.4 \quad \sum \mathrm {~d} v = 2.85$
\begin{enumerate}[label=(\alph*)]
\item \begin{enumerate}[label=(\roman*)]
\item Explain why $d$ is an example of an independent, controlled variable.
\item Use two relevant terms to describe the variable $v$ in a similar way.
A statistician believes that the point ( $0.5,0.4$ ) may be an anomaly.
\end{enumerate}\item Calculate the equation of the least squares regression line of $v$ on $d$ for all the points in the table apart from ( $0.5,0.4$ ).
\item Use the equation of the line found in part (b) to estimate the value of $v$ when $d = 0.5$.
\item Use your answer to part (c) to comment on the statistician's belief.
\item Use the diagram in the Printed Answer Booklet (which does not illustrate the data in this question) to explain what is meant by "least squares regression line".
\end{enumerate}
\hfill \mbox{\textit{OCR Further Statistics AS 2022 Q1 [8]}}