| Exam Board | OCR |
|---|---|
| Module | Further Statistics AS (Further Statistics AS) |
| Year | 2018 |
| Session | June |
| Marks | 8 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Identify response/explanatory variables |
| Difficulty | Moderate -0.8 Part (i) is straightforward conceptual understanding of independent/explanatory variables. Part (ii) is a standard regression calculation with given data. Part (iii) tests understanding of how linear transformations affect regression equations (routine). Part (iv) requires explaining a definition with a diagram. All parts are direct application of core regression concepts with no problem-solving or novel insight required, making this easier than average. |
| Spec | 5.09a Dependent/independent variables5.09c Calculate regression line |
| \(c\) | 1.94 | 1.78 | 1.62 | 1.51 | 1.52 | 1.4 |
| \(m\) | 6.5 | 7.2 | 7.4 | 7.6 | 8.3 | 9.7 |
| Answer | Marks | Guidance |
|---|---|---|
| Neither | B1 [1] | OE. Not "neither is independent of the other" |
| Answer | Marks | Guidance |
|---|---|---|
| \(c = 2.848 - 0.1567m\) BC | B1 B1 B1 [3] | B1: Correct \(a\), awrt 2.85. B1: Correct \(b\), awrt 0.157. B1: Letters correct from correct method. If both wrongly rounded e.g. \(c = 2.84 - 0.156m\), give B2. SC: \(m\) on \(c\): \(m = 15.65 - 4.832c\): B2; \(y = 15.65 - 4.832x\): B1; \(c = 15.65 - 4.832m\): B1. If B0B0, give B1 for correct letters from valid working |
| Answer | Marks | Guidance |
|---|---|---|
| \(a\) unchanged, \(b\) multiplied by 2.2 (allow "\(a\) unchanged, \(b\) increases", etc) | B1 [1] | oe e.g. \(c = 2.848 - 0.345m\); \(m = 7.114 - 2.196c\). SC: \(m\) on \(c\) in (ii): Both divided by 2.2 B1. AO 2.2a |
| Answer | Marks | Guidance |
|---|---|---|
| Say that "Best fit" line minimises the sum of squares of these distances | M1 M1 A1 [3] | M1: AO 1.1. M1: AO 2.4. A1: Needs M2 and "minimises" and "sums of squares" oe. SC: Horizontal(s): full marks (indept of (ii)) |
## Question 7:
### Part (i)
Neither | **B1** [1] | OE. Not "neither is independent of the other"
### Part (ii)
$c = 2.848 - 0.1567m$ **BC** | **B1** B1 B1 [3] | B1: Correct $a$, awrt 2.85. B1: Correct $b$, awrt 0.157. B1: Letters correct from correct method. If both wrongly rounded e.g. $c = 2.84 - 0.156m$, give B2. SC: $m$ on $c$: $m = 15.65 - 4.832c$: B2; $y = 15.65 - 4.832x$: B1; $c = 15.65 - 4.832m$: B1. If B0B0, give B1 for correct letters from valid working
### Part (iii)
$a$ unchanged, $b$ multiplied by 2.2 (allow "$a$ unchanged, $b$ increases", etc) | **B1** [1] | oe e.g. $c = 2.848 - 0.345m$; $m = 7.114 - 2.196c$. SC: $m$ on $c$ in (ii): Both divided by 2.2 B1. AO 2.2a
### Part (iv)
Draw approximate line of best fit
Draw at least one vertical from line to point
Say that "Best fit" line minimises the sum of squares of these distances | **M1** M1 A1 [3] | M1: AO 1.1. M1: AO 2.4. A1: Needs M2 and "minimises" and "sums of squares" oe. SC: Horizontal(s): full marks (indept of (ii))
---
7 An environmentalist measures the mean concentration, $c$ milligrams per litre, of a particular chemical in a group of rivers, and the mean mass, $m$ pounds, of fish of a certain species found in those rivers. The results are given in the table.
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | }
\hline
$c$ & 1.94 & 1.78 & 1.62 & 1.51 & 1.52 & 1.4 \\
\hline
$m$ & 6.5 & 7.2 & 7.4 & 7.6 & 8.3 & 9.7 \\
\hline
\end{tabular}
\end{center}
(i) State which, if either, of $m$ and $c$ is an independent variable.\\
(ii) Calculate the equation of the least squares regression line of $c$ on $m$.\\
(iii) State what effect, if any, there would be on your answer to part (ii) if the masses of the fish had been recorded in kilograms rather than pounds. ( $1 \mathrm {~kg} \approx 2.2$ pounds.)\\
(iv) The data is illustrated in the scatter diagram. Explain what is meant by 'least squares', illustrating your answer using the copy of this diagram in the Printed Answer Booklet.\\
\begin{tikzpicture}[>=Stealth]
% Define scale: x-axis from 1 to 2.1, y-axis from 6 to 10
\def\xscale{8}
\def\yscale{1.5}
% Draw axes
\draw[->] (0,0) -- (8.8,0) node[right] {$c$};
\draw[->] (0,0) -- (0,6.3);
% x-axis: ranges from 1 to 2, with gridlines
% Map: c=1 -> x=0, c=2 -> x=8, so x = (c-1)*8
% y-axis: m=6 -> y=0, m=10 -> y=6, so y = (m-6)*1.5
% x-axis tick marks and labels
\foreach \c in {1, 1.2, 1.4, 1.6, 1.8, 2} {
\pgfmathsetmacro{\xpos}{(\c - 1)*8}
\draw (\xpos, -0.1) -- (\xpos, 0.1);
\node[below] at (\xpos, -0.1) {$\c$};
}
% y-axis tick marks and labels
\foreach \m in {6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10} {
\pgfmathsetmacro{\ypos}{(\m - 6)*1.5}
\draw (-0.1, \ypos) -- (0.1, \ypos);
\pgfmathsetmacro{\mval}{\m}
\node[left] at (-0.1, \ypos) {$\pgfmathprintnumber[fixed,precision=1]{\mval}$};
}
% y-axis label
\node[left] at (-0.1, 6.3) {$m$};
% Grid lines (light gray)
\foreach \c in {1, 1.2, 1.4, 1.6, 1.8, 2} {
\pgfmathsetmacro{\xpos}{(\c - 1)*8}
\draw[gray!30] (\xpos, 0) -- (\xpos, 6);
}
\foreach \m in {6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10} {
\pgfmathsetmacro{\ypos}{(\m - 6)*1.5}
\draw[gray!30] (0, \ypos) -- (8, \ypos);
}
% Data points: (c, m)
% (1.94, 6.5) -> (7.52, 0.75)
\fill (7.52, 0.75) circle (3pt);
% (1.78, 7.2) -> (6.24, 1.8)
\fill (6.24, 1.8) circle (3pt);
% (1.62, 7.4) -> (4.96, 2.1)
\fill (4.96, 2.1) circle (3pt);
% (1.51, 7.6) -> (4.08, 2.4)
\fill (4.08, 2.4) circle (3pt);
% (1.52, 8.3) -> (4.16, 3.45)
\fill (4.16, 3.45) circle (3pt);
% (1.4, 9.7) -> (3.2, 5.55)
\fill (3.2, 5.55) circle (3pt);
\end{tikzpicture}
\hfill \mbox{\textit{OCR Further Statistics AS 2018 Q7 [8]}}