OCR Further Statistics 2024 June — Question 7 8 marks

Exam BoardOCR
ModuleFurther Statistics (Further Statistics)
Year2024
SessionJune
Marks8
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeMinimize sum of squared residuals
DifficultyStandard +0.3 This is a structured Further Maths Statistics question that guides students through minimizing sum of squared residuals using a completed square form. While it involves Further Maths content, the question provides the algebraic form explicitly and asks for explanations rather than derivations. Parts (a)-(d) require understanding of regression concepts but minimal calculation. Part (e) requires applying transformations to find a new gradient, which is routine algebra. Overall, slightly easier than average due to the scaffolding provided.
Spec5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression5.09e Use regression: for estimation in context

7 The coordinates of a set of 10 points are denoted by ( \(\mathrm { x } _ { \mathrm { i } } , \mathrm { y } _ { \mathrm { i } }\) ) for \(i = 1,2 , \ldots , 10\). For a particular set of values of ( \(\mathrm { x } _ { \mathrm { i } } , \mathrm { y } _ { \mathrm { i } }\) ) and any constants \(a\) and \(b\) it can be shown that \(\Sigma \left( y _ { i } - a - b x _ { i } \right) ^ { 2 } = 10 ( 11 - a - 6 b ) ^ { 2 } + 126 \left( b - \frac { 83 } { 42 } \right) ^ { 2 } + \frac { 139 } { 14 }\).
    1. Explain why \(\sum \left( \mathrm { y } _ { \mathrm { i } } - \mathrm { a } - \mathrm { bx } _ { \mathrm { i } } \right) ^ { 2 }\) is minimised by taking \(b = \frac { 83 } { 42 }\) and \(\mathrm { a } = 11 - 6 \mathrm {~b}\).
    2. Hence explain why the equation of the regression line of \(y\) on \(x\) for these points is given by the corresponding values of \(a\) and \(b\) (so that the equation is \(\mathrm { y } = \frac { 83 } { 42 } \mathrm { x } - \frac { 6 } { 7 }\) ).
  1. State which of the following terms cannot apply to the variable \(X\) if the regression line of \(y\) on \(x\) can be used for estimating values of \(Y\). Dependent Independent Controlled Response
  2. Use the regression line to estimate the value of \(y\) corresponding to \(x = 8\).
  3. State what must be true of the value \(x = 8\) if the estimate in part (c) is to be reliable.
  4. Variables \(u\) and \(v\) are related to \(x\) and \(y\) by the following relationships. \(u = 2 + 4 x \quad v = 8 - 2 y\) Show that the gradient of the regression line of \(v\) on \(u\) is very close to - 1 .

Question 7:
AnswerMarks Guidance
7(a)(i) (Squares are  0 so the expression is minimised
by) making both squared brackets zeroB1
[1]2.1 Needs “makes both brackets zero” oe, don’t need “squares”
or “minimised” here
AnswerMarks
(a)(ii)This choice of a and b gives the minimum (sum
of) squares of residues
AnswerMarks
Dependent, responseB1
[1]
B1
AnswerMarks
[1]1.2
1.2OE. Needs “minimises” oe and “squares of residuals/
differences/distances/errors” oe, but don’t need “sum of”
Both, no others
(b)
AnswerMarks
(c)314
or 15.0 (14.952)
21
(8) must be within the range of the given data, or
“must be interpolation” or “not extrapolation”.
x = 14 ( u − 2 ) , y = 12 ( 8 − v )
1(8−v)=−6+ 831(u−2)
AnswerMarks
2 7 42 4B1
[1]
B1
[1]
M1
AnswerMarks
M11.1
2.3
3.1a
AnswerMarks
1.1Exact or in range [14.9, 15.0] Allow 15 only if 3SF seen
correct in working.
Allow “within range of data”. Not “within range of y-values”,
not “must be one of the data values”. Ignore extra comments,
except that if anything definitely wrong seen: B0
Rearrange
Substitute into equation
(d)
(e)
AnswerMarks Guidance
ORGradient of v on u = (gradient of y on x)  –2  4 M1A1
Use =   or equivalent, e.g.
du dx dy dx
b = S / S = (S × (–2×4)) / (S  42)
uv uu xy xx
AnswerMarks
OR(x, y) = (0,−6),(36,0); (u, v) = (2, 6 8),(3 1 0,8)
7 8 3
7 83
(68−8)(2−310)
New gradient is
AnswerMarks
7 83M1
M1Find any 2 points on (y on x) and convert to (u, v)
Find new gradient
AnswerMarks Guidance
ORv = a + bu  8 – 2y = a + b(2 + 4x) M1
A1Substitute and compare coefficients of x
4 and –2 correctly placed
Compare coefficients of x: 4 b  ( − 2 ) = 84 32
−83
Gradient is (which is very close to –1, AG)
AnswerMarks Guidance
84A1
[3]3.2a Obtain –0.988 or better. Ignore intercept constants. Don’t
need conclusion.
M1
M1
Find any 2 points on (y on x) and convert to (u, v)
Find new gradient
AnswerMarks Guidance
QuestionAnswer Marks
Question 7:
7 | (a)(i) | (Squares are  0 so the expression is minimised
by) making both squared brackets zero | B1
[1] | 2.1 | Needs “makes both brackets zero” oe, don’t need “squares”
or “minimised” here
(a)(ii) | This choice of a and b gives the minimum (sum
of) squares of residues
Dependent, response | B1
[1]
B1
[1] | 1.2
1.2 | OE. Needs “minimises” oe and “squares of residuals/
differences/distances/errors” oe, but don’t need “sum of”
Both, no others
(b)
(c) | 314
or 15.0 (14.952)
21
(8) must be within the range of the given data, or
“must be interpolation” or “not extrapolation”.
x = 14 ( u − 2 ) , y = 12 ( 8 − v )
1(8−v)=−6+ 831(u−2)
2 7 42 4 | B1
[1]
B1
[1]
M1
M1 | 1.1
2.3
3.1a
1.1 | Exact or in range [14.9, 15.0] Allow 15 only if 3SF seen
correct in working.
Allow “within range of data”. Not “within range of y-values”,
not “must be one of the data values”. Ignore extra comments,
except that if anything definitely wrong seen: B0
Rearrange
Substitute into equation
(d)
(e)
OR | Gradient of v on u = (gradient of y on x)  –2  4 | M1A1 | dv dy dv du
Use =   or equivalent, e.g.
du dx dy dx
b = S / S = (S × (–2×4)) / (S  42)
uv uu xy xx
OR | (x, y) = (0,−6),(36,0); (u, v) = (2, 6 8),(3 1 0,8)
7 8 3
7 83
(68−8)(2−310)
New gradient is
7 83 | M1
M1 | Find any 2 points on (y on x) and convert to (u, v)
Find new gradient
OR | v = a + bu  8 – 2y = a + b(2 + 4x) | M1
A1 | Substitute and compare coefficients of x
4 and –2 correctly placed
Compare coefficients of x: 4 b  ( − 2 ) = 84 32
−83
Gradient is (which is very close to –1, AG)
84 | A1
[3] | 3.2a | Obtain –0.988 or better. Ignore intercept constants. Don’t
need conclusion.
M1
M1
Find any 2 points on (y on x) and convert to (u, v)
Find new gradient
Question | Answer | Marks | AO | Guidance
7 The coordinates of a set of 10 points are denoted by ( $\mathrm { x } _ { \mathrm { i } } , \mathrm { y } _ { \mathrm { i } }$ ) for $i = 1,2 , \ldots , 10$. For a particular set of values of ( $\mathrm { x } _ { \mathrm { i } } , \mathrm { y } _ { \mathrm { i } }$ ) and any constants $a$ and $b$ it can be shown that\\
$\Sigma \left( y _ { i } - a - b x _ { i } \right) ^ { 2 } = 10 ( 11 - a - 6 b ) ^ { 2 } + 126 \left( b - \frac { 83 } { 42 } \right) ^ { 2 } + \frac { 139 } { 14 }$.
\begin{enumerate}[label=(\alph*)]
\item \begin{enumerate}[label=(\roman*)]
\item Explain why $\sum \left( \mathrm { y } _ { \mathrm { i } } - \mathrm { a } - \mathrm { bx } _ { \mathrm { i } } \right) ^ { 2 }$ is minimised by taking $b = \frac { 83 } { 42 }$ and $\mathrm { a } = 11 - 6 \mathrm {~b}$.
\item Hence explain why the equation of the regression line of $y$ on $x$ for these points is given by the corresponding values of $a$ and $b$ (so that the equation is $\mathrm { y } = \frac { 83 } { 42 } \mathrm { x } - \frac { 6 } { 7 }$ ).
\end{enumerate}\item State which of the following terms cannot apply to the variable $X$ if the regression line of $y$ on $x$ can be used for estimating values of $Y$.

Dependent Independent Controlled Response
\item Use the regression line to estimate the value of $y$ corresponding to $x = 8$.
\item State what must be true of the value $x = 8$ if the estimate in part (c) is to be reliable.
\item Variables $u$ and $v$ are related to $x$ and $y$ by the following relationships.\\
$u = 2 + 4 x \quad v = 8 - 2 y$

Show that the gradient of the regression line of $v$ on $u$ is very close to - 1 .
\end{enumerate}

\hfill \mbox{\textit{OCR Further Statistics 2024 Q7 [8]}}