Edexcel FS2 2023 June — Question 1 7 marks

Exam BoardEdexcel
ModuleFS2 (Further Statistics 2)
Year2023
SessionJune
Marks7
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeCalculate PMCC from summary statistics
DifficultyEasy -1.2 This is a straightforward application of standard formulas for PMCC, regression line, and residuals. All summary statistics are provided, requiring only direct substitution into memorized formulas (r = S_xy/√(S_xx·S_yy), b = S_xy/S_xx, a = ȳ - bx̄). Part (d) requires basic interpretation of residual signs. No problem-solving or novel insight needed—purely routine calculation and standard interpretation, making it easier than average A-level questions.
Spec5.08a Pearson correlation: calculate pmcc5.09c Calculate regression line

  1. Baako is investigating the times taken by children to run a 100 m race, \(x\) seconds, and a 500 m race, \(y\) seconds. For a sample of 20 children, Baako obtains the time taken by each child to run each race.
Here are Baako's summary statistics. $$\begin{gathered} \mathrm { S } _ { x x } = 314.55 \quad \mathrm {~S} _ { y y } = 9026 \quad \mathrm {~S} _ { x y } = 1610 \\ \bar { x } = 19.65 \quad \bar { y } = 108 \end{gathered}$$
  1. Calculate the product moment correlation coefficient between the times taken to run the 100 m race and the times taken to run the 500 m race.
  2. Show that the equation of the regression line of \(y\) on \(x\) can be written as $$y = 5.12 x + 7.42$$ where the gradient and \(y\) intercept are given to 3 significant figures. The child who completed the 100 m race in 20 seconds took 104 seconds to complete the 500 m race.
  3. Find the residual for this child. The table below shows the signs of the residuals for the 20 children in order of finishing time for the 100 m race.
    Sign of residual++++--+--------+++++
  4. Explain what the signs of the residuals show about the model's predictions of the 500 m race times for the children who are fastest and slowest over the 100 m race.

Question 1:
Part (a)
AnswerMarks Guidance
Answer/WorkingMarks Guidance
\(r = \dfrac{1610}{\sqrt{314.55 \times 9026}}\)M1 Correct expression for \(r\)
\(= 0.9555\ldots\) awrt \(\mathbf{0.956}\)A1 awrt 0.956
(2 marks)
Part (b)
AnswerMarks Guidance
Answer/WorkingMarks Guidance
\(b = \dfrac{1610}{314.55}\)M1 Correct expression for gradient, or correct formula stated with \(b\) to more than 3 s.f.
\(a = \bar{y} - b\bar{x} = 108 - {'}b{'} \times 19.65\)M1 Correct expression, or correct formula stated with \(a\) to more than 3 s.f., follow through their \(b\)
\(y = 5.12x + 7.42\)*A1* Correct linear model with \(a\) = awrt 7.42 and \(b\) = awrt 5.12
(3 marks)
Part (c)
AnswerMarks Guidance
Answer/WorkingMarks Guidance
Residual \(= 104 - (5.12(20) + 7.42) =\) awrt \(\mathbf{-5.8}\)B1 Correct use of model awrt \(-5.8\)
(1 mark)
Part (d)
AnswerMarks Guidance
Answer/WorkingMarks Guidance
Positive residuals show that the model underestimates the 500m race times of the fastest and slowest children.B1 Correct contextual explanation mentioning positive residuals and underestimate o.e.
(1 mark)
## Question 1:

### Part (a)
| Answer/Working | Marks | Guidance |
|---|---|---|
| $r = \dfrac{1610}{\sqrt{314.55 \times 9026}}$ | M1 | Correct expression for $r$ |
| $= 0.9555\ldots$ awrt $\mathbf{0.956}$ | A1 | awrt **0.956** |
| **(2 marks)** | | |

### Part (b)
| Answer/Working | Marks | Guidance |
|---|---|---|
| $b = \dfrac{1610}{314.55}$ | M1 | Correct expression for gradient, or correct formula stated with $b$ to more than 3 s.f. |
| $a = \bar{y} - b\bar{x} = 108 - {'}b{'} \times 19.65$ | M1 | Correct expression, or correct formula stated with $a$ to more than 3 s.f., follow through their $b$ |
| $y = 5.12x + 7.42$* | A1* | Correct linear model with $a$ = awrt **7.42** and $b$ = awrt **5.12** |
| **(3 marks)** | | |

### Part (c)
| Answer/Working | Marks | Guidance |
|---|---|---|
| Residual $= 104 - (5.12(20) + 7.42) =$ awrt $\mathbf{-5.8}$ | B1 | Correct use of model awrt $-5.8$ |
| **(1 mark)** | | |

### Part (d)
| Answer/Working | Marks | Guidance |
|---|---|---|
| Positive residuals show that the model underestimates the 500m race times of the fastest and slowest children. | B1 | Correct contextual explanation mentioning positive residuals and underestimate o.e. |
| **(1 mark)** | | |
\begin{enumerate}
  \item Baako is investigating the times taken by children to run a 100 m race, $x$ seconds, and a 500 m race, $y$ seconds. For a sample of 20 children, Baako obtains the time taken by each child to run each race.
\end{enumerate}

Here are Baako's summary statistics.

$$\begin{gathered}
\mathrm { S } _ { x x } = 314.55 \quad \mathrm {~S} _ { y y } = 9026 \quad \mathrm {~S} _ { x y } = 1610 \\
\bar { x } = 19.65 \quad \bar { y } = 108
\end{gathered}$$

(a) Calculate the product moment correlation coefficient between the times taken to run the 100 m race and the times taken to run the 500 m race.\\
(b) Show that the equation of the regression line of $y$ on $x$ can be written as

$$y = 5.12 x + 7.42$$

where the gradient and $y$ intercept are given to 3 significant figures.

The child who completed the 100 m race in 20 seconds took 104 seconds to complete the 500 m race.\\
(c) Find the residual for this child.

The table below shows the signs of the residuals for the 20 children in order of finishing time for the 100 m race.

\begin{center}
\begin{tabular}{ | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | }
\hline
Sign of residual & + & + & + & + & - & - & + & - & - & - & - & - & - & - & - & + & + & + & + & + \\
\hline
\end{tabular}
\end{center}

(d) Explain what the signs of the residuals show about the model's predictions of the 500 m race times for the children who are fastest and slowest over the 100 m race.

\hfill \mbox{\textit{Edexcel FS2 2023 Q1 [7]}}