| Exam Board | Edexcel |
|---|---|
| Module | FS2 (Further Statistics 2) |
| Year | 2023 |
| Session | June |
| Marks | 7 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate PMCC from summary statistics |
| Difficulty | Easy -1.2 This is a straightforward application of standard formulas for PMCC, regression line, and residuals. All summary statistics are provided, requiring only direct substitution into memorized formulas (r = S_xy/√(S_xx·S_yy), b = S_xy/S_xx, a = ȳ - bx̄). Part (d) requires basic interpretation of residual signs. No problem-solving or novel insight needed—purely routine calculation and standard interpretation, making it easier than average A-level questions. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.09c Calculate regression line |
| Sign of residual | + | + | + | + | - | - | + | - | - | - | - | - | - | - | - | + | + | + | + | + |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| \(r = \dfrac{1610}{\sqrt{314.55 \times 9026}}\) | M1 | Correct expression for \(r\) |
| \(= 0.9555\ldots\) awrt \(\mathbf{0.956}\) | A1 | awrt 0.956 |
| (2 marks) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| \(b = \dfrac{1610}{314.55}\) | M1 | Correct expression for gradient, or correct formula stated with \(b\) to more than 3 s.f. |
| \(a = \bar{y} - b\bar{x} = 108 - {'}b{'} \times 19.65\) | M1 | Correct expression, or correct formula stated with \(a\) to more than 3 s.f., follow through their \(b\) |
| \(y = 5.12x + 7.42\)* | A1* | Correct linear model with \(a\) = awrt 7.42 and \(b\) = awrt 5.12 |
| (3 marks) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| Residual \(= 104 - (5.12(20) + 7.42) =\) awrt \(\mathbf{-5.8}\) | B1 | Correct use of model awrt \(-5.8\) |
| (1 mark) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| Positive residuals show that the model underestimates the 500m race times of the fastest and slowest children. | B1 | Correct contextual explanation mentioning positive residuals and underestimate o.e. |
| (1 mark) |
## Question 1:
### Part (a)
| Answer/Working | Marks | Guidance |
|---|---|---|
| $r = \dfrac{1610}{\sqrt{314.55 \times 9026}}$ | M1 | Correct expression for $r$ |
| $= 0.9555\ldots$ awrt $\mathbf{0.956}$ | A1 | awrt **0.956** |
| **(2 marks)** | | |
### Part (b)
| Answer/Working | Marks | Guidance |
|---|---|---|
| $b = \dfrac{1610}{314.55}$ | M1 | Correct expression for gradient, or correct formula stated with $b$ to more than 3 s.f. |
| $a = \bar{y} - b\bar{x} = 108 - {'}b{'} \times 19.65$ | M1 | Correct expression, or correct formula stated with $a$ to more than 3 s.f., follow through their $b$ |
| $y = 5.12x + 7.42$* | A1* | Correct linear model with $a$ = awrt **7.42** and $b$ = awrt **5.12** |
| **(3 marks)** | | |
### Part (c)
| Answer/Working | Marks | Guidance |
|---|---|---|
| Residual $= 104 - (5.12(20) + 7.42) =$ awrt $\mathbf{-5.8}$ | B1 | Correct use of model awrt $-5.8$ |
| **(1 mark)** | | |
### Part (d)
| Answer/Working | Marks | Guidance |
|---|---|---|
| Positive residuals show that the model underestimates the 500m race times of the fastest and slowest children. | B1 | Correct contextual explanation mentioning positive residuals and underestimate o.e. |
| **(1 mark)** | | |
\begin{enumerate}
\item Baako is investigating the times taken by children to run a 100 m race, $x$ seconds, and a 500 m race, $y$ seconds. For a sample of 20 children, Baako obtains the time taken by each child to run each race.
\end{enumerate}
Here are Baako's summary statistics.
$$\begin{gathered}
\mathrm { S } _ { x x } = 314.55 \quad \mathrm {~S} _ { y y } = 9026 \quad \mathrm {~S} _ { x y } = 1610 \\
\bar { x } = 19.65 \quad \bar { y } = 108
\end{gathered}$$
(a) Calculate the product moment correlation coefficient between the times taken to run the 100 m race and the times taken to run the 500 m race.\\
(b) Show that the equation of the regression line of $y$ on $x$ can be written as
$$y = 5.12 x + 7.42$$
where the gradient and $y$ intercept are given to 3 significant figures.
The child who completed the 100 m race in 20 seconds took 104 seconds to complete the 500 m race.\\
(c) Find the residual for this child.
The table below shows the signs of the residuals for the 20 children in order of finishing time for the 100 m race.
\begin{center}
\begin{tabular}{ | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | l | }
\hline
Sign of residual & + & + & + & + & - & - & + & - & - & - & - & - & - & - & - & + & + & + & + & + \\
\hline
\end{tabular}
\end{center}
(d) Explain what the signs of the residuals show about the model's predictions of the 500 m race times for the children who are fastest and slowest over the 100 m race.
\hfill \mbox{\textit{Edexcel FS2 2023 Q1 [7]}}