| Exam Board | Edexcel |
|---|---|
| Module | FS2 AS (Further Statistics 2 AS) |
| Year | 2023 |
| Session | June |
| Marks | 10 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate PMCC from summary statistics |
| Difficulty | Standard +0.3 This is a standard Further Statistics question testing routine PMCC and regression calculations from summary statistics using well-rehearsed formulas (S_vv, r = S_hv/√(S_hh·S_vv), regression equation). Part (d) tests conceptual understanding that residuals sum to zero, and part (e) requires simple algebraic manipulation. All parts follow textbook procedures with no novel problem-solving required, making it slightly easier than average even for Further Maths. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09c Calculate regression line5.09e Use regression: for estimation in context |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(r = \frac{5.1376}{\sqrt{"831.132"\times 0.0487}}\) | M1 | 1.1b – complete correct method to find \(r\); correct expressions for \(S_{vv}\) and \(r\) |
| \(r = 0.80753\ldots\) awrt 0.808 | A1 | 1.1b |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| [Positive] correlation reasonably close to 1 is consistent with a linear relationship, or strong (oe e.g. "high") positive correlation, so is consistent | B1ft | 2.4 – ft their answer to part (a). For a correct reason. If \(r < 0.58\) allow "weak" correlation or correlation is close to 0 so is not consistent |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(b = \frac{5.1376}{0.0487}\ [= 105.49\ldots]\) | M1 | 3.3 – correct model, correct expression for \(b\) (or 105 or better) |
| \(a = \frac{2174.9}{9} - \text{"105.49..."}\times\frac{17.63}{9}\) or \(a = 241.655\ldots - \text{"105.49..."}\times 1.958\ldots\) | M1 | 1.1b – correct model, correct expression (ft) for \(a\) (or 35 or better) |
| \(v = 105.5h + 35.0\) | A1 | 1.1b – \(b = 105.5\) or awrt 105 and \(a =\) awrt 35.0 [condone \(a = 35\) but not awrt 35] |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| The sum of the residuals should be zero | B1 | 2.4 – correct explanation |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| Residual \(= 2.27 - 1.04\ [= 1.23]\) | M1 | 3.1b – correct method to calculate the actual residual |
| \(v = \text{"105.5"}\times 1.96 + \text{"35.0"} + \text{"1.23"}\) | M1 | 3.4 – using the model to estimate the speed. Must be \(\hat{v} +\) their actual residual (using 2.27…) |
| \(= 243.01\) awrt 243 | A1ft | 1.1b – for awrt 243 or ft their values of \(a\) and \(b\) |
## Question 3:
### Part 3(a):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $r = \frac{5.1376}{\sqrt{"831.132"\times 0.0487}}$ | M1 | 1.1b – complete correct method to find $r$; correct expressions for $S_{vv}$ and $r$ |
| $r = 0.80753\ldots$ awrt 0.808 | A1 | 1.1b |
### Part 3(b):
| Answer/Working | Mark | Guidance |
|---|---|---|
| [Positive] correlation reasonably close to 1 is consistent with a linear relationship, or strong (oe e.g. "high") positive correlation, so is consistent | B1ft | 2.4 – ft their answer to part (a). For a correct reason. If $r < 0.58$ allow "weak" correlation or correlation is close to 0 so is **not** consistent |
### Part 3(c):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $b = \frac{5.1376}{0.0487}\ [= 105.49\ldots]$ | M1 | 3.3 – correct model, correct expression for $b$ (or 105 or better) |
| $a = \frac{2174.9}{9} - \text{"105.49..."}\times\frac{17.63}{9}$ or $a = 241.655\ldots - \text{"105.49..."}\times 1.958\ldots$ | M1 | 1.1b – correct model, correct expression (ft) for $a$ (or 35 or better) |
| $v = 105.5h + 35.0$ | A1 | 1.1b – $b = 105.5$ or awrt 105 **and** $a =$ awrt 35.0 [condone $a = 35$ but not awrt 35] |
### Part 3(d):
| Answer/Working | Mark | Guidance |
|---|---|---|
| The sum of the residuals should be zero | B1 | 2.4 – correct explanation |
### Part 3(e):
| Answer/Working | Mark | Guidance |
|---|---|---|
| Residual $= 2.27 - 1.04\ [= 1.23]$ | M1 | 3.1b – correct method to calculate the actual residual |
| $v = \text{"105.5"}\times 1.96 + \text{"35.0"} + \text{"1.23"}$ | M1 | 3.4 – using the model to estimate the speed. Must be $\hat{v} +$ their actual residual (using 2.27…) |
| $= 243.01$ awrt 243 | A1ft | 1.1b – for awrt 243 or ft their values of $a$ and $b$ |
---
\begin{enumerate}
\item Pat is investigating the relationship between the height of professional tennis players and the speed of their serve. Data from 9 randomly selected professional male tennis players were collected. The variables recorded were the height of each player, $h$ metres, and the maximum speed of their serve, $v \mathrm {~km} / \mathrm { h }$.
\end{enumerate}
Pat summarised these data as follows
$$\sum h = 17.63 \quad \sum v = 2174.9 \quad \sum v ^ { 2 } = 526407.8 \quad S _ { h h } = 0.0487 \quad S _ { h v } = 5.1376$$
(a) Calculate the product moment correlation coefficient between $h$ and $v$\\
(b) Explain whether the answer to part (a) is consistent with a linear model for these data.\\
(c) Find the equation of the regression line of $v$ on $h$ in the form $v = a + b h$ where $a$ and $b$ are to be given to one decimal place.
Pat calculated the sum of the residuals for the 9 tennis players as 1.04\\
(d) Without doing a calculation, explain how you know Pat has made a mistake.
Pat made one mistake in the calculation. For the tennis player of height 1.96 m Pat misread the residual as 2.27\\
(e) Find the maximum speed of serve, in km/h, for the tennis player of height 1.96 m
\hfill \mbox{\textit{Edexcel FS2 AS 2023 Q3 [10]}}