| Exam Board | Edexcel |
|---|---|
| Module | FS2 (Further Statistics 2) |
| Year | 2024 |
| Session | June |
| Marks | 9 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Convert regression equation between coded and original |
| Difficulty | Standard +0.3 This is a straightforward application of standard regression formulas with coding. Part (a) requires converting between coded and original variables using given summary statistics, part (b) is a simple residual calculation, part (c) uses RSS = S_yy - b²S_xx, and part (d) is a direct comparison. All steps are routine textbook procedures with no novel insight required, making it slightly easier than average. |
| Spec | 5.09a Dependent/independent variables5.09c Calculate regression line5.09e Use regression: for estimation in context |
| Answer | Marks | Guidance |
|---|---|---|
| Working/Answer | Mark | Guidance |
| \(b = \frac{-338.83}{4.52}\ [= -74.96...]\) | M1 | For use of a correct model, i.e. a correct expression for \(b\) |
| \(a = \frac{626}{10} - "b"\frac{22.47}{10}\ [= 231.04.....]\) | M1 | For use of a correct model, i.e. a correct expression (ft) for \(a\) |
| \(t - 40 = \text{"231.04..."}+(\text{"}-\text{"74.96..."}) \sqrt{h}\) | dM1 | Dependent on both previous M marks; proceeding from equation of form \(v = "a"+"b"w\) to un-simplified model in terms of \(h\) and \(t\), ft their \(a\) and \(b\) |
| \(t = 271.04... - 74.96...\sqrt{h}\) | A1 | Correct model \(t = 271.04... - 74.95...\sqrt{h}\), awrt 271 and awrt 75 |
| (4 marks) |
| Answer | Marks | Guidance |
|---|---|---|
| Working/Answer | Mark | Guidance |
| \(\text{Residual} = 47 - \left(\text{"271.04..."} - \text{"74.96..."} \times \sqrt{9}\right)\) | M1 | States \(h=9\) and correct expression for residual of form \(t = a + b\sqrt{h}\), or 9 substituted into a correct expression. Must be subtracting the correct way round. |
| \(= 0.8466...\) | A1 | awrt 0.85; allow answers in range awrt 0.84 to awrt 0.86 if working shown following a correct model in (a). If no working shown then look for awrt 0.847 |
| (2 marks) |
| Answer | Marks | Guidance |
|---|---|---|
| Working/Answer | Mark | Guidance |
| \(RSS = \left[64678 - \frac{626^2}{10}\right] - \frac{(-338.83)^2}{4.52}\ \left(= 25490.4 - \frac{(-338.83)^2}{4.52}\right)\) | M1 | For a correct expression for RSS. \(25490.4\) may be seen as \(\frac{127452}{5}\). May also use \(RSS = S_{vv}(1-r^2) = 25490.4\!\left(1 - \frac{(-338.83)^2}{4.52 \times 25490.4}\right)\) |
| \(= 90.89...\ [s^2]\) | A1 | awrt 90.9 |
| (2 marks) |
| Answer | Marks | Guidance |
|---|---|---|
| Working/Answer | Mark | Guidance |
| Student \(A\)'s model as the sum of squares of the residuals is lower | B1 | Explaining a reason for their conclusion that A is a more suitable model provided their positive RSS found in (c) is less than 980. e.g. RSS is smaller so model \(A\). Condone references to the model being more accurate. Must be a comparison with \(B\) or implied — do not accept statements such as "\(A\) because it has a small RSS" |
| (1 mark) |
# Question 1:
## Part (a)
| Working/Answer | Mark | Guidance |
|---|---|---|
| $b = \frac{-338.83}{4.52}\ [= -74.96...]$ | M1 | For use of a correct model, i.e. a correct expression for $b$ |
| $a = \frac{626}{10} - "b"\frac{22.47}{10}\ [= 231.04.....]$ | M1 | For use of a correct model, i.e. a correct expression (ft) for $a$ |
| $t - 40 = \text{"231.04..."}+(\text{"}-\text{"74.96..."}) \sqrt{h}$ | dM1 | Dependent on both previous M marks; proceeding from equation of form $v = "a"+"b"w$ to un-simplified model in terms of $h$ and $t$, ft their $a$ and $b$ |
| $t = 271.04... - 74.96...\sqrt{h}$ | A1 | Correct model $t = 271.04... - 74.95...\sqrt{h}$, awrt 271 and awrt 75 |
| **(4 marks)** | | |
## Part (b)
| Working/Answer | Mark | Guidance |
|---|---|---|
| $\text{Residual} = 47 - \left(\text{"271.04..."} - \text{"74.96..."} \times \sqrt{9}\right)$ | M1 | States $h=9$ and correct expression for residual of form $t = a + b\sqrt{h}$, or 9 substituted into a correct expression. Must be subtracting the correct way round. |
| $= 0.8466...$ | A1 | awrt 0.85; allow answers in range awrt 0.84 to awrt 0.86 if working shown following a correct model in (a). If no working shown then look for awrt 0.847 |
| **(2 marks)** | | |
## Part (c)
| Working/Answer | Mark | Guidance |
|---|---|---|
| $RSS = \left[64678 - \frac{626^2}{10}\right] - \frac{(-338.83)^2}{4.52}\ \left(= 25490.4 - \frac{(-338.83)^2}{4.52}\right)$ | M1 | For a correct expression for RSS. $25490.4$ may be seen as $\frac{127452}{5}$. May also use $RSS = S_{vv}(1-r^2) = 25490.4\!\left(1 - \frac{(-338.83)^2}{4.52 \times 25490.4}\right)$ |
| $= 90.89...\ [s^2]$ | A1 | awrt 90.9 |
| **(2 marks)** | | |
## Part (d)
| Working/Answer | Mark | Guidance |
|---|---|---|
| Student $A$'s model as the sum of squares of the residuals is lower | B1 | Explaining a reason **for their conclusion that A is a more suitable model** provided their positive RSS found in (c) is less than 980. e.g. RSS is smaller so model $A$. Condone references to the model being more accurate. Must be a comparison with $B$ or implied — do not accept statements such as "$A$ because it has a small RSS" |
| **(1 mark)** | | |
**Total: 9 marks**
\begin{enumerate}
\item Two students are experimenting with some water in a plastic bottle. The bottle is filled with water and a hole is put in the bottom of the bottle. The students record the time, $t$ seconds, it takes for the water level to fall to each of 10 given values of the height, $h \mathrm {~cm}$, above the hole.
\end{enumerate}
Student $A$ models the data with an equation of the form $t = a + b \sqrt { h }$\\
The data is coded using $v = t - 40$ and $w = \sqrt { h }$ and the following information is obtained.
$$\sum v = 626 \quad \sum v ^ { 2 } = 64678 \quad \sum w = 22.47 \quad \mathrm {~S} _ { w w } = 4.52 \quad \mathrm {~S} _ { v w } = - 338.83$$
(a) Find the equation of the regression line of $t$ on $\sqrt { h }$ in the form $t = a + b \sqrt { h }$
The time it takes the water level to fall to a height of 9 cm above the hole is 47 seconds.\\
(b) Calculate the residual for this data point.
Give your answer to 2 decimal places.
Given that the residual sum of squares (RSS) for the model of $t$ on $\sqrt { h }$ is the same as the RSS for the model of $v$ on $w$,\\
(c) calculate the RSS for these 10 data points.
Student $B$ models the data with an equation of the form $t = c + d h$\\
The regression line of $t$ on $h$ is calculated and the residual sum of squares (RSS) is found to be 980 to 3 significant figures.\\
(d) With reference to part (c) state, giving a reason, whether Student B's model or Student A's model is the more suitable for these data.
\hfill \mbox{\textit{Edexcel FS2 2024 Q1 [9]}}