| Exam Board | Edexcel |
|---|---|
| Module | FS2 (Further Statistics 2) |
| Session | Specimen |
| Marks | 12 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate y on x from raw data table |
| Difficulty | Standard +0.3 This is a standard Further Statistics 2 regression question with provided summary statistics (Sxx, Smm, Sxm), requiring routine application of formulas for regression line, RSS calculation, and residual analysis. While it has multiple parts and involves outlier discussion, all steps follow textbook procedures with no novel problem-solving required. Slightly easier than average A-level due to given summary statistics eliminating computational burden. |
| Spec | 5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression5.09e Use regression: for estimation in context |
| Number of piglets, \(\boldsymbol { x }\) | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | ||
| 1.50 | 1.20 | 1.40 | 1.40 | 1.23 | 1.30 | 1.20 | 1.15 | 1.25 | 1.15 |
| Answer | Marks | Guidance |
|---|---|---|
| \(b = \frac{S_{xm}}{S_{xx}} = -0.0277576\) | M1 | Realising the need to use \(b = \frac{S_{xm}}{S_{xx}}\) and \(a = \bar{m} - b\bar{x}\) |
| \([a = \bar{m} - b\bar{x} = 1.278 + 0.0277576 \times 8.5 = 1.5139]\) \(m = 1.5139 - 0.02775...\times x\) | A1 | \(m = \) awrt 1.51) – (awrt 0.0278) \(x\). Award M1A1 for correct equation |
| Answer | Marks | Guidance |
|---|---|---|
| \(\text{RSS} = 0.12756 - \frac{(-2.29)^2}{82.5}\) | M1 | Using \(S_{mm} - \frac{(S_{xm})^2}{S_{xx}}\) |
| \(= 0.06399*\) | A1* | awrt 0.064 |
| Answer | Marks | Guidance |
|---|---|---|
| Using the model in part (a) i.e. \(m = \) ("1.5139" – "0.02775"\(x\)) implied by a correct value | M1 | All correct. Award M1A1 for a list of correct residuals |
| A1 | ||
| \(x\) | \(m\) | \(m = a + bx\) |
| 4 | 1.50 | 1.4029 |
| 5 | 1.20 | 1.3752 |
| 6 | 1.40 | 1.3474 |
| 7 | 1.40 | 1.3196 |
| 8 | 1.23 | 1.2919 |
| 9 | 1.30 | 1.2641 |
| 10 | 1.20 | 1.2364 |
| 11 | 1.15 | 1.2086 |
| 12 | 1.25 | 1.1808 |
| 13 | 1.15 | 1.1531 |
| Answer | Marks |
|---|---|
| The point (5, 1.2) is an outlier | B1ft |
| Answer | Marks |
|---|---|
| It is a valid piece of data so should be used or It does not follow the pattern according to the residuals so may contain an error making the result invalid so should be removed | B1 |
| Answer | Marks |
|---|---|
| \(a = \bar{m} - b\bar{x} = 1.28667 + 0.03765 \times 8.88889 = 1.6213\) \(m = 1.6213 - 0.03765x\) | M1 |
| A1 |
| Answer | Marks | Guidance |
|---|---|---|
| \(m = 1.6213 - 0.03765 \times 15 = 1.056\) or awrt 1.06 | B1ft | using their model in e(i), awrt 1.06 or ft their e(ii) |
| Answer | Marks |
|---|---|
| The model is only reliable if the values are limited to those in the given range so probably not reliable | B1 |
**(a)**
| $b = \frac{S_{xm}}{S_{xx}} = -0.0277576$ | M1 | Realising the need to use $b = \frac{S_{xm}}{S_{xx}}$ and $a = \bar{m} - b\bar{x}$ |
| $[a = \bar{m} - b\bar{x} = 1.278 + 0.0277576 \times 8.5 = 1.5139]$ $m = 1.5139 - 0.02775...\times x$ | A1 | $m = $ awrt 1.51) – (awrt 0.0278) $x$. Award M1A1 for correct equation |
**(b)**
| $\text{RSS} = 0.12756 - \frac{(-2.29)^2}{82.5}$ | M1 | Using $S_{mm} - \frac{(S_{xm})^2}{S_{xx}}$ |
| $= 0.06399*$ | A1* | awrt 0.064 |
**(c)**
| Using the model in part (a) i.e. $m = $ ("1.5139" – "0.02775"$x$) implied by a correct value | M1 | All correct. Award M1A1 for a list of correct residuals |
| | A1 | |
| $x$ | $m$ | $m = a + bx$ | $\epsilon$ |
|---|---|---|---|
| 4 | 1.50 | 1.4029 | +0.0971 |
| 5 | 1.20 | 1.3752 | −0.1752 |
| 6 | 1.40 | 1.3474 | +0.0526 |
| 7 | 1.40 | 1.3196 | +0.0804 |
| 8 | 1.23 | 1.2919 | −0.0619 |
| 9 | 1.30 | 1.2641 | +0.0359 |
| 10 | 1.20 | 1.2364 | −0.0364 |
| 11 | 1.15 | 1.2086 | −0.0586 |
| 12 | 1.25 | 1.1808 | +0.0692 |
| 13 | 1.15 | 1.1531 | −0.0031 |
**(d)**
| The point (5, 1.2) is an outlier | B1ft | |
**(e)(i)**
| It is a valid piece of data so should be used **or** It does not follow the pattern according to the residuals so may contain an error making the result invalid so should be removed | B1 | |
**(e)(ii)**
| $a = \bar{m} - b\bar{x} = 1.28667 + 0.03765 \times 8.88889 = 1.6213$ $m = 1.6213 - 0.03765x$ | M1 | |
| | A1 | |
**(e)(iii)**
| $m = 1.6213 - 0.03765 \times 15 = 1.056$ or awrt 1.06 | B1ft | using their model in e(i), awrt 1.06 or ft their e(ii) |
**(e)(iv)**
| The model is only reliable if the values are limited to those in the given range so probably not reliable | B1 | |
---
\begin{enumerate}
\item A random sample of 10 female pigs was taken. The number of piglets, $x$, born to each female pig and their average weight at birth, $m \mathrm {~kg}$, was recorded. The results were as follows:
\end{enumerate}
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | c | c | c | }
\hline
Number of piglets, $\boldsymbol { x }$ & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 \\
\hline
\begin{tabular}{ l }
Average weight at \\
birth, $\boldsymbol { m } \mathbf { ~ k g }$ \\
\end{tabular} & 1.50 & 1.20 & 1.40 & 1.40 & 1.23 & 1.30 & 1.20 & 1.15 & 1.25 & 1.15 \\
\hline
\end{tabular}
\end{center}
(You may use $\mathrm { S } _ { x x } = 82.5$ and $\mathrm { S } _ { m m } = 0.12756$ and $\mathrm { S } _ { x m } = - 2.29$ )\\
(a) Find the equation of the regression line of $m$ on $x$ in the form $m = a + b x$ as a model for these results.\\
(b) Show that the residual sum of squares (RSS) is 0.064 to 3 decimal places.\\
(c) Calculate the residual values.\\
(d) Write down the outlier.\\
(e) (i) Comment on the validity of ignoring this outlier.\\
(ii) Ignoring the outlier, produce another model.\\
(iii) Use this model to estimate the average weight at birth if $x = 15$\\
(iv) Comment, giving a reason, on the reliability of your estimate.
\hfill \mbox{\textit{Edexcel FS2 Q6 [12]}}