Edexcel FS2 Specimen — Question 6 12 marks

Exam BoardEdexcel
ModuleFS2 (Further Statistics 2)
SessionSpecimen
Marks12
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeCalculate y on x from raw data table
DifficultyStandard +0.3 This is a standard Further Statistics 2 regression question with provided summary statistics (Sxx, Smm, Sxm), requiring routine application of formulas for regression line, RSS calculation, and residual analysis. While it has multiple parts and involves outlier discussion, all steps follow textbook procedures with no novel problem-solving required. Slightly easier than average A-level due to given summary statistics eliminating computational burden.
Spec5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression5.09e Use regression: for estimation in context

  1. A random sample of 10 female pigs was taken. The number of piglets, \(x\), born to each female pig and their average weight at birth, \(m \mathrm {~kg}\), was recorded. The results were as follows:
Number of piglets, \(\boldsymbol { x }\)45678910111213
Average weight at
birth, \(\boldsymbol { m } \mathbf { ~ k g }\)
1.501.201.401.401.231.301.201.151.251.15
(You may use \(\mathrm { S } _ { x x } = 82.5\) and \(\mathrm { S } _ { m m } = 0.12756\) and \(\mathrm { S } _ { x m } = - 2.29\) )
  1. Find the equation of the regression line of \(m\) on \(x\) in the form \(m = a + b x\) as a model for these results.
  2. Show that the residual sum of squares (RSS) is 0.064 to 3 decimal places.
  3. Calculate the residual values.
  4. Write down the outlier.
    1. Comment on the validity of ignoring this outlier.
    2. Ignoring the outlier, produce another model.
    3. Use this model to estimate the average weight at birth if \(x = 15\)
    4. Comment, giving a reason, on the reliability of your estimate.

(a)
AnswerMarks Guidance
\(b = \frac{S_{xm}}{S_{xx}} = -0.0277576\)M1 Realising the need to use \(b = \frac{S_{xm}}{S_{xx}}\) and \(a = \bar{m} - b\bar{x}\)
\([a = \bar{m} - b\bar{x} = 1.278 + 0.0277576 \times 8.5 = 1.5139]\) \(m = 1.5139 - 0.02775...\times x\)A1 \(m = \) awrt 1.51) – (awrt 0.0278) \(x\). Award M1A1 for correct equation
(b)
AnswerMarks Guidance
\(\text{RSS} = 0.12756 - \frac{(-2.29)^2}{82.5}\)M1 Using \(S_{mm} - \frac{(S_{xm})^2}{S_{xx}}\)
\(= 0.06399*\)A1* awrt 0.064
(c)
AnswerMarks Guidance
Using the model in part (a) i.e. \(m = \) ("1.5139" – "0.02775"\(x\)) implied by a correct valueM1 All correct. Award M1A1 for a list of correct residuals
A1
\(x\)\(m\) \(m = a + bx\)
41.50 1.4029
51.20 1.3752
61.40 1.3474
71.40 1.3196
81.23 1.2919
91.30 1.2641
101.20 1.2364
111.15 1.2086
121.25 1.1808
131.15 1.1531
(d)
AnswerMarks
The point (5, 1.2) is an outlierB1ft
(e)(i)
AnswerMarks
It is a valid piece of data so should be used or It does not follow the pattern according to the residuals so may contain an error making the result invalid so should be removedB1
(e)(ii)
AnswerMarks
\(a = \bar{m} - b\bar{x} = 1.28667 + 0.03765 \times 8.88889 = 1.6213\) \(m = 1.6213 - 0.03765x\)M1
A1
(e)(iii)
AnswerMarks Guidance
\(m = 1.6213 - 0.03765 \times 15 = 1.056\) or awrt 1.06B1ft using their model in e(i), awrt 1.06 or ft their e(ii)
(e)(iv)
AnswerMarks
The model is only reliable if the values are limited to those in the given range so probably not reliableB1
**(a)**

| $b = \frac{S_{xm}}{S_{xx}} = -0.0277576$ | M1 | Realising the need to use $b = \frac{S_{xm}}{S_{xx}}$ and $a = \bar{m} - b\bar{x}$ |
| $[a = \bar{m} - b\bar{x} = 1.278 + 0.0277576 \times 8.5 = 1.5139]$ $m = 1.5139 - 0.02775...\times x$ | A1 | $m = $ awrt 1.51) – (awrt 0.0278) $x$. Award M1A1 for correct equation |

**(b)**

| $\text{RSS} = 0.12756 - \frac{(-2.29)^2}{82.5}$ | M1 | Using $S_{mm} - \frac{(S_{xm})^2}{S_{xx}}$ |
| $= 0.06399*$ | A1* | awrt 0.064 |

**(c)**

| Using the model in part (a) i.e. $m = $ ("1.5139" – "0.02775"$x$) implied by a correct value | M1 | All correct. Award M1A1 for a list of correct residuals |
| | A1 | |

| $x$ | $m$ | $m = a + bx$ | $\epsilon$ |
|---|---|---|---|
| 4 | 1.50 | 1.4029 | +0.0971 |
| 5 | 1.20 | 1.3752 | −0.1752 |
| 6 | 1.40 | 1.3474 | +0.0526 |
| 7 | 1.40 | 1.3196 | +0.0804 |
| 8 | 1.23 | 1.2919 | −0.0619 |
| 9 | 1.30 | 1.2641 | +0.0359 |
| 10 | 1.20 | 1.2364 | −0.0364 |
| 11 | 1.15 | 1.2086 | −0.0586 |
| 12 | 1.25 | 1.1808 | +0.0692 |
| 13 | 1.15 | 1.1531 | −0.0031 |

**(d)**

| The point (5, 1.2) is an outlier | B1ft | |

**(e)(i)**

| It is a valid piece of data so should be used **or** It does not follow the pattern according to the residuals so may contain an error making the result invalid so should be removed | B1 | |

**(e)(ii)**

| $a = \bar{m} - b\bar{x} = 1.28667 + 0.03765 \times 8.88889 = 1.6213$ $m = 1.6213 - 0.03765x$ | M1 | |
| | A1 | |

**(e)(iii)**

| $m = 1.6213 - 0.03765 \times 15 = 1.056$ or awrt 1.06 | B1ft | using their model in e(i), awrt 1.06 or ft their e(ii) |

**(e)(iv)**

| The model is only reliable if the values are limited to those in the given range so probably not reliable | B1 | |

---
\begin{enumerate}
  \item A random sample of 10 female pigs was taken. The number of piglets, $x$, born to each female pig and their average weight at birth, $m \mathrm {~kg}$, was recorded. The results were as follows:
\end{enumerate}

\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | c | c | c | }
\hline
Number of piglets, $\boldsymbol { x }$ & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 \\
\hline
\begin{tabular}{ l }
Average weight at \\
birth, $\boldsymbol { m } \mathbf { ~ k g }$ \\
\end{tabular} & 1.50 & 1.20 & 1.40 & 1.40 & 1.23 & 1.30 & 1.20 & 1.15 & 1.25 & 1.15 \\
\hline
\end{tabular}
\end{center}

(You may use $\mathrm { S } _ { x x } = 82.5$ and $\mathrm { S } _ { m m } = 0.12756$ and $\mathrm { S } _ { x m } = - 2.29$ )\\
(a) Find the equation of the regression line of $m$ on $x$ in the form $m = a + b x$ as a model for these results.\\
(b) Show that the residual sum of squares (RSS) is 0.064 to 3 decimal places.\\
(c) Calculate the residual values.\\
(d) Write down the outlier.\\
(e) (i) Comment on the validity of ignoring this outlier.\\
(ii) Ignoring the outlier, produce another model.\\
(iii) Use this model to estimate the average weight at birth if $x = 15$\\
(iv) Comment, giving a reason, on the reliability of your estimate.

\hfill \mbox{\textit{Edexcel FS2  Q6 [12]}}