| Exam Board | Edexcel |
|---|---|
| Module | FS2 (Further Statistics 2) |
| Year | 2021 |
| Session | June |
| Marks | 10 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate PMCC from summary statistics |
| Difficulty | Standard +0.3 This is a straightforward Further Statistics 2 question testing standard regression calculations from summary statistics. Parts (a)-(d) involve routine procedures: interpreting PMCC, deriving regression equation from given statistics, unit conversion, and RSS calculation. Part (e) requires calculating one residual's contribution. While it's Further Maths content (inherently harder), these are textbook exercises with clear methods and no novel problem-solving required. Slightly easier than average A-level difficulty overall. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression |
| Answer | Marks | Guidance |
|---|---|---|
| Working | Mark | Guidance |
| As elevation increases, temperature decreases | B1 | Correct contextual interpretation |
| Answer | Marks | Guidance |
|---|---|---|
| Working | Mark | Guidance |
| \(S_{xt} = -0.959\sqrt{8\,820\,655 \times 444.7} = -60\,062.38727\) | M1 | Using pmcc to find \(S_{xt}\) |
| \(b = \frac{-60\,062...}{8\,820\,655} = -0.006809...\) | M1 | Setting up linear model by attempting to find \(b\); allow M2 for \(b = r\sqrt{\frac{S_{tt}}{S_{xx}}}\) |
| \(a = \frac{94.62}{20} - {'}b{'}\frac{28\,130}{20} = 14.308...\) | M1 | Setting up linear model by attempting to find \(a\) |
| \(t = 14.3 - 0.00681x\) | A1cso* | Correct model with \(a =\) awrt 14.3 and \(b =\) awrt \(-0.00681\) |
| Answer | Marks | Guidance |
|---|---|---|
| Working | Mark | Guidance |
| \(\left[w = \frac{x}{1000} \rightarrow\right]\; t = 14.3 - 6.81w\) | B1 | Correct model |
| Answer | Marks | Guidance |
|---|---|---|
| Working | Mark | Guidance |
| \(444.7(1-(-0.959)^2)\) or \(444.7 - \frac{(-60\,062...)^2}{8\,820\,655} = 35.7^*\) | B1cso* | Either correct expression |
| Answer | Marks | Guidance |
|---|---|---|
| Working | Mark | Guidance |
| \((\text{residual})^2 = [1.4 - (14.3 - 0.00681(1100))]^2 = 29.2...\) | M1 | Using the model to evaluate the squared residual |
| \([29.2... \div 35.7 \times 100\%]\) awrt 82% | A1 | awrt 82% |
| Answer | Marks | Guidance |
|---|---|---|
| Working | Mark | Guidance |
| As the point representing this data contributes to the majority of the RSS, the point is possibly an outlier and should be investigated | B1 | Evaluating the result to suggest the point may be an outlier |
# Question 4:
## Part (a):
| Working | Mark | Guidance |
|---------|------|----------|
| As **elevation** increases, **temperature** decreases | B1 | Correct contextual interpretation |
## Part (b):
| Working | Mark | Guidance |
|---------|------|----------|
| $S_{xt} = -0.959\sqrt{8\,820\,655 \times 444.7} = -60\,062.38727$ | M1 | Using pmcc to find $S_{xt}$ |
| $b = \frac{-60\,062...}{8\,820\,655} = -0.006809...$ | M1 | Setting up linear model by attempting to find $b$; allow M2 for $b = r\sqrt{\frac{S_{tt}}{S_{xx}}}$ |
| $a = \frac{94.62}{20} - {'}b{'}\frac{28\,130}{20} = 14.308...$ | M1 | Setting up linear model by attempting to find $a$ |
| $t = 14.3 - 0.00681x$ | A1cso* | Correct model with $a =$ awrt 14.3 and $b =$ awrt $-0.00681$ |
## Part (c):
| Working | Mark | Guidance |
|---------|------|----------|
| $\left[w = \frac{x}{1000} \rightarrow\right]\; t = 14.3 - 6.81w$ | B1 | Correct model |
## Part (d):
| Working | Mark | Guidance |
|---------|------|----------|
| $444.7(1-(-0.959)^2)$ or $444.7 - \frac{(-60\,062...)^2}{8\,820\,655} = 35.7^*$ | B1cso* | Either correct expression |
## Part (e)(i):
| Working | Mark | Guidance |
|---------|------|----------|
| $(\text{residual})^2 = [1.4 - (14.3 - 0.00681(1100))]^2 = 29.2...$ | M1 | Using the model to evaluate the squared residual |
| $[29.2... \div 35.7 \times 100\%]$ awrt 82% | A1 | awrt 82% |
## Part (e)(ii):
| Working | Mark | Guidance |
|---------|------|----------|
| As the point representing this data contributes to the majority of the RSS, the point is possibly an outlier and should be investigated | B1 | Evaluating the result to suggest the point may be an outlier |
---
\begin{enumerate}
\item A researcher is investigating the relationship between elevation, $x$ metres, and annual mean temperature, $t ^ { \circ } \mathrm { C }$.
\end{enumerate}
From a random sample of 20 weather stations in Switzerland, the following results were obtained
$$\mathrm { S } _ { x x } = 8820655 \quad \mathrm {~S} _ { t t } = 444.7 \quad \sum x = 28130 \quad \sum t = 94.62$$
The product moment correlation coefficient for these data is found to be - 0.959\\
(a) Interpret the value of this correlation coefficient.\\
(b) Show that the equation of the regression line of $t$ on $x$ can be written as
$$t = 14.3 - 0.00681 x$$
The random variable $W$ represents the elevations of the weather stations in kilometres.\\
(c) Write down the equation of the regression line of $t$ on $w$ for these 20 weather stations in the form $t = a + b w$\\
(d) Show that the residual sum of squares (RSS) for the model for $t$ and $x$ is 35.7 correct to one decimal place.
One of the weather stations in the sample had a recorded elevation of 1100 metres and an annual mean temperature of $1.4 ^ { \circ } \mathrm { C }$\\
(e) (i) Calculate this weather station's contribution to the residual sum of squares. Give your answer as a percentage\\
(ii) Comment on the data for this weather station in light of your answer to part (e)(i).
\hfill \mbox{\textit{Edexcel FS2 2021 Q4 [10]}}