Edexcel FS2 2021 June — Question 4 10 marks

Exam BoardEdexcel
ModuleFS2 (Further Statistics 2)
Year2021
SessionJune
Marks10
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeCalculate PMCC from summary statistics
DifficultyStandard +0.3 This is a straightforward Further Statistics 2 question testing standard regression calculations from summary statistics. Parts (a)-(d) involve routine procedures: interpreting PMCC, deriving regression equation from given statistics, unit conversion, and RSS calculation. Part (e) requires calculating one residual's contribution. While it's Further Maths content (inherently harder), these are textbook exercises with clear methods and no novel problem-solving required. Slightly easier than average A-level difficulty overall.
Spec5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression

  1. A researcher is investigating the relationship between elevation, \(x\) metres, and annual mean temperature, \(t ^ { \circ } \mathrm { C }\).
From a random sample of 20 weather stations in Switzerland, the following results were obtained $$\mathrm { S } _ { x x } = 8820655 \quad \mathrm {~S} _ { t t } = 444.7 \quad \sum x = 28130 \quad \sum t = 94.62$$ The product moment correlation coefficient for these data is found to be - 0.959
  1. Interpret the value of this correlation coefficient.
  2. Show that the equation of the regression line of \(t\) on \(x\) can be written as $$t = 14.3 - 0.00681 x$$ The random variable \(W\) represents the elevations of the weather stations in kilometres.
  3. Write down the equation of the regression line of \(t\) on \(w\) for these 20 weather stations in the form \(t = a + b w\)
  4. Show that the residual sum of squares (RSS) for the model for \(t\) and \(x\) is 35.7 correct to one decimal place. One of the weather stations in the sample had a recorded elevation of 1100 metres and an annual mean temperature of \(1.4 ^ { \circ } \mathrm { C }\)
    1. Calculate this weather station's contribution to the residual sum of squares. Give your answer as a percentage
    2. Comment on the data for this weather station in light of your answer to part (e)(i).

Question 4:
Part (a):
AnswerMarks Guidance
WorkingMark Guidance
As elevation increases, temperature decreasesB1 Correct contextual interpretation
Part (b):
AnswerMarks Guidance
WorkingMark Guidance
\(S_{xt} = -0.959\sqrt{8\,820\,655 \times 444.7} = -60\,062.38727\)M1 Using pmcc to find \(S_{xt}\)
\(b = \frac{-60\,062...}{8\,820\,655} = -0.006809...\)M1 Setting up linear model by attempting to find \(b\); allow M2 for \(b = r\sqrt{\frac{S_{tt}}{S_{xx}}}\)
\(a = \frac{94.62}{20} - {'}b{'}\frac{28\,130}{20} = 14.308...\)M1 Setting up linear model by attempting to find \(a\)
\(t = 14.3 - 0.00681x\)A1cso* Correct model with \(a =\) awrt 14.3 and \(b =\) awrt \(-0.00681\)
Part (c):
AnswerMarks Guidance
WorkingMark Guidance
\(\left[w = \frac{x}{1000} \rightarrow\right]\; t = 14.3 - 6.81w\)B1 Correct model
Part (d):
AnswerMarks Guidance
WorkingMark Guidance
\(444.7(1-(-0.959)^2)\) or \(444.7 - \frac{(-60\,062...)^2}{8\,820\,655} = 35.7^*\)B1cso* Either correct expression
Part (e)(i):
AnswerMarks Guidance
WorkingMark Guidance
\((\text{residual})^2 = [1.4 - (14.3 - 0.00681(1100))]^2 = 29.2...\)M1 Using the model to evaluate the squared residual
\([29.2... \div 35.7 \times 100\%]\) awrt 82%A1 awrt 82%
Part (e)(ii):
AnswerMarks Guidance
WorkingMark Guidance
As the point representing this data contributes to the majority of the RSS, the point is possibly an outlier and should be investigatedB1 Evaluating the result to suggest the point may be an outlier
# Question 4:

## Part (a):
| Working | Mark | Guidance |
|---------|------|----------|
| As **elevation** increases, **temperature** decreases | B1 | Correct contextual interpretation |

## Part (b):
| Working | Mark | Guidance |
|---------|------|----------|
| $S_{xt} = -0.959\sqrt{8\,820\,655 \times 444.7} = -60\,062.38727$ | M1 | Using pmcc to find $S_{xt}$ |
| $b = \frac{-60\,062...}{8\,820\,655} = -0.006809...$ | M1 | Setting up linear model by attempting to find $b$; allow M2 for $b = r\sqrt{\frac{S_{tt}}{S_{xx}}}$ |
| $a = \frac{94.62}{20} - {'}b{'}\frac{28\,130}{20} = 14.308...$ | M1 | Setting up linear model by attempting to find $a$ |
| $t = 14.3 - 0.00681x$ | A1cso* | Correct model with $a =$ awrt 14.3 and $b =$ awrt $-0.00681$ |

## Part (c):
| Working | Mark | Guidance |
|---------|------|----------|
| $\left[w = \frac{x}{1000} \rightarrow\right]\; t = 14.3 - 6.81w$ | B1 | Correct model |

## Part (d):
| Working | Mark | Guidance |
|---------|------|----------|
| $444.7(1-(-0.959)^2)$ or $444.7 - \frac{(-60\,062...)^2}{8\,820\,655} = 35.7^*$ | B1cso* | Either correct expression |

## Part (e)(i):
| Working | Mark | Guidance |
|---------|------|----------|
| $(\text{residual})^2 = [1.4 - (14.3 - 0.00681(1100))]^2 = 29.2...$ | M1 | Using the model to evaluate the squared residual |
| $[29.2... \div 35.7 \times 100\%]$ awrt 82% | A1 | awrt 82% |

## Part (e)(ii):
| Working | Mark | Guidance |
|---------|------|----------|
| As the point representing this data contributes to the majority of the RSS, the point is possibly an outlier and should be investigated | B1 | Evaluating the result to suggest the point may be an outlier |

---
\begin{enumerate}
  \item A researcher is investigating the relationship between elevation, $x$ metres, and annual mean temperature, $t ^ { \circ } \mathrm { C }$.
\end{enumerate}

From a random sample of 20 weather stations in Switzerland, the following results were obtained

$$\mathrm { S } _ { x x } = 8820655 \quad \mathrm {~S} _ { t t } = 444.7 \quad \sum x = 28130 \quad \sum t = 94.62$$

The product moment correlation coefficient for these data is found to be - 0.959\\
(a) Interpret the value of this correlation coefficient.\\
(b) Show that the equation of the regression line of $t$ on $x$ can be written as

$$t = 14.3 - 0.00681 x$$

The random variable $W$ represents the elevations of the weather stations in kilometres.\\
(c) Write down the equation of the regression line of $t$ on $w$ for these 20 weather stations in the form $t = a + b w$\\
(d) Show that the residual sum of squares (RSS) for the model for $t$ and $x$ is 35.7 correct to one decimal place.

One of the weather stations in the sample had a recorded elevation of 1100 metres and an annual mean temperature of $1.4 ^ { \circ } \mathrm { C }$\\
(e) (i) Calculate this weather station's contribution to the residual sum of squares. Give your answer as a percentage\\
(ii) Comment on the data for this weather station in light of your answer to part (e)(i).

\hfill \mbox{\textit{Edexcel FS2 2021 Q4 [10]}}