Edexcel S1 2022 June — Question 2 14 marks

Exam BoardEdexcel
ModuleS1 (Statistics 1)
Year2022
SessionJune
Marks14
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeCalculate y on x from summary statistics
DifficultyModerate -0.8 This is a routine S1 linear regression question requiring standard formula application (PMCC, regression line, interpretation) with all summary statistics provided. The calculations are straightforward substitutions with no conceptual challenges or problem-solving required beyond recalling formulas.
Spec2.02c Scatter diagrams and regression lines5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09c Calculate regression line5.09d Linear coding: effect on regression5.09e Use regression: for estimation in context

  1. Stuart is investigating the relationship between Gross Domestic Product (GDP) and the size of the population for a particular country.
    He takes a random sample of 9 years and records the size of the population, \(t\) millions, and the GDP, \(g\) billion dollars for each of these years.
The data are summarised as $$n = 9 \quad \sum t = 7.87 \quad \sum g = 144.84 \quad \sum g ^ { 2 } = 3624.41 \quad S _ { t t } = 1.29 \quad S _ { t g } = 40.25$$
  1. Calculate the product moment correlation coefficient between \(t\) and \(g\)
  2. Give an interpretation of your product moment correlation coefficient.
  3. Find the equation of the least squares regression line of \(g\) on \(t\) in the form \(g = a + b t\)
  4. Give an interpretation of the value of \(b\) in your regression line.
    1. Use the regression line from part (c) to estimate the GDP, in billions of dollars, for a population of 7000000
    2. Comment on the reliability of your answer in part (i). Give a reason, in context, for your answer. Using the regression line from part (c), Stuart estimates that for a population increase of \(x\) million there will be an increase of 0.1 billion dollars in GDP.
  5. Find the value of \(x\)

Question 2:
Part (a)
AnswerMarks Guidance
AnswerMark Guidance
\(S_{gg} = 3624.41 - \dfrac{144.84^2}{9} [= 1293.4516]\)M1 Correct method for finding \(S_{gg}\) (implied by awrt 1290 to 3sf). If \(S_{gg} = 3624.41\) used, M0
\(r = \dfrac{40.25}{\sqrt{\text{"1293.4516"} \times 1.29}}\)M1 Correct method for finding \(r\) using their \(S_{gg}\)
\(= 0.985\ldots\) awrt 0.985A1 Correct answer only scores M1M1A1
Part (b)
AnswerMarks Guidance
AnswerMark Guidance
As population/\(t\) increases, GDP/\(g\) increasesB1 Correct interpreted contextual statement including population (or \(t\)) and GDP (or \(g\)). "Strong positive correlation between population and GDP" on its own: B0
Part (c)
AnswerMarks Guidance
AnswerMark Guidance
\(b = \dfrac{40.25}{1.29} [= 31.20155\ldots]\)M1 Correct method for finding \(b\)
\(a = \dfrac{144.84}{9} - \text{"31.20155..."} \times \dfrac{7.87}{9} [= -11.19068\ldots]\)M1 Correct method for finding \(a\) using their \(b\). \(a = 16.0\ldots - \text{"31.20155..."} \times 0.874\ldots\)
\(g = \ldots 31.20155\ldots t\)A1 Only dep on 1st M1. Awrt 31.2 in regression equation (allow any variables)
\(g = -11.2 + 31.2t\)A1 Must be \(g\) and \(t\). Do not allow fractions
Part (d)
AnswerMarks Guidance
AnswerMark Guidance
GDP/\(g\) increases by (average of) "31.2" billion [dollars] when population/\(t\) increases by one millionB1 Idea that GDP increases by "their \(b\)" billion dollars for every 1 million increase in population
Part (e)(i)
AnswerMarks Guidance
AnswerMark Guidance
\(\text{"}-11.2\text{"} + \text{"}31.2\text{"} \times 7\)M1 Correct method. Allow substitution of 7 000 000 instead of 7
\(= 207.2\ldots\) awrt 207A1 awrt 207 (billion). isw after answer of 207
Part (e)(ii)
AnswerMarks Guidance
AnswerMark Guidance
Unreliable as 7 000 000 is much greater than the mean population/\(\bar{t}\) for the 9 yearsB1 Must reference \(t\) or \(\bar{t}\) \([= 0.874]\) or population. "Extrapolation so unreliable" alone: B0. Reference to \(g\) out of range: B0
Part (f)
AnswerMarks Guidance
AnswerMark Guidance
\(0.1 = \text{"}31.2\text{"} \times x\)M1 Equating 0.1 with "their \(b\)" \(\times x\), or substituting two \(g\) values differing by 0.1
\(x = 0.003205\ldots\) million people awrt 0.0032A1 awrt 0.0032 (million). Allow awrt 3200 (to 2sf). Do not allow fractions
# Question 2:

## Part (a)
| Answer | Mark | Guidance |
|--------|------|----------|
| $S_{gg} = 3624.41 - \dfrac{144.84^2}{9} [= 1293.4516]$ | M1 | Correct method for finding $S_{gg}$ (implied by awrt 1290 to 3sf). If $S_{gg} = 3624.41$ used, M0 |
| $r = \dfrac{40.25}{\sqrt{\text{"1293.4516"} \times 1.29}}$ | M1 | Correct method for finding $r$ using their $S_{gg}$ |
| $= 0.985\ldots$ awrt 0.985 | A1 | Correct answer only scores M1M1A1 |

## Part (b)
| Answer | Mark | Guidance |
|--------|------|----------|
| As population/$t$ increases, GDP/$g$ increases | B1 | Correct interpreted contextual statement including population (or $t$) and GDP (or $g$). "Strong positive correlation between population and GDP" on its own: B0 |

## Part (c)
| Answer | Mark | Guidance |
|--------|------|----------|
| $b = \dfrac{40.25}{1.29} [= 31.20155\ldots]$ | M1 | Correct method for finding $b$ |
| $a = \dfrac{144.84}{9} - \text{"31.20155..."} \times \dfrac{7.87}{9} [= -11.19068\ldots]$ | M1 | Correct method for finding $a$ using their $b$. $a = 16.0\ldots - \text{"31.20155..."} \times 0.874\ldots$ |
| $g = \ldots 31.20155\ldots t$ | A1 | Only dep on 1st M1. Awrt 31.2 in regression equation (allow any variables) |
| $g = -11.2 + 31.2t$ | A1 | Must be $g$ and $t$. Do not allow fractions |

## Part (d)
| Answer | Mark | Guidance |
|--------|------|----------|
| GDP/$g$ increases by (average of) **"31.2" billion** [dollars] when population/$t$ increases by **one million** | B1 | Idea that GDP increases by "their $b$" billion dollars for every 1 million increase in population |

## Part (e)(i)
| Answer | Mark | Guidance |
|--------|------|----------|
| $\text{"}-11.2\text{"} + \text{"}31.2\text{"} \times 7$ | M1 | Correct method. Allow substitution of 7 000 000 instead of 7 |
| $= 207.2\ldots$ awrt 207 | A1 | awrt 207 (billion). isw after answer of 207 |

## Part (e)(ii)
| Answer | Mark | Guidance |
|--------|------|----------|
| Unreliable as 7 000 000 is much greater than the mean population/$\bar{t}$ for the 9 years | B1 | Must reference $t$ or $\bar{t}$ $[= 0.874]$ or population. "Extrapolation so unreliable" alone: B0. Reference to $g$ out of range: B0 |

## Part (f)
| Answer | Mark | Guidance |
|--------|------|----------|
| $0.1 = \text{"}31.2\text{"} \times x$ | M1 | Equating 0.1 with "their $b$" $\times x$, or substituting two $g$ values differing by 0.1 |
| $x = 0.003205\ldots$ million people awrt 0.0032 | A1 | awrt 0.0032 (million). Allow awrt 3200 (to 2sf). Do not allow fractions |

---
\begin{enumerate}
  \item Stuart is investigating the relationship between Gross Domestic Product (GDP) and the size of the population for a particular country.\\
He takes a random sample of 9 years and records the size of the population, $t$ millions, and the GDP, $g$ billion dollars for each of these years.
\end{enumerate}

The data are summarised as

$$n = 9 \quad \sum t = 7.87 \quad \sum g = 144.84 \quad \sum g ^ { 2 } = 3624.41 \quad S _ { t t } = 1.29 \quad S _ { t g } = 40.25$$

(a) Calculate the product moment correlation coefficient between $t$ and $g$\\
(b) Give an interpretation of your product moment correlation coefficient.\\
(c) Find the equation of the least squares regression line of $g$ on $t$ in the form $g = a + b t$\\
(d) Give an interpretation of the value of $b$ in your regression line.\\
(e) (i) Use the regression line from part (c) to estimate the GDP, in billions of dollars, for a population of 7000000\\
(ii) Comment on the reliability of your answer in part (i). Give a reason, in context, for your answer.

Using the regression line from part (c), Stuart estimates that for a population increase of $x$ million there will be an increase of 0.1 billion dollars in GDP.\\
(f) Find the value of $x$

\hfill \mbox{\textit{Edexcel S1 2022 Q2 [14]}}