| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2022 |
| Session | June |
| Marks | 14 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate y on x from summary statistics |
| Difficulty | Moderate -0.8 This is a routine S1 linear regression question requiring standard formula application (PMCC, regression line, interpretation) with all summary statistics provided. The calculations are straightforward substitutions with no conceptual challenges or problem-solving required beyond recalling formulas. |
| Spec | 2.02c Scatter diagrams and regression lines5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09c Calculate regression line5.09d Linear coding: effect on regression5.09e Use regression: for estimation in context |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(S_{gg} = 3624.41 - \dfrac{144.84^2}{9} [= 1293.4516]\) | M1 | Correct method for finding \(S_{gg}\) (implied by awrt 1290 to 3sf). If \(S_{gg} = 3624.41\) used, M0 |
| \(r = \dfrac{40.25}{\sqrt{\text{"1293.4516"} \times 1.29}}\) | M1 | Correct method for finding \(r\) using their \(S_{gg}\) |
| \(= 0.985\ldots\) awrt 0.985 | A1 | Correct answer only scores M1M1A1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| As population/\(t\) increases, GDP/\(g\) increases | B1 | Correct interpreted contextual statement including population (or \(t\)) and GDP (or \(g\)). "Strong positive correlation between population and GDP" on its own: B0 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(b = \dfrac{40.25}{1.29} [= 31.20155\ldots]\) | M1 | Correct method for finding \(b\) |
| \(a = \dfrac{144.84}{9} - \text{"31.20155..."} \times \dfrac{7.87}{9} [= -11.19068\ldots]\) | M1 | Correct method for finding \(a\) using their \(b\). \(a = 16.0\ldots - \text{"31.20155..."} \times 0.874\ldots\) |
| \(g = \ldots 31.20155\ldots t\) | A1 | Only dep on 1st M1. Awrt 31.2 in regression equation (allow any variables) |
| \(g = -11.2 + 31.2t\) | A1 | Must be \(g\) and \(t\). Do not allow fractions |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| GDP/\(g\) increases by (average of) "31.2" billion [dollars] when population/\(t\) increases by one million | B1 | Idea that GDP increases by "their \(b\)" billion dollars for every 1 million increase in population |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(\text{"}-11.2\text{"} + \text{"}31.2\text{"} \times 7\) | M1 | Correct method. Allow substitution of 7 000 000 instead of 7 |
| \(= 207.2\ldots\) awrt 207 | A1 | awrt 207 (billion). isw after answer of 207 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| Unreliable as 7 000 000 is much greater than the mean population/\(\bar{t}\) for the 9 years | B1 | Must reference \(t\) or \(\bar{t}\) \([= 0.874]\) or population. "Extrapolation so unreliable" alone: B0. Reference to \(g\) out of range: B0 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(0.1 = \text{"}31.2\text{"} \times x\) | M1 | Equating 0.1 with "their \(b\)" \(\times x\), or substituting two \(g\) values differing by 0.1 |
| \(x = 0.003205\ldots\) million people awrt 0.0032 | A1 | awrt 0.0032 (million). Allow awrt 3200 (to 2sf). Do not allow fractions |
# Question 2:
## Part (a)
| Answer | Mark | Guidance |
|--------|------|----------|
| $S_{gg} = 3624.41 - \dfrac{144.84^2}{9} [= 1293.4516]$ | M1 | Correct method for finding $S_{gg}$ (implied by awrt 1290 to 3sf). If $S_{gg} = 3624.41$ used, M0 |
| $r = \dfrac{40.25}{\sqrt{\text{"1293.4516"} \times 1.29}}$ | M1 | Correct method for finding $r$ using their $S_{gg}$ |
| $= 0.985\ldots$ awrt 0.985 | A1 | Correct answer only scores M1M1A1 |
## Part (b)
| Answer | Mark | Guidance |
|--------|------|----------|
| As population/$t$ increases, GDP/$g$ increases | B1 | Correct interpreted contextual statement including population (or $t$) and GDP (or $g$). "Strong positive correlation between population and GDP" on its own: B0 |
## Part (c)
| Answer | Mark | Guidance |
|--------|------|----------|
| $b = \dfrac{40.25}{1.29} [= 31.20155\ldots]$ | M1 | Correct method for finding $b$ |
| $a = \dfrac{144.84}{9} - \text{"31.20155..."} \times \dfrac{7.87}{9} [= -11.19068\ldots]$ | M1 | Correct method for finding $a$ using their $b$. $a = 16.0\ldots - \text{"31.20155..."} \times 0.874\ldots$ |
| $g = \ldots 31.20155\ldots t$ | A1 | Only dep on 1st M1. Awrt 31.2 in regression equation (allow any variables) |
| $g = -11.2 + 31.2t$ | A1 | Must be $g$ and $t$. Do not allow fractions |
## Part (d)
| Answer | Mark | Guidance |
|--------|------|----------|
| GDP/$g$ increases by (average of) **"31.2" billion** [dollars] when population/$t$ increases by **one million** | B1 | Idea that GDP increases by "their $b$" billion dollars for every 1 million increase in population |
## Part (e)(i)
| Answer | Mark | Guidance |
|--------|------|----------|
| $\text{"}-11.2\text{"} + \text{"}31.2\text{"} \times 7$ | M1 | Correct method. Allow substitution of 7 000 000 instead of 7 |
| $= 207.2\ldots$ awrt 207 | A1 | awrt 207 (billion). isw after answer of 207 |
## Part (e)(ii)
| Answer | Mark | Guidance |
|--------|------|----------|
| Unreliable as 7 000 000 is much greater than the mean population/$\bar{t}$ for the 9 years | B1 | Must reference $t$ or $\bar{t}$ $[= 0.874]$ or population. "Extrapolation so unreliable" alone: B0. Reference to $g$ out of range: B0 |
## Part (f)
| Answer | Mark | Guidance |
|--------|------|----------|
| $0.1 = \text{"}31.2\text{"} \times x$ | M1 | Equating 0.1 with "their $b$" $\times x$, or substituting two $g$ values differing by 0.1 |
| $x = 0.003205\ldots$ million people awrt 0.0032 | A1 | awrt 0.0032 (million). Allow awrt 3200 (to 2sf). Do not allow fractions |
---
\begin{enumerate}
\item Stuart is investigating the relationship between Gross Domestic Product (GDP) and the size of the population for a particular country.\\
He takes a random sample of 9 years and records the size of the population, $t$ millions, and the GDP, $g$ billion dollars for each of these years.
\end{enumerate}
The data are summarised as
$$n = 9 \quad \sum t = 7.87 \quad \sum g = 144.84 \quad \sum g ^ { 2 } = 3624.41 \quad S _ { t t } = 1.29 \quad S _ { t g } = 40.25$$
(a) Calculate the product moment correlation coefficient between $t$ and $g$\\
(b) Give an interpretation of your product moment correlation coefficient.\\
(c) Find the equation of the least squares regression line of $g$ on $t$ in the form $g = a + b t$\\
(d) Give an interpretation of the value of $b$ in your regression line.\\
(e) (i) Use the regression line from part (c) to estimate the GDP, in billions of dollars, for a population of 7000000\\
(ii) Comment on the reliability of your answer in part (i). Give a reason, in context, for your answer.
Using the regression line from part (c), Stuart estimates that for a population increase of $x$ million there will be an increase of 0.1 billion dollars in GDP.\\
(f) Find the value of $x$
\hfill \mbox{\textit{Edexcel S1 2022 Q2 [14]}}