| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2023 |
| Session | January |
| Marks | 14 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Interpret regression line parameters |
| Difficulty | Moderate -0.3 This is a standard S1 regression question testing routine interpretation of gradient, extrapolation concerns, and correlation coefficient calculation using the formula. All parts follow textbook procedures with no novel problem-solving required, though part (d) requires careful algebraic manipulation. Slightly easier than average due to straightforward interpretations in parts (a), (c), and (e). |
| Spec | 2.02c Scatter diagrams and regression lines5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09c Calculate regression line |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| An increase/change of 1°C will allow an extra 2.72 grams [of sugar] to dissolve | B1 | Must include correct interpretation of gradient in context, including grams and degrees |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(151.2 + 2.72 \times 90 = 396\) | M1 | For substitution of 90 into the regression line |
| A1 | cao — 396 on its own scores 2 out of 2 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| The temperature 90[°C] is outside of the range; so (may be) unreliable | B1 | For a comment implying 90[°C] is outside the range. Allow extrapolation if not linked to 396. Do not allow comments implying 396 is out of range or use of "it" |
| dB1 | Dependent on 1st B1 for a correct conclusion |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| Use of \(\bar{y} = 151.2 + 2.72\bar{x}\), so \(\sum x = \left(\dfrac{\frac{3119}{12} - 151.2}{2.72}\right) \times 12 = 479.63235...\) | M1 A1 | For clear use of regression line to find \(\sum x\) or \(\bar{x}\). \(\sum x\) = awrt 480 or \(\bar{x}\) = awrt 40 |
| \(S_{yy} = 851093 - \dfrac{3119^2}{12} [= 40412.9166...]\) | M1 | For correct expression for \(S_{yy}\), may be implied by awrt 40400 |
| \(S_{xx} = 24500 - \dfrac{479.63235...^2}{12} [= 5329.4005...]\) | M1 | For correct expression for \(S_{xx}\) ft their \(\sum x\) or \(\bar{x}\), may be implied by awrt 5330 |
| \(S_{xy} = 2.72 \times 5329.4005...[= 14495.9693...]\) | M1 | For use of gradient to find \(S_{xy}\) ft their \(S_{xx}\), may be implied by awrt 14500, or use of \(r = b\sqrt{\dfrac{S_{xx}}{S_{yy}}}\) |
| \(r = \dfrac{14495.9693...}{\sqrt{5329.4005... \times 40412.9166...}}\) or \(r = 2.72 \times \sqrt{\dfrac{5329.4005...}{40412.9166...}}\) | M1 | For correct expression for \(r\) ft their \(S_{xy}\), \(S_{xx}\) and \(S_{yy}\) or 2.72, '\(S_{xx}\)' and '\(S_{yy}\)'. If not correct, must be labelled before expression for \(r\) |
| \(= 0.988*\) | A1* | Answer given — fully correct solution must be seen |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| The points lie reasonably close to a straight line / positive correlation | B1 | For either: points lie reasonably close to a straight line/linear/positive correlation, OR PMCC is close to 1 (ignore any reference to strength) |
| ...and the PMCC is close to 1, therefore supports a linear model | B1 | For both points with a correct conclusion (ignore any reference to strength) |
# Question 6:
## Part (a):
| Answer | Mark | Guidance |
|--------|------|----------|
| An increase/change of 1°C will allow an extra 2.72 grams [of sugar] to dissolve | B1 | Must include correct interpretation of gradient in context, including grams and degrees |
**Total: 1 mark**
## Part (b):
| Answer | Mark | Guidance |
|--------|------|----------|
| $151.2 + 2.72 \times 90 = 396$ | M1 | For substitution of 90 into the regression line |
| | A1 | cao — 396 on its own scores 2 out of 2 |
**Total: 2 marks**
## Part (c):
| Answer | Mark | Guidance |
|--------|------|----------|
| The temperature 90[°C] is outside of the range; so (may be) unreliable | B1 | For a comment implying 90[°C] is outside the range. Allow extrapolation if not linked to 396. Do not allow comments implying 396 is out of range or use of "it" |
| | dB1 | Dependent on 1st B1 for a correct conclusion |
**Total: 2 marks**
## Part (d):
| Answer | Mark | Guidance |
|--------|------|----------|
| Use of $\bar{y} = 151.2 + 2.72\bar{x}$, so $\sum x = \left(\dfrac{\frac{3119}{12} - 151.2}{2.72}\right) \times 12 = 479.63235...$ | M1 A1 | For clear use of regression line to find $\sum x$ or $\bar{x}$. $\sum x$ = awrt 480 or $\bar{x}$ = awrt 40 |
| $S_{yy} = 851093 - \dfrac{3119^2}{12} [= 40412.9166...]$ | M1 | For correct expression for $S_{yy}$, may be implied by awrt 40400 |
| $S_{xx} = 24500 - \dfrac{479.63235...^2}{12} [= 5329.4005...]$ | M1 | For correct expression for $S_{xx}$ ft their $\sum x$ or $\bar{x}$, may be implied by awrt 5330 |
| $S_{xy} = 2.72 \times 5329.4005...[= 14495.9693...]$ | M1 | For use of gradient to find $S_{xy}$ ft their $S_{xx}$, may be implied by awrt 14500, or use of $r = b\sqrt{\dfrac{S_{xx}}{S_{yy}}}$ |
| $r = \dfrac{14495.9693...}{\sqrt{5329.4005... \times 40412.9166...}}$ or $r = 2.72 \times \sqrt{\dfrac{5329.4005...}{40412.9166...}}$ | M1 | For correct expression for $r$ ft their $S_{xy}$, $S_{xx}$ and $S_{yy}$ or 2.72, '$S_{xx}$' and '$S_{yy}$'. If not correct, must be labelled before expression for $r$ |
| $= 0.988*$ | A1* | Answer given — fully correct solution must be seen |
**Total: 7 marks**
## Part (e):
| Answer | Mark | Guidance |
|--------|------|----------|
| The points lie reasonably close to a straight line / positive correlation | B1 | For either: points lie reasonably close to a straight line/linear/positive correlation, OR PMCC is close to 1 (ignore any reference to strength) |
| ...and the PMCC is close to 1, therefore supports a linear model | B1 | For both points with a correct conclusion (ignore any reference to strength) |
**Total: 2 marks**
**Question Total: 14 marks**
\begin{enumerate}
\item A research student is investigating the maximum weight, $y$ grams, of sugar that will dissolve in 100 grams of water at various temperatures, $x ^ { \circ } \mathrm { C }$, where $10 \leqslant x \leqslant 80$
\end{enumerate}
The research student calculated the regression line of $y$ on $x$ and found it to be
$$y = 151.2 + 2.72 x$$
(a) Give an interpretation of the gradient of the regression line.\\
(b) Use the regression line to estimate the maximum weight of sugar that will dissolve in 100 grams of water when the temperature is $90 ^ { \circ } \mathrm { C }$.\\
(c) Comment on the reliability of your estimate, giving a reason for your answer.
Using the regression line of $y$ on $x$ and the following summary statistics
$$\sum y = 3119 \quad \sum y ^ { 2 } = 851093 \quad \sum x ^ { 2 } = 24500 \quad n = 12$$
(d) show that the product moment correlation coefficient for these data is 0.988 to 3 decimal places.
The research student's supervisor plotted the original data on a scatter diagram, shown on page 23
With reference to both the scatter diagram and the correlation coefficient,\\
(e) discuss the suitability of a linear regression model to describe the relationship between $x$ and $y$.
\begin{center}
\includegraphics[max width=\textwidth, alt={}]{c316fa29-dedc-4890-bd82-31eb0bb819f9-23_990_1138_205_356}
\end{center}
\hfill \mbox{\textit{Edexcel S1 2023 Q6 [14]}}