| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2017 |
| Session | June |
| Marks | 14 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate from summary statistics |
| Difficulty | Moderate -0.5 This is a standard S1 linear regression question requiring routine application of formulas for S_wt, S_tt, correlation coefficient, and regression lines. While it has multiple parts and involves coding/decoding variables, each step follows directly from memorized formulas with no problem-solving insight required. Slightly easier than average due to being purely procedural. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.08c Pearson: measure of straight-line fit5.08d Hypothesis test: Pearson correlation5.09c Calculate regression line5.09e Use regression: for estimation in context |
| Answer | Marks | Guidance |
|---|---|---|
| (a) | \([S_{wr}] = 784 - \frac{119 \times 42}{6} = -49\) | M1, A1 |
| \([S_r] = 2435 - \frac{119^2}{6} = 74.83\) or \(74\frac{5}{6}\) or \(\frac{449}{6}\) (accept awrt 74.8) | M1, A1 | 1st A1 for \([S_{wr}] = -49\); 2nd A1 for \([S_r] =\) awrt 74.8. SC If both values correct but clearly mislabelled award M1A0A1 |
| (b) | \(S_w = 5 \times 10^3\) or \(50\,000\,000\) (o.e.) | B1 |
| \(S_{ff} = -49\,000\) | B1ft | |
| (c) | \(r = \frac{"-49"}{√{50×"74.83"}}\) or \(\frac{"-49\,000"}{√{5×10^3×"74.83"}}\) | M1, A1 |
| \(= -0.80105\ldots = \text{awrt } -0.801\) | ||
| (d) | \(r\) is close to \(-1\) or \( | r |
| \(\ldots\) so "yes" or does support the belief | ||
| (e) | \(b = \frac{"-49"}{"{74.83"} = [-0.6547\ldots]\), \(a = \frac{42}{6} - b × \frac{119}{6} = [19.9866\ldots]\) or \(a = 7 - b × 19.83\) | M1, M1 |
| So \(w = 20.0 - 0.655t\) | A1 | For a correct equation in \(w\) and \(t\) only with \(a = 20\) or awrt 20.0 and \(b =\) awrt \(-0.655\) (No fractions) |
| (f) | \(s = 20\,000 - 655t\) or \(c = 20\,000\) and \(d = -655\) | B1ft, B1ft |
| (g) | Decrease in sales of [£] 655 | B1ft |
(a) | $[S_{wr}] = 784 - \frac{119 \times 42}{6} = -49$ | M1, A1 | For a correct expression for $S_{wr}$ or $S_{ff}$ (May be implied by either correct answer) |
| $[S_r] = 2435 - \frac{119^2}{6} = 74.83$ or $74\frac{5}{6}$ or $\frac{449}{6}$ (accept awrt 74.8) | M1, A1 | 1st A1 for $[S_{wr}] = -49$; 2nd A1 for $[S_r] =$ awrt 74.8. SC If both values correct but clearly mislabelled award M1A0A1 |
(b) | $S_w = 5 \times 10^3$ or $50\,000\,000$ (o.e.) | B1 | For multiplying their $S_{wr}$ by 1000 |
| $S_{ff} = -49\,000$ | B1ft | |
(c) | $r = \frac{"-49"}{√{50×"74.83"}}$ or $\frac{"-49\,000"}{√{5×10^3×"74.83"}}$ | M1, A1 | M1 for a correct expression using their values provided $S_{ff}$ and $S_w$, both $> 0$. A1 for awrt $-0.801$ (Correct ans. only M1A1, $-0.80$ with no working M1A0) |
| $= -0.80105\ldots = \text{awrt } -0.801$ | | |
(d) | $r$ is close to $-1$ or $|r|$ is close to $1$ or "strong" (o.e.) [negative] correlation | B1ft | For a correct comment that uses their value of $r$ as support, provided $0.5 <_r|_r, II$. For $|r| < 0.5$ comment must be "does not support", because "weak" (o.e.) correlation. NB "points lie close to a straight line" is B0 unless supported by mention of their value of $r$ |
| $\ldots$ so "yes" or does support the belief | | |
(e) | $b = \frac{"-49"}{"{74.83"} = [-0.6547\ldots]$, $a = \frac{42}{6} - b × \frac{119}{6} = [19.9866\ldots]$ or $a = 7 - b × 19.83$ | M1, M1 | 1st M1 for a correct expression for $b$ or awrt $-0.66$ or $-0.65$ Ft their answers from (a). 2nd M1 for a correct expression for $a$ ft their value for $b$ |
| So $w = 20.0 - 0.655t$ | A1 | For a correct equation in $w$ and $t$ only with $a = 20$ or awrt 20.0 and $b =$ awrt $-0.655$ (No fractions) |
(f) | $s = 20\,000 - 655t$ or $c = 20\,000$ and $d = -655$ | B1ft, B1ft | 1st B1 ft for correct $c$ or "their 20.0"$× 1000$; 2nd B1 ft for correct $d$ or their "$-0.655"$× 1000$. Values can be in an $s, r$eq'n or $c, d = $ (Their $a$ and $b$ needn't be to 3 sf and ft their letter for $t$) |
(g) | Decrease in sales of [£] 655 | B1ft | For stating clearly both decrease (o.e.) and [£] 655. Ft their $d$ and allow "increase" if $d > 0$ |
\begin{enumerate}
\item A clothes shop manager records the weekly sales figures, $\pounds s$, and the average weekly temperature, $t ^ { \circ } \mathrm { C }$, for 6 weeks during the summer. The sales figures were coded so that $w = \frac { s } { 1000 }$
\end{enumerate}
The data are summarised as follows
$$\mathrm { S } _ { w w } = 50 \quad \sum w t = 784 \quad \sum t ^ { 2 } = 2435 \quad \sum t = 119 \quad \sum w = 42$$
(a) Find $\mathrm { S } _ { w t }$ and $\mathrm { S } _ { t t }$\\
(b) Write down the value of $\mathrm { S } _ { s s }$ and the value of $\mathrm { S } _ { s t }$\\
(c) Find the product moment correlation coefficient between $s$ and $t$.
The manager of the clothes shop believes that a linear regression model may be appropriate to describe these data.\\
(d) State, giving a reason, whether or not your value of the correlation coefficient supports the manager's belief.\\
(e) Find the equation of the regression line of $w$ on $t$, giving your answer in the form $w = a + b t$\\
(f) Hence find the equation of the regression line of $s$ on $t$, giving your answer in the form $s = c + d t$, where $c$ and $d$ are correct to 3 significant figures.\\
(g) Using your equation in part (f), interpret the effect of a $1 ^ { \circ } \mathrm { C }$ increase in average weekly temperature on weekly sales during the summer.
\hfill \mbox{\textit{Edexcel S1 2017 Q1 [14]}}