| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2018 |
| Session | June |
| Marks | 13 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Convert regression equation between coded and original |
| Difficulty | Moderate -0.3 This is a standard S1 regression question requiring routine application of formulas (showing S_yy, finding correlation coefficient, regression equation) and straightforward conversion between coded and original variables. The coding transformation is simple linear scaling, and all steps follow textbook procedures with no novel problem-solving required. Slightly easier than average due to being highly procedural with given summary statistics. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09c Calculate regression line5.09e Use regression: for estimation in context |
| \(\boldsymbol { x }\) | 6.89 | 3.67 | 5.92 | 5.04 | 4.87 | 3.92 | 4.71 | 5.14 | 3.65 | 5.23 |
| \(\boldsymbol { y }\) | 30 | 3 | 22 | 15 | 13 | 8 | 15 | 13.5 | 3 | 19 |
| Answer | Marks | Guidance |
|---|---|---|
| \(S_{yy} = 2628.25 - \frac{141.5^2}{10} = 626.025\) | M1A1cso (2 marks) | Allow complete expression with \(\Sigma y = (30+3+22+15+13+8+15+13.5+3+19)\) |
| Answer | Marks | Guidance |
|---|---|---|
| \(r = \frac{74.664}{\sqrt{9.25924 \times 626.025}} = 0.98068...\) awrt 0.981 | M1, A1 (2 marks) | M1 attempt at correct formula (allow one transcription error). 0.98 or 0.980 on its own is M1A0. |
| Answer | Marks | Guidance |
|---|---|---|
| \(r\)/0.981 is close to 1 or a strong correlation | B1 (1 mark) | Allow "near perfect" correlation, but "perfect correlation" is B0. If \( |
| Answer | Marks | Guidance |
|---|---|---|
| \(b = \frac{74.664}{9.25924} = 8.063728...\) | M1 | M1 for correct expression for \(b\) |
| \(a = \frac{141.5}{10} - \left(\frac{74.664}{9.25924}\right)\times\left(\frac{49.04}{10}\right) = -25.39452...\) | M1 | M1 for correct expression for \(a\) ft their \(b\) |
| \(y = -25.4 + 8.06x\) | A1 (3 marks) | Must be in part (d) in \(y\) and \(x\) with \(a\) = awrt \(-25.4\) and \(b\) = awrt \(8.06\). No fractions. |
| Answer | Marks | Guidance |
|---|---|---|
| \(y = -25.4 + 8.06 \times 4.4\ [=10.08...]\) | M1 | M1 subst. \(x=4.4\) into their equation |
| \(m = (10.08...) + 25 = 35.085...\)(mpg) awrt 35.1 | depM1, A1 (3 marks) | depM1 adding 25 to their 10.08... |
| Answer | Marks | Guidance |
|---|---|---|
| As \(44(p)/4.4(x)\) is within the range of the data set or it involves interpolation, (the actual miles per gallon should be) reliable. | M1, A1 (2 marks) | M1 needs to be clear that \(p\ (36.5 < 44 < 68.9)\) or \(x\ (3.65 < 4.4 < 6.89)\) is in range. A1 M1 must be explicitly seen. 'It is reliable as it is within range' is M0A0. |
# Question 1:
## Part (a)
$S_{yy} = 2628.25 - \frac{141.5^2}{10} = 626.025$ | M1A1cso (2 marks) | Allow complete expression with $\Sigma y = (30+3+22+15+13+8+15+13.5+3+19)$
## Part (b)
$r = \frac{74.664}{\sqrt{9.25924 \times 626.025}} = 0.98068...$ awrt **0.981** | M1, A1 (2 marks) | M1 attempt at correct formula (allow one transcription error). 0.98 or 0.980 on its own is M1A0.
## Part (c)
$r$/0.981 is close to 1 or a **strong correlation** | B1 (1 mark) | Allow "near perfect" correlation, but "perfect correlation" is B0. If $|r|>1$, then B0.
## Part (d)
$b = \frac{74.664}{9.25924} = 8.063728...$ | M1 | M1 for correct expression for $b$
$a = \frac{141.5}{10} - \left(\frac{74.664}{9.25924}\right)\times\left(\frac{49.04}{10}\right) = -25.39452...$ | M1 | M1 for correct expression for $a$ ft their $b$
$y = -25.4 + 8.06x$ | A1 (3 marks) | Must be in part (d) in $y$ and $x$ with $a$ = awrt $-25.4$ and $b$ = awrt $8.06$. No fractions.
## Part (e)
$y = -25.4 + 8.06 \times 4.4\ [=10.08...]$ | M1 | M1 subst. $x=4.4$ into their equation
$m = (10.08...) + 25 = 35.085...$(mpg) awrt **35.1** | depM1, A1 (3 marks) | depM1 adding 25 to their 10.08...
## Part (f)
As $44(p)/4.4(x)$ is within the range of the data set or it involves interpolation, (the actual miles per gallon should be) reliable. | M1, A1 (2 marks) | M1 needs to be clear that $p\ (36.5 < 44 < 68.9)$ or $x\ (3.65 < 4.4 < 6.89)$ is in range. A1 M1 must be explicitly seen. 'It is reliable as it is within range' is M0A0.
---
\begin{enumerate}
\item A random sample of 10 cars of different makes and sizes is taken and the published miles per gallon, $p$, and the actual miles per gallon, $m$, are recorded. The data are coded using variables $x = \frac { p } { 10 }$ and $y = m - 25$
\end{enumerate}
The results for the coded data are summarised below.
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | c | c | c | c | }
\hline
$\boldsymbol { x }$ & 6.89 & 3.67 & 5.92 & 5.04 & 4.87 & 3.92 & 4.71 & 5.14 & 3.65 & 5.23 \\
\hline
$\boldsymbol { y }$ & 30 & 3 & 22 & 15 & 13 & 8 & 15 & 13.5 & 3 & 19 \\
\hline
\end{tabular}
\end{center}
(You may use $\sum y ^ { 2 } = 2628.25 \quad \sum x y = 768.58 \quad \mathrm {~S} _ { x x } = 9.25924 \quad \mathrm {~S} _ { x y } = 74.664$ )\\
(a) Show that $\mathrm { S } _ { y y } = 626.025$\\
(b) Find the product moment correlation coefficient between $x$ and $y$.\\
(c) Give a reason to support fitting a regression model of the form $y = a + b x$ to these data.\\
(d) Find the equation of the regression line of $y$ on $x$, giving your answer in the form $y = a + b x$.\\
Give the value of $a$ and the value of $b$ to 3 significant figures.
A car's published miles per gallon is 44\\
(e) Estimate the actual miles per gallon for this particular car.\\
(f) Comment on the reliability of your estimate in part (e). Give a reason for your answer.
\hfill \mbox{\textit{Edexcel S1 2018 Q1 [13]}}