Edexcel S1 2018 June — Question 1 13 marks

Exam BoardEdexcel
ModuleS1 (Statistics 1)
Year2018
SessionJune
Marks13
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeConvert regression equation between coded and original
DifficultyModerate -0.3 This is a standard S1 regression question requiring routine application of formulas (showing S_yy, finding correlation coefficient, regression equation) and straightforward conversion between coded and original variables. The coding transformation is simple linear scaling, and all steps follow textbook procedures with no novel problem-solving required. Slightly easier than average due to being highly procedural with given summary statistics.
Spec5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09c Calculate regression line5.09e Use regression: for estimation in context

  1. A random sample of 10 cars of different makes and sizes is taken and the published miles per gallon, \(p\), and the actual miles per gallon, \(m\), are recorded. The data are coded using variables \(x = \frac { p } { 10 }\) and \(y = m - 25\)
The results for the coded data are summarised below.
\(\boldsymbol { x }\)6.893.675.925.044.873.924.715.143.655.23
\(\boldsymbol { y }\)30322151381513.5319
(You may use \(\sum y ^ { 2 } = 2628.25 \quad \sum x y = 768.58 \quad \mathrm {~S} _ { x x } = 9.25924 \quad \mathrm {~S} _ { x y } = 74.664\) )
  1. Show that \(\mathrm { S } _ { y y } = 626.025\)
  2. Find the product moment correlation coefficient between \(x\) and \(y\).
  3. Give a reason to support fitting a regression model of the form \(y = a + b x\) to these data.
  4. Find the equation of the regression line of \(y\) on \(x\), giving your answer in the form \(y = a + b x\).
    Give the value of \(a\) and the value of \(b\) to 3 significant figures. A car's published miles per gallon is 44
  5. Estimate the actual miles per gallon for this particular car.
  6. Comment on the reliability of your estimate in part (e). Give a reason for your answer.

Question 1:
Part (a)
AnswerMarks Guidance
\(S_{yy} = 2628.25 - \frac{141.5^2}{10} = 626.025\)M1A1cso (2 marks) Allow complete expression with \(\Sigma y = (30+3+22+15+13+8+15+13.5+3+19)\)
Part (b)
AnswerMarks Guidance
\(r = \frac{74.664}{\sqrt{9.25924 \times 626.025}} = 0.98068...\) awrt 0.981M1, A1 (2 marks) M1 attempt at correct formula (allow one transcription error). 0.98 or 0.980 on its own is M1A0.
Part (c)
AnswerMarks Guidance
\(r\)/0.981 is close to 1 or a strong correlationB1 (1 mark) Allow "near perfect" correlation, but "perfect correlation" is B0. If \(
Part (d)
AnswerMarks Guidance
\(b = \frac{74.664}{9.25924} = 8.063728...\)M1 M1 for correct expression for \(b\)
\(a = \frac{141.5}{10} - \left(\frac{74.664}{9.25924}\right)\times\left(\frac{49.04}{10}\right) = -25.39452...\)M1 M1 for correct expression for \(a\) ft their \(b\)
\(y = -25.4 + 8.06x\)A1 (3 marks) Must be in part (d) in \(y\) and \(x\) with \(a\) = awrt \(-25.4\) and \(b\) = awrt \(8.06\). No fractions.
Part (e)
AnswerMarks Guidance
\(y = -25.4 + 8.06 \times 4.4\ [=10.08...]\)M1 M1 subst. \(x=4.4\) into their equation
\(m = (10.08...) + 25 = 35.085...\)(mpg) awrt 35.1depM1, A1 (3 marks) depM1 adding 25 to their 10.08...
Part (f)
AnswerMarks Guidance
As \(44(p)/4.4(x)\) is within the range of the data set or it involves interpolation, (the actual miles per gallon should be) reliable.M1, A1 (2 marks) M1 needs to be clear that \(p\ (36.5 < 44 < 68.9)\) or \(x\ (3.65 < 4.4 < 6.89)\) is in range. A1 M1 must be explicitly seen. 'It is reliable as it is within range' is M0A0.
# Question 1:

## Part (a)
$S_{yy} = 2628.25 - \frac{141.5^2}{10} = 626.025$ | M1A1cso (2 marks) | Allow complete expression with $\Sigma y = (30+3+22+15+13+8+15+13.5+3+19)$

## Part (b)
$r = \frac{74.664}{\sqrt{9.25924 \times 626.025}} = 0.98068...$ awrt **0.981** | M1, A1 (2 marks) | M1 attempt at correct formula (allow one transcription error). 0.98 or 0.980 on its own is M1A0.

## Part (c)
$r$/0.981 is close to 1 or a **strong correlation** | B1 (1 mark) | Allow "near perfect" correlation, but "perfect correlation" is B0. If $|r|>1$, then B0.

## Part (d)
$b = \frac{74.664}{9.25924} = 8.063728...$ | M1 | M1 for correct expression for $b$

$a = \frac{141.5}{10} - \left(\frac{74.664}{9.25924}\right)\times\left(\frac{49.04}{10}\right) = -25.39452...$ | M1 | M1 for correct expression for $a$ ft their $b$

$y = -25.4 + 8.06x$ | A1 (3 marks) | Must be in part (d) in $y$ and $x$ with $a$ = awrt $-25.4$ and $b$ = awrt $8.06$. No fractions.

## Part (e)
$y = -25.4 + 8.06 \times 4.4\ [=10.08...]$ | M1 | M1 subst. $x=4.4$ into their equation

$m = (10.08...) + 25 = 35.085...$(mpg) awrt **35.1** | depM1, A1 (3 marks) | depM1 adding 25 to their 10.08...

## Part (f)
As $44(p)/4.4(x)$ is within the range of the data set or it involves interpolation, (the actual miles per gallon should be) reliable. | M1, A1 (2 marks) | M1 needs to be clear that $p\ (36.5 < 44 < 68.9)$ or $x\ (3.65 < 4.4 < 6.89)$ is in range. A1 M1 must be explicitly seen. 'It is reliable as it is within range' is M0A0.

---
\begin{enumerate}
  \item A random sample of 10 cars of different makes and sizes is taken and the published miles per gallon, $p$, and the actual miles per gallon, $m$, are recorded. The data are coded using variables $x = \frac { p } { 10 }$ and $y = m - 25$
\end{enumerate}

The results for the coded data are summarised below.

\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | c | c | c | c | }
\hline
$\boldsymbol { x }$ & 6.89 & 3.67 & 5.92 & 5.04 & 4.87 & 3.92 & 4.71 & 5.14 & 3.65 & 5.23 \\
\hline
$\boldsymbol { y }$ & 30 & 3 & 22 & 15 & 13 & 8 & 15 & 13.5 & 3 & 19 \\
\hline
\end{tabular}
\end{center}

(You may use $\sum y ^ { 2 } = 2628.25 \quad \sum x y = 768.58 \quad \mathrm {~S} _ { x x } = 9.25924 \quad \mathrm {~S} _ { x y } = 74.664$ )\\
(a) Show that $\mathrm { S } _ { y y } = 626.025$\\
(b) Find the product moment correlation coefficient between $x$ and $y$.\\
(c) Give a reason to support fitting a regression model of the form $y = a + b x$ to these data.\\
(d) Find the equation of the regression line of $y$ on $x$, giving your answer in the form $y = a + b x$.\\
Give the value of $a$ and the value of $b$ to 3 significant figures.

A car's published miles per gallon is 44\\
(e) Estimate the actual miles per gallon for this particular car.\\
(f) Comment on the reliability of your estimate in part (e). Give a reason for your answer.

\hfill \mbox{\textit{Edexcel S1 2018 Q1 [13]}}