Edexcel S1 2016 January — Question 3 15 marks

Exam BoardEdexcel
ModuleS1 (Statistics 1)
Year2016
SessionJanuary
Marks15
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeConvert regression equation between coded and original
DifficultyModerate -0.3 This is a standard S1 regression question requiring routine calculations (Sxy, Syy, correlation coefficient) and straightforward conversion between coded and original variables using given formulae. While multi-part with several steps, each part follows textbook procedures with no novel insight required. The coding conversion in part (e) is mechanical substitution. Slightly easier than average due to clear structure and provided summary statistics.
Spec5.08a Pearson correlation: calculate pmcc5.08b Linear coding: effect on pmcc5.08c Pearson: measure of straight-line fit5.08d Hypothesis test: Pearson correlation5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression

3. A publisher collects information about the amount spent on advertising, \(\pounds x\), and the sales, \(y\) books, for some of her publications. She collects information for a random sample of 8 textbooks and codes the data using \(v = \frac { x + 50 } { 200 }\) and \(s = \frac { y } { 1000 }\) to give
\(v\)0.608.104.300.401.606.402.505.10
\(s\)1.846.735.951.302.457.464.826.25
[You may use: \(\sum v = 29 \sum s = 36.8 \sum s ^ { 2 } = 209.72 \sum v s = 177.311 \quad \mathrm {~S} _ { v v } = 55.275\) ]
  1. Find \(\mathrm { S } _ { v s }\) and \(\mathrm { S } _ { s s }\)
  2. Calculate the product moment correlation coefficient for these data. The publisher believes that a linear regression model may be appropriate to describe these data.
  3. State, giving a reason, whether or not your answer to part (b) supports the publisher's belief.
  4. Find the equation of the regression line of \(s\) on \(v\), giving your answer in the form \(s = a + b v\)
  5. Hence find the equation of the regression line of \(y\) on \(x\) for the sample of textbooks, giving your answer in the form \(y = c + d x\) The publisher calculated the regression line for a sample of novels and obtained the equation $$y = 3100 + 1.2 x$$ She wants to increase the sales of books by spending more money on advertising.
  6. State, giving your reasons, whether the publisher should spend more money on advertising textbooks or novels.

Question 3:
Part (a)
AnswerMarks Guidance
\([S_{vs}] = 177.311 - \frac{36.8 \times 29}{8} = 43.911\) = awrt \(43.9\)M1, A1 M1 for one correct expression; for correct answer with no working award M1 and appropriate A1; condone missing labels
\([S_{ss}] = 209.72 - \frac{36.8^2}{8} = 40.44\) = awrt \(40.4\)A1 (3)
Part (b)
AnswerMarks Guidance
\(r = \frac{43.911}{\sqrt{55.275 \times 40.44}} = 0.92875...\) = awrt \(0.929\)M1, A1 (2) M1 for correct expression for \(r\), ft their 43.911 (but not 177.311) and their 40.44 (not 209.72). A1 for awrt 0.929 (correct ans only scores 2/2; ans only of 0.93 scores M1A0)
Part (c)
AnswerMarks Guidance
\(r\) is close to 1 so there is support for the publisher's beliefB1ft (1) For saying it does support the belief or a linear model/relationship is suitable and giving suitable reason e.g. strong correlation. If \(
Part (d)
AnswerMarks Guidance
\(b = \frac{43.911}{55.275} = 0.7944...\) = awrt \(0.79\)M1, A1 1st M1 for correct expression for \(b\), ft their 43.911; allow 3sf values. 1st A1 for awrt 0.79 or exact fraction from 3sf values e.g. \(\frac{439}{553}\)
\(a = \bar{s} - b\bar{v} = 4.6 - 0.7944... \times 3.625 = 1.720...\)M1 2nd M1 for correct method for \(a\), ft their \(b\). \(\bar{s} = 4.6 = \frac{3.6}{8}\) and \(\bar{v} = 3.625 = \frac{29}{8}\)
\(s = 1.72 + 0.794v\)A1 (4) 2nd A1 for equation for \(s\) in terms of \(v\) with \(a =\) awrt 1.72 and \(b =\) awrt 0.794
Part (e)
AnswerMarks Guidance
\(\frac{y}{1000} = 1.72 + 0.794 \times \left(\frac{x+50}{200}\right)\)M1 For correct substitution giving equation in \(y\) and \(x\). Allow 1 slip e.g. \(\frac{y}{100}\)
\(y = 1920 + 3.97x\)A1, A1ft (3) 1st A1 for \(c = 1920\) (to 3sf); 2nd A1ft for \(d =\) awrt 3.97 or \(5\times\)(their \(b\) correct to 2 sig. figs.)
Part (f)
AnswerMarks Guidance
Gradient of textbooks is greaterB1ft For suitable reason based on gradients
spend more advertising on textbooksdB1ft (2) For recommending spend more on advertising textbooks. If gradient in (e) \(< 1.2\) then comparison of grads leading to spending on novels is B1B1
## Question 3:

### Part (a)
| $[S_{vs}] = 177.311 - \frac{36.8 \times 29}{8} = 43.911$ = awrt $43.9$ | M1, A1 | M1 for one correct expression; for correct answer with no working award M1 and appropriate A1; condone missing labels |
| $[S_{ss}] = 209.72 - \frac{36.8^2}{8} = 40.44$ = awrt $40.4$ | A1 (3) | |

### Part (b)
| $r = \frac{43.911}{\sqrt{55.275 \times 40.44}} = 0.92875...$ = awrt $0.929$ | M1, A1 (2) | M1 for correct expression for $r$, ft their 43.911 (but not 177.311) and their 40.44 (not 209.72). A1 for awrt 0.929 (correct ans only scores 2/2; ans only of 0.93 scores M1A0) |

### Part (c)
| $r$ is close to 1 so there is support for the publisher's belief | B1ft (1) | For saying it does support the belief or a linear model/relationship is suitable and giving suitable reason e.g. strong correlation. If $|r| < 0.5$ allow "$r$ close to 0" so "does not support". Allow "yes" because "strong corr." but "yes" & "positive corr." is B0 |

### Part (d)
| $b = \frac{43.911}{55.275} = 0.7944...$ = awrt $0.79$ | M1, A1 | 1st M1 for correct expression for $b$, ft their 43.911; allow 3sf values. 1st A1 for awrt 0.79 or exact fraction from 3sf values e.g. $\frac{439}{553}$ |
| $a = \bar{s} - b\bar{v} = 4.6 - 0.7944... \times 3.625 = 1.720...$ | M1 | 2nd M1 for correct method for $a$, ft their $b$. $\bar{s} = 4.6 = \frac{3.6}{8}$ and $\bar{v} = 3.625 = \frac{29}{8}$ |
| $s = 1.72 + 0.794v$ | A1 (4) | 2nd A1 for equation for $s$ in terms of $v$ with $a =$ awrt 1.72 and $b =$ awrt 0.794 |

### Part (e)
| $\frac{y}{1000} = 1.72 + 0.794 \times \left(\frac{x+50}{200}\right)$ | M1 | For correct substitution giving equation in $y$ and $x$. Allow 1 slip e.g. $\frac{y}{100}$ |
| $y = 1920 + 3.97x$ | A1, A1ft (3) | 1st A1 for $c = 1920$ (to 3sf); 2nd A1ft for $d =$ awrt 3.97 or $5\times$(their $b$ correct to 2 sig. figs.) |

### Part (f)
| Gradient of textbooks is greater | B1ft | For suitable reason based on gradients |
| spend more advertising on textbooks | dB1ft (2) | For recommending spend more on advertising textbooks. If gradient in (e) $< 1.2$ then comparison of grads leading to spending on novels is B1B1 |

---
3. A publisher collects information about the amount spent on advertising, $\pounds x$, and the sales, $y$ books, for some of her publications. She collects information for a random sample of 8 textbooks and codes the data using $v = \frac { x + 50 } { 200 }$ and $s = \frac { y } { 1000 }$ to give

\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | c | c | }
\hline
$v$ & 0.60 & 8.10 & 4.30 & 0.40 & 1.60 & 6.40 & 2.50 & 5.10 \\
\hline
$s$ & 1.84 & 6.73 & 5.95 & 1.30 & 2.45 & 7.46 & 4.82 & 6.25 \\
\hline
\end{tabular}
\end{center}

[You may use: $\sum v = 29 \sum s = 36.8 \sum s ^ { 2 } = 209.72 \sum v s = 177.311 \quad \mathrm {~S} _ { v v } = 55.275$ ]
\begin{enumerate}[label=(\alph*)]
\item Find $\mathrm { S } _ { v s }$ and $\mathrm { S } _ { s s }$
\item Calculate the product moment correlation coefficient for these data.

The publisher believes that a linear regression model may be appropriate to describe these data.
\item State, giving a reason, whether or not your answer to part (b) supports the publisher's belief.
\item Find the equation of the regression line of $s$ on $v$, giving your answer in the form $s = a + b v$
\item Hence find the equation of the regression line of $y$ on $x$ for the sample of textbooks, giving your answer in the form $y = c + d x$

The publisher calculated the regression line for a sample of novels and obtained the equation

$$y = 3100 + 1.2 x$$

She wants to increase the sales of books by spending more money on advertising.
\item State, giving your reasons, whether the publisher should spend more money on advertising textbooks or novels.
\end{enumerate}

\hfill \mbox{\textit{Edexcel S1 2016 Q3 [15]}}