| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2014 |
| Session | June |
| Marks | 13 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate y on x from raw data table |
| Difficulty | Easy -1.2 This is a straightforward S1 linear regression question where all summary statistics are provided. Students simply substitute into standard formulas (r = S_vm/√(S_vv × S_mm), b = S_vm/S_vv) and perform routine calculations. No problem-solving or conceptual insight required—pure formula application with interpretation prompts that are standard textbook exercises. |
| Spec | 2.02c Scatter diagrams and regression lines5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression5.09e Use regression: for estimation in context |
| 2450 | 2480 | 2540 | 2420 | 2350 | 2290 | 2400 | 2460 | ||
| 1370 | 1350 | 1400 | 1330 | 1270 | 1210 | 1330 | 1350 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| (a) \(r = \frac{31512.5}{\sqrt{42587.5 \times 25187.5}} = 0.962\) awrt 0.962 | M1 A1 (2) | M1 for correct expression for \(r\). Ans only of 0.96 or awrt 0.96 is M1A0. Ans only of 0.962 or awrt 0.962 is M1A1. Do not allow fractions for A1 |
| (b) \(r\) is close to 1 or a strong correlation | B1 (1) | B1 for comment implying strong correlation (e.g. big/high/clear etc). B0 if \( |
| (c) \(b = \frac{31512.5}{42587.5} = 0.739947\ldots = 0.740\) (3 dp) 0.740 (only) | M1 A1cao (2) | M1 for correct expression for \(b\). A1 for 0.740 only in (c) or \(b = 0.740\) seen elsewhere (M1A0 for \(\frac{2521}{3407}\) or awrt 0.74 here) |
| (d) \(a = 1326.25 - (0.7399\ldots \times 2423.75)\) \([= -467.2\) or awrt \(-467]\) So \(m = -467 + 0.74v\) | M1 A1 (2) | M1 for \(1326.25 -\) ('their \(b\)' \(\times 2423.75\)). Condone fractions or awrt 1330 for \(\bar{m}\) and awrt 2420 for \(\bar{v}\). A1 for correct equation in \(m\) and \(v\) with \(a =\) awrt \(-467\) and \(b =\) awrt 0.74. Condone \(\frac{2521}{3407}\) for \(b\) and \(\frac{-1591740}{3407}\) for \(a\). Equation in \(y\) and \(x\) is A0 |
| (e) \(b\) is the money (spent) per visitor | B1 B1ft (2) | 1st B1 for correct definition of the rate in words. Must state or imply "money per visitor". 2nd B1ft for correct numerical rate (ft their \(b\)). e.g. "each visitor spends £740" is B1B1 |
| (f) \(m = -467 + 0.74 \times 2500 = 1383\) (£ million) awrt 1380 | M1 A1 (2) | M1 sub. \(v = 2500\) into their equation. Simply substituting 2,500,000 is M0 (unless adjusted eqn). A1 awrt 1380 units (£ and million not required) |
| (g) As 2500 is within the range of the data set or it involves interpolation. The value of money spent is reliable. | B1 dB1 (2) | 1st B1 for 2500 or 2,500,000 or visitors or \(v\) is in range. "it" is B0 unless \(v\) clearly implied. 2nd dB1 for stating it is reliable. Dependent on previous B mark being awarded |
# Question 3:
| Answer/Working | Marks | Guidance |
|---|---|---|
| **(a)** $r = \frac{31512.5}{\sqrt{42587.5 \times 25187.5}} = 0.962$ awrt **0.962** | M1 A1 **(2)** | M1 for correct expression for $r$. Ans only of 0.96 or awrt 0.96 is M1A0. Ans only of 0.962 or awrt 0.962 is M1A1. Do not allow fractions for A1 |
| **(b)** $r$ is close to 1 or a **strong correlation** | B1 **(1)** | B1 for comment implying strong correlation (e.g. big/high/clear etc). B0 if $|r|>1$. "Just positive" is B0. "relationship" or "skew" not "correlation" is B0 |
| **(c)** $b = \frac{31512.5}{42587.5} = 0.739947\ldots = 0.740$ (3 dp) **0.740** (only) | M1 A1cao **(2)** | M1 for correct expression for $b$. A1 for 0.740 only in (c) or $b = 0.740$ seen elsewhere (M1A0 for $\frac{2521}{3407}$ or awrt 0.74 here) |
| **(d)** $a = 1326.25 - (0.7399\ldots \times 2423.75)$ $[= -467.2$ or awrt $-467]$ So $m = -467 + 0.74v$ | M1 A1 **(2)** | M1 for $1326.25 -$ ('their $b$' $\times 2423.75$). Condone fractions or awrt 1330 for $\bar{m}$ and awrt 2420 for $\bar{v}$. A1 for correct equation in $m$ and $v$ with $a =$ awrt $-467$ and $b =$ awrt 0.74. Condone $\frac{2521}{3407}$ for $b$ and $\frac{-1591740}{3407}$ for $a$. Equation in $y$ and $x$ is A0 |
| **(e)** $b$ is the money (spent) per visitor | B1 B1ft **(2)** | 1st B1 for correct definition of the rate in words. Must state or imply "money per visitor". 2nd B1ft for correct numerical rate (ft their $b$). e.g. "each visitor spends £740" is B1B1 |
| **(f)** $m = -467 + 0.74 \times 2500 = 1383$ (£ million) awrt **1380** | M1 A1 **(2)** | M1 sub. $v = 2500$ into their equation. Simply substituting 2,500,000 is M0 (unless adjusted eqn). A1 awrt 1380 units (£ and million not required) |
| **(g)** As 2500 is within the range of the data set or it involves interpolation. The value of money spent is reliable. | B1 dB1 **(2)** | 1st B1 for 2500 or 2,500,000 or visitors or $v$ is in range. "it" is B0 unless $v$ clearly implied. 2nd dB1 for stating it is reliable. Dependent on previous B mark being awarded |
---
3. The table shows data on the number of visitors to the UK in a month, $v$ (1000s), and the amount of money they spent, $m$ ( $\pounds$ millions), for each of 8 months.
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | c | c | }
\hline
\begin{tabular}{ c }
Number of visitors \\
$v ( 1000 \mathrm {~s} )$ \\
\end{tabular} & 2450 & 2480 & 2540 & 2420 & 2350 & 2290 & 2400 & 2460 \\
\hline
\begin{tabular}{ c }
Amount of money spent \\
$m ( \pounds$ millions $)$ \\
\end{tabular} & 1370 & 1350 & 1400 & 1330 & 1270 & 1210 & 1330 & 1350 \\
\hline
\end{tabular}
\end{center}
You may use\\
$S _ { v v } = 42587.5 \quad S _ { v m } = 31512.5 \quad S _ { m m } = 25187.5 \quad \sum v = 19390 \quad \sum m = 10610$
\begin{enumerate}[label=(\alph*)]
\item Find the product moment correlation coefficient between $m$ and $v$.
\item Give a reason to support fitting a regression model of the form $m = a + b v$ to these data.
\item Find the value of $b$ correct to 3 decimal places.
\item Find the equation of the regression line of $m$ on $v$.
\item Interpret your value of $b$.
\item Use your answer to part (d) to estimate the amount of money spent when the number of visitors to the UK in a month is 2500000
\item Comment on the reliability of your estimate in part (f). Give a reason for your answer.
\end{enumerate}
\hfill \mbox{\textit{Edexcel S1 2014 Q3 [13]}}