| Exam Board | OCR |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2012 |
| Session | June |
| Marks | 9 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Interpret correlation strength/direction |
| Difficulty | Moderate -0.8 This is a straightforward S1 question testing standard correlation and regression calculations with formula application, plus a basic interpretation of correlation vs causation. The calculations are routine (finding r, regression line, making a prediction) and the final part tests a well-rehearsed concept that correlation doesn't imply causation—easier than average A-level content. |
| Spec | 2.02e Correlation does not imply causation5.08a Pearson correlation: calculate pmcc5.09c Calculate regression line5.09d Linear coding: effect on regression |
| Year | 2007 | 2008 | 2009 | 2010 | 2011 |
| \(x\) | 250 | 270 | 264 | 290 | 292 |
| \(y\) | 4.2 | 3.7 | 3.2 | 3.5 | 3.0 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(\Sigma x = 1366\), \(\Sigma y = 17.6\), \(\Sigma x^2 = 374460\), \(\Sigma y^2 = 62.82\), \(\Sigma xy = 4784.8\) (any three correct) | B1 | May be implied by 2 \(S\)'s. OR using \(S_{xx} = \Sigma(x-\bar{x})^2\) etc. \(\bar{x} = \frac{1366}{5}\) or 273.2, \(\bar{y} = \frac{17.6}{5}\) or 3.52 |
| \(S_{xx} = 374460 - \frac{1366^2}{5}\) or 1268.8 | ||
| \(S_{yy} = 62.82 - \frac{17.6^2}{5}\) or 0.868 | ||
| \(S_{xy} = 4784.8 - \frac{1366 \times 17.6}{5}\) or \(-23.52\) | M1 | Correct sub in any correct \(S\) formula, ft \(\Sigma s\), \(\bar{x}\), \(\bar{y}\) |
| \(r = \frac{-23.52}{\sqrt{1268.8 \times 0.868}}\) or \(\frac{-23.52}{33.186...}\) | M1 | Correct sub into 3 \(S\)s and \(r\), ft \(\Sigma s\), \(\bar{x}\), \(\bar{y}\) |
| \(= -0.709\) (3 sfs) | A1 | cao. If no working seen: \(-0.71\): SC3; \(-0.7\): SC1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(b = \frac{-23.52}{1268.8}\) or \(-\frac{147}{7930}\) or \(-0.0185\) (3 sfs) | M1 | ft their \(S_{xy}\) & \(S_{xx}\) & \(\Sigma s\) from (i). Use of \(x\) on \(y\) line: \(b' = \frac{-23.52}{0.868}\) (or \(-27.1\)) M0 |
| \(y - \frac{17.6}{5} = -0.0185(x - \frac{1366}{5})\) | M1 | or \(a = \frac{17.6}{5} - (-0.0185) \times \frac{1366}{5}\); if \(a\) incorrect must see method for M1 |
| \(\Rightarrow y = -0.019x + 8.6\) or better, ie 2 sfs enough | A1 | cao; must be "\(y = \ldots\)"; coeffs that round to \(-0.019\) & \(8.6\) to 2 sfs |
| \((y = -0.019 \times 280 + 8.6 \quad (= 3.39 \text{ to } 3.41))\) | ft their \(y \times 1000\), dep M1M1, sub 280 (not 280000) | |
| Est sales = £3390 to £3410 or 3.39 thousand to 3.41 thousand | A1ft | Allow "k" for thousand. No working, ans in range: M1M1A0A1. 3277 or 3278: A0 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| There may be other factors oe / Correlation does not imply causation oe | B1 | Must state or clearly imply: EITHER corr'n does not imply causation OR there could be another factor involved. Ignore all else. NOT: Tourists & sales not nec'y linked; Sales are not entirely dep on tourists; Could be a coincidence; \(-0.8\) is not strong corr'n; Only shows good neg corr'n; Sample is small; Neg corr'n not nec'y imply neg relationship |
## Question 1:
### Part (i)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $\Sigma x = 1366$, $\Sigma y = 17.6$, $\Sigma x^2 = 374460$, $\Sigma y^2 = 62.82$, $\Sigma xy = 4784.8$ (any three correct) | B1 | May be implied by 2 $S$'s. OR using $S_{xx} = \Sigma(x-\bar{x})^2$ etc. $\bar{x} = \frac{1366}{5}$ or 273.2, $\bar{y} = \frac{17.6}{5}$ or 3.52 |
| $S_{xx} = 374460 - \frac{1366^2}{5}$ or 1268.8 | | |
| $S_{yy} = 62.82 - \frac{17.6^2}{5}$ or 0.868 | | |
| $S_{xy} = 4784.8 - \frac{1366 \times 17.6}{5}$ or $-23.52$ | M1 | Correct sub in any correct $S$ formula, ft $\Sigma s$, $\bar{x}$, $\bar{y}$ |
| $r = \frac{-23.52}{\sqrt{1268.8 \times 0.868}}$ or $\frac{-23.52}{33.186...}$ | M1 | Correct sub into 3 $S$s and $r$, ft $\Sigma s$, $\bar{x}$, $\bar{y}$ |
| $= -0.709$ (3 sfs) | A1 | cao. If no working seen: $-0.71$: SC3; $-0.7$: SC1 |
### Part (ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $b = \frac{-23.52}{1268.8}$ or $-\frac{147}{7930}$ or $-0.0185$ (3 sfs) | M1 | ft their $S_{xy}$ & $S_{xx}$ & $\Sigma s$ from (i). Use of $x$ on $y$ line: $b' = \frac{-23.52}{0.868}$ (or $-27.1$) M0 |
| $y - \frac{17.6}{5} = -0.0185(x - \frac{1366}{5})$ | M1 | or $a = \frac{17.6}{5} - (-0.0185) \times \frac{1366}{5}$; if $a$ incorrect must see method for M1 |
| $\Rightarrow y = -0.019x + 8.6$ or better, ie 2 sfs enough | A1 | cao; must be "$y = \ldots$"; coeffs that round to $-0.019$ & $8.6$ to 2 sfs |
| $(y = -0.019 \times 280 + 8.6 \quad (= 3.39 \text{ to } 3.41))$ | | ft their $y \times 1000$, dep M1M1, sub 280 (not 280000) |
| Est sales = £3390 to £3410 or 3.39 thousand to 3.41 thousand | A1ft | Allow "k" for thousand. No working, ans in range: M1M1A0A1. 3277 or 3278: A0 |
### Part (iii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| There may be other factors oe / Correlation does not imply causation oe | B1 | Must state or clearly imply: EITHER corr'n does not imply causation OR there could be another factor involved. Ignore all else. NOT: Tourists & sales not nec'y linked; Sales are not entirely dep on tourists; Could be a coincidence; $-0.8$ is not strong corr'n; Only shows good neg corr'n; Sample is small; Neg corr'n not nec'y imply neg relationship |
---
1 For each of the last five years the number of tourists, $x$ thousands, visiting Sackton, and the average weekly sales, $\pounds y$ thousands, in Sackton Stores were noted. The table shows the results.
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | }
\hline
Year & 2007 & 2008 & 2009 & 2010 & 2011 \\
\hline
$x$ & 250 & 270 & 264 & 290 & 292 \\
\hline
$y$ & 4.2 & 3.7 & 3.2 & 3.5 & 3.0 \\
\hline
\end{tabular}
\end{center}
(i) Calculate the product moment correlation coefficient $r$ between $x$ and $y$.\\
(ii) It is required to estimate the average weekly sales at Sackton Stores in a year when the number of tourists is 280000 . Calculate the equation of an appropriate regression line, and use it to find this estimate.\\
(iii) Over a longer period the value of $r$ is - 0.8 . The mayor says, "This shows that having more tourists causes sales at Sackton Stores to decrease." Give a reason why this statement is not correct.
\hfill \mbox{\textit{OCR S1 2012 Q1 [9]}}