| Exam Board | OCR |
|---|---|
| Module | Further Statistics AS (Further Statistics AS) |
| Year | 2023 |
| Session | June |
| Marks | 8 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate regression line then predict |
| Difficulty | Standard +0.3 This is a standard linear regression question requiring routine application of formulas for variance, regression line, and prediction. Part (d) adds mild interpretation using standard deviation to assess data range, and part (e) requires commenting on correlation strength—both straightforward for Further Maths students. The calculations are mechanical with no novel problem-solving required. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(64282/32 - (1340/32)^2 = 255(.297)\) | B1[1] | Awrt 255. Allow \(263.52\) from \(n/(n-1)\). Don't give ISW for \(\sqrt{255}\) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(y = 8.02 + 0.265(2)x\quad \left[\frac{131039}{16339} + \frac{4333}{16339}x\right]\) | B2[2] | Coefficients exact or correct to 3 sf, allow 8.03, letters correct. One error: B1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(8.02 + 0.2652 \times 48 = \pounds 20\,700\ (3\ \text{sf})\ (20749)\) | B1[1] | Awrt 20700 (not 20.7) or in range [20740, 20750]. Ignore absence of £. NB: can be obtained from calculator even if (b) is wrong; B1 for this |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| SD is \(\sqrt{255} \approx 16\) and 48 is less than 6 away from \(\bar{x}\), so extremely likely that range includes 48 | B1 B1[2] | Relevant calculation, e.g. \(1340/32 \pm 2\sqrt{255}\), or difference is \(0.383\sigma\). SD or variance mentioned and nuanced conclusion e.g. "very likely that Tom is wrong" or more extreme, but not "Tom is wrong". SC: Only variance mentioned: max (B0)B1 |
| Answer | Marks |
|---|---|
| Response | Marks |
| (A) The standard deviation is \(\approx 16\), so Tom is likely to be right | B0 |
| (B) Variance is large so very likely that Tom is wrong *(SC – but not "variance is very large so results inaccurate")* | B0B1 |
| (C) Less than 2 SD above mean, so Tom is incorrect *(B1, but not nuanced so B0)* | B1B0 |
| (D) Variance is large so results vary a lot, so likely to be data above 48, so unlikely that Tom's claim is correct | B0B1 |
| (E) Less than one standard deviation away from mean [consistent with (a)], so Tom is very unlikely to be right *(minimum for B1B1)* | B1B1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| (48 almost certainly within range but) correlation only moderate so not very reliable | M1 A1[2] | Comment on size of PMCC, allow comparison with CV. Nuanced conclusion, but *not* from "significant evidence of correlation". OE (a significance test asks "is there evidence that \(\rho > 0\)?", but here the issue is "how close is \(\rho\) to \(\pm 1\)?", so a significance test is irrelevant) |
| Answer | Marks |
|---|---|
| Response | Marks |
| (F) PMCC shows quite strong correlation and probably within range, so reliable | M1A0 |
| (G) PMCC shows quite strong correlation so fairly reliable | M1A1 |
| (H) Not very reliable as PMCC is low and might be extrapolating | M1A1 |
| (I) Not very reliable as PMCC is low | M1A1 |
## Question 3:
### Part (a)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $64282/32 - (1340/32)^2 = 255(.297)$ | B1[1] | Awrt 255. Allow $263.52$ from $n/(n-1)$. Don't give ISW for $\sqrt{255}$ |
### Part (b)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $y = 8.02 + 0.265(2)x\quad \left[\frac{131039}{16339} + \frac{4333}{16339}x\right]$ | B2[2] | Coefficients exact or correct to 3 sf, allow 8.03, letters correct. One error: B1 |
### Part (c)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $8.02 + 0.2652 \times 48 = \pounds 20\,700\ (3\ \text{sf})\ (20749)$ | B1[1] | Awrt 20700 (not 20.7) or in range [20740, 20750]. Ignore absence of £. NB: can be obtained from calculator even if **(b)** is wrong; B1 for this |
### Part (d)
| Answer | Marks | Guidance |
|--------|-------|----------|
| SD is $\sqrt{255} \approx 16$ and 48 is less than 6 away from $\bar{x}$, so extremely likely that range includes 48 | B1 B1[2] | Relevant calculation, e.g. $1340/32 \pm 2\sqrt{255}$, or difference is $0.383\sigma$. SD or variance mentioned and nuanced conclusion e.g. "very likely that Tom is wrong" or more extreme, but not "Tom is wrong". SC: Only variance mentioned: max (B0)B1 |
#### Part (d) – further examples:
| Response | Marks |
|----------|-------|
| (A) The standard deviation is $\approx 16$, so Tom is likely to be right | B0 |
| (B) Variance is large so very likely that Tom is wrong *(SC – but not "variance is very large so results inaccurate")* | B0B1 |
| (C) Less than 2 SD above mean, so Tom is incorrect *(B1, but not nuanced so B0)* | B1B0 |
| (D) Variance is large so results vary a lot, so likely to be data above 48, so unlikely that Tom's claim is correct | B0B1 |
| (E) Less than one standard deviation away from mean [consistent with **(a)**], so Tom is very unlikely to be right *(minimum for B1B1)* | B1B1 |
### Part (e)
| Answer | Marks | Guidance |
|--------|-------|----------|
| (48 almost certainly within range but) correlation only moderate so not very reliable | M1 A1[2] | Comment on size of PMCC, allow comparison with CV. Nuanced conclusion, but *not* from "significant evidence of correlation". OE (a significance test asks "is there evidence that $\rho > 0$?", but here the issue is "how close is $\rho$ to $\pm 1$?", so a significance test is irrelevant) |
#### Part (e) – further examples:
| Response | Marks |
|----------|-------|
| (F) PMCC shows quite strong correlation and probably within range, so reliable | M1A0 |
| (G) PMCC shows quite strong correlation so fairly reliable | M1A1 |
| (H) Not very reliable as PMCC is low and might be extrapolating | M1A1 |
| (I) Not very reliable as PMCC is low | M1A1 |
---
3 An insurance company collected data concerning the age, $x$ years, of policy holders and the average size of claim, $\pounds y$ thousand. The data is summarised as follows.\\
$n = 32 \quad \sum x = 1340 \quad \sum y = 612 \quad \sum x ^ { 2 } = 64282 \quad \sum y ^ { 2 } = 13418 \quad \sum x y = 27794$
\begin{enumerate}[label=(\alph*)]
\item Find the variance of $x$.
\item Find the equation of the regression line of $y$ on $x$.
\item Hence estimate the expected size of claim from a policy holder of age 48.
Tom is aged 48. He claims that the range of the data probably does not include people of his age because the mean age for the data is 41.875 , and 48 is not close to this.
\item Use your answer to part (a) to determine how likely it is that Tom's claim is correct.
\item Comment on the reliability of your estimate in part (c). You should refer to the value of the product-moment correlation coefficient for the data, which is 0.579 correct to 3 significant figures.
\end{enumerate}
\hfill \mbox{\textit{OCR Further Statistics AS 2023 Q3 [8]}}