| Exam Board | OCR |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2005 |
| Session | January |
| Marks | 15 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate regression line then predict |
| Difficulty | Standard +0.3 This is a standard S1 regression question requiring routine application of formulas (calculating gradient/intercept, making predictions, working with residuals). Part (i) uses given summaries with standard formulas, parts (ii-v) involve straightforward arithmetic and recall of residual properties. While multi-part with several calculations, each step follows textbook procedures without requiring problem-solving insight or novel approaches. |
| Spec | 5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(\frac{264 - \frac{90 \times 15}{5}}{1720 - \frac{90^2}{5}}\) or \(\frac{264 - 5 \times 18 \times 3}{1720 - 5 \times 18^2}\) | M1 | Formula correctly used |
| \(= -0.06\) AG | A1 | \(-0.06\) correctly obtained |
| \(y - \frac{15}{5} = -0.06\left(x - \frac{90}{5}\right)\) | M1 | or \(a = \frac{15}{5} - (-0.06) \times \frac{90}{5}\) |
| \(y = 4.08 - 0.06x\) | A1 | 4 marks, complete equation correct |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| Substitute \(x = 20.5\ (y = 2.85)\) | M1 | Allow \(20\ (y = 2.88)\) or \(20.49\) |
| Substitute \(x = 19.5\ (y = 2.91)\) | M1 | |
| \(2.91 - 2.85 = 0.06\) | A1 | 3 marks, answer \(0.06\) or \(-0.06\), c.w.d |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(-0.6\) | B1 | \(-0.6\) correct |
| \(0.5\) | B1 | 2 marks, \(0.5\) correct |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(1.5\) | B1 | |
| Calculated equation minimises this quantity | B1 | 2 marks, not "Low value for \(\Sigma e^2\) means points near line" |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(\bar{e} = \Sigma e_i / 5\) | M1 | \(\Sigma e_i / 5\) used |
| \(= 0\) | A1 | Answer \(0\), cwd, cao |
| \(\Sigma e_i^2 / 5 - (-\text{her}\ \bar{e})^2\) | M1 | \(\Sigma e_i^2 / 5\) |
| \(= 0.3\) | A1 | 4 marks, \(0.3\) only, must see \(-0^2\) or \(-0\) in variance. ie: No working: \(\bar{e} = 0\): M1A1; Var \(= 0.3\): M1A0 |
# Question 9:
## Part (i)
| Answer | Mark | Guidance |
|--------|------|----------|
| $\frac{264 - \frac{90 \times 15}{5}}{1720 - \frac{90^2}{5}}$ or $\frac{264 - 5 \times 18 \times 3}{1720 - 5 \times 18^2}$ | M1 | Formula correctly used |
| $= -0.06$ AG | A1 | $-0.06$ correctly obtained |
| $y - \frac{15}{5} = -0.06\left(x - \frac{90}{5}\right)$ | M1 | or $a = \frac{15}{5} - (-0.06) \times \frac{90}{5}$ |
| $y = 4.08 - 0.06x$ | A1 | 4 marks, complete equation correct |
## Part (ii)
| Answer | Mark | Guidance |
|--------|------|----------|
| Substitute $x = 20.5\ (y = 2.85)$ | M1 | Allow $20\ (y = 2.88)$ or $20.49$ |
| Substitute $x = 19.5\ (y = 2.91)$ | M1 | |
| $2.91 - 2.85 = 0.06$ | A1 | 3 marks, answer $0.06$ or $-0.06$, c.w.d |
## Part (iii)
| Answer | Mark | Guidance |
|--------|------|----------|
| $-0.6$ | B1 | $-0.6$ correct |
| $0.5$ | B1 | 2 marks, $0.5$ correct |
## Part (iv)
| Answer | Mark | Guidance |
|--------|------|----------|
| $1.5$ | B1 | |
| Calculated equation minimises this quantity | B1 | 2 marks, not "Low value for $\Sigma e^2$ means points near line" |
## Part (v)
| Answer | Mark | Guidance |
|--------|------|----------|
| $\bar{e} = \Sigma e_i / 5$ | M1 | $\Sigma e_i / 5$ used |
| $= 0$ | A1 | Answer $0$, cwd, cao |
| $\Sigma e_i^2 / 5 - (-\text{her}\ \bar{e})^2$ | M1 | $\Sigma e_i^2 / 5$ |
| $= 0.3$ | A1 | 4 marks, $0.3$ only, must see $-0^2$ or $-0$ in variance. ie: No working: $\bar{e} = 0$: M1A1; Var $= 0.3$: M1A0 |
9 Five observations of bivariate data produce the following results, denoted as ( $x _ { i } , y _ { i }$ ) for $i = 1,2,3,4,5$.
$$\begin{aligned}
& ( 13,2.7 ) \\
& { \left[ \Sigma x = 90 , \Sigma y = 15.0 , \Sigma x ^ { 2 } = 1720 , \Sigma y ^ { 2 } = 46.86 , \Sigma x y = 264.0 . \right] }
\end{aligned}$$
(i) Show that the regression line of $y$ on $x$ has gradient - 0.06 , and find its equation in the form $y = a + b x$.\\
(ii) The regression line is used to estimate the value of $y$ corresponding to $x = 20$, but the value $x = 20$ is accurate only to the nearest whole number. Calculate the difference between the largest and the smallest values that the estimated value of $y$ could take.
The numbers $e _ { 1 } , e _ { 2 } , e _ { 3 } , e _ { 4 } , e _ { 5 }$ are defined by
$$e _ { i } = a + b x _ { i } - y _ { i } \quad \text { for } i = 1,2,3,4,5$$
(iii) The values of $e _ { 1 } , e _ { 2 }$ and $e _ { 3 }$ are $0.6 , - 0.7$ and 0.2 respectively. Calculate the values of $e _ { 4 }$ and $e _ { 5 }$.\\
(iv) Calculate the value of $e _ { 1 } ^ { 2 } + e _ { 2 } ^ { 2 } + e _ { 3 } ^ { 2 } + e _ { 4 } ^ { 2 } + e _ { 5 } ^ { 2 }$ and explain the relevance of this quantity to the regression line found in part (i).\\
(v) Find the mean and the variance of $e _ { 1 } , e _ { 2 } , e _ { 3 } , e _ { 4 } , e _ { 5 }$.
\hfill \mbox{\textit{OCR S1 2005 Q9 [15]}}