| Exam Board | OCR MEI |
|---|---|
| Module | S2 (Statistics 2) |
| Year | 2011 |
| Session | June |
| Marks | 18 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Draw scatter diagram from data |
| Difficulty | Easy -1.2 This is a standard, routine S2 linear regression question requiring only textbook procedures: plotting points, calculating regression equation using formulas, computing residuals, and making predictions. All steps are algorithmic with no problem-solving or novel insight required, making it easier than average A-level questions. |
| Spec | 5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context |
| \(x\) | 0 | 30 | 60 | 90 | 120 |
| \(y\) | 0.5 | 2.5 | 4.7 | 6.2 | 7.4 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Axes drawn correctly | G1 | Condone axes drawn either way; axes should show some indication of scale |
| Values of \(x\) plotted | G1 | If axes scaled and only one point incorrectly plotted, allow max G2/3 |
| Values of \(y\) plotted | G1 | |
| Total: 3 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(\bar{x} = 60\), \(\bar{y} = 4.26\) | B1 | For \(\bar{x}\) and \(\bar{y}\) used appropriately (SOI); allow \(\bar{y} = 4.3\) |
| \(b = \frac{S_{xy}}{S_{xx}} = \frac{1803 - 300 \times 21.3/5}{27000 - 300^2/5} = \frac{525}{9000} = 0.0583\) | M1 | Attempt should be correct – e.g. evidence of either of the two suggested methods |
| OR \(b = \frac{1803/5 - 60 \times 4.26}{27000/5 - 60^2} = \frac{105}{1800} = 0.0583\) | A1 | Allow 0.058; condone \(0.058\hat{3}\) and \(\frac{7}{120}\) |
| \(y - \bar{y} = b(x - \bar{x})\) \(\Rightarrow y - 4.26 = 0.0583(x - 60)\) \(\Rightarrow y = 0.0583x + 0.76\) | M1 | Dependent on first M1; values must be substituted; condone use of their \(b\) for FT provided \(b > 0\) |
| Complete equation | A1 FT | Final equation must be simplified; \(b = 0.058\) leads to \(y = 0.058x + 0.78\) |
| Total: 5 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Regression line plotted on graph | G1 | Line must pass through their \((\bar{x}, \bar{y})\) and \(y\)-intercept |
| G1 | ||
| The fit is good | E1 | E0 for notably inaccurate graphs/lines |
| Total: 3 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(x = 30 \Rightarrow\) predicted \(y = 0.0583 \times 30 + 0.76 = 2.509\) | B1 | Using their equation |
| Residual \(= 2.5 - 2.509 = -0.009\) | M1 | Subtraction can be 'either way' but sign of residual must be correct |
| A1 FT | FT sensible equations only; no FT for \(y = 0.071x\) leading to \(+0.37\); \([c = 0.78\) leads to residual of \(-0.02]\) | |
| Total: 3 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| (A) For \(x = 45\): \(y = 0.0583 \times 45 + 0.76 = 3.4\) | M1 | For at least one prediction attempted; prediction from their equation |
| (B) For \(x = 150\): \(y = 0.0583 \times 150 + 0.76 = 9.5\) | A1 FT | Both answers; FT their equation provided \(b > 0\) |
| Total: 2 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| This is well below the predicted value… | E1 | Value (8.7) is significantly below what is expected (9.5); simply saying 'below' is not sufficient |
| …suggesting that the model breaks down for larger values of \(x\) | E1 | Suitable comment on extrapolation; allow other sensible comments e.g. 'data might be better modelled by a curve', 'there may be other factors affecting yield' |
| Total: 2 |
# Question 1:
## Part 1(i)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Axes drawn correctly | G1 | Condone axes drawn either way; axes should show some indication of scale |
| Values of $x$ plotted | G1 | If axes scaled and only one point incorrectly plotted, allow max G2/3 |
| Values of $y$ plotted | G1 | |
| **Total: 3** | | |
## Part 1(ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $\bar{x} = 60$, $\bar{y} = 4.26$ | B1 | For $\bar{x}$ and $\bar{y}$ used appropriately (SOI); allow $\bar{y} = 4.3$ |
| $b = \frac{S_{xy}}{S_{xx}} = \frac{1803 - 300 \times 21.3/5}{27000 - 300^2/5} = \frac{525}{9000} = 0.0583$ | M1 | Attempt should be correct – e.g. evidence of either of the two suggested methods |
| OR $b = \frac{1803/5 - 60 \times 4.26}{27000/5 - 60^2} = \frac{105}{1800} = 0.0583$ | A1 | Allow 0.058; condone $0.058\hat{3}$ and $\frac{7}{120}$ |
| $y - \bar{y} = b(x - \bar{x})$ $\Rightarrow y - 4.26 = 0.0583(x - 60)$ $\Rightarrow y = 0.0583x + 0.76$ | M1 | Dependent on first M1; values must be substituted; condone use of their $b$ for FT provided $b > 0$ |
| Complete equation | A1 FT | Final equation must be simplified; $b = 0.058$ leads to $y = 0.058x + 0.78$ |
| **Total: 5** | | |
## Part 1(iii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Regression line plotted on graph | G1 | Line must pass through their $(\bar{x}, \bar{y})$ and $y$-intercept |
| | G1 | |
| The fit is good | E1 | E0 for notably inaccurate graphs/lines |
| **Total: 3** | | |
## Part 1(iv)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $x = 30 \Rightarrow$ predicted $y = 0.0583 \times 30 + 0.76 = 2.509$ | B1 | Using their equation |
| Residual $= 2.5 - 2.509 = -0.009$ | M1 | Subtraction can be 'either way' but sign of residual must be correct |
| | A1 FT | FT sensible equations only; no FT for $y = 0.071x$ leading to $+0.37$; $[c = 0.78$ leads to residual of $-0.02]$ |
| **Total: 3** | | |
## Part 1(v)
| Answer | Marks | Guidance |
|--------|-------|----------|
| (A) For $x = 45$: $y = 0.0583 \times 45 + 0.76 = 3.4$ | M1 | For at least one prediction attempted; prediction from their equation |
| (B) For $x = 150$: $y = 0.0583 \times 150 + 0.76 = 9.5$ | A1 FT | Both answers; FT their equation provided $b > 0$ |
| **Total: 2** | | |
## Part 1(vi)
| Answer | Marks | Guidance |
|--------|-------|----------|
| This is **well** below the predicted value… | E1 | Value (8.7) is significantly below what is expected (9.5); simply saying 'below' is not sufficient |
| …suggesting that the model breaks down for larger values of $x$ | E1 | Suitable comment on extrapolation; allow other sensible comments e.g. 'data might be better modelled by a curve', 'there may be other factors affecting yield' |
| **Total: 2** | | |
---
1 An experiment is performed to determine the response of maize to nitrogen fertilizer. Data for the amount of nitrogen fertilizer applied, $x \mathrm {~kg} / \mathrm { hectare }$, and the average yield of maize, $y$ tonnes/hectare, in 5 experimental plots are given in the table below.
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | }
\hline
$x$ & 0 & 30 & 60 & 90 & 120 \\
\hline
$y$ & 0.5 & 2.5 & 4.7 & 6.2 & 7.4 \\
\hline
\end{tabular}
\end{center}
\begin{enumerate}[label=(\roman*)]
\item Draw a scatter diagram to illustrate these data.
\item Calculate the equation of the regression line of $y$ on $x$.
\item Draw your regression line on your scatter diagram and comment briefly on its fit.
\item Calculate the value of the residual for the data point where $x = 30$ and $y = 2.5$.
\item Use the equation of the regression line to calculate estimates of average yield with nitrogen fertilizer applications of\\
(A) $45 \mathrm {~kg} / \mathrm { hectare }$,\\
(B) $150 \mathrm {~kg} /$ hectare.
\item In a plot where $150 \mathrm {~kg} /$ hectare of nitrogen fertilizer is applied, the average yield of maize is 8.7 tonnes/hectare. Comment on this result.
\end{enumerate}
\hfill \mbox{\textit{OCR MEI S2 2011 Q1 [18]}}