OCR MEI S2 2011 June — Question 1 18 marks

Exam BoardOCR MEI
ModuleS2 (Statistics 2)
Year2011
SessionJune
Marks18
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeDraw scatter diagram from data
DifficultyEasy -1.2 This is a standard, routine S2 linear regression question requiring only textbook procedures: plotting points, calculating regression equation using formulas, computing residuals, and making predictions. All steps are algorithmic with no problem-solving or novel insight required, making it easier than average A-level questions.
Spec5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context

1 An experiment is performed to determine the response of maize to nitrogen fertilizer. Data for the amount of nitrogen fertilizer applied, \(x \mathrm {~kg} / \mathrm { hectare }\), and the average yield of maize, \(y\) tonnes/hectare, in 5 experimental plots are given in the table below.
\(x\)0306090120
\(y\)0.52.54.76.27.4
  1. Draw a scatter diagram to illustrate these data.
  2. Calculate the equation of the regression line of \(y\) on \(x\).
  3. Draw your regression line on your scatter diagram and comment briefly on its fit.
  4. Calculate the value of the residual for the data point where \(x = 30\) and \(y = 2.5\).
  5. Use the equation of the regression line to calculate estimates of average yield with nitrogen fertilizer applications of
    (A) \(45 \mathrm {~kg} / \mathrm { hectare }\),
    (B) \(150 \mathrm {~kg} /\) hectare.
  6. In a plot where \(150 \mathrm {~kg} /\) hectare of nitrogen fertilizer is applied, the average yield of maize is 8.7 tonnes/hectare. Comment on this result.

Question 1:
Part 1(i)
AnswerMarks Guidance
AnswerMarks Guidance
Axes drawn correctlyG1 Condone axes drawn either way; axes should show some indication of scale
Values of \(x\) plottedG1 If axes scaled and only one point incorrectly plotted, allow max G2/3
Values of \(y\) plottedG1
Total: 3
Part 1(ii)
AnswerMarks Guidance
AnswerMarks Guidance
\(\bar{x} = 60\), \(\bar{y} = 4.26\)B1 For \(\bar{x}\) and \(\bar{y}\) used appropriately (SOI); allow \(\bar{y} = 4.3\)
\(b = \frac{S_{xy}}{S_{xx}} = \frac{1803 - 300 \times 21.3/5}{27000 - 300^2/5} = \frac{525}{9000} = 0.0583\)M1 Attempt should be correct – e.g. evidence of either of the two suggested methods
OR \(b = \frac{1803/5 - 60 \times 4.26}{27000/5 - 60^2} = \frac{105}{1800} = 0.0583\)A1 Allow 0.058; condone \(0.058\hat{3}\) and \(\frac{7}{120}\)
\(y - \bar{y} = b(x - \bar{x})\) \(\Rightarrow y - 4.26 = 0.0583(x - 60)\) \(\Rightarrow y = 0.0583x + 0.76\)M1 Dependent on first M1; values must be substituted; condone use of their \(b\) for FT provided \(b > 0\)
Complete equationA1 FT Final equation must be simplified; \(b = 0.058\) leads to \(y = 0.058x + 0.78\)
Total: 5
Part 1(iii)
AnswerMarks Guidance
AnswerMarks Guidance
Regression line plotted on graphG1 Line must pass through their \((\bar{x}, \bar{y})\) and \(y\)-intercept
G1
The fit is goodE1 E0 for notably inaccurate graphs/lines
Total: 3
Part 1(iv)
AnswerMarks Guidance
AnswerMarks Guidance
\(x = 30 \Rightarrow\) predicted \(y = 0.0583 \times 30 + 0.76 = 2.509\)B1 Using their equation
Residual \(= 2.5 - 2.509 = -0.009\)M1 Subtraction can be 'either way' but sign of residual must be correct
A1 FTFT sensible equations only; no FT for \(y = 0.071x\) leading to \(+0.37\); \([c = 0.78\) leads to residual of \(-0.02]\)
Total: 3
Part 1(v)
AnswerMarks Guidance
AnswerMarks Guidance
(A) For \(x = 45\): \(y = 0.0583 \times 45 + 0.76 = 3.4\)M1 For at least one prediction attempted; prediction from their equation
(B) For \(x = 150\): \(y = 0.0583 \times 150 + 0.76 = 9.5\)A1 FT Both answers; FT their equation provided \(b > 0\)
Total: 2
Part 1(vi)
AnswerMarks Guidance
AnswerMarks Guidance
This is well below the predicted value…E1 Value (8.7) is significantly below what is expected (9.5); simply saying 'below' is not sufficient
…suggesting that the model breaks down for larger values of \(x\)E1 Suitable comment on extrapolation; allow other sensible comments e.g. 'data might be better modelled by a curve', 'there may be other factors affecting yield'
Total: 2
# Question 1:

## Part 1(i)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Axes drawn correctly | G1 | Condone axes drawn either way; axes should show some indication of scale |
| Values of $x$ plotted | G1 | If axes scaled and only one point incorrectly plotted, allow max G2/3 |
| Values of $y$ plotted | G1 | |
| **Total: 3** | | |

## Part 1(ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $\bar{x} = 60$, $\bar{y} = 4.26$ | B1 | For $\bar{x}$ and $\bar{y}$ used appropriately (SOI); allow $\bar{y} = 4.3$ |
| $b = \frac{S_{xy}}{S_{xx}} = \frac{1803 - 300 \times 21.3/5}{27000 - 300^2/5} = \frac{525}{9000} = 0.0583$ | M1 | Attempt should be correct – e.g. evidence of either of the two suggested methods |
| OR $b = \frac{1803/5 - 60 \times 4.26}{27000/5 - 60^2} = \frac{105}{1800} = 0.0583$ | A1 | Allow 0.058; condone $0.058\hat{3}$ and $\frac{7}{120}$ |
| $y - \bar{y} = b(x - \bar{x})$ $\Rightarrow y - 4.26 = 0.0583(x - 60)$ $\Rightarrow y = 0.0583x + 0.76$ | M1 | Dependent on first M1; values must be substituted; condone use of their $b$ for FT provided $b > 0$ |
| Complete equation | A1 FT | Final equation must be simplified; $b = 0.058$ leads to $y = 0.058x + 0.78$ |
| **Total: 5** | | |

## Part 1(iii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Regression line plotted on graph | G1 | Line must pass through their $(\bar{x}, \bar{y})$ and $y$-intercept |
| | G1 | |
| The fit is good | E1 | E0 for notably inaccurate graphs/lines |
| **Total: 3** | | |

## Part 1(iv)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $x = 30 \Rightarrow$ predicted $y = 0.0583 \times 30 + 0.76 = 2.509$ | B1 | Using their equation |
| Residual $= 2.5 - 2.509 = -0.009$ | M1 | Subtraction can be 'either way' but sign of residual must be correct |
| | A1 FT | FT sensible equations only; no FT for $y = 0.071x$ leading to $+0.37$; $[c = 0.78$ leads to residual of $-0.02]$ |
| **Total: 3** | | |

## Part 1(v)
| Answer | Marks | Guidance |
|--------|-------|----------|
| (A) For $x = 45$: $y = 0.0583 \times 45 + 0.76 = 3.4$ | M1 | For at least one prediction attempted; prediction from their equation |
| (B) For $x = 150$: $y = 0.0583 \times 150 + 0.76 = 9.5$ | A1 FT | Both answers; FT their equation provided $b > 0$ |
| **Total: 2** | | |

## Part 1(vi)
| Answer | Marks | Guidance |
|--------|-------|----------|
| This is **well** below the predicted value… | E1 | Value (8.7) is significantly below what is expected (9.5); simply saying 'below' is not sufficient |
| …suggesting that the model breaks down for larger values of $x$ | E1 | Suitable comment on extrapolation; allow other sensible comments e.g. 'data might be better modelled by a curve', 'there may be other factors affecting yield' |
| **Total: 2** | | |

---
1 An experiment is performed to determine the response of maize to nitrogen fertilizer. Data for the amount of nitrogen fertilizer applied, $x \mathrm {~kg} / \mathrm { hectare }$, and the average yield of maize, $y$ tonnes/hectare, in 5 experimental plots are given in the table below.

\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | }
\hline
$x$ & 0 & 30 & 60 & 90 & 120 \\
\hline
$y$ & 0.5 & 2.5 & 4.7 & 6.2 & 7.4 \\
\hline
\end{tabular}
\end{center}
\begin{enumerate}[label=(\roman*)]
\item Draw a scatter diagram to illustrate these data.
\item Calculate the equation of the regression line of $y$ on $x$.
\item Draw your regression line on your scatter diagram and comment briefly on its fit.
\item Calculate the value of the residual for the data point where $x = 30$ and $y = 2.5$.
\item Use the equation of the regression line to calculate estimates of average yield with nitrogen fertilizer applications of\\
(A) $45 \mathrm {~kg} / \mathrm { hectare }$,\\
(B) $150 \mathrm {~kg} /$ hectare.
\item In a plot where $150 \mathrm {~kg} /$ hectare of nitrogen fertilizer is applied, the average yield of maize is 8.7 tonnes/hectare. Comment on this result.
\end{enumerate}

\hfill \mbox{\textit{OCR MEI S2 2011 Q1 [18]}}