OCR MEI S2 2015 June — Question 1 17 marks

Exam BoardOCR MEI
ModuleS2 (Statistics 2)
Year2015
SessionJune
Marks17
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeCalculate y on x from raw data table
DifficultyModerate -0.5 This is a straightforward linear regression question requiring standard calculations (means, Sxx, Sxy, regression equation) from a small data table, plus basic residual calculation and interpretation. The only slight variation is part (v) with the regression through origin formula, but this is given explicitly. Below average difficulty as it's purely procedural with no conceptual challenges or problem-solving required.
Spec2.02c Scatter diagrams and regression lines5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression5.09e Use regression: for estimation in context

1 A random sample of wheat seedlings is planted and their growth is measured. The table shows their average growth, \(y \mathrm {~mm}\), at half-day intervals.
Time \(t\) days00.511.522.53
Average growth \(y \mathrm {~mm}\)072133455662
  1. Draw a scatter diagram to illustrate these data.
  2. Calculate the equation of the regression line of \(y\) on \(t\).
  3. Calculate the value of the residual for the data point at which \(t = 2\).
  4. Use the equation of the regression line to calculate an estimate of the average growth after 5 days for wheat seedlings. Comment on the reliability of this estimate. It is suggested that it would be better to replace the regression line by a line which passes through the origin. You are given that the equation of such a line is \(y = a t\), where \(a = \frac { \sum y t } { \sum t ^ { 2 } }\).
  5. Find the equation of this line and plot the line on your scatter diagram.

Question 1:
Part (i)
AnswerMarks Guidance
AnswerMarks Guidance
Both axes labeled (allow \(t\) and \(y\)) with indication of scaleG1* Allow axes interchanged; condone \(x\) for \(t\)
Values of time plotted; BOD if \((0,0)\) not clearly visibleG1dep* (evenly spaced)
Values of average growth plotted; BOD if \((0,0)\) not clearly visibleG1dep* Visually correct; SC1 for points having correct distribution and G0* awarded; line through origin rewarded in part (v)
Part (ii)
AnswerMarks Guidance
AnswerMarks Guidance
\(\bar{t} = 1.5,\ \bar{y} = 32\)B1 For \(\bar{t}\) and \(\bar{y}\) seen or implied by final answer; seen either in calculating \(b\) or forming equation
\(b = \dfrac{S_{yt}}{S_{tt}} = \dfrac{490-(224\times10.5/7)}{22.75-10.5^2/7} = \dfrac{154}{7} = 22\) OR \(b = \dfrac{490/7-(32\times1.5)}{22.75/7-1.5^2} = \dfrac{22}{1} = 22\)M1 For attempt at gradient \(b\); correct structure needed; FT their \(\bar{t}\) and \(\bar{y}\) for M1
For \(22\)A1 CAO
\(y - \bar{y} = b(t - \bar{t})\)M1 For equation of line; with their \(b > 0\), \(\bar{t}\) and \(\bar{y}\)
\(\Rightarrow y - 32 = 22(t-1.5)\) \(\Rightarrow y = 22t - 1\)A1 CAO; A0 for \(y = 22x - 1\)
Part (iii)
AnswerMarks Guidance
AnswerMarks Guidance
\(t=2 \Rightarrow\) predicted average growth \(= (22\times2)-1 = 43\)B1 For prediction; FT their equation
Residual \(= 45 - 43 = 2\)M1, A1 For subtraction (either way); FT; \(45 -\) their prediction
Part (iv)
AnswerMarks Guidance
AnswerMarks Guidance
\((22\times5)-1 = 109\)B1 Estimate calculated using equation; FT their equation
Likely to be unreliable as extrapolation (oe)B1
Part (v)
AnswerMarks Guidance
AnswerMarks Guidance
\(a = \dfrac{490}{22.75} = 21.538\ldots = 21.5\) (3 s.f.)M1, A1
Equation is \(y = 21.5t\)A1 Allow \(y = 21.54t\) CAO; A0 if axes not scaled or \(a \neq 21.5\) to 3 sf; allow \(y = (280/13)t\)
Line correctly plotted on diagramA1 Through \((0,0)\) and between \((3,64)\) and \((3,65)\)
# Question 1:

## Part (i)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Both axes labeled (allow $t$ and $y$) with indication of scale | G1* | Allow axes interchanged; condone $x$ for $t$ |
| Values of time plotted; BOD if $(0,0)$ not clearly visible | G1dep* | (evenly spaced) |
| Values of average growth plotted; BOD if $(0,0)$ not clearly visible | G1dep* | Visually correct; SC1 for points having correct distribution and G0* awarded; line through origin rewarded in part (v) |

## Part (ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $\bar{t} = 1.5,\ \bar{y} = 32$ | B1 | For $\bar{t}$ and $\bar{y}$ seen or implied by final answer; seen either in calculating $b$ or forming equation |
| $b = \dfrac{S_{yt}}{S_{tt}} = \dfrac{490-(224\times10.5/7)}{22.75-10.5^2/7} = \dfrac{154}{7} = 22$ OR $b = \dfrac{490/7-(32\times1.5)}{22.75/7-1.5^2} = \dfrac{22}{1} = 22$ | M1 | For attempt at gradient $b$; correct structure needed; FT their $\bar{t}$ and $\bar{y}$ for M1 |
| For $22$ | A1 | CAO |
| $y - \bar{y} = b(t - \bar{t})$ | M1 | For equation of line; with their $b > 0$, $\bar{t}$ and $\bar{y}$ |
| $\Rightarrow y - 32 = 22(t-1.5)$ $\Rightarrow y = 22t - 1$ | A1 | CAO; A0 for $y = 22x - 1$ |

## Part (iii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $t=2 \Rightarrow$ predicted average growth $= (22\times2)-1 = 43$ | B1 | For prediction; FT their equation |
| Residual $= 45 - 43 = 2$ | M1, A1 | For subtraction (either way); FT; $45 -$ their prediction |

## Part (iv)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $(22\times5)-1 = 109$ | B1 | Estimate calculated using equation; FT their equation |
| Likely to be **unreliable as extrapolation** (oe) | B1 | |

## Part (v)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $a = \dfrac{490}{22.75} = 21.538\ldots = 21.5$ (3 s.f.) | M1, A1 | |
| Equation is $y = 21.5t$ | A1 | Allow $y = 21.54t$ CAO; A0 if axes not scaled or $a \neq 21.5$ to 3 sf; allow $y = (280/13)t$ |
| Line correctly plotted on diagram | A1 | Through $(0,0)$ and between $(3,64)$ and $(3,65)$ |

---
1 A random sample of wheat seedlings is planted and their growth is measured. The table shows their average growth, $y \mathrm {~mm}$, at half-day intervals.

\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | }
\hline
Time $t$ days & 0 & 0.5 & 1 & 1.5 & 2 & 2.5 & 3 \\
\hline
Average growth $y \mathrm {~mm}$ & 0 & 7 & 21 & 33 & 45 & 56 & 62 \\
\hline
\end{tabular}
\end{center}

(i) Draw a scatter diagram to illustrate these data.\\
(ii) Calculate the equation of the regression line of $y$ on $t$.\\
(iii) Calculate the value of the residual for the data point at which $t = 2$.\\
(iv) Use the equation of the regression line to calculate an estimate of the average growth after 5 days for wheat seedlings. Comment on the reliability of this estimate.

It is suggested that it would be better to replace the regression line by a line which passes through the origin. You are given that the equation of such a line is $y = a t$, where $a = \frac { \sum y t } { \sum t ^ { 2 } }$.\\
(v) Find the equation of this line and plot the line on your scatter diagram.

\hfill \mbox{\textit{OCR MEI S2 2015 Q1 [17]}}