Edexcel S1 2012 June — Question 3 15 marks

Exam BoardEdexcel
ModuleS1 (Statistics 1)
Year2012
SessionJune
Marks15
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeCalculate y on x from raw data table
DifficultyModerate -0.5 This is a straightforward linear regression question with standard bookwork calculations (Spt, Spp, regression line equation) using given summary statistics. While it requires multiple steps and careful arithmetic, all techniques are routine S1 procedures with no conceptual challenges or novel problem-solving required. Slightly easier than average due to the scaffolded structure and provision of key sums.
Spec2.02c Scatter diagrams and regression lines5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context

3. A scientist is researching whether or not birds of prey exposed to pollutants lay eggs with thinner shells. He collects a random sample of egg shells from each of 6 different nests and tests for pollutant level, \(p\), and measures the thinning of the shell, \(t\). The results are shown in the table below.
\(p\)3830251512
\(t\)1391056
[You may use \(\sum p ^ { 2 } = 1967\) and \(\sum p t = 694\) ]
  1. Draw a scatter diagram on the axes on page 7 to represent these data.
  2. Explain why a linear regression model may be appropriate to describe the relationship between \(p\) and \(t\).
  3. Calculate the value of \(S _ { p t }\) and the value of \(S _ { p p }\).
  4. Find the equation of the regression line of \(t\) on \(p\), giving your answer in the form \(t = a + b p\).
  5. Plot the point ( \(\bar { p } , \bar { t }\) ) and draw the regression line on your scatter diagram. The scientist reviews similar studies and finds that pollutant levels above 16 are likely to result in the death of a chick soon after hatching.
  6. Estimate the minimum thinning of the shell that is likely to result in the death of a chick. \includegraphics[max width=\textwidth, alt={}, center]{0593544d-392d-465b-b922-c9cb1435abb5-05_1257_1568_301_173}

Question 3:
Part (a):
AnswerMarks Guidance
Answer/WorkingMarks Guidance
Points plotted correctlyB2 B2 for all 6 data points plotted correctly; B1 for any 5 correct; points not wholly outside circles
Line \(t = 0.318p + 0.741\) drawnB1, B1 Use overlay
Part (b):
AnswerMarks Guidance
Answer/WorkingMarks Guidance
Points (appear to) lie close to a (straight) line, or "strong/high correlation"B1
Part (c):
AnswerMarks Guidance
Answer/WorkingMarks Guidance
\(\sum p = 93\) and \(\sum t = 34\)M1 May be seen in table; allow \(80 < \sum p < 100\) and \(30 < \sum t < 40\)
\(S_{pt} = 694 - \frac{\text{"93"} \times \text{"34"}}{6} = 167\) or \(S_{pp} = 1967 - \frac{\text{"93"}^2}{6} = 525.5\)M1 2nd M1 for one correct expression for \(S_{pt}\) or \(S_{pp}\), f.t. their sums
\(S_{pt} = 167\); \(S_{pp} =\) awrt \(526\)A1; A1 1st A1 for \(S_{pt}\), 2nd for \(S_{pp}\)
Part (d):
AnswerMarks Guidance
Answer/WorkingMarks Guidance
\(b = \frac{S_{pt}}{S_{pp}} = \frac{\text{"167"}}{\text{"525.5"}} = 0.31779...\)B1ft For correct expression for gradient, f.t. their 167 and 525.5 from (c); check answer if expression not seen
\(a = \frac{\text{"34"}}{6} - \text{"0.31779..."} \times \frac{\text{"93"}}{6} = 0.74088...\) awrt \(0.74\)M1, A1 M1 for correct use of \(a = \bar{t} - b\bar{p}\) f.t. their values; condone 5.6 for \(\bar{t}\)
\(t = 0.741 + 0.318p\) (accept \(a = \frac{2336}{3153}\) and \(b = \frac{334}{1051}\))A1 For a correct equation for \(t\) in terms of \(p\) with \(a\) and \(b\) awrt 3sf
Part (e):
AnswerMarks Guidance
Answer/WorkingMarks Guidance
\((\bar{p}, \bar{t}) = (15.5, 5.7)\) plotted on graph (not wholly outside the circle)B1
Correct line plotted as per overlay. For \(p=5\): \(2 < t < 3\) and for \(p=30\): \(10 < t < 11\); line must stretch roughly as far as the points and go through the \((\bar{p}, \bar{t})\) circleB1
Part (f):
AnswerMarks Guidance
Answer/WorkingMarks Guidance
\(t = \text{"0.741"} + \text{"0.318"} \times 16\)M1 For clear use of their line (equation or on graph) and \(p=16\) to estimate \(t\)
\(= 5.825...\) awrt \(5.8\)A1 For awrt 5.8, even if line not fully correct; accept "\(t > 5.8\)" (oe); answer only 2/2
# Question 3:

## Part (a):
| Answer/Working | Marks | Guidance |
|---|---|---|
| Points plotted correctly | B2 | B2 for all 6 data points plotted correctly; B1 for any 5 correct; points not wholly outside circles |
| Line $t = 0.318p + 0.741$ drawn | B1, B1 | Use overlay |

## Part (b):
| Answer/Working | Marks | Guidance |
|---|---|---|
| Points (appear to) lie close to a (straight) line, or "strong/high correlation" | B1 | |

## Part (c):
| Answer/Working | Marks | Guidance |
|---|---|---|
| $\sum p = 93$ and $\sum t = 34$ | M1 | May be seen in table; allow $80 < \sum p < 100$ and $30 < \sum t < 40$ |
| $S_{pt} = 694 - \frac{\text{"93"} \times \text{"34"}}{6} = 167$ or $S_{pp} = 1967 - \frac{\text{"93"}^2}{6} = 525.5$ | M1 | 2nd M1 for one correct expression for $S_{pt}$ or $S_{pp}$, f.t. their sums |
| $S_{pt} = 167$; $S_{pp} =$ awrt $526$ | A1; A1 | 1st A1 for $S_{pt}$, 2nd for $S_{pp}$ |

## Part (d):
| Answer/Working | Marks | Guidance |
|---|---|---|
| $b = \frac{S_{pt}}{S_{pp}} = \frac{\text{"167"}}{\text{"525.5"}} = 0.31779...$ | B1ft | For correct expression for gradient, f.t. their 167 and 525.5 from (c); check answer if expression not seen |
| $a = \frac{\text{"34"}}{6} - \text{"0.31779..."} \times \frac{\text{"93"}}{6} = 0.74088...$ awrt $0.74$ | M1, A1 | M1 for correct use of $a = \bar{t} - b\bar{p}$ f.t. their values; condone 5.6 for $\bar{t}$ |
| $t = 0.741 + 0.318p$ (accept $a = \frac{2336}{3153}$ and $b = \frac{334}{1051}$) | A1 | For a correct equation for $t$ in terms of $p$ with $a$ and $b$ awrt 3sf |

## Part (e):
| Answer/Working | Marks | Guidance |
|---|---|---|
| $(\bar{p}, \bar{t}) = (15.5, 5.7)$ plotted on graph (not wholly outside the circle) | B1 | |
| Correct line plotted as per overlay. For $p=5$: $2 < t < 3$ and for $p=30$: $10 < t < 11$; line must stretch roughly as far as the points and go through the $(\bar{p}, \bar{t})$ circle | B1 | |

## Part (f):
| Answer/Working | Marks | Guidance |
|---|---|---|
| $t = \text{"0.741"} + \text{"0.318"} \times 16$ | M1 | For clear use of their line (equation or on graph) and $p=16$ to estimate $t$ |
| $= 5.825...$ awrt $5.8$ | A1 | For awrt 5.8, even if line not fully correct; accept "$t > 5.8$" (oe); answer only 2/2 |

---
3. A scientist is researching whether or not birds of prey exposed to pollutants lay eggs with thinner shells. He collects a random sample of egg shells from each of 6 different nests and tests for pollutant level, $p$, and measures the thinning of the shell, $t$. The results are shown in the table below.

\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | }
\hline
$p$ & 3 & 8 & 30 & 25 & 15 & 12 \\
\hline
$t$ & 1 & 3 & 9 & 10 & 5 & 6 \\
\hline
\end{tabular}
\end{center}

[You may use $\sum p ^ { 2 } = 1967$ and $\sum p t = 694$ ]
\begin{enumerate}[label=(\alph*)]
\item Draw a scatter diagram on the axes on page 7 to represent these data.
\item Explain why a linear regression model may be appropriate to describe the relationship between $p$ and $t$.
\item Calculate the value of $S _ { p t }$ and the value of $S _ { p p }$.
\item Find the equation of the regression line of $t$ on $p$, giving your answer in the form $t = a + b p$.
\item Plot the point ( $\bar { p } , \bar { t }$ ) and draw the regression line on your scatter diagram.

The scientist reviews similar studies and finds that pollutant levels above 16 are likely to result in the death of a chick soon after hatching.
\item Estimate the minimum thinning of the shell that is likely to result in the death of a chick.

\includegraphics[max width=\textwidth, alt={}, center]{0593544d-392d-465b-b922-c9cb1435abb5-05_1257_1568_301_173}
\end{enumerate}

\hfill \mbox{\textit{Edexcel S1 2012 Q3 [15]}}