| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2012 |
| Session | June |
| Marks | 15 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate y on x from raw data table |
| Difficulty | Moderate -0.5 This is a straightforward linear regression question with standard bookwork calculations (Spt, Spp, regression line equation) using given summary statistics. While it requires multiple steps and careful arithmetic, all techniques are routine S1 procedures with no conceptual challenges or novel problem-solving required. Slightly easier than average due to the scaffolded structure and provision of key sums. |
| Spec | 2.02c Scatter diagrams and regression lines5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context |
| \(p\) | 3 | 8 | 30 | 25 | 15 | 12 |
| \(t\) | 1 | 3 | 9 | 10 | 5 | 6 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| Points plotted correctly | B2 | B2 for all 6 data points plotted correctly; B1 for any 5 correct; points not wholly outside circles |
| Line \(t = 0.318p + 0.741\) drawn | B1, B1 | Use overlay |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| Points (appear to) lie close to a (straight) line, or "strong/high correlation" | B1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| \(\sum p = 93\) and \(\sum t = 34\) | M1 | May be seen in table; allow \(80 < \sum p < 100\) and \(30 < \sum t < 40\) |
| \(S_{pt} = 694 - \frac{\text{"93"} \times \text{"34"}}{6} = 167\) or \(S_{pp} = 1967 - \frac{\text{"93"}^2}{6} = 525.5\) | M1 | 2nd M1 for one correct expression for \(S_{pt}\) or \(S_{pp}\), f.t. their sums |
| \(S_{pt} = 167\); \(S_{pp} =\) awrt \(526\) | A1; A1 | 1st A1 for \(S_{pt}\), 2nd for \(S_{pp}\) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| \(b = \frac{S_{pt}}{S_{pp}} = \frac{\text{"167"}}{\text{"525.5"}} = 0.31779...\) | B1ft | For correct expression for gradient, f.t. their 167 and 525.5 from (c); check answer if expression not seen |
| \(a = \frac{\text{"34"}}{6} - \text{"0.31779..."} \times \frac{\text{"93"}}{6} = 0.74088...\) awrt \(0.74\) | M1, A1 | M1 for correct use of \(a = \bar{t} - b\bar{p}\) f.t. their values; condone 5.6 for \(\bar{t}\) |
| \(t = 0.741 + 0.318p\) (accept \(a = \frac{2336}{3153}\) and \(b = \frac{334}{1051}\)) | A1 | For a correct equation for \(t\) in terms of \(p\) with \(a\) and \(b\) awrt 3sf |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| \((\bar{p}, \bar{t}) = (15.5, 5.7)\) plotted on graph (not wholly outside the circle) | B1 | |
| Correct line plotted as per overlay. For \(p=5\): \(2 < t < 3\) and for \(p=30\): \(10 < t < 11\); line must stretch roughly as far as the points and go through the \((\bar{p}, \bar{t})\) circle | B1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| \(t = \text{"0.741"} + \text{"0.318"} \times 16\) | M1 | For clear use of their line (equation or on graph) and \(p=16\) to estimate \(t\) |
| \(= 5.825...\) awrt \(5.8\) | A1 | For awrt 5.8, even if line not fully correct; accept "\(t > 5.8\)" (oe); answer only 2/2 |
# Question 3:
## Part (a):
| Answer/Working | Marks | Guidance |
|---|---|---|
| Points plotted correctly | B2 | B2 for all 6 data points plotted correctly; B1 for any 5 correct; points not wholly outside circles |
| Line $t = 0.318p + 0.741$ drawn | B1, B1 | Use overlay |
## Part (b):
| Answer/Working | Marks | Guidance |
|---|---|---|
| Points (appear to) lie close to a (straight) line, or "strong/high correlation" | B1 | |
## Part (c):
| Answer/Working | Marks | Guidance |
|---|---|---|
| $\sum p = 93$ and $\sum t = 34$ | M1 | May be seen in table; allow $80 < \sum p < 100$ and $30 < \sum t < 40$ |
| $S_{pt} = 694 - \frac{\text{"93"} \times \text{"34"}}{6} = 167$ or $S_{pp} = 1967 - \frac{\text{"93"}^2}{6} = 525.5$ | M1 | 2nd M1 for one correct expression for $S_{pt}$ or $S_{pp}$, f.t. their sums |
| $S_{pt} = 167$; $S_{pp} =$ awrt $526$ | A1; A1 | 1st A1 for $S_{pt}$, 2nd for $S_{pp}$ |
## Part (d):
| Answer/Working | Marks | Guidance |
|---|---|---|
| $b = \frac{S_{pt}}{S_{pp}} = \frac{\text{"167"}}{\text{"525.5"}} = 0.31779...$ | B1ft | For correct expression for gradient, f.t. their 167 and 525.5 from (c); check answer if expression not seen |
| $a = \frac{\text{"34"}}{6} - \text{"0.31779..."} \times \frac{\text{"93"}}{6} = 0.74088...$ awrt $0.74$ | M1, A1 | M1 for correct use of $a = \bar{t} - b\bar{p}$ f.t. their values; condone 5.6 for $\bar{t}$ |
| $t = 0.741 + 0.318p$ (accept $a = \frac{2336}{3153}$ and $b = \frac{334}{1051}$) | A1 | For a correct equation for $t$ in terms of $p$ with $a$ and $b$ awrt 3sf |
## Part (e):
| Answer/Working | Marks | Guidance |
|---|---|---|
| $(\bar{p}, \bar{t}) = (15.5, 5.7)$ plotted on graph (not wholly outside the circle) | B1 | |
| Correct line plotted as per overlay. For $p=5$: $2 < t < 3$ and for $p=30$: $10 < t < 11$; line must stretch roughly as far as the points and go through the $(\bar{p}, \bar{t})$ circle | B1 | |
## Part (f):
| Answer/Working | Marks | Guidance |
|---|---|---|
| $t = \text{"0.741"} + \text{"0.318"} \times 16$ | M1 | For clear use of their line (equation or on graph) and $p=16$ to estimate $t$ |
| $= 5.825...$ awrt $5.8$ | A1 | For awrt 5.8, even if line not fully correct; accept "$t > 5.8$" (oe); answer only 2/2 |
---
3. A scientist is researching whether or not birds of prey exposed to pollutants lay eggs with thinner shells. He collects a random sample of egg shells from each of 6 different nests and tests for pollutant level, $p$, and measures the thinning of the shell, $t$. The results are shown in the table below.
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | }
\hline
$p$ & 3 & 8 & 30 & 25 & 15 & 12 \\
\hline
$t$ & 1 & 3 & 9 & 10 & 5 & 6 \\
\hline
\end{tabular}
\end{center}
[You may use $\sum p ^ { 2 } = 1967$ and $\sum p t = 694$ ]
\begin{enumerate}[label=(\alph*)]
\item Draw a scatter diagram on the axes on page 7 to represent these data.
\item Explain why a linear regression model may be appropriate to describe the relationship between $p$ and $t$.
\item Calculate the value of $S _ { p t }$ and the value of $S _ { p p }$.
\item Find the equation of the regression line of $t$ on $p$, giving your answer in the form $t = a + b p$.
\item Plot the point ( $\bar { p } , \bar { t }$ ) and draw the regression line on your scatter diagram.
The scientist reviews similar studies and finds that pollutant levels above 16 are likely to result in the death of a chick soon after hatching.
\item Estimate the minimum thinning of the shell that is likely to result in the death of a chick.
\includegraphics[max width=\textwidth, alt={}, center]{0593544d-392d-465b-b922-c9cb1435abb5-05_1257_1568_301_173}
\end{enumerate}
\hfill \mbox{\textit{Edexcel S1 2012 Q3 [15]}}