| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2014 |
| Session | January |
| Marks | 11 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate y on x from raw data table |
| Difficulty | Moderate -0.8 This is a straightforward application of standard linear regression formulas with all summary statistics provided (Stp, Stt, means). Students simply substitute into b = Stp/Stt and a = ȳ - bx̄, then use the equation for prediction. The scatter diagram and comments require minimal interpretation. This is easier than average as it's purely procedural with no problem-solving or conceptual challenges. |
| Spec | 2.02c Scatter diagrams and regression lines2.02d Informal interpretation of correlation |
| \(t\) | 10 | 13 | 17 | 18 | 22 | 24 | 25 | 27 |
| \(p\) | 720 | 650 | 430 | 490 | 500 | 390 | 280 | 300 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| At least 7 points plotted correctly | B1 | Within (not on) circles on overlay |
| All 8 points plotted correctly | B1 | Within (not on) circles on overlay |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| "Negative correlation" or "as \(t\) increases, \(p\) decreases" or "Points close to a straight line" or "linear correlation" | B1 | "negative relationship" or "skew" scores B0; "linear correlation" acceptable |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(b = \frac{S_{tp}}{S_{tt}} = \frac{-6080}{254} (= -23.937)\) | M1 | Correct expression for gradient \(b\); allow fractions e.g. \(-\frac{3040}{127}\) |
| \(a = \bar{p} - b\bar{t} = 470 + 23.937 \times 19.5 = 936.7717\) | M1, A1 | Follow through their \(b\); allow sign slip on \(b\) only if correct formula for \(a\) seen; \(a =\) awrt 937 |
| \(p = 936.7717 - 23.937t\) awrt \(p = 937 - 23.9t\) | A1 | Correct equation in \(p\) and \(t\) (not \(x,y\)) with \(a=\) awrt 937 and \(b=\) awrt \(-23.9\); no fractions |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(p = 937.7717 - 23.937 \times 20 = 458.0315\) awrt £458 | M1, A1 | M1 for substituting \(t=20\); NB use of 3sf for \(a\) and \(b\) gives awrt £459 but scores A0 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Extrapolation or 39 (or it's) outside the range of data (or table) | B1 | Stating 39 is an "outlier" is B0 |
| Not a good decision or the prediction would be unreliable | dB1 | Dependent on suitable reason; stating or implying it is not a sensible decision |
# Question 3:
## Part (a)
| Answer | Marks | Guidance |
|--------|-------|----------|
| At least 7 points plotted correctly | B1 | Within (not on) circles on overlay |
| All 8 points plotted correctly | B1 | Within (not on) circles on overlay |
## Part (b)
| Answer | Marks | Guidance |
|--------|-------|----------|
| "Negative correlation" or "as $t$ increases, $p$ decreases" or "Points close to a straight line" or "linear correlation" | B1 | "negative relationship" or "skew" scores B0; "linear correlation" acceptable |
## Part (c)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $b = \frac{S_{tp}}{S_{tt}} = \frac{-6080}{254} (= -23.937)$ | M1 | Correct expression for gradient $b$; allow fractions e.g. $-\frac{3040}{127}$ |
| $a = \bar{p} - b\bar{t} = 470 + 23.937 \times 19.5 = 936.7717$ | M1, A1 | Follow through their $b$; allow sign slip on $b$ only if correct formula for $a$ seen; $a =$ awrt 937 |
| $p = 936.7717 - 23.937t$ awrt $p = 937 - 23.9t$ | A1 | Correct equation in $p$ and $t$ (not $x,y$) with $a=$ awrt 937 and $b=$ awrt $-23.9$; no fractions |
## Part (d)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $p = 937.7717 - 23.937 \times 20 = 458.0315$ awrt £458 | M1, A1 | M1 for substituting $t=20$; NB use of 3sf for $a$ and $b$ gives awrt £459 but scores A0 |
## Part (e)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Extrapolation or 39 (or it's) outside the range of data (or table) | B1 | Stating 39 is an "outlier" is B0 |
| Not a good decision or the prediction would be unreliable | dB1 | Dependent on suitable reason; stating or implying it is not a sensible decision |
---
3. Jean works for an insurance company. She randomly selects 8 people and records the price of their car insurance, $\pounds p$, and the time, $t$ years, since they passed their driving test. The data is shown in the table below.
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | c | c | }
\hline
$t$ & 10 & 13 & 17 & 18 & 22 & 24 & 25 & 27 \\
\hline
$p$ & 720 & 650 & 430 & 490 & 500 & 390 & 280 & 300 \\
\hline
\end{tabular}
\end{center}
$$\text { (You may use } \bar { t } = 19.5 , \bar { p } = 470 , S _ { t p } = - 6080 , S _ { t t } = 254 , S _ { p p } = 169200 \text { ) }$$
\begin{enumerate}[label=(\alph*)]
\item On the graph below draw a scatter diagram for these data.
\item Comment on the relationship between $p$ and $t$.
\item Find the equation of the regression line of $p$ on $t$.
\item Use your regression equation to estimate the price of car insurance for someone who passed their driving test 20 years ago.
Jack passed his test 39 years ago and decides to use Jean's data to predict the price of his car insurance.
\item Comment on Jack's decision. Give a reason for your answer.\\
\includegraphics[max width=\textwidth, alt={}, center]{a839a89a-17f0-473b-ac10-bcec3dbe97f7-06_951_1365_1603_294}
\end{enumerate}
\hfill \mbox{\textit{Edexcel S1 2014 Q3 [11]}}