| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2001 |
| Session | January |
| Marks | 18 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate y on x from raw data table |
| Difficulty | Moderate -0.8 This is a standard S1 linear regression question following a routine template: calculate Sxx/Sxy/Syy from summary statistics, find correlation coefficient using given formulas, fit regression line, and interpret. All steps are algorithmic with no problem-solving required. The only mild challenge is part (g) requiring understanding of extrapolation, but this is a common textbook concept. Significantly easier than average A-level maths questions. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(S_{xx} = 65.68 - \dfrac{25^2}{10} = 3.18\) | B1 | |
| \(S_{xy} = 130.64 - \dfrac{25\times50.0}{10} = 5.64\) | B1 | |
| \(S_{yy} = 260.68 - \dfrac{500^2}{10} = 10.68\) | B1 (3) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(\text{pmcc} = \dfrac{S_{xy}}{\sqrt{S_{xx}S_{yy}}} = \dfrac{5.64}{\sqrt{3.18\times10.68}} = 0.977\) | M1 A1 A1 (3) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Positive correlation, close to but not perfect correlation | B1 (1) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(b = \dfrac{S_{xy}}{S_{xx}} = \dfrac{5.64}{3.18} = 1.773\) | M1 A1 | |
| \(a = \bar{y} - b\bar{x} = \left(\dfrac{50}{10}\right) - 1.773\times\left(\dfrac{25}{10}\right) = 0.566\) | M1 A1 (4) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(a = 0.566 \Rightarrow\) the cost of reconditioning immediately after it has been reconditioned (i.e. no usage) is £566 | B1 (1) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| (i) \(y = 0.566 + 1.773\times2.4 = 4.814\), i.e. £4814 | M1 A1 (2) | NB: if use \(2400\), not \(2.4\), award M0 |
| (ii) Increase is \(1.773\times1.5 = 2.655\), i.e. increase of £2655 (or \(0.566+1.773\times3.9 - 4.814\)) | M1 A1 (2) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| 4500 hours is well out of the range of \(x\) values (\(x \leq 3.0\)) and thus there is no evidence that the model will apply | B1s, B1ft (2) |
## Question 6:
**Part (a)**
| Answer | Marks | Guidance |
|--------|-------|----------|
| $S_{xx} = 65.68 - \dfrac{25^2}{10} = 3.18$ | B1 | |
| $S_{xy} = 130.64 - \dfrac{25\times50.0}{10} = 5.64$ | B1 | |
| $S_{yy} = 260.68 - \dfrac{500^2}{10} = 10.68$ | B1 (3) | |
**Part (b)**
| Answer | Marks | Guidance |
|--------|-------|----------|
| $\text{pmcc} = \dfrac{S_{xy}}{\sqrt{S_{xx}S_{yy}}} = \dfrac{5.64}{\sqrt{3.18\times10.68}} = 0.977$ | M1 A1 A1 (3) | |
**Part (c)**
| Answer | Marks | Guidance |
|--------|-------|----------|
| Positive correlation, close to but not perfect correlation | B1 (1) | |
**Part (d)**
| Answer | Marks | Guidance |
|--------|-------|----------|
| $b = \dfrac{S_{xy}}{S_{xx}} = \dfrac{5.64}{3.18} = 1.773$ | M1 A1 | |
| $a = \bar{y} - b\bar{x} = \left(\dfrac{50}{10}\right) - 1.773\times\left(\dfrac{25}{10}\right) = 0.566$ | M1 A1 (4) | |
**Part (e)**
| Answer | Marks | Guidance |
|--------|-------|----------|
| $a = 0.566 \Rightarrow$ the cost of reconditioning immediately after it has been reconditioned (i.e. no usage) is £566 | B1 (1) | |
**Part (f)**
| Answer | Marks | Guidance |
|--------|-------|----------|
| (i) $y = 0.566 + 1.773\times2.4 = 4.814$, i.e. £4814 | M1 A1 (2) | NB: if use $2400$, not $2.4$, award M0 |
| (ii) Increase is $1.773\times1.5 = 2.655$, i.e. increase of £2655 (or $0.566+1.773\times3.9 - 4.814$) | M1 A1 (2) | |
**Part (g)**
| Answer | Marks | Guidance |
|--------|-------|----------|
| 4500 hours is well out of the range of $x$ values ($x \leq 3.0$) and thus there is no evidence that the model will apply | B1s, B1ft (2) | |
6. A local authority is investigating the cost of reconditioning its incinerators. Data from 10 randomly chosen incinerators were collected. The variables monitored were the operating time $x$ (in thousands of hours) since last reconditioning and the reconditioning cost $y$ (in $\pounds 1000$ ). None of the incinerators had been used for more than 3000 hours since last reconditioning.
The data are summarised below,
$$\Sigma x = 25.0 , \Sigma x ^ { 2 } = 65.68 , \Sigma y = 50.0 , \Sigma y ^ { 2 } = 260.48 , \Sigma x y = 130.64 .$$
\begin{enumerate}[label=(\alph*)]
\item Find $\mathrm { S } _ { x x } , \mathrm {~S} _ { x y } , \mathrm {~S} _ { y y }$.
\item Calculate the product moment correlation coefficient between $x$ and $y$.
\item Explain why this value might support the fitting of a linear regression model of the form $y = a + b x$.
\item Find the values of $a$ and $b$.
\item Give an interpretation of $a$.
\item Estimate
\begin{enumerate}[label=(\roman*)]
\item the reconditioning cost for an operating time of 2400 hours,
\item the financial effect of an increase of 1500 hours in operating time.
\end{enumerate}\item Suggest why the authority might be cautious about making a prediction of the reconditioning cost of an incinerator which had been operating for 4500 hours since its last reconditioning.
\end{enumerate}
\hfill \mbox{\textit{Edexcel S1 2001 Q6 [18]}}