| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2009 |
| Session | January |
| Marks | 11 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate y on x from raw data table |
| Difficulty | Moderate -0.8 This is a standard S1 linear regression question requiring straightforward application of formulas (Sxx, Sxy, regression line) with all summations provided. The calculations are routine, interpretation is basic, and the reliability comment in part (e) is a common textbook exercise. Easier than average A-level due to minimal problem-solving and direct formula application. |
| Spec | 5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context |
| 1.0 | 3.5 | 4.0 | 1.5 | 1.3 | 0.5 | 1.8 | 2.5 | 2.3 | 3.0 | ||
| 5 | 30 | 27 | 10 | - 3 | - 5 | 7 | 15 | - 10 | 20 |
| Answer | Marks | Guidance |
|---|---|---|
| \(S_{xx} = 57.22 - \frac{(21.4)^2}{10} = 11.424\) | M1, A1 | M1 for correct expression; A1 for AWRT 11.4 |
| \(S_{xy} = 313.7 - \frac{21.4 \times 96}{10} = 108.26\) | A1 (3) | A1 for AWRT 108. Both correct scores M1A1A1 |
| Answer | Marks | Guidance |
|---|---|---|
| \(b = \frac{S_{xy}}{S_{xx}} = 9.4765...\) | M1 A1 | 1st M1 for using values in correct formula; 1st A1 for AWRT 9.5 |
| \(a = \bar{y} - b\bar{x} = 9.6 - 2.14b = (-10.679...)\) | M1 | 2nd M1 for correct method for \(a\) (minus sign required) |
| \(y = -10.7 + 9.48x\) | A1 (4) | 2nd A1 for equation with \(a\) and \(b\) AWRT 3 sf. Must have full equation with \(a\) and \(b\) correct to AWRT 3 sf |
| Answer | Marks | Guidance |
|---|---|---|
| Every (extra) hour spent using the programme produces about 9.5 marks improvement | B1ft (1) | Must mention value of \(b\), must mention "marks" and "hour(s)". "…9.5 times per hour…" scores B0 |
| Answer | Marks | Guidance |
|---|---|---|
| \(y = -10.7 + 9.48 \times 3.3 = 20.6\) AWRT 21 | M1, A1 (2) | M1 for sub \(x = 3.3\) into regression equation; A1 for AWRT 21 |
| Answer | Marks | Guidance |
|---|---|---|
| Model may not be valid since [8h is] outside the range [0.5–4] | B1 (1) | B1 for statement that it may not be valid because outside the range. Do not need to mention values 8h or 0.5–4 |
## Question 1:
**(a)**
$S_{xx} = 57.22 - \frac{(21.4)^2}{10} = 11.424$ | M1, A1 | M1 for correct expression; A1 for AWRT 11.4
$S_{xy} = 313.7 - \frac{21.4 \times 96}{10} = 108.26$ | A1 (3) | A1 for AWRT 108. Both correct scores M1A1A1
**(b)**
$b = \frac{S_{xy}}{S_{xx}} = 9.4765...$ | M1 A1 | 1st M1 for using values in correct formula; 1st A1 for AWRT 9.5
$a = \bar{y} - b\bar{x} = 9.6 - 2.14b = (-10.679...)$ | M1 | 2nd M1 for correct method for $a$ (minus sign required)
$y = -10.7 + 9.48x$ | A1 (4) | 2nd A1 for equation with $a$ and $b$ AWRT 3 sf. Must have full equation with $a$ and $b$ correct to AWRT 3 sf
**(c)**
Every (extra) hour spent using the programme produces about 9.5 marks improvement | B1ft (1) | Must mention value of $b$, must mention "marks" and "hour(s)". "…9.5 times per hour…" scores B0
**(d)**
$y = -10.7 + 9.48 \times 3.3 = 20.6$ AWRT 21 | M1, A1 (2) | M1 for sub $x = 3.3$ into regression equation; A1 for AWRT 21
**(e)**
Model may not be valid since [8h is] outside the range [0.5–4] | B1 (1) | B1 for statement that it may not be valid because outside the range. Do not need to mention values 8h or 0.5–4
---
\begin{enumerate}
\item A teacher is monitoring the progress of students using a computer based revision course. The improvement in performance, $y$ marks, is recorded for each student along with the time, $x$ hours, that the student spent using the revision course. The results for a random sample of 10 students are recorded below.
\end{enumerate}
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | c | c | c | c | }
\hline
\begin{tabular}{ c }
$x$ \\
hours \\
\end{tabular} & 1.0 & 3.5 & 4.0 & 1.5 & 1.3 & 0.5 & 1.8 & 2.5 & 2.3 & 3.0 \\
\hline
\begin{tabular}{ c }
$y$ \\
marks \\
\end{tabular} & 5 & 30 & 27 & 10 & - 3 & - 5 & 7 & 15 & - 10 & 20 \\
\hline
\end{tabular}
\end{center}
$$\text { [You may use } \sum x = 21.4 , \quad \sum y = 96 , \quad \sum x ^ { 2 } = 57.22 , \quad \sum x y = 313.7 \text { ] }$$
(a) Calculate $S _ { x x }$ and $S _ { x y }$.\\
(b) Find the equation of the least squares regression line of $y$ on $x$ in the form $y = a + b x$.\\
(c) Give an interpretation of the gradient of your regression line.
Rosemary spends 3.3 hours using the revision course.\\
(d) Predict her improvement in marks.
Lee spends 8 hours using the revision course claiming that this should give him an improvement in performance of over 60 marks.\\
(e) Comment on Lee's claim.\\
\hfill \mbox{\textit{Edexcel S1 2009 Q1 [11]}}