| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Marks | 15 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate regression line then predict |
| Difficulty | Standard +0.3 This is a standard S1 regression question requiring routine calculations: summing data, applying regression formulas, making predictions, and finding correlation from two regression lines. All techniques are textbook exercises with no novel problem-solving required. The multi-part structure and computational work make it slightly easier than average for A-level, but the straightforward application of standard formulas keeps it close to typical difficulty. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.08d Hypothesis test: Pearson correlation5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context |
| Student | \(A\) | \(B\) | \(C\) | \(D\) | \(E\) | \(F\) | \(G\) | \(H\) | \(I\) | \(J\) |
| Module 1 \((x)\) | \(54\) | \(33\) | \(42\) | \(71\) | \(60\) | \(27\) | \(39\) | \(46\) | \(59\) | \(64\) |
| Module 2 \((y)\) | \(50\) | \(22\) | \(44\) | \(58\) | \(42\) | \(19\) | \(35\) | \(46\) | \(55\) | \(60\) |
| Answer | Marks | Guidance |
|---|---|---|
| (a) \(\sum x = 495, \sum y = 431\) | B1 B1 | |
| (b) \(S_{xx} = 1850.5, S_{xy} = 1656.5\) giving \(y - 43.1 = 0.895(x - 49.5)\) or \(y = 0.895x - 1.21\) | B1 B1 M1 A1 A1 | |
| (c) (i) 57, (ii) 3; (ii) less reliable – outside range of given values | M1 A1 A1 B1 | |
| (d) \(r = \sqrt{0.895 \times 0.921} = 0.908\). Quite good positive correlation | M1 M1 A1 B1 | Total: 15 marks |
(a) $\sum x = 495, \sum y = 431$ | B1 B1 |
(b) $S_{xx} = 1850.5, S_{xy} = 1656.5$ giving $y - 43.1 = 0.895(x - 49.5)$ or $y = 0.895x - 1.21$ | B1 B1 M1 A1 A1 |
(c) (i) 57, (ii) 3; (ii) less reliable – outside range of given values | M1 A1 A1 B1 |
(d) $r = \sqrt{0.895 \times 0.921} = 0.908$. Quite good positive correlation | M1 M1 A1 B1 | Total: 15 marks
The marks out of 75 obtained by a group of ten students in their first and second Statistics modules were as follows:
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline
Student & $A$ & $B$ & $C$ & $D$ & $E$ & $F$ & $G$ & $H$ & $I$ & $J$ \\
\hline
Module 1 $(x)$ & $54$ & $33$ & $42$ & $71$ & $60$ & $27$ & $39$ & $46$ & $59$ & $64$ \\
\hline
Module 2 $(y)$ & $50$ & $22$ & $44$ & $58$ & $42$ & $19$ & $35$ & $46$ & $55$ & $60$ \\
\hline
\end{tabular}
\begin{enumerate}[label=(\alph*)]
\item Find $\sum x$ and $\sum y$. [2 marks]
\end{enumerate}
Given that $\sum x^2 = 26353$ and $\sum xy = 22991$,
\begin{enumerate}[label=(\alph*)]
\setcounter{enumi}{1}
\item obtain the equation of the regression line of $y$ on $x$. [5 marks]
\item Estimate the Module 2 result of a student whose mark in Module 1 was (i) 65, (ii) 5. Explain why one of these estimates is less reliable than the other. [4 marks]
\end{enumerate}
The equation of the regression line of $x$ on $y$ is $x = 0.921y + 9.81$.
\begin{enumerate}[label=(\alph*)]
\setcounter{enumi}{3}
\item Deduce the product moment correlation coefficient between $x$ and $y$, and briefly interpret its value. [4 marks]
\end{enumerate}
\hfill \mbox{\textit{Edexcel S1 Q6 [15]}}