| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2002 |
| Session | November |
| Marks | 12 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Convert regression equation between coded and original |
| Difficulty | Standard +0.3 This is a straightforward S1 linear regression question requiring standard formula application for regression coefficients, back-transformation of coded variables (routine algebraic manipulation), and recognition that correlation is invariant under linear coding. All techniques are textbook exercises with no novel problem-solving required, making it slightly easier than average. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.08b Linear coding: effect on pmcc5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression |
| Answer | Marks |
|---|---|
| \(S_{xx} = 108.07875\); \(S_u = 129.1675\) | B1; B1 |
| \(q = \frac{S_u}{S_{xx}} = \frac{129.1675}{108.07875} = 1.1951239\ldots\) | M1, A1 |
| \(p = \frac{65.0}{8} - (1.951239\ldots) \times \frac{48.5}{8} = 0.879561\ldots\) | M1, A1 |
| \(t = 0.879561 + 1.1951259\ldots x\) | A1 ft |
| (7 marks) |
| Answer | Marks |
|---|---|
| \(y - 20 = 0.879561\ldots + 1.1951239\ldots(x - 6)\) | M1, A1 ft |
| \(y = 13.709 + 1.195x\) | A1 |
| (3 marks) |
| Answer | Marks |
|---|---|
| \(0.943\); the pmcc is an index (no units) and is not affected by linear transformations of either/both variables | B1; B1 |
| (2 marks) |
## (a)
$S_{xx} = 108.07875$; $S_u = 129.1675$ | B1; B1 |
$q = \frac{S_u}{S_{xx}} = \frac{129.1675}{108.07875} = 1.1951239\ldots$ | M1, A1 |
$p = \frac{65.0}{8} - (1.951239\ldots) \times \frac{48.5}{8} = 0.879561\ldots$ | M1, A1 |
$t = 0.879561 + 1.1951259\ldots x$ | A1 ft |
| (7 marks) |
## (b)
$y - 20 = 0.879561\ldots + 1.1951239\ldots(x - 6)$ | M1, A1 ft |
$y = 13.709 + 1.195x$ | A1 |
| (3 marks) |
## (c)
$0.943$; the pmcc is an index (no units) and is not affected by linear transformations of either/both variables | B1; B1 |
| (2 marks) |
**Total: 12 marks**
---
An agricultural researcher collected data, in appropriate units, on the annual rainfall $x$ and the annual yield of wheat $y$ at 8 randomly selected places.
The data were coded using $s = x - 6$ and $t = y - 20$ and the following summations were obtained.
$\Sigma s = 48.5$, $\Sigma t = 65.0$, $\Sigma s^2 = 402.11$, $\Sigma t^2 = 701.80$, $\Sigma st = 523.23$
\begin{enumerate}[label=(\alph*)]
\item Find the equation of the regression line of $t$ on $s$ in the form $t = p + qs$. [7]
\item Find the equation of the regression line of $y$ on $x$ in the form $y = a + bx$, giving $a$ and $b$ to 3 decimal places. [3]
\end{enumerate}
The value of the product moment correlation coefficient between $s$ and $t$ is 0.943, to 3 decimal places.
\begin{enumerate}[label=(\alph*)]
\setcounter{enumi}{2}
\item Write down the value of the product moment correlation coefficient between $x$ and $y$. Give a justification for your answer. [2]
\end{enumerate}
\hfill \mbox{\textit{Edexcel S1 2002 Q5 [12]}}