| Exam Board | AQA |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2013 |
| Session | June |
| Marks | 17 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Bivariate data |
| Type | Compare correlation coefficients |
| Difficulty | Standard +0.3 This is a straightforward multi-part question on correlation and regression requiring standard formula application. Parts (a) and (c)(i) use the PMCC formula r = S_xy/√(S_xx × S_yy), part (c)(iii) uses standard regression formulas b = S_xy/S_xx and a = ȳ - bx̄, and part (c)(ii) is simple substitution. The interpretation and commentary parts require basic understanding but no novel insight. Slightly easier than average due to being purely procedural with all necessary values provided. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.08b Linear coding: effect on pmcc5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line |
| Answer | Marks | Guidance |
|---|---|---|
| \(r_{gy} = \frac{S_{gy}}{\sqrt{S_{gg} \cdot S_{yy}}}= \frac{24.15}{\sqrt{0.1196 \times 5880}}\) | M1 | Correct formula with correct values substituted |
| \(= \frac{24.15}{\sqrt{703.248}} = \frac{24.15}{26.519...}\) | ||
| \(= 0.911\) (3 s.f.) | A1 |
| Answer | Marks |
|---|---|
| \(r_{ly} = \frac{S_{ly}}{\sqrt{S_{ll} \cdot S_{yy}}} = \frac{10.25}{\sqrt{0.0436 \times 5880}}\) | M1 |
| \(= \frac{10.25}{\sqrt{256.368}} = \frac{10.25}{16.011...}\) | |
| \(= 0.640\) (3 s.f.) | A1 |
| Answer | Marks | Guidance |
|---|---|---|
| There is a strong positive correlation between girth and weight — pigs with larger girths tend to weigh more | B1 | In context |
| There is a positive correlation between length and weight — pigs with greater length tend to weigh more (but less strong than girth) | B1 | In context, reference to relative strengths |
| Both correlations are positive | B1 |
| Answer | Marks | Guidance |
|---|---|---|
| \(r_{xy} = \frac{S_{xy}}{\sqrt{S_{xx} \cdot S_{yy}}} = \frac{5662.97}{\sqrt{5656.15 \times 5880}}\) | M1 | |
| \(= \frac{5662.97}{\sqrt{33258162}} = \frac{5662.97}{5767.0...}\) | ||
| \(= 0.982\) (3 s.f.) | A1 | |
| \(x\) is most strongly correlated with \(y\) as \(r_{xy} = 0.982\) is closest to 1 | A1 | Must state \(x\) with reason |
| Answer | Marks | Guidance |
|---|---|---|
| \(x = 69.3 \times 1.25^2 \times 1.15\) | M1 | Substituting \(g = 1.25\), \(l = 1.15\) |
| \(= 69.3 \times 1.5625 \times 1.15 = 124.4\) kg | A1 |
| Answer | Marks | Guidance |
|---|---|---|
| \(b = \frac{S_{xy}}{S_{xx}} = \frac{5662.97}{5656.15}\) | M1 | |
| \(= 1.00121...\) | A1 | |
| \(a = \bar{y} - b\bar{x} = 116.0 - 1.00121 \times 115.4\) | M1 | |
| \(= 116.0 - 115.54... = 0.460\) | ||
| \(y = 0.460 + 1.001x\) | A1 | Accept values rounding to these |
| Answer | Marks | Guidance |
|---|---|---|
| The PMCC of 0.982 is very close to 1, indicating a very strong linear relationship between \(x\) and \(y\) | B1 | |
| The value of \(b \approx 1\), close to 1, suggesting the formula gives good estimates | B1 | |
| The value of \(a \approx 0.46\), close to 0, so the regression line nearly passes through the origin | B1 | |
| Overall the estimate of 124.4 kg is likely to be quite accurate/reliable | B1 | Must give conclusion in context |
# Question 4:
## Part (a)(i) - girth and weight PMCC
$r_{gy} = \frac{S_{gy}}{\sqrt{S_{gg} \cdot S_{yy}}}= \frac{24.15}{\sqrt{0.1196 \times 5880}}$ | M1 | Correct formula with correct values substituted |
$= \frac{24.15}{\sqrt{703.248}} = \frac{24.15}{26.519...}$ | | |
$= 0.911$ (3 s.f.) | A1 | |
## Part (a)(ii) - length and weight PMCC
$r_{ly} = \frac{S_{ly}}{\sqrt{S_{ll} \cdot S_{yy}}} = \frac{10.25}{\sqrt{0.0436 \times 5880}}$ | M1 | |
$= \frac{10.25}{\sqrt{256.368}} = \frac{10.25}{16.011...}$ | | |
$= 0.640$ (3 s.f.) | A1 | |
## Part (b) - Interpretation
There is a strong positive correlation between girth and weight — pigs with larger girths tend to weigh more | B1 | In context |
There is a positive correlation between length and weight — pigs with greater length tend to weigh more (but less strong than girth) | B1 | In context, reference to relative strengths |
Both correlations are positive | B1 | |
## Part (c)(i) - Third PMCC
$r_{xy} = \frac{S_{xy}}{\sqrt{S_{xx} \cdot S_{yy}}} = \frac{5662.97}{\sqrt{5656.15 \times 5880}}$ | M1 | |
$= \frac{5662.97}{\sqrt{33258162}} = \frac{5662.97}{5767.0...}$ | | |
$= 0.982$ (3 s.f.) | A1 | |
$x$ is most strongly correlated with $y$ as $r_{xy} = 0.982$ is closest to 1 | A1 | Must state $x$ with reason |
## Part (c)(ii) - Estimate weight
$x = 69.3 \times 1.25^2 \times 1.15$ | M1 | Substituting $g = 1.25$, $l = 1.15$ |
$= 69.3 \times 1.5625 \times 1.15 = 124.4$ kg | A1 | |
## Part (c)(iii) - Regression line
$b = \frac{S_{xy}}{S_{xx}} = \frac{5662.97}{5656.15}$ | M1 | |
$= 1.00121...$ | A1 | |
$a = \bar{y} - b\bar{x} = 116.0 - 1.00121 \times 115.4$ | M1 | |
$= 116.0 - 115.54... = 0.460$ | | |
$y = 0.460 + 1.001x$ | A1 | Accept values rounding to these |
## Part (c)(iv) - Comment on accuracy
The PMCC of 0.982 is very close to 1, indicating a very strong linear relationship between $x$ and $y$ | B1 | |
The value of $b \approx 1$, close to 1, suggesting the formula gives good estimates | B1 | |
The value of $a \approx 0.46$, close to 0, so the regression line nearly passes through the origin | B1 | |
Overall the estimate of 124.4 kg is likely to be quite accurate/reliable | B1 | Must give conclusion in context |
---
4 The girth, $g$ metres, the length, $l$ metres, and the weight, $y$ kilograms, of each of a sample of 20 pigs were measured.
The data collected is summarised as follows.
$$S _ { g g } = 0.1196 \quad S _ { l l } = 0.0436 \quad S _ { y y } = 5880 \quad S _ { g y } = 24.15 \quad S _ { l y } = 10.25$$
\begin{enumerate}[label=(\alph*)]
\item Calculate the value of the product moment correlation coefficient between:
\begin{enumerate}[label=(\roman*)]
\item girth and weight;
\item length and weight.
\end{enumerate}\item Interpret, in context, each of the values that you obtained in part (a).
\item Weighing pigs requires expensive equipment, whereas measuring their girths and lengths simply requires a tape measure. With this in mind, the following formula is proposed to make an estimate of a pig's weight, $x$ kilograms, from its girth and length.
$$x = 69.3 \times g ^ { 2 } \times l$$
Applying this formula to the relevant data on the 20 pigs resulted in
$$S _ { x x } = 5656.15 \quad S _ { x y } = 5662.97$$
\begin{enumerate}[label=(\roman*)]
\item By calculating a third value of the product moment correlation coefficient, state which of $g , l$ or $x$ is the most strongly correlated with $y$, the weight.
\item Estimate the weight of a pig that has a girth of 1.25 metres and a length of 1.15 metres.
\item Given the additional information that $\bar { x } = 115.4$ and $\bar { y } = 116.0$, calculate the equation of the least squares regression line of $y$ on $x$, in the form $y = a + b x$.
\item Comment on the likely accuracy of the estimated weight found in part (c)(ii). Your answer should make reference to the value of the product moment correlation coefficient found in part (c)(i) and to the values of $b$ and $a$ found in part (c)(iii).\\
(4 marks)
\end{enumerate}\end{enumerate}
\hfill \mbox{\textit{AQA S1 2013 Q4 [17]}}