| Exam Board | OCR MEI |
|---|---|
| Module | S2 (Statistics 2) |
| Year | 2006 |
| Session | June |
| Marks | 18 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Hypothesis test of Pearson’s product-moment correlation coefficient |
| Type | Calculate PMCC from summary statistics |
| Difficulty | Standard +0.3 This is a standard S2 hypothesis testing question requiring routine application of the PMCC formula from summary statistics and comparison with critical values. Part (i) is straightforward calculation, part (ii) is textbook hypothesis test procedure, parts (iii) and (iv) test understanding of significance levels and experimental design but require only recall of standard concepts. Slightly easier than average due to being methodical rather than requiring problem-solving insight. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.08d Hypothesis test: Pearson correlation |
| Answer | Marks |
|---|---|
| \(S_{xx} = 2237725 - \frac{4715^2}{10} = 14693.75\) | M1 |
| \(S_{yy} = 17455825 - \frac{13175^2}{10} = 551.25\) | A1 |
| \(S_{xy} = 6235575 - \frac{4715 \times 13175}{10} = 2496.25\) | A1 |
| \(r = \frac{2496.25}{\sqrt{14693.75 \times 551.25}}\) | M1 |
| \(r = 0.877\) | A1 |
| Answer | Marks |
|---|---|
| \(H_0: \rho = 0\); \(H_1: \rho \neq 0\) (\(\rho\) = population correlation coefficient) | B1 |
| Critical value at 5%, \(n=10\): \(r = 0.6319\) | B1 |
| \(0.877 > 0.6319\), reject \(H_0\) | M1 |
| Significant evidence of correlation between length and circumference | A1 A1 A1 |
| Answer | Marks |
|---|---|
| There is a 5% probability of rejecting \(H_0\) when it is in fact true | B1 B1 |
| Answer | Marks |
|---|---|
| Advantage: less likely to incorrectly reject \(H_0\) / fewer Type I errors | B1 |
| Disadvantage: less likely to detect a genuine correlation / more Type II errors | B1 |
| Answer | Marks |
|---|---|
| Wrong to ignore first result; both results should be considered | B1 |
| Could combine samples or take larger sample | B1 B1 |
# Question 3:
## Part (i)
$S_{xx} = 2237725 - \frac{4715^2}{10} = 14693.75$ | M1 |
$S_{yy} = 17455825 - \frac{13175^2}{10} = 551.25$ | A1 |
$S_{xy} = 6235575 - \frac{4715 \times 13175}{10} = 2496.25$ | A1 |
$r = \frac{2496.25}{\sqrt{14693.75 \times 551.25}}$ | M1 |
$r = 0.877$ | A1 |
## Part (ii)
$H_0: \rho = 0$; $H_1: \rho \neq 0$ ($\rho$ = population correlation coefficient) | B1 |
Critical value at 5%, $n=10$: $r = 0.6319$ | B1 |
$0.877 > 0.6319$, reject $H_0$ | M1 |
Significant evidence of correlation between length and circumference | A1 A1 A1 |
## Part (iii)(A)
There is a 5% probability of rejecting $H_0$ when it is in fact true | B1 B1 |
## Part (iii)(B)
Advantage: less likely to incorrectly reject $H_0$ / fewer Type I errors | B1 |
Disadvantage: less likely to detect a genuine correlation / more Type II errors | B1 |
## Part (iv)
Wrong to ignore first result; both results should be considered | B1 |
Could combine samples or take larger sample | B1 B1 |
---
3 A student is investigating the relationship between the length $x \mathrm {~mm}$ and circumference $y \mathrm {~mm}$ of plums from a large crop. The student measures the dimensions of a random sample of 10 plums from this crop. Summary statistics for these dimensions are as follows.
$$\begin{aligned}
& \sum x = 4715 \quad \sum y = 13175 \quad \sum x ^ { 2 } = 2237725 \\
& \sum y ^ { 2 } = 17455825 \quad \sum x y = 6235575 \quad n = 10
\end{aligned}$$
\begin{enumerate}[label=(\roman*)]
\item Calculate the sample product moment correlation coefficient.
\item Carry out a hypothesis test at the $5 \%$ significance level to determine whether there is any correlation between length and circumference of plums from this crop. State your hypotheses clearly, defining any symbols which you use.
\item (A) Explain the meaning of a 5\% significance level.\\
(B) State one advantage and one disadvantage of using a $1 \%$ significance level rather than a $5 \%$ significance level in a hypothesis test.
The student decides to take another random sample of 10 plums. Using the same hypotheses as in part (ii), the correlation coefficient for this second sample is significant at the $5 \%$ level. The student decides to ignore the first result and concludes that there is correlation between the length and circumference of plums in the crop.
\item Comment on the student's decision to ignore the first result. Suggest a better way in which the student could proceed.
\end{enumerate}
\hfill \mbox{\textit{OCR MEI S2 2006 Q3 [18]}}