| Exam Board | OCR MEI |
|---|---|
| Module | S2 (Statistics 2) |
| Year | 2011 |
| Session | January |
| Marks | 17 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Hypothesis test of Pearson’s product-moment correlation coefficient |
| Type | Calculate PMCC from summary statistics |
| Difficulty | Standard +0.3 This is a straightforward application of the PMCC formula using given summary statistics, followed by a standard hypothesis test procedure. The calculations are routine, the test follows a textbook template, and while part (iv) requires some interpretation, it's guided by the context. Slightly easier than average due to minimal problem-solving required. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.08d Hypothesis test: Pearson correlation |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| \(H_0: \rho = 0\); \(H_1: \rho \neq 0\) (two-tailed test) | B1 | Condone hypotheses written in words/context. Allow \(H_0\): no correlation between \(x\) & \(y\). If only words used and 'association' mentioned, do not award first B1 and last B1 |
| where \(\rho\) is the population correlation coefficient | B1 | |
| For \(n = 14\), 5% critical value \(= -0.5324\) | B1 (+or-) | One-tailed test \(cv = (-) 0.4575\) |
| Since \(-0.276 > -0.5324\) the result is not significant. Thus we do not have sufficient evidence to reject \(H_0\) | M1, A1 | Comparison between candidate's \(r\) from part (i) and appropriate \(cv\) (signs must match). If result not stated but conclusion correct award SC1 to replace final A1 B1 |
| There is not sufficient evidence at the 5% level to suggest correlation between birth rate and death rate | B1 ft |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| The underlying population must have a bivariate Normal distribution | B1 | Not just 'bivariate' and 'Normal' separately |
| Since the scatter diagram has a roughly elliptical shape | E1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Marks | Guidance |
| Because this data point is a long way from the other data | E1 | Indication that the point is (possibly) an outlier |
| and it is below and to the right of the other data | E1 | Allow in terms of \(x\) and \(y\) |
| It does bring the validity of the test into question | E1 | Allow 'no' only with suitable explanation e.g. sample still too small |
| since this extra data point is so far from other points and so there is less evidence of ellipticity | E1 |
# Question 1 (part ii):
| Answer/Working | Marks | Guidance |
|---|---|---|
| $H_0: \rho = 0$; $H_1: \rho \neq 0$ (two-tailed test) | B1 | Condone hypotheses written in words/context. Allow $H_0$: no correlation between $x$ & $y$. If only words used and 'association' mentioned, do not award first B1 and last B1 |
| where $\rho$ is the population correlation coefficient | B1 | |
| For $n = 14$, 5% critical value $= -0.5324$ | B1 (+or-) | One-tailed test $cv = (-) 0.4575$ |
| Since $-0.276 > -0.5324$ the result is not significant. Thus we do not have sufficient evidence to reject $H_0$ | M1, A1 | Comparison between candidate's $r$ from part (i) and appropriate $cv$ (signs must match). If result not stated but conclusion correct award SC1 to replace final A1 B1 |
| There is not sufficient evidence at the 5% level to suggest correlation between birth rate and death rate | B1 ft | |
**Total: 6 marks**
# Question 1 (part iii):
| Answer/Working | Marks | Guidance |
|---|---|---|
| The underlying population must have a bivariate Normal distribution | B1 | Not just 'bivariate' and 'Normal' separately |
| Since the scatter diagram has a roughly elliptical shape | E1 | |
**Total: 2 marks**
# Question 1 (part iv):
| Answer/Working | Marks | Guidance |
|---|---|---|
| Because this data point is a long way from the other data | E1 | Indication that the point is (possibly) an outlier |
| and it is below and to the right of the other data | E1 | Allow in terms of $x$ and $y$ |
| It does bring the validity of the test into question | E1 | Allow 'no' only with suitable explanation e.g. sample still too small |
| since this extra data point is so far from other points and so there is less evidence of ellipticity | E1 | |
**Total: 4 marks**
---
1 The scatter diagram below shows the birth rates $x$, and death rates $y$, measured in standard units, in a random sample of 14 countries in a particular year. Summary statistics for the data are as follows.
$$\Sigma x = 139.8 \quad \Sigma y = 140.4 \quad \Sigma x ^ { 2 } = 1411.66 \quad \Sigma y ^ { 2 } = 1417.88 \quad \Sigma x y = 1398.56 \quad n = 14$$
\includegraphics[max width=\textwidth, alt={}, center]{cd1a8f39-dd3c-44c9-90b0-6a919361d593-2_643_1047_488_550}\\
(i) Calculate the sample product moment correlation coefficient.\\
(ii) Carry out a hypothesis test at the $5 \%$ significance level to determine whether there is any correlation between birth rates and death rates.\\
(iii) State the distributional assumption which is necessary for this test to be valid. Explain briefly in the light of the scatter diagram why it appears that the assumption may be valid.\\
(iv) The values of $x$ and $y$ for another country in that year are 14.4 and 7.8 respectively. If these values are included, the value of the sample product moment correlation coefficient is - 0.5694 . Explain why this one observation causes such a large change to the value of the sample product moment correlation coefficient. Discuss whether this brings the validity of the test into question.
\hfill \mbox{\textit{OCR MEI S2 2011 Q1 [17]}}