OCR MEI S2 2011 January — Question 1 17 marks

Exam BoardOCR MEI
ModuleS2 (Statistics 2)
Year2011
SessionJanuary
Marks17
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicHypothesis test of Pearson’s product-moment correlation coefficient
TypeCalculate PMCC from summary statistics
DifficultyStandard +0.3 This is a straightforward application of the PMCC formula using given summary statistics, followed by a standard hypothesis test procedure. The calculations are routine, the test follows a textbook template, and while part (iv) requires some interpretation, it's guided by the context. Slightly easier than average due to minimal problem-solving required.
Spec5.08a Pearson correlation: calculate pmcc5.08d Hypothesis test: Pearson correlation

1 The scatter diagram below shows the birth rates \(x\), and death rates \(y\), measured in standard units, in a random sample of 14 countries in a particular year. Summary statistics for the data are as follows. $$\Sigma x = 139.8 \quad \Sigma y = 140.4 \quad \Sigma x ^ { 2 } = 1411.66 \quad \Sigma y ^ { 2 } = 1417.88 \quad \Sigma x y = 1398.56 \quad n = 14$$ \includegraphics[max width=\textwidth, alt={}, center]{cd1a8f39-dd3c-44c9-90b0-6a919361d593-2_643_1047_488_550}
  1. Calculate the sample product moment correlation coefficient.
  2. Carry out a hypothesis test at the \(5 \%\) significance level to determine whether there is any correlation between birth rates and death rates.
  3. State the distributional assumption which is necessary for this test to be valid. Explain briefly in the light of the scatter diagram why it appears that the assumption may be valid.
  4. The values of \(x\) and \(y\) for another country in that year are 14.4 and 7.8 respectively. If these values are included, the value of the sample product moment correlation coefficient is - 0.5694 . Explain why this one observation causes such a large change to the value of the sample product moment correlation coefficient. Discuss whether this brings the validity of the test into question.

Question 1 (part ii):
AnswerMarks Guidance
Answer/WorkingMarks Guidance
\(H_0: \rho = 0\); \(H_1: \rho \neq 0\) (two-tailed test)B1 Condone hypotheses written in words/context. Allow \(H_0\): no correlation between \(x\) & \(y\). If only words used and 'association' mentioned, do not award first B1 and last B1
where \(\rho\) is the population correlation coefficientB1
For \(n = 14\), 5% critical value \(= -0.5324\)B1 (+or-) One-tailed test \(cv = (-) 0.4575\)
Since \(-0.276 > -0.5324\) the result is not significant. Thus we do not have sufficient evidence to reject \(H_0\)M1, A1 Comparison between candidate's \(r\) from part (i) and appropriate \(cv\) (signs must match). If result not stated but conclusion correct award SC1 to replace final A1 B1
There is not sufficient evidence at the 5% level to suggest correlation between birth rate and death rateB1 ft
Total: 6 marks
Question 1 (part iii):
AnswerMarks Guidance
Answer/WorkingMarks Guidance
The underlying population must have a bivariate Normal distributionB1 Not just 'bivariate' and 'Normal' separately
Since the scatter diagram has a roughly elliptical shapeE1
Total: 2 marks
Question 1 (part iv):
AnswerMarks Guidance
Answer/WorkingMarks Guidance
Because this data point is a long way from the other dataE1 Indication that the point is (possibly) an outlier
and it is below and to the right of the other dataE1 Allow in terms of \(x\) and \(y\)
It does bring the validity of the test into questionE1 Allow 'no' only with suitable explanation e.g. sample still too small
since this extra data point is so far from other points and so there is less evidence of ellipticityE1
Total: 4 marks
# Question 1 (part ii):

| Answer/Working | Marks | Guidance |
|---|---|---|
| $H_0: \rho = 0$; $H_1: \rho \neq 0$ (two-tailed test) | B1 | Condone hypotheses written in words/context. Allow $H_0$: no correlation between $x$ & $y$. If only words used and 'association' mentioned, do not award first B1 and last B1 |
| where $\rho$ is the population correlation coefficient | B1 | |
| For $n = 14$, 5% critical value $= -0.5324$ | B1 (+or-) | One-tailed test $cv = (-) 0.4575$ |
| Since $-0.276 > -0.5324$ the result is not significant. Thus we do not have sufficient evidence to reject $H_0$ | M1, A1 | Comparison between candidate's $r$ from part (i) and appropriate $cv$ (signs must match). If result not stated but conclusion correct award SC1 to replace final A1 B1 |
| There is not sufficient evidence at the 5% level to suggest correlation between birth rate and death rate | B1 ft | |

**Total: 6 marks**

# Question 1 (part iii):

| Answer/Working | Marks | Guidance |
|---|---|---|
| The underlying population must have a bivariate Normal distribution | B1 | Not just 'bivariate' and 'Normal' separately |
| Since the scatter diagram has a roughly elliptical shape | E1 | |

**Total: 2 marks**

# Question 1 (part iv):

| Answer/Working | Marks | Guidance |
|---|---|---|
| Because this data point is a long way from the other data | E1 | Indication that the point is (possibly) an outlier |
| and it is below and to the right of the other data | E1 | Allow in terms of $x$ and $y$ |
| It does bring the validity of the test into question | E1 | Allow 'no' only with suitable explanation e.g. sample still too small |
| since this extra data point is so far from other points and so there is less evidence of ellipticity | E1 | |

**Total: 4 marks**

---
1 The scatter diagram below shows the birth rates $x$, and death rates $y$, measured in standard units, in a random sample of 14 countries in a particular year. Summary statistics for the data are as follows.

$$\Sigma x = 139.8 \quad \Sigma y = 140.4 \quad \Sigma x ^ { 2 } = 1411.66 \quad \Sigma y ^ { 2 } = 1417.88 \quad \Sigma x y = 1398.56 \quad n = 14$$

\includegraphics[max width=\textwidth, alt={}, center]{cd1a8f39-dd3c-44c9-90b0-6a919361d593-2_643_1047_488_550}\\
(i) Calculate the sample product moment correlation coefficient.\\
(ii) Carry out a hypothesis test at the $5 \%$ significance level to determine whether there is any correlation between birth rates and death rates.\\
(iii) State the distributional assumption which is necessary for this test to be valid. Explain briefly in the light of the scatter diagram why it appears that the assumption may be valid.\\
(iv) The values of $x$ and $y$ for another country in that year are 14.4 and 7.8 respectively. If these values are included, the value of the sample product moment correlation coefficient is - 0.5694 . Explain why this one observation causes such a large change to the value of the sample product moment correlation coefficient. Discuss whether this brings the validity of the test into question.

\hfill \mbox{\textit{OCR MEI S2 2011 Q1 [17]}}