OCR MEI S2 2012 January — Question 1 17 marks

Exam BoardOCR MEI
ModuleS2 (Statistics 2)
Year2012
SessionJanuary
Marks17
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicHypothesis test of Spearman’s rank correlation coefficien
TypeHypothesis test for association
DifficultyStandard +0.3 This is a straightforward application of Spearman's rank correlation coefficient with clear data, standard ranking procedure, and table lookup at 5% level. The scatter diagram and comparison with PMCC are routine bookwork. Slightly above average only due to the ranking calculation being somewhat tedious with 9 data points, but requires no problem-solving insight.
Spec5.08e Spearman rank correlation5.08f Hypothesis test: Spearman rank

1 Nine long-distance runners are starting an exercise programme to improve their strength. During the first session, each of them has to do a 100 metre run and to do as many push-ups as possible in one minute. The times taken for the run, together with the number of push-ups each runner achieves, are shown in the table.
RunnerABCDEFGHI
100 metre time (seconds)13.211.610.912.314.713.111.713.612.4
Push-ups achieved324222364127373833
  1. Draw a scatter diagram to illustrate the data.
  2. Calculate the value of Spearman's rank correlation coefficient.
  3. Carry out a hypothesis test at the \(5 \%\) significance level to examine whether there is any association between time taken for the run and number of push-ups achieved.
  4. Under what circumstances is it appropriate to carry out a hypothesis test based on the product moment correlation coefficient? State, with a reason, which test is more appropriate for these data.

(i)
Answer: Graph with axes suitably labelled with some indication of linear scale provided. Points plotted correctly (8 points).
AnswerMarks Guidance
Marks: G1, G2, 1.0G1 G0
Guidance: G1 For axes suitably labelled with some indication of linear scale provided. G2 for points plotted correctly. G1 if 8 points plotted correctly. G0 if two or more incorrectly plotted/omitted points. Special Case SC1 for points visibly correct on axes where no indication of scale has been provided. Allow axes reversed.
(ii)
Answer:
\[\sum d^2 = 94\]
\[r_s = 1 - \frac{6\sum d^2}{n(n^2-1)} = 1 - \frac{6 \times 94}{9 \times 80} = 1 - 0.783 = 0.217\]
(to 3 s.f.) [allow 0.22 to 2 s.f.]
AnswerMarks Guidance
Marks: M1M1 A1
Guidance:
- M1 For ranking (allow all ranks reversed for either or both categories)
- NB No ranking or re-allocation of pairs scores 0/5
- M1 For \(d^2\)
- A1 For \(\sum d^2\)
- M1 for method for \(r_s\), used
AnswerMarks Guidance
- A1 f.t. for \(r_s < 1\) Allow 13/60 or \(r_s = 1 - 1.217 = -0.217\) with reversed ranks
Push-up times ranked from highest (1st) to Lowest (9th) gives \(\sum d^2 = 146\) which leads to -0.217. Allow both A marks.
AnswerMarks Guidance
AB C
100m7 2
Push up7 1
d0 1
\(d^2\)0 1
(iii)
Answer:
\[H_0: \text{no association between 100m time and number of push-ups achieved in the population of long distance runners}\]
\[H_1: \text{some association between 100m time and number of push-ups achieved in the population of long distance runners}\]
Two tail test critical value at 5% level is 0.7000
Since \(0.217 < 0.7000\), there is insufficient evidence to reject \(H_0\)
AnswerMarks Guidance
Marks: B1B1 SC1
Guidance:
- B1 for \(H_0\) in context (not x & y)
- B1 for \(H_1\) in context (not x & y)
- SC1 for both correct but no context provided
- B1 for population SOI. NB \(H_0\) \(H_1\) not ito \(\rho\). Do not condone the use of the word 'correlation' in place of 'association'. Population' should be mentioned to award B1, unless clear, unambiguous alternative wording is used.
- B1 for ±0.7000 (or ±0.6000 only if \(H_1\) indicates a 1-tailed test is intended)
AnswerMarks Guidance
- M1 for sensible comparison with c.v leading to a conclusion seen, provided \(r_s < 1\)
- NOTE The comparison can be in the form of a diagram as long as it is clear and unambiguous.
Sensible comparison: e.g. − 0.217 > − 0.7000 is 'sensible' whereas − 0.217 < 0.7000 is 'not sensible'. Allow -0.7000 < 0.217 < 0.7000
Reversed inequality sign e.g. 0.217 > 0.7000 etc gets max M1 A0.
Also, if the c.v. comes from the p.m.e.c. table (0.6664 for 2-tailed test and 0.5822 for a 1-tailed test) award max M1 A0.
Question 1(iv)
Answer: It is appropriate to carry out a hypothesis test based on the product moment correlation coefficient when the underlying population has a bivariate Normal distribution.
The scatter diagram does not appear to be roughly elliptical
so the Spearman coefficient is more appropriate
AnswerMarks Guidance
Marks: E1E1 E1dep
Guidance:
- E1 Do not accept 'both Normally distributed'
- E1 Allow reasonable alternatives e.g. in this case, one variable is discrete so pmcc invalid.
- E1dep E1 dependent on previous E1
## (i)

**Answer:** Graph with axes suitably labelled with some indication of linear scale provided. Points plotted correctly (8 points).

**Marks:** G1, G2, 1.0 | G1 | G0

**Guidance:** G1 For axes suitably labelled with some indication of linear scale provided. G2 for points plotted correctly. G1 if 8 points plotted correctly. G0 if two or more incorrectly plotted/omitted points. Special Case SC1 for points visibly correct on axes where no indication of scale has been provided. Allow axes reversed.

---

## (ii)

**Answer:** 
$$\sum d^2 = 94$$

$$r_s = 1 - \frac{6\sum d^2}{n(n^2-1)} = 1 - \frac{6 \times 94}{9 \times 80} = 1 - 0.783 = 0.217$$
(to 3 s.f.) [allow 0.22 to 2 s.f.]

**Marks:** M1 | M1 | A1 | A1

**Guidance:** 
- M1 For ranking (allow all ranks reversed for either or both categories)
- NB No ranking or re-allocation of pairs scores 0/5
- M1 For $d^2$
- A1 For $\sum d^2$
- M1 for method for $r_s$, used
- A1 f.t. for $|r_s| < 1$ Allow 13/60 or $r_s = 1 - 1.217 = -0.217$ with reversed ranks

Push-up times ranked from highest (1st) to Lowest (9th) gives $\sum d^2 = 146$ which leads to -0.217. Allow both A marks.

| | A | B | C | D | E | F | G | H | I |
|---|---|---|---|---|---|---|---|---|---|
| 100m | 7 | 2 | 1 | 4 | 9 | 6 | 3 | 8 | 5 |
| Push up | 7 | 1 | 9 | 5 | 2 | 8 | 4 | 3 | 6 |
| d | 0 | 1 | -8 | -1 | 7 | -2 | -1 | 5 | -1 |
| $d^2$ | 0 | 1 | 64 | 1 | 49 | 4 | 1 | 25 | 1 |

---

## (iii)

**Answer:**
$$H_0: \text{no association between 100m time and number of push-ups achieved in the population of long distance runners}$$

$$H_1: \text{some association between 100m time and number of push-ups achieved in the population of long distance runners}$$

Two tail test critical value at 5% level is 0.7000

Since $0.217 < 0.7000$, there is insufficient evidence to reject $H_0$

**Marks:** B1 | B1 | SC1 | B1 | M1 | A1

**Guidance:**
- B1 for $H_0$ in context (not x & y)
- B1 for $H_1$ in context (not x & y)
- SC1 for both correct but no context provided
- B1 for population SOI. NB $H_0$ $H_1$ not ito $\rho$. Do not condone the use of the word 'correlation' in place of 'association'. Population' should be mentioned to award B1, unless clear, unambiguous alternative wording is used.
- B1 for ±0.7000 (or ±0.6000 only if $H_1$ indicates a 1-tailed test is intended)
- M1 for sensible comparison with c.v leading to a conclusion seen, provided $|r_s| < 1$
- NOTE The comparison can be in the form of a diagram as long as it is clear and unambiguous.

Sensible comparison: e.g. − 0.217 > − 0.7000 is 'sensible' whereas − 0.217 < 0.7000 is 'not sensible'. Allow -0.7000 < 0.217 < 0.7000
Reversed inequality sign e.g. 0.217 > 0.7000 etc gets max M1 A0.

Also, if the c.v. comes from the p.m.e.c. table (0.6664 for 2-tailed test and 0.5822 for a 1-tailed test) award max M1 A0.

---

# Question 1(iv)

**Answer:** It is appropriate to carry out a hypothesis test based on the product moment correlation coefficient when the underlying population has a **bivariate Normal distribution**.

The scatter diagram does not appear to be roughly elliptical

so the Spearman coefficient is more appropriate

**Marks:** E1 | E1 | E1dep

**Guidance:**
- E1 Do not accept 'both Normally distributed'
- E1 Allow reasonable alternatives e.g. in this case, one variable is discrete so pmcc invalid.
- E1dep E1 dependent on previous E1

---
1 Nine long-distance runners are starting an exercise programme to improve their strength. During the first session, each of them has to do a 100 metre run and to do as many push-ups as possible in one minute. The times taken for the run, together with the number of push-ups each runner achieves, are shown in the table.

\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | c | c | }
\hline
Runner & A & B & C & D & E & F & G & H & I \\
\hline
100 metre time (seconds) & 13.2 & 11.6 & 10.9 & 12.3 & 14.7 & 13.1 & 11.7 & 13.6 & 12.4 \\
\hline
Push-ups achieved & 32 & 42 & 22 & 36 & 41 & 27 & 37 & 38 & 33 \\
\hline
\end{tabular}
\end{center}

(i) Draw a scatter diagram to illustrate the data.\\
(ii) Calculate the value of Spearman's rank correlation coefficient.\\
(iii) Carry out a hypothesis test at the $5 \%$ significance level to examine whether there is any association between time taken for the run and number of push-ups achieved.\\
(iv) Under what circumstances is it appropriate to carry out a hypothesis test based on the product moment correlation coefficient? State, with a reason, which test is more appropriate for these data.

\hfill \mbox{\textit{OCR MEI S2 2012 Q1 [17]}}