OCR MEI S2 2007 June — Question 4 18 marks

Exam BoardOCR MEI
ModuleS2 (Statistics 2)
Year2007
SessionJune
Marks18
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared test of independence
TypeInterpret association after test
DifficultyStandard +0.3 This is a standard chi-squared test of independence with straightforward calculations: computing expected frequencies from row/column totals, calculating contributions, and comparing to critical values. Part (ii) requires basic interpretation of the data, and part (iii) is a routine normal approximation to binomial. All techniques are textbook applications with no novel insight required, making it slightly easier than average.
Spec2.04e Normal distribution: as model N(mu, sigma^2)2.04f Find normal probabilities: Z transformation2.05a Hypothesis testing language: null, alternative, p-value, significance2.05c Significance levels: one-tail and two-tail5.06a Chi-squared: contingency tables

4 The sexes and ages of a random sample of 300 runners taking part in marathons are classified as follows.
ObservedSex\multirow{2}{*}{Row totals}
\cline { 3 - 4 }MaleFemale
\multirow{3}{*}{
Age
group
}
Under 407054124
\cline { 2 - 4 }\(40 - 49\)7636112
\cline { 2 - 5 }50 and over521264
Column totals198102300
  1. Carry out a test at the \(5 \%\) significance level to examine whether there is any association between age group and sex. State carefully your null and alternative hypotheses. Your working should include a table showing the contributions of each cell to the test statistic.
  2. Does your analysis support the suggestion that women are less likely than men to enter marathons as they get older? Justify your answer. For marathons in general, on average \(3 \%\) of runners are 'Female, 50 and over'. The random variable \(X\) represents the number of 'Female, 50 and over' runners in a random sample of size 300.
  3. Use a suitable approximating distribution to find \(\mathrm { P } ( X \geqslant 12 )\).

Question 4:
Part (i)
AnswerMarks Guidance
Answer/WorkingMarks Guidance
\(H_0\): no association between age group and sex; \(H_1\): some association between age group and sexB1 In context
Expected values: Under 40: 81.84, 42.16; 40–49: 73.92, 38.08; 50+: 42.24, 21.76M1 A1 To 2dp
Valid attempt at \(\frac{(O-E)^2}{E}\)M1
Contributions: 1.713, 3.325, 0.059, 0.114, 2.255, 4.378 summedM1dep For summation
\(X^2 = 11.84\)A1 CAO
2 degrees of freedom; critical value at 5% \(= 5.991\)B1 B1 B1 for 2 d.o.f.; B1 CAO for cv
Result is significant; some association between age group and sexB1 E1 B1 dep on their cv & \(X^2\); E1 conclusion in context
Part (ii)
AnswerMarks Guidance
Answer/WorkingMarks Guidance
More females in under 40 group and fewer in 50+ than expected if no associationE1
Reverse is true for males; data support the suggestionE1 E1dep E1dep on at least one previous E1
Part (iii)
AnswerMarks Guidance
Answer/WorkingMarks Guidance
Binomial\((300, 0.03)\); \(n=300\), \(p=0.03\)B1 CAO
Use Poisson approximation with \(\lambda = np = 9\)B1 For Poisson; B1dep for Poisson(9)
\(P(X \geq 12) = 1 - P(X \leq 11) = 1 - 0.8030 = 0.197\)M1 A1 M1 for tables to find \(1 - P(X \leq 11)\)
OR: Normal approx. \(N(9, 8.73)\); \(P(X > 11.5) = P\!\left(Z > \frac{11.5-9}{\sqrt{8.73}}\right) = P(Z > 0.846) = 1 - 0.8012 = 0.199\)B1 M1 A1 B1 for Normal; B1dep for parameters; M1 for correct tail (cc not required for M1)
# Question 4:

## Part (i)

| Answer/Working | Marks | Guidance |
|---|---|---|
| $H_0$: no association between age group and sex; $H_1$: some association between age group and sex | B1 | In context |
| Expected values: Under 40: 81.84, 42.16; 40–49: 73.92, 38.08; 50+: 42.24, 21.76 | M1 A1 | To 2dp |
| Valid attempt at $\frac{(O-E)^2}{E}$ | M1 | |
| Contributions: 1.713, 3.325, 0.059, 0.114, 2.255, 4.378 summed | M1dep | For summation |
| $X^2 = 11.84$ | A1 | CAO |
| 2 degrees of freedom; critical value at 5% $= 5.991$ | B1 B1 | B1 for 2 d.o.f.; B1 CAO for cv |
| Result is significant; some association between age group and sex | B1 E1 | B1 dep on their cv & $X^2$; E1 conclusion in context |

## Part (ii)

| Answer/Working | Marks | Guidance |
|---|---|---|
| More females in under 40 group and fewer in 50+ than expected if no association | E1 | |
| Reverse is true for males; data support the suggestion | E1 E1dep | E1dep on at least one previous E1 |

## Part (iii)

| Answer/Working | Marks | Guidance |
|---|---|---|
| Binomial$(300, 0.03)$; $n=300$, $p=0.03$ | B1 | CAO |
| Use Poisson approximation with $\lambda = np = 9$ | B1 | For Poisson; B1dep for Poisson(9) |
| $P(X \geq 12) = 1 - P(X \leq 11) = 1 - 0.8030 = 0.197$ | M1 A1 | M1 for tables to find $1 - P(X \leq 11)$ |
| OR: Normal approx. $N(9, 8.73)$; $P(X > 11.5) = P\!\left(Z > \frac{11.5-9}{\sqrt{8.73}}\right) = P(Z > 0.846) = 1 - 0.8012 = 0.199$ | B1 M1 A1 | B1 for Normal; B1dep for parameters; M1 for correct tail (cc not required for M1) |
4 The sexes and ages of a random sample of 300 runners taking part in marathons are classified as follows.

\begin{center}
\begin{tabular}{ | c | l | c | c | c | }
\hline
\multicolumn{2}{|c|}{Observed} & \multicolumn{2}{c|}{Sex} & \multirow{2}{*}{Row totals} \\
\cline { 3 - 4 }
 & Male & Female &  \\
\hline
\multirow{3}{*}{\begin{tabular}{ c }
Age \\
group \\
\end{tabular}} & Under 40 & 70 & 54 & 124 \\
\cline { 2 - 4 }
 & $40 - 49$ & 76 & 36 & 112 \\
\cline { 2 - 5 }
 & 50 and over & 52 & 12 & 64 \\
\hline
\multicolumn{2}{|c|}{Column totals} & 198 & 102 & 300 \\
\hline
\end{tabular}
\end{center}

(i) Carry out a test at the $5 \%$ significance level to examine whether there is any association between age group and sex. State carefully your null and alternative hypotheses. Your working should include a table showing the contributions of each cell to the test statistic.\\
(ii) Does your analysis support the suggestion that women are less likely than men to enter marathons as they get older? Justify your answer.

For marathons in general, on average $3 \%$ of runners are 'Female, 50 and over'. The random variable $X$ represents the number of 'Female, 50 and over' runners in a random sample of size 300.\\
(iii) Use a suitable approximating distribution to find $\mathrm { P } ( X \geqslant 12 )$.

\hfill \mbox{\textit{OCR MEI S2 2007 Q4 [18]}}