Question 3 - A-Level Maths

OCR MEI S4 2016 June — Question 3 24 marks

Exam Board	OCR MEI
Module	S4 (Statistics 4)
Year	2016
Session	June
Marks	24
Paper	Download PDF ↗
Mark scheme	Download PDF ↗
Topic	Wilcoxon tests
Type	Two-sample t-test
Difficulty	Standard +0.3 This is a straightforward application of a two-sample t-test with standard bookwork components. Part (i) requires routine calculation with given summary statistics, part (ii) tests recall of assumptions and knowledge of the Wilcoxon rank-sum test, and part (iii) requires explanation of paired design—all standard S4 material with no novel problem-solving required. Slightly easier than average due to computational simplicity and predictable structure.
Spec	5.05c Hypothesis test: normal distribution for population mean 5.07a Non-parametric tests: when to use 5.07d Paired vs two-sample: selection

3 A large department in a university wished to compare the standards of literacy and numeracy of its students. A random sample of 24 students was taken and sub-divided, randomly, into two groups of 12 . The students in one group took a literacy assessment (scores denoted by $x$ ); the students in the other group took a numeracy assessment (scores denoted by $y$ ). The two assessments were designed to give the same distributions of scores when taken by random samples from the general population. The scores obtained by the students on the two assessments are shown in the table.

$x$	23	42	43	46	48	48	50	54	58	59	62	65
$y$	44	36	63	55	53	58	63	80	61	57	83	54

$$\sum x = 598 \quad \sum x ^ { 2 } = 31196 \quad \sum y = 707 \quad \sum y ^ { 2 } = 43543$$

Carry out an appropriate $t$ test, at the $5 \%$ level of significance, to compare the standards of literacy and numeracy.
State the distributional assumptions required for the $t$ test to be valid. Name the test that you would use if the assumptions required for the $t$ test are thought not to hold. State the hypotheses for this new test. Explain, in general terms, which of the two tests is more powerful, and why. A statistician at the university looked at the data and commented that a paired sample design would have been better.
Explain how a paired sample design would be applied in this context, and how the data would be analysed. Explain also why it would be better than the design used.

Show mark scheme Show mark scheme source

Part (i)

Answer	Marks	Guidance
Answer	Marks	Guidance
$H_0: \mu_1=\mu_2$	B1	Zero if sample means used
$H_1: \mu_1\neq\mu_2$ where $\mu_1$ and $\mu_2$ are the means in the underlying population	B1	B1 if not clearly population means
$\bar{x}=\frac{598}{12}=49.8333$, $\bar{y}=\frac{707}{12}=58.9167$	B1
$\sum(x-\bar{x})^2=31196-\frac{598^2}{12}=1395.66667$; $[s_x^2=126.87..., s_x=11.264...]$	M1	Accept alternative forms if correctly used later
$\sum(y-\bar{y})^2=43543-\frac{707^2}{12}=1888.91667$; $[s_y^2=171.719..., s_y=13.104...]$	A1
Pooled variance estimate $=\frac{(1395.666...+1888.916...)}{(11+11)}=149.299$	M1A1	$\frac{11s_x^2+11s_y^2}{22}$; correct construction, their $s$, $\bar{x}$, $\bar{y}$
Test statistic: $\frac{58.9167-49.8333}{\sqrt{149.299}\sqrt{\frac{1}{12}+\frac{1}{12}}}=1.8209$	M1A1
5% two-tailed critical value for $t_{22}$ is 2.0739	B1	2.0772 by interpolation from tables
Hence no reason to reject $H_0$, no reason to suppose that standards of literacy and numeracy are different in the underlying population, on average	M1A1	no reason to reject $H_0$; context

Part (ii)

Answer	Marks	Guidance
Answer	Marks	Guidance
Scores in the underlying population distributed Normally	B1
With common variance	B1	Accept same median and different medians
Wilcoxon rank sum test (or Mann-Whitney 2 sample test)	B1
$H_0$: literacy scores and numeracy scores have the same distribution	B1
$H_1$: literacy scores and numeracy scores have the same distribution but for a shift in location	B1
The $t$ test will be more powerful because	B1
it uses the magnitudes of the data rather than just their ranks	B1

Part (iii)

Answer	Marks	Guidance
Answer	Marks	Guidance
In a paired sample design, all the students in the sample would do both assessments	B1	This part is entirely descriptive; marks should be awarded accordingly
The order in which the students do the assessments should be randomised and/or blocked for balance	B1
The data used in the test would be the differences in their scores	B1
A single sample $t$ test (or Wilcoxon if Normality cannot be assumed) would be used	B1
This would be better than the two sample design used because the variation between students would be factored out	B1
The design would therefore be more sensitive to differences between literacy and numeracy	B1

## Part (i)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $H_0: \mu_1=\mu_2$ | B1 | Zero if sample means used |
| $H_1: \mu_1\neq\mu_2$ where $\mu_1$ and $\mu_2$ are the means in the underlying population | B1 | B1 if not clearly population means |
| $\bar{x}=\frac{598}{12}=49.8333$, $\bar{y}=\frac{707}{12}=58.9167$ | B1 | |
| $\sum(x-\bar{x})^2=31196-\frac{598^2}{12}=1395.66667$; $[s_x^2=126.87..., s_x=11.264...]$ | M1 | Accept alternative forms if correctly used later |
| $\sum(y-\bar{y})^2=43543-\frac{707^2}{12}=1888.91667$; $[s_y^2=171.719..., s_y=13.104...]$ | A1 | |
| Pooled variance estimate $=\frac{(1395.666...+1888.916...)}{(11+11)}=149.299$ | M1A1 | $\frac{11s_x^2+11s_y^2}{22}$; correct construction, their $s$, $\bar{x}$, $\bar{y}$ |
| Test statistic: $\frac{58.9167-49.8333}{\sqrt{149.299}\sqrt{\frac{1}{12}+\frac{1}{12}}}=1.8209$ | M1A1 | |
| 5% two-tailed critical value for $t_{22}$ is 2.0739 | B1 | 2.0772 by interpolation from tables |
| Hence no reason to reject $H_0$, no reason to suppose that standards of literacy and numeracy are different in the underlying population, on average | M1A1 | no reason to reject $H_0$; context |

## Part (ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Scores in the underlying population distributed Normally | B1 | |
| With common variance | B1 | Accept same median and different medians |
| Wilcoxon rank sum test (or Mann-Whitney 2 sample test) | B1 | |
| $H_0$: literacy scores and numeracy scores have the same distribution | B1 | |
| $H_1$: literacy scores and numeracy scores have the same distribution but for a shift in location | B1 | |
| The $t$ test will be more powerful because | B1 | |
| it uses the magnitudes of the data rather than just their ranks | B1 | |

## Part (iii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| In a paired sample design, all the students in the sample would do both assessments | B1 | This part is entirely descriptive; marks should be awarded accordingly |
| The order in which the students do the assessments should be randomised and/or blocked for balance | B1 | |
| The data used in the test would be the differences in their scores | B1 | |
| A single sample $t$ test (or Wilcoxon if Normality cannot be assumed) would be used | B1 | |
| This would be better than the two sample design used because the variation between students would be factored out | B1 | |
| The design would therefore be more sensitive to differences between literacy and numeracy | B1 | |

---

Show LaTeX source

3 A large department in a university wished to compare the standards of literacy and numeracy of its students. A random sample of 24 students was taken and sub-divided, randomly, into two groups of 12 . The students in one group took a literacy assessment (scores denoted by $x$ ); the students in the other group took a numeracy assessment (scores denoted by $y$ ). The two assessments were designed to give the same distributions of scores when taken by random samples from the general population.

The scores obtained by the students on the two assessments are shown in the table.

\begin{center}
\begin{tabular}{ | l | l | l | l | l | l | l | l | l | l | l | l | l | }
\hline
$x$ & 23 & 42 & 43 & 46 & 48 & 48 & 50 & 54 & 58 & 59 & 62 & 65 \\
\hline
$y$ & 44 & 36 & 63 & 55 & 53 & 58 & 63 & 80 & 61 & 57 & 83 & 54 \\
\hline
\end{tabular}
\end{center}

$$\sum x = 598 \quad \sum x ^ { 2 } = 31196 \quad \sum y = 707 \quad \sum y ^ { 2 } = 43543$$

\begin{enumerate}[label=(\roman*)]
\item Carry out an appropriate $t$ test, at the $5 \%$ level of significance, to compare the standards of literacy and numeracy.
\item State the distributional assumptions required for the $t$ test to be valid.

Name the test that you would use if the assumptions required for the $t$ test are thought not to hold. State the hypotheses for this new test.

Explain, in general terms, which of the two tests is more powerful, and why.

A statistician at the university looked at the data and commented that a paired sample design would have been better.
\item Explain how a paired sample design would be applied in this context, and how the data would be analysed. Explain also why it would be better than the design used.
\end{enumerate}

\hfill \mbox{\textit{OCR MEI S4 2016 Q3 [24]}}

This paper (4 questions)

View full paper

Q1 24 Q2 24 Q3 24 Q4

\(x\)	23	42	43	46	48	48	50	54	58	59	62	65
\(y\)	44	36	63	55	53	58	63	80	61	57	83	54