Edexcel S3 2022 January — Question 2 8 marks

Exam BoardEdexcel
ModuleS3 (Statistics 3)
Year2022
SessionJanuary
Marks8
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear combinations of normal random variables
TypeTwo-sample t-test (unknown variances)
DifficultyStandard +0.3 This is a straightforward two-sample hypothesis test with large samples (n=240 each) and given summary statistics. Students apply a standard procedure: state hypotheses, calculate the test statistic using the difference of means formula, compare to critical value, and conclude. Part (b) requires brief explanation of CLT justifying normality. The large sample sizes make calculations simple and CLT application obvious. Slightly above average difficulty due to being a full hypothesis test rather than pure recall, but well within standard S3 material with no novel insight required.
Spec5.04a Linear combinations: E(aX+bY), Var(aX+bY)5.05a Sample mean distribution: central limit theorem5.05c Hypothesis test: normal distribution for population mean

  1. Secondary schools in a region conduct ability testing at the start of Year 7 and the start of Year 8. Each year a regional education officer randomly selects 240 Year 7 students and 240 Year 8 students from across the region. The results for last year are summarised in the table below.
\cline { 2 - 3 } \multicolumn{1}{c|}{}Mean scoreVariance of scores
Year 710138
Year 810342
The regional education officer claims that there is no difference between the mean scores of these two year groups.
  1. Test the regional education officer's claim at the \(1 \%\) significance level. You should state your hypotheses, test statistic and critical value clearly.
  2. Explain the significance of the Central Limit Theorem in part (a).

Question 2:
Part (a)
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(H_0: \mu_{\text{year7}} = \mu_{\text{year8}}\), \(H_1: \mu_{\text{year7}} \neq \mu_{\text{year8}}\)B1 Both hypotheses correct; allow equivalent rearrangements; must be in terms of \(\mu\); if using e.g. \(\mu_A = \mu_B\), A and B must be clearly identified with year groups
\(SE = \sqrt{\dfrac{38}{240} + \dfrac{42}{240}}\)M1 for use of SE with 38 and 42 (may be implied by \(SE =\) awrt 0.577)
\(z = \dfrac{103 - 101}{SE}\)M1 for a correct standardisation expression using 103, 101 (either order) and \(SE =\) awrt 0.0577; or ft their stated SE; or if not stated only allow \(\sqrt{\dfrac{38^2}{240}+\dfrac{42^2}{240}}\) or \(\sqrt{\dfrac{38}{240}}+\sqrt{\dfrac{42}{240}}\)
\(= (\pm)3.464\ldots\ \left(2\sqrt{3}\right)\) awrt \((\pm)3.46\)A1 awrt 3.46 or awrt \(-3.46\); allow \(p\) value of awrt 0.000266
\(Z_{\text{critical}} = 2.5758\)B1 \(
In CR/Significant/Reject \(H_0\)M1 a correct statement linking their test statistic and their CV; need not be contextual but do not allow contradicting non-contextual comments
There is sufficient evidence to suggest that the regional education officer's claim is not correct / There is a difference between the mean scores of the two year groups.A1 a correct contextual statement (dependent on 2nd M1) consistent with their test statistics and CV, which must reject \(H_0\); must mention the officer or mean scores; do not allow a ft conclusion here
Part (b)
AnswerMarks Guidance
Answer/WorkingMark Guidance
CLT allows us to use sample means (oe) being normally distributedB1 a correct explanation which must mention sample means oe (population means are normally distributed is B0); ignore extraneous non-contradictory comments
# Question 2:

## Part (a)

| Answer/Working | Mark | Guidance |
|---|---|---|
| $H_0: \mu_{\text{year7}} = \mu_{\text{year8}}$, $H_1: \mu_{\text{year7}} \neq \mu_{\text{year8}}$ | B1 | Both hypotheses correct; allow equivalent rearrangements; must be in terms of $\mu$; if using e.g. $\mu_A = \mu_B$, A and B must be clearly identified with year groups |
| $SE = \sqrt{\dfrac{38}{240} + \dfrac{42}{240}}$ | M1 | for use of SE with 38 and 42 (may be implied by $SE =$ awrt 0.577) |
| $z = \dfrac{103 - 101}{SE}$ | M1 | for a correct standardisation expression using 103, 101 (either order) and $SE =$ awrt 0.0577; or ft their stated SE; or if not stated only allow $\sqrt{\dfrac{38^2}{240}+\dfrac{42^2}{240}}$ or $\sqrt{\dfrac{38}{240}}+\sqrt{\dfrac{42}{240}}$ |
| $= (\pm)3.464\ldots\ \left(2\sqrt{3}\right)$ awrt $(\pm)3.46$ | A1 | awrt 3.46 or awrt $-3.46$; allow $p$ value of awrt 0.000266 |
| $Z_{\text{critical}} = 2.5758$ | B1 | $|CV| = 2.5758$ or better (seen) |
| In CR/Significant/Reject $H_0$ | M1 | a correct statement linking their test statistic and their CV; need not be contextual but do not allow contradicting non-contextual comments |
| There is sufficient evidence to suggest that the regional education officer's claim is not correct / There is a difference between the mean scores of the two year groups. | A1 | a correct contextual statement (dependent on 2nd M1) consistent with their test statistics and CV, which must reject $H_0$; must mention the officer or mean scores; do not allow a ft conclusion here |

## Part (b)

| Answer/Working | Mark | Guidance |
|---|---|---|
| CLT allows us to use sample means (oe) being normally distributed | B1 | a correct explanation which must mention sample means oe (population means are normally distributed is B0); ignore extraneous non-contradictory comments |
\begin{enumerate}
  \item Secondary schools in a region conduct ability testing at the start of Year 7 and the start of Year 8. Each year a regional education officer randomly selects 240 Year 7 students and 240 Year 8 students from across the region. The results for last year are summarised in the table below.
\end{enumerate}

\begin{center}
\begin{tabular}{ | c | c | c | }
\cline { 2 - 3 }
\multicolumn{1}{c|}{} & Mean score & Variance of scores \\
\hline
Year 7 & 101 & 38 \\
\hline
Year 8 & 103 & 42 \\
\hline
\end{tabular}
\end{center}

The regional education officer claims that there is no difference between the mean scores of these two year groups.\\
(a) Test the regional education officer's claim at the $1 \%$ significance level. You should state your hypotheses, test statistic and critical value clearly.\\
(b) Explain the significance of the Central Limit Theorem in part (a).

\hfill \mbox{\textit{Edexcel S3 2022 Q2 [8]}}