Edexcel S3 2017 June — Question 5 11 marks

Exam BoardEdexcel
ModuleS3 (Statistics 3)
Year2017
SessionJune
Marks11
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicT-tests (unknown variance)
TypeTwo-sample z-test large samples
DifficultyStandard +0.3 This is a straightforward two-sample t-test with all summary statistics provided. Part (a)-(b) test basic sampling knowledge (calculating proportions), while part (c) requires standard hypothesis test procedure with given formulae. The setup is clear, calculations are routine, and no conceptual subtleties arise—slightly easier than average A-level.
Spec2.01c Sampling techniques: simple random, opportunity, etc2.05a Hypothesis testing language: null, alternative, p-value, significance5.05c Hypothesis test: normal distribution for population mean

5. A dance studio has 800 dancers of which \begin{displayquote} 452 are beginners
251 are intermediates
97 are professionals
  1. Explain in detail how a stratified sample of size 50 could be taken.
  2. State an advantage of stratified sampling rather than simple random sampling in this situation. \end{displayquote} Independent random samples of 80 beginners and 60 intermediates are chosen. Each of these dancers is given an assessment score, \(x\), based on the quality of their dancing. The results are summarised in the table below.
    \(\bar { x }\)\(s ^ { 2 }\)\(n\)
    Beginners31.757.380
    Intermediates36.938.160
    The studio manager believes that the mean score of intermediates is more than 3 points greater than the mean score of beginners.
  3. Stating your hypotheses clearly and using a \(5 \%\) level of significance, test whether or not these data support the studio manager's belief.

Question 5:
Part (a):
AnswerMarks Guidance
Answer/WorkingMark Guidance
Label beginners \(1-452\), intermediates \(1-251\), professionals \(1-97\)M1 For a suitable numbered/labelled list for each ability level
Use random numbers to select...M1 For use of random numbers/sample to select from each stratum
Simple random sample of 28 beginners, 16 intermediates and 6 professionalsA1 Dependent on either 1st or 2nd M1 mark
Part (b):
AnswerMarks Guidance
Answer/WorkingMark Guidance
Any one of: Enables estimation of statistics/sampling errors for each strata; Reduces variability; More representative of population/reflects population structureB1
Part (c):
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(H_0: \mu_I - \mu_B = 3\), \(H_1: \mu_I - \mu_B > 3\)B1; B1 If \(\mu_1, \mu_2\) used must be clear which refers to intermediates/beginners
\(\text{s.e.} = \sqrt{\dfrac{38.1}{60} + \dfrac{57.3}{80}} = 1.162432794...\)M1 May be implied by s.e. = awrt 1.16. Condone minor slips e.g. \(\sqrt{\dfrac{38.1}{80}+\dfrac{57.3}{60}}\)
\(z = \dfrac{36.9 - 31.7 - 3}{1.1624...} = 1.89258...\); awrt 1.89dM1; A1 Dependent on 1st M1
One tailed c.v. \(Z = 1.6449\) or CR: \(Z \geq 1.6449\) or p-value = awrt 0.029 \(< 0.05\)B1 \(1.64 \leq
Mean score of intermediates is more than 3 greater than mean score of beginners / manager's belief is correctA1 Dep. on all M1 and B1 marks; contextualised rejection of \(H_0\)
Alternative method (2nd M1, 1st A1, 3rd B1): Let \(D = \bar{x}_I - \bar{x}_B\)
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(1.6449 = \dfrac{D-3}{1.1624...}\), so \(D = 4.912...\)dM1; A1 \(\dfrac{D-3}{\text{their } 1.1624...} = 1.6449/1.645/1.64/1.65\); \(D =\) awrt 4.91 and \(D_\text{obs} = 5.2\)
\(D_\text{obs} = 36.9 - 31.7 = 5.2\); in \([1.64, 1.65]\)B1 Critical value of \(-1.6449\)
# Question 5:

## Part (a):

| Answer/Working | Mark | Guidance |
|---|---|---|
| Label beginners $1-452$, intermediates $1-251$, professionals $1-97$ | M1 | For a suitable numbered/labelled list for each ability level |
| Use random numbers to select... | M1 | For use of random numbers/sample to select from each stratum |
| Simple random sample of **28 beginners**, **16 intermediates** and **6 professionals** | A1 | Dependent on either 1st or 2nd M1 mark |

## Part (b):

| Answer/Working | Mark | Guidance |
|---|---|---|
| Any one of: Enables estimation of statistics/sampling errors for each strata; Reduces variability; More representative of population/reflects population structure | B1 | |

## Part (c):

| Answer/Working | Mark | Guidance |
|---|---|---|
| $H_0: \mu_I - \mu_B = 3$, $H_1: \mu_I - \mu_B > 3$ | B1; B1 | If $\mu_1, \mu_2$ used must be clear which refers to intermediates/beginners |
| $\text{s.e.} = \sqrt{\dfrac{38.1}{60} + \dfrac{57.3}{80}} = 1.162432794...$ | M1 | May be implied by s.e. = awrt 1.16. Condone minor slips e.g. $\sqrt{\dfrac{38.1}{80}+\dfrac{57.3}{60}}$ |
| $z = \dfrac{36.9 - 31.7 - 3}{1.1624...} = 1.89258...$; awrt 1.89 | dM1; A1 | Dependent on 1st M1 |
| One tailed c.v. $Z = 1.6449$ or CR: $Z \geq 1.6449$ or p-value = awrt 0.029 $< 0.05$ | B1 | $1.64 \leq |C.V.| \leq 1.65$ compatible with test statistic, or correct probability comparison |
| Mean score of intermediates is more than 3 greater than mean score of beginners / manager's belief is correct | A1 | Dep. on all M1 and B1 marks; contextualised rejection of $H_0$ |

**Alternative method (2nd M1, 1st A1, 3rd B1):** Let $D = \bar{x}_I - \bar{x}_B$

| Answer/Working | Mark | Guidance |
|---|---|---|
| $1.6449 = \dfrac{D-3}{1.1624...}$, so $D = 4.912...$ | dM1; A1 | $\dfrac{D-3}{\text{their } 1.1624...} = 1.6449/1.645/1.64/1.65$; $D =$ awrt 4.91 and $D_\text{obs} = 5.2$ |
| $D_\text{obs} = 36.9 - 31.7 = 5.2$; in $[1.64, 1.65]$ | B1 | Critical value of $-1.6449$ |

---
5. A dance studio has 800 dancers of which

\begin{displayquote}
452 are beginners\\
251 are intermediates\\
97 are professionals
\begin{enumerate}[label=(\alph*)]
\item Explain in detail how a stratified sample of size 50 could be taken.
\item State an advantage of stratified sampling rather than simple random sampling in this situation.
\end{displayquote}

Independent random samples of 80 beginners and 60 intermediates are chosen. Each of these dancers is given an assessment score, $x$, based on the quality of their dancing. The results are summarised in the table below.

\begin{center}
\begin{tabular}{ | c | c | c | c | }
\hline
 & $\bar { x }$ & $s ^ { 2 }$ & $n$ \\
\hline
Beginners & 31.7 & 57.3 & 80 \\
\hline
Intermediates & 36.9 & 38.1 & 60 \\
\hline
\end{tabular}
\end{center}

The studio manager believes that the mean score of intermediates is more than 3 points greater than the mean score of beginners.
\item Stating your hypotheses clearly and using a $5 \%$ level of significance, test whether or not these data support the studio manager's belief.
\end{enumerate}

\hfill \mbox{\textit{Edexcel S3 2017 Q5 [11]}}