Edexcel S3 — Question 7 17 marks

Exam BoardEdexcel
ModuleS3 (Statistics 3)
Marks17
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared goodness of fit
TypeChi-squared goodness of fit: Normal
DifficultyStandard +0.3 This is a standard chi-squared goodness-of-fit test with given parameters. Part (a) requires routine normal distribution calculations using tables/calculator, part (b) is a textbook hypothesis test application, and part (c) asks for conceptual understanding of parameter estimation effects on degrees of freedom. All techniques are core S3 content with no novel problem-solving required, making it slightly easier than average.
Spec5.06b Fit prescribed distribution: chi-squared test5.06c Fit other distributions: discrete and continuous

A shoe manufacturer sees a report from another country stating that the length of adult male feet is normally distributed with a mean of 22.4 cm and a standard deviation of 2.8 cm. The manufacturer wishes to see if this model is appropriate for his customers and collects data on the length, correct to the nearest cm, of the right foot of a random sample of 200 males giving the following results:
Length (cm)\(\leq 18\)\(19 - 21\)\(22 - 24\)\(25 - 27\)\(\geq 28\)
No. of Men2448694118
The expected frequencies for the \(\leq 18\) and \(19 - 21\) groups are calculated as 16.46 and 58.44 respectively, correct to 2 decimal places.
  1. Calculate expected frequencies for the other three classes. [7]
  2. Stating your hypotheses clearly, test at the 10\% level of significance whether or not this data can be modelled by the distribution N(22.4, 2.8²). [7]
The manufacturer wishes to refine the model by not assuming a mean and standard deviation.
  1. Explain briefly how the manufacturer should proceed. [3]

AnswerMarks Guidance
Part (a)M1, M1 A1, A1, M1, A1, A1 let \(X =\) length of adult male feet; \(P(21.5 < X < 24.5) = P\left(\frac{21.5-23.4}{2.8} < Z < \frac{24.5-23.4}{2.8}\right)\); \(= P(-0.32 < Z < 0.75) = 0.7734 - (1 - 0.6255) = 0.3989\); exp. freq. = \(0.3989 \times 200 = 79.78\); \(P(24.5 < X < 27.5) = P\left(0.75 < Z < \frac{27.5-23.4}{2.8}\right)\); \(= P(0.75 < Z < 1.82) = 0.9656 - 0.7734 = 0.1922\); exp. freq. = \(0.1922 \times 200 = 38.44\); exp. freq. for > 27.5 = 200 − total of others = 6.88
Part (b)B1 \(H_0: N(22.4, 2.8^2)\) is a suitable model; \(H_1: N(22.4, 2.8^2)\) is not a suitable model
OE (O – E)
2416.46 7.54
4858.44 −10.44
6979.78 −10.78
4138.44 2.56
186.88 11.12
M1 A2, M1 A1, A1\(\therefore \sum \frac{(O-E)^2}{E} = 24.919\); \(\nu = 5 - 1 = 4, \chi^2_{\text{crit}}(10\%) = 7.779\); \(24.919 > 7.779\) \(\therefore\) reject \(H_0\); \(N(22.4, 2.8^2)\) is not a suitable model
Part (c)B3, (17) use data to estimate mean and std. dev.; combine any cells with exp. freqs. < 5 and repeat calculation; \(\nu =\) no of cells after combining − 3 as parameters have been estimated
**Part (a)** | M1, M1 A1, A1, M1, A1, A1 | let $X =$ length of adult male feet; $P(21.5 < X < 24.5) = P\left(\frac{21.5-23.4}{2.8} < Z < \frac{24.5-23.4}{2.8}\right)$; $= P(-0.32 < Z < 0.75) = 0.7734 - (1 - 0.6255) = 0.3989$; exp. freq. = $0.3989 \times 200 = 79.78$; $P(24.5 < X < 27.5) = P\left(0.75 < Z < \frac{27.5-23.4}{2.8}\right)$; $= P(0.75 < Z < 1.82) = 0.9656 - 0.7734 = 0.1922$; exp. freq. = $0.1922 \times 200 = 38.44$; exp. freq. for > 27.5 = 200 − total of others = 6.88

**Part (b)** | B1 | $H_0: N(22.4, 2.8^2)$ is a suitable model; $H_1: N(22.4, 2.8^2)$ is not a suitable model

| | O | E | (O – E) | $\frac{(O-E)^2}{E}$ |
|---|---|---|---|---|
| | 24 | 16.46 | 7.54 | 3.4539 |
| | 48 | 58.44 | −10.44 | 1.8651 |
| | 69 | 79.78 | −10.78 | 1.4566 |
| | 41 | 38.44 | 2.56 | 0.1705 |
| | 18 | 6.88 | 11.12 | 17.9730 |

| M1 A2, M1 A1, A1 | $\therefore \sum \frac{(O-E)^2}{E} = 24.919$; $\nu = 5 - 1 = 4, \chi^2_{\text{crit}}(10\%) = 7.779$; $24.919 > 7.779$ $\therefore$ reject $H_0$; $N(22.4, 2.8^2)$ is not a suitable model

**Part (c)** | B3, (17) | use data to estimate mean and std. dev.; combine any cells with exp. freqs. < 5 and repeat calculation; $\nu =$ no of cells after combining − 3 as parameters have been estimated
A shoe manufacturer sees a report from another country stating that the length of adult male feet is normally distributed with a mean of 22.4 cm and a standard deviation of 2.8 cm. The manufacturer wishes to see if this model is appropriate for his customers and collects data on the length, correct to the nearest cm, of the right foot of a random sample of 200 males giving the following results:

\begin{center}
\begin{tabular}{|c|c|c|c|c|c|}
\hline
Length (cm) & $\leq 18$ & $19 - 21$ & $22 - 24$ & $25 - 27$ & $\geq 28$ \\
\hline
No. of Men & 24 & 48 & 69 & 41 & 18 \\
\hline
\end{tabular}
\end{center}

The expected frequencies for the $\leq 18$ and $19 - 21$ groups are calculated as 16.46 and 58.44 respectively, correct to 2 decimal places.

\begin{enumerate}[label=(\alph*)]
\item Calculate expected frequencies for the other three classes. [7]

\item Stating your hypotheses clearly, test at the 10\% level of significance whether or not this data can be modelled by the distribution N(22.4, 2.8²). [7]
\end{enumerate}

The manufacturer wishes to refine the model by not assuming a mean and standard deviation.

\begin{enumerate}[label=(\alph*)]
\setcounter{enumi}{2}
\item Explain briefly how the manufacturer should proceed. [3]
\end{enumerate}

\hfill \mbox{\textit{Edexcel S3  Q7 [17]}}