| Exam Board | OCR |
|---|---|
| Module | H240/02 (Pure Mathematics and Statistics) |
| Year | 2022 |
| Session | June |
| Marks | 14 |
| Paper | Download PDF ↗ |
| Topic | Normal Distribution |
| Type | Validity of normal model |
| Difficulty | Standard +0.3 This is a straightforward application of normal distribution with histogram interpretation. Part (a) requires basic frequency density reading, parts (b) and (c) are standard normal probability calculations, part (d) requires calculating mean/SD from grouped data (routine A-level Stats), and part (e) involves comparing models—all standard techniques with no novel problem-solving required. Slightly easier than average due to scaffolded structure and routine methods. |
| Spec | 2.02b Histogram: area represents frequency2.02f Measures of average and spread2.02g Calculate mean and standard deviation2.04e Normal distribution: as model N(mu, sigma^2)2.04f Find normal probabilities: Z transformation |
| Answer | Marks | Guidance |
|---|---|---|
| Area of 20-30 block \(\div\) total area \(= \frac{110}{750}\) or \(\frac{22}{150}\) or \(\frac{4.4}{30} = 0.147\) (3 sf) | M1, A1 [2] | attempted using any units; correct calculation and answer 0.147. Not any method starting with 0.147 |
| Answer | Marks | Guidance |
|---|---|---|
| Roughly bell-shaped | B1 [1] | or Roughly symmetrical AND peaks in middle AND tails off at each end. All 3 must be seen. Not "Shape is like normal curve" |
| Answer | Marks | Guidance |
|---|---|---|
| Roughly symmetrical about \(x = 40\), or area to left of \(40 \approx\) area to right of 40, or peak is at 40 | B1 [2] | or calculate mean and obtain \(\frac{5915}{150}\) or 39.4. Allow 40 has highest frequency. \(70 - 40 \approx 3\sigma\), hence \(\sigma \approx 10\) |
| or (Area within \(40\pm10\))/total e.g. \(510/750\) or \(102/150 = 0.68\) or \(\approx \frac{2}{3}\) | B1 | Must see correct fraction and \(\approx \frac{2}{3}\), or 68% or 0.68 [2] |
| Answer | Marks | Guidance |
|---|---|---|
| \(0.136\) (3 sf) | B1 [1] | BC |
| Answer | Marks | Guidance |
|---|---|---|
| \(m = 39.4\) or \(\frac{5915}{150}\) or \(\frac{1183}{30}\) | B1 | Allow \(39.1 \leq m \leq 39.7\). BC |
| \(s = 10.3\) (3 sf) or \(s^2 = 106\) (3 sf) | B1 | Allow \(105.5 \leq s^2 \leq 108.5\) or \(10.27 \leq s \leq 10.42\) |
| \(0.150\) or \(0.151\) or \(0.152\) (3 sf) | B2, B1 [4] | cao; or B1 for 0.145 to 0.158. NB no retrospective marks if 0.151 seen in table for (e)(i) |
| Answer | Marks | Guidance |
|---|---|---|
| \(x\) | \(<20\) | 20–30 |
| Histogram | 0.027 | 0.147 |
| \(N(40,100)\) | 0.023 | 0.136 |
| \(N(m,s^2)\) | 0.030 | 0.151 |
| B1 middle row correct \(\pm0.001\) | B1 bottom row correct \(\pm0.003\) [2] | No FT |
| Answer | Marks | Guidance |
|---|---|---|
| Nina's model better fit for lower values of \(X\); Nina's model better fit for any ranges \(< 40\); Nina's model less good fit for 40–45 (or \(>60\)) | B1 | Allow "more accurate" or "less accurate" |
| Sam's model better fit for higher values; Sam's model better fit for any ranges \(> 40\); Sam's model less good fit for 20–30 (or \(>60\)) | B1 [2] | BUT SC: "Both less good fit for \(>60\)" alone: B1 only. NOT "Both are fairly good fit" B0B0 |
## Question 9:
**Part (a)**
Area of 20-30 block $\div$ total area $= \frac{110}{750}$ or $\frac{22}{150}$ or $\frac{4.4}{30} = 0.147$ (3 sf) | **M1, A1** [2] | attempted using any units; correct calculation and answer 0.147. Not any method starting with 0.147
**Part (b)(i)**
Roughly bell-shaped | **B1** [1] | or Roughly symmetrical AND peaks in middle AND tails off at each end. All 3 must be seen. Not "Shape is like normal curve"
**Part (b)(ii)**
Roughly symmetrical about $x = 40$, or area to left of $40 \approx$ area to right of 40, or peak is at 40 | **B1** [2] | or calculate mean and obtain $\frac{5915}{150}$ or 39.4. Allow 40 has highest frequency. $70 - 40 \approx 3\sigma$, hence $\sigma \approx 10$
or (Area within $40\pm10$)/total e.g. $510/750$ or $102/150 = 0.68$ or $\approx \frac{2}{3}$ | **B1** | Must see correct fraction and $\approx \frac{2}{3}$, or 68% or 0.68 [2]
**Part (c)**
$0.136$ (3 sf) | **B1** [1] | **BC**
**Part (d)**
$m = 39.4$ or $\frac{5915}{150}$ or $\frac{1183}{30}$ | **B1** | Allow $39.1 \leq m \leq 39.7$. **BC**
$s = 10.3$ (3 sf) or $s^2 = 106$ (3 sf) | **B1** | Allow $105.5 \leq s^2 \leq 108.5$ or $10.27 \leq s \leq 10.42$
$0.150$ or $0.151$ or $0.152$ (3 sf) | **B2, B1** [4] | cao; or B1 for 0.145 to 0.158. NB no retrospective marks if 0.151 seen in table for (e)(i)
**Part (e)(i)**
| $x$ | $<20$ | 20–30 | 30–35 | 35–40 | 40–45 | 45–50 | 50–60 | $>60$ |
|---|---|---|---|---|---|---|---|---|
| Histogram | 0.027 | 0.147 | 0.153 | 0.187 | 0.193 | 0.147 | 0.133 | 0.013 |
| $N(40,100)$ | 0.023 | 0.136 | 0.150 | 0.191 | 0.191 | 0.150 | 0.136 | 0.023 |
| $N(m,s^2)$ | 0.030 | 0.151 | 0.153 | 0.189 | 0.183 | 0.142 | 0.130 | 0.023 |
**B1** middle row correct $\pm0.001$ | **B1** bottom row correct $\pm0.003$ [2] | No FT
**Part (e)(ii)**
Nina's model better fit for lower values of $X$; Nina's model better fit for any ranges $< 40$; Nina's model less good fit for 40–45 (or $>60$) | **B1** | Allow "more accurate" or "less accurate"
Sam's model better fit for higher values; Sam's model better fit for any ranges $> 40$; Sam's model less good fit for 20–30 (or $>60$) | **B1** [2] | BUT SC: "Both less good fit for $>60$" alone: B1 only. NOT "Both are fairly good fit" B0B0
---
9 The heights, in centimetres, of a random sample of 150 plants of a certain variety were measured. The results are summarised in the histogram.\\
\includegraphics[max width=\textwidth, alt={}, center]{cb83836f-753f-4b3a-99e8-a18aff0f49ff-08_842_1651_495_207}
One of the 150 plants is chosen at random, and its height, $X \mathrm {~cm}$, is noted.
\begin{enumerate}[label=(\alph*)]
\item Show that $\mathrm { P } ( 20 < X < 30 ) = 0.147$, correct to 3 significant figures.
Sam suggests that the distribution of $X$ can be well modelled by the distribution $\mathrm { N } ( 40,100 )$.
\item \begin{enumerate}[label=(\roman*)]
\item Give a brief justification for the use of the normal distribution in this context.
\item Give a brief justification for the choice of the parameter values 40 and 100 .
\end{enumerate}\item Use Sam's model to find $\mathrm { P } ( 20 < X < 30 )$.
Nina suggests a different model. She uses the midpoints of the classes to calculate estimates, $m$ and $s$, for the mean and standard deviation respectively, in centimetres, of the 150 heights. She then uses the distribution $\mathrm { N } \left( m , s ^ { 2 } \right)$ as her model.
\item Use Nina's model to find $\mathrm { P } ( 20 < X < 30 )$.
\item \begin{enumerate}[label=(\roman*)]
\item Complete the table in the Printed Answer Booklet to show the probabilities obtained from Sam's model and Nina's model.
\item By considering the different ranges of values of $X$ given in the table, discuss how well the two models fit the original distribution.
\end{enumerate}\end{enumerate}
\hfill \mbox{\textit{OCR H240/02 2022 Q9 [14]}}