OCR H240/02 2022 June — Question 9 14 marks

Exam BoardOCR
ModuleH240/02 (Pure Mathematics and Statistics)
Year2022
SessionJune
Marks14
PaperDownload PDF ↗
TopicNormal Distribution
TypeValidity of normal model
DifficultyStandard +0.3 This is a straightforward application of normal distribution with histogram interpretation. Part (a) requires basic frequency density reading, parts (b) and (c) are standard normal probability calculations, part (d) requires calculating mean/SD from grouped data (routine A-level Stats), and part (e) involves comparing models—all standard techniques with no novel problem-solving required. Slightly easier than average due to scaffolded structure and routine methods.
Spec2.02b Histogram: area represents frequency2.02f Measures of average and spread2.02g Calculate mean and standard deviation2.04e Normal distribution: as model N(mu, sigma^2)2.04f Find normal probabilities: Z transformation

9 The heights, in centimetres, of a random sample of 150 plants of a certain variety were measured. The results are summarised in the histogram. \includegraphics[max width=\textwidth, alt={}, center]{cb83836f-753f-4b3a-99e8-a18aff0f49ff-08_842_1651_495_207} One of the 150 plants is chosen at random, and its height, \(X \mathrm {~cm}\), is noted.
  1. Show that \(\mathrm { P } ( 20 < X < 30 ) = 0.147\), correct to 3 significant figures. Sam suggests that the distribution of \(X\) can be well modelled by the distribution \(\mathrm { N } ( 40,100 )\).
    1. Give a brief justification for the use of the normal distribution in this context.
    2. Give a brief justification for the choice of the parameter values 40 and 100 .
  2. Use Sam's model to find \(\mathrm { P } ( 20 < X < 30 )\). Nina suggests a different model. She uses the midpoints of the classes to calculate estimates, \(m\) and \(s\), for the mean and standard deviation respectively, in centimetres, of the 150 heights. She then uses the distribution \(\mathrm { N } \left( m , s ^ { 2 } \right)\) as her model.
  3. Use Nina's model to find \(\mathrm { P } ( 20 < X < 30 )\).
    1. Complete the table in the Printed Answer Booklet to show the probabilities obtained from Sam's model and Nina's model.
    2. By considering the different ranges of values of \(X\) given in the table, discuss how well the two models fit the original distribution.

Question 9:
Part (a)
AnswerMarks Guidance
Area of 20-30 block \(\div\) total area \(= \frac{110}{750}\) or \(\frac{22}{150}\) or \(\frac{4.4}{30} = 0.147\) (3 sf)M1, A1 [2] attempted using any units; correct calculation and answer 0.147. Not any method starting with 0.147
Part (b)(i)
AnswerMarks Guidance
Roughly bell-shapedB1 [1] or Roughly symmetrical AND peaks in middle AND tails off at each end. All 3 must be seen. Not "Shape is like normal curve"
Part (b)(ii)
AnswerMarks Guidance
Roughly symmetrical about \(x = 40\), or area to left of \(40 \approx\) area to right of 40, or peak is at 40B1 [2] or calculate mean and obtain \(\frac{5915}{150}\) or 39.4. Allow 40 has highest frequency. \(70 - 40 \approx 3\sigma\), hence \(\sigma \approx 10\)
or (Area within \(40\pm10\))/total e.g. \(510/750\) or \(102/150 = 0.68\) or \(\approx \frac{2}{3}\)B1 Must see correct fraction and \(\approx \frac{2}{3}\), or 68% or 0.68 [2]
Part (c)
AnswerMarks Guidance
\(0.136\) (3 sf)B1 [1] BC
Part (d)
AnswerMarks Guidance
\(m = 39.4\) or \(\frac{5915}{150}\) or \(\frac{1183}{30}\)B1 Allow \(39.1 \leq m \leq 39.7\). BC
\(s = 10.3\) (3 sf) or \(s^2 = 106\) (3 sf)B1 Allow \(105.5 \leq s^2 \leq 108.5\) or \(10.27 \leq s \leq 10.42\)
\(0.150\) or \(0.151\) or \(0.152\) (3 sf)B2, B1 [4] cao; or B1 for 0.145 to 0.158. NB no retrospective marks if 0.151 seen in table for (e)(i)
Part (e)(i)
AnswerMarks Guidance
\(x\)\(<20\) 20–30
Histogram0.027 0.147
\(N(40,100)\)0.023 0.136
\(N(m,s^2)\)0.030 0.151
B1 middle row correct \(\pm0.001\)B1 bottom row correct \(\pm0.003\) [2] No FT
Part (e)(ii)
AnswerMarks Guidance
Nina's model better fit for lower values of \(X\); Nina's model better fit for any ranges \(< 40\); Nina's model less good fit for 40–45 (or \(>60\))B1 Allow "more accurate" or "less accurate"
Sam's model better fit for higher values; Sam's model better fit for any ranges \(> 40\); Sam's model less good fit for 20–30 (or \(>60\))B1 [2] BUT SC: "Both less good fit for \(>60\)" alone: B1 only. NOT "Both are fairly good fit" B0B0
## Question 9:

**Part (a)**

Area of 20-30 block $\div$ total area $= \frac{110}{750}$ or $\frac{22}{150}$ or $\frac{4.4}{30} = 0.147$ (3 sf) | **M1, A1** [2] | attempted using any units; correct calculation and answer 0.147. Not any method starting with 0.147

**Part (b)(i)**

Roughly bell-shaped | **B1** [1] | or Roughly symmetrical AND peaks in middle AND tails off at each end. All 3 must be seen. Not "Shape is like normal curve"

**Part (b)(ii)**

Roughly symmetrical about $x = 40$, or area to left of $40 \approx$ area to right of 40, or peak is at 40 | **B1** [2] | or calculate mean and obtain $\frac{5915}{150}$ or 39.4. Allow 40 has highest frequency. $70 - 40 \approx 3\sigma$, hence $\sigma \approx 10$

or (Area within $40\pm10$)/total e.g. $510/750$ or $102/150 = 0.68$ or $\approx \frac{2}{3}$ | **B1** | Must see correct fraction and $\approx \frac{2}{3}$, or 68% or 0.68 [2]

**Part (c)**

$0.136$ (3 sf) | **B1** [1] | **BC**

**Part (d)**

$m = 39.4$ or $\frac{5915}{150}$ or $\frac{1183}{30}$ | **B1** | Allow $39.1 \leq m \leq 39.7$. **BC**

$s = 10.3$ (3 sf) or $s^2 = 106$ (3 sf) | **B1** | Allow $105.5 \leq s^2 \leq 108.5$ or $10.27 \leq s \leq 10.42$

$0.150$ or $0.151$ or $0.152$ (3 sf) | **B2, B1** [4] | cao; or B1 for 0.145 to 0.158. NB no retrospective marks if 0.151 seen in table for (e)(i)

**Part (e)(i)**

| $x$ | $<20$ | 20–30 | 30–35 | 35–40 | 40–45 | 45–50 | 50–60 | $>60$ |
|---|---|---|---|---|---|---|---|---|
| Histogram | 0.027 | 0.147 | 0.153 | 0.187 | 0.193 | 0.147 | 0.133 | 0.013 |
| $N(40,100)$ | 0.023 | 0.136 | 0.150 | 0.191 | 0.191 | 0.150 | 0.136 | 0.023 |
| $N(m,s^2)$ | 0.030 | 0.151 | 0.153 | 0.189 | 0.183 | 0.142 | 0.130 | 0.023 |

**B1** middle row correct $\pm0.001$ | **B1** bottom row correct $\pm0.003$ [2] | No FT

**Part (e)(ii)**

Nina's model better fit for lower values of $X$; Nina's model better fit for any ranges $< 40$; Nina's model less good fit for 40–45 (or $>60$) | **B1** | Allow "more accurate" or "less accurate"

Sam's model better fit for higher values; Sam's model better fit for any ranges $> 40$; Sam's model less good fit for 20–30 (or $>60$) | **B1** [2] | BUT SC: "Both less good fit for $>60$" alone: B1 only. NOT "Both are fairly good fit" B0B0

---
9 The heights, in centimetres, of a random sample of 150 plants of a certain variety were measured. The results are summarised in the histogram.\\
\includegraphics[max width=\textwidth, alt={}, center]{cb83836f-753f-4b3a-99e8-a18aff0f49ff-08_842_1651_495_207}

One of the 150 plants is chosen at random, and its height, $X \mathrm {~cm}$, is noted.
\begin{enumerate}[label=(\alph*)]
\item Show that $\mathrm { P } ( 20 < X < 30 ) = 0.147$, correct to 3 significant figures.

Sam suggests that the distribution of $X$ can be well modelled by the distribution $\mathrm { N } ( 40,100 )$.
\item \begin{enumerate}[label=(\roman*)]
\item Give a brief justification for the use of the normal distribution in this context.
\item Give a brief justification for the choice of the parameter values 40 and 100 .
\end{enumerate}\item Use Sam's model to find $\mathrm { P } ( 20 < X < 30 )$.

Nina suggests a different model. She uses the midpoints of the classes to calculate estimates, $m$ and $s$, for the mean and standard deviation respectively, in centimetres, of the 150 heights. She then uses the distribution $\mathrm { N } \left( m , s ^ { 2 } \right)$ as her model.
\item Use Nina's model to find $\mathrm { P } ( 20 < X < 30 )$.
\item \begin{enumerate}[label=(\roman*)]
\item Complete the table in the Printed Answer Booklet to show the probabilities obtained from Sam's model and Nina's model.
\item By considering the different ranges of values of $X$ given in the table, discuss how well the two models fit the original distribution.
\end{enumerate}\end{enumerate}

\hfill \mbox{\textit{OCR H240/02 2022 Q9 [14]}}