| Exam Board | Edexcel |
|---|---|
| Module | Paper 3 (Paper 3) |
| Year | 2019 |
| Session | June |
| Marks | 11 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Normal Distribution |
| Type | Outliers and box plots |
| Difficulty | Moderate -0.3 This is a multi-part question covering standard A-level statistics content: completing a box plot with outliers (routine calculation using 1.5×IQR rule), calculating standard deviation from summary statistics (direct formula application), and finding percentiles from a normal distribution (calculator work). Part (e) requires knowledge of the large data set but is straightforward recall. All parts are textbook-standard with no novel problem-solving required, making it slightly easier than average. |
| Spec | 2.01a Population and sample: terminology2.02a Interpret single variable data: tables and diagrams2.02g Calculate mean and standard deviation2.04e Normal distribution: as model N(mu, sigma^2)2.04f Find normal probabilities: Z transformation |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(IQR = 26.6 - 19.4 = 7.2\) | B1 | Correct IQR calculation (implied by 10.8 or 8.6 or 37.4) |
| \(19.4 - 1.5 \times 7.2 = 8.6\) or \(26.6 + 1.5 \times 7.2 = 37.4\) | M1 | Complete method for either outlier limit |
| Plotting upper whisker to 32.5 and lower whisker to 8.6 or 9.1 | A1 | Both whiskers plotted correctly (allow ½ square tolerance) |
| Plotting 7.6 and 8.1 as the only two outliers | A1 | Only two outliers plotted, disconnected from whisker |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| October (coldest temperatures between May and October in Beijing) | B1 | — |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(\sigma = \sqrt{\frac{4952.906}{184}}\) or \(\sigma = \sqrt{\frac{S_{xx}}{n}} = 5.188...\) \([=5.19^*]\) | B1cso* | Correct expression with square root or correct formula; \(\sum x^2 =\) awrt 98720 and \(\sigma = \sqrt{\frac{98715.9...}{184} - \left(\frac{4153.6}{184}\right)^2}\) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| \(z = (\pm)\ 1.28(16)\) | B1 | Identifying z-value for 10th or 90th percentile (allow awrt \(\pm 1.28\)); or identifying \([P_{90}=]29.251...\) or \([P_{10}=]15.948...\) |
| \(2 \times 1.2816 \times 5.19\) | M1 | For \(2 \times z \times 5.19\) where \(1 < z < 2\); or \(P_{90} - P_{10}\) where \(25 < P_{90} < 35\) and \(10 < P_{10} < 20\) |
| \(= \) awrt 13.3 | A1 | — |
| Answer | Marks | Guidance |
|---|---|---|
| Answer/Working | Mark | Guidance |
| Daily mean wind speed/Beaufort — qualitative data | B1 | Data is non-numeric; do not allow wind direction/wind gust |
| Rainfall — not symmetric/lots of days with 0 rainfall | B1 | Not symmetric/skewed/not bell shaped/lots of 0s/many days with no rain/mean≠mode or median |
## Question 2:
### Part (a):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $IQR = 26.6 - 19.4 = 7.2$ | B1 | Correct IQR calculation (implied by 10.8 or 8.6 or 37.4) |
| $19.4 - 1.5 \times 7.2 = 8.6$ or $26.6 + 1.5 \times 7.2 = 37.4$ | M1 | Complete method for either outlier limit |
| Plotting upper whisker to 32.5 **and** lower whisker to 8.6 or 9.1 | A1 | Both whiskers plotted correctly (allow ½ square tolerance) |
| Plotting 7.6 and 8.1 as the only two outliers | A1 | Only two outliers plotted, disconnected from whisker |
### Part (b):
| Answer/Working | Mark | Guidance |
|---|---|---|
| October (coldest temperatures between May and October in Beijing) | B1 | — |
### Part (c):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $\sigma = \sqrt{\frac{4952.906}{184}}$ or $\sigma = \sqrt{\frac{S_{xx}}{n}} = 5.188...$ $[=5.19^*]$ | B1cso* | Correct expression with square root **or** correct formula; $\sum x^2 =$ awrt 98720 and $\sigma = \sqrt{\frac{98715.9...}{184} - \left(\frac{4153.6}{184}\right)^2}$ |
### Part (d):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $z = (\pm)\ 1.28(16)$ | B1 | Identifying z-value for 10th or 90th percentile (allow awrt $\pm 1.28$); or identifying $[P_{90}=]29.251...$ or $[P_{10}=]15.948...$ |
| $2 \times 1.2816 \times 5.19$ | M1 | For $2 \times z \times 5.19$ where $1 < z < 2$; or $P_{90} - P_{10}$ where $25 < P_{90} < 35$ and $10 < P_{10} < 20$ |
| $= $ awrt **13.3** | A1 | — |
### Part (e):
| Answer/Working | Mark | Guidance |
|---|---|---|
| Daily mean wind speed/Beaufort — qualitative data | B1 | Data is non-numeric; do not allow wind direction/wind gust |
| Rainfall — not symmetric/lots of days with 0 rainfall | B1 | Not symmetric/skewed/not bell shaped/lots of 0s/many days with no rain/mean≠mode or median |
---
2.
\begin{figure}[h]
\begin{center}
\includegraphics[alt={},max width=\textwidth]{d1eaaae7-c1dc-4aee-ab54-59f35519a7a4-06_321_1822_294_127}
\captionsetup{labelformat=empty}
\caption{Figure 1}
\end{center}
\end{figure}
The partially completed box plot in Figure 1 shows the distribution of daily mean air temperatures using the data from the large data set for Beijing in 2015
An outlier is defined as a value\\
more than $1.5 \times$ IQR below $Q _ { 1 }$ or\\
more than $1.5 \times$ IQR above $Q _ { 3 }$\\
The three lowest air temperatures in the data set are $7.6 ^ { \circ } \mathrm { C } , 8.1 ^ { \circ } \mathrm { C }$ and $9.1 ^ { \circ } \mathrm { C }$\\
The highest air temperature in the data set is $32.5 ^ { \circ } \mathrm { C }$
\begin{enumerate}[label=(\alph*)]
\item Complete the box plot in Figure 1 showing clearly any outliers.
\item Using your knowledge of the large data set, suggest from which month the two outliers are likely to have come.
Using the data from the large data set, Simon produced the following summary statistics for the daily mean air temperature, $x ^ { \circ } \mathrm { C }$, for Beijing in 2015
$$n = 184 \quad \sum x = 4153.6 \quad \mathrm {~S} _ { x x } = 4952.906$$
\item Show that, to 3 significant figures, the standard deviation is $5.19 ^ { \circ } \mathrm { C }$
Simon decides to model the air temperatures with the random variable
$$T \sim \mathrm {~N} \left( 22.6,5.19 ^ { 2 } \right)$$
\item Using Simon's model, calculate the 10th to 90th interpercentile range.
Simon wants to model another variable from the large data set for Beijing using a normal distribution.
\item State two variables from the large data set for Beijing that are not suitable to be modelled by a normal distribution. Give a reason for each answer.\\
\includegraphics[max width=\textwidth, alt={}, center]{d1eaaae7-c1dc-4aee-ab54-59f35519a7a4-09_473_1813_2161_127}\\
(Total for Question 2 is 11 marks)
\end{enumerate}
\hfill \mbox{\textit{Edexcel Paper 3 2019 Q2 [11]}}