Edexcel Paper 3 2019 June — Question 2 11 marks

Exam BoardEdexcel
ModulePaper 3 (Paper 3)
Year2019
SessionJune
Marks11
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicNormal Distribution
TypeOutliers and box plots
DifficultyModerate -0.3 This is a multi-part question covering standard A-level statistics content: completing a box plot with outliers (routine calculation using 1.5×IQR rule), calculating standard deviation from summary statistics (direct formula application), and finding percentiles from a normal distribution (calculator work). Part (e) requires knowledge of the large data set but is straightforward recall. All parts are textbook-standard with no novel problem-solving required, making it slightly easier than average.
Spec2.01a Population and sample: terminology2.02a Interpret single variable data: tables and diagrams2.02g Calculate mean and standard deviation2.04e Normal distribution: as model N(mu, sigma^2)2.04f Find normal probabilities: Z transformation

2. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{d1eaaae7-c1dc-4aee-ab54-59f35519a7a4-06_321_1822_294_127} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure} The partially completed box plot in Figure 1 shows the distribution of daily mean air temperatures using the data from the large data set for Beijing in 2015 An outlier is defined as a value
more than \(1.5 \times\) IQR below \(Q _ { 1 }\) or
more than \(1.5 \times\) IQR above \(Q _ { 3 }\) The three lowest air temperatures in the data set are \(7.6 ^ { \circ } \mathrm { C } , 8.1 ^ { \circ } \mathrm { C }\) and \(9.1 ^ { \circ } \mathrm { C }\) The highest air temperature in the data set is \(32.5 ^ { \circ } \mathrm { C }\)
  1. Complete the box plot in Figure 1 showing clearly any outliers.
  2. Using your knowledge of the large data set, suggest from which month the two outliers are likely to have come. Using the data from the large data set, Simon produced the following summary statistics for the daily mean air temperature, \(x ^ { \circ } \mathrm { C }\), for Beijing in 2015 $$n = 184 \quad \sum x = 4153.6 \quad \mathrm {~S} _ { x x } = 4952.906$$
  3. Show that, to 3 significant figures, the standard deviation is \(5.19 ^ { \circ } \mathrm { C }\) Simon decides to model the air temperatures with the random variable $$T \sim \mathrm {~N} \left( 22.6,5.19 ^ { 2 } \right)$$
  4. Using Simon's model, calculate the 10th to 90th interpercentile range. Simon wants to model another variable from the large data set for Beijing using a normal distribution.
  5. State two variables from the large data set for Beijing that are not suitable to be modelled by a normal distribution. Give a reason for each answer. \includegraphics[max width=\textwidth, alt={}, center]{d1eaaae7-c1dc-4aee-ab54-59f35519a7a4-09_473_1813_2161_127}
    (Total for Question 2 is 11 marks)

Question 2:
Part (a):
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(IQR = 26.6 - 19.4 = 7.2\)B1 Correct IQR calculation (implied by 10.8 or 8.6 or 37.4)
\(19.4 - 1.5 \times 7.2 = 8.6\) or \(26.6 + 1.5 \times 7.2 = 37.4\)M1 Complete method for either outlier limit
Plotting upper whisker to 32.5 and lower whisker to 8.6 or 9.1A1 Both whiskers plotted correctly (allow ½ square tolerance)
Plotting 7.6 and 8.1 as the only two outliersA1 Only two outliers plotted, disconnected from whisker
Part (b):
AnswerMarks Guidance
Answer/WorkingMark Guidance
October (coldest temperatures between May and October in Beijing)B1
Part (c):
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(\sigma = \sqrt{\frac{4952.906}{184}}\) or \(\sigma = \sqrt{\frac{S_{xx}}{n}} = 5.188...\) \([=5.19^*]\)B1cso* Correct expression with square root or correct formula; \(\sum x^2 =\) awrt 98720 and \(\sigma = \sqrt{\frac{98715.9...}{184} - \left(\frac{4153.6}{184}\right)^2}\)
Part (d):
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(z = (\pm)\ 1.28(16)\)B1 Identifying z-value for 10th or 90th percentile (allow awrt \(\pm 1.28\)); or identifying \([P_{90}=]29.251...\) or \([P_{10}=]15.948...\)
\(2 \times 1.2816 \times 5.19\)M1 For \(2 \times z \times 5.19\) where \(1 < z < 2\); or \(P_{90} - P_{10}\) where \(25 < P_{90} < 35\) and \(10 < P_{10} < 20\)
\(= \) awrt 13.3A1
Part (e):
AnswerMarks Guidance
Answer/WorkingMark Guidance
Daily mean wind speed/Beaufort — qualitative dataB1 Data is non-numeric; do not allow wind direction/wind gust
Rainfall — not symmetric/lots of days with 0 rainfallB1 Not symmetric/skewed/not bell shaped/lots of 0s/many days with no rain/mean≠mode or median
## Question 2:

### Part (a):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $IQR = 26.6 - 19.4 = 7.2$ | B1 | Correct IQR calculation (implied by 10.8 or 8.6 or 37.4) |
| $19.4 - 1.5 \times 7.2 = 8.6$ or $26.6 + 1.5 \times 7.2 = 37.4$ | M1 | Complete method for either outlier limit |
| Plotting upper whisker to 32.5 **and** lower whisker to 8.6 or 9.1 | A1 | Both whiskers plotted correctly (allow ½ square tolerance) |
| Plotting 7.6 and 8.1 as the only two outliers | A1 | Only two outliers plotted, disconnected from whisker |

### Part (b):
| Answer/Working | Mark | Guidance |
|---|---|---|
| October (coldest temperatures between May and October in Beijing) | B1 | — |

### Part (c):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $\sigma = \sqrt{\frac{4952.906}{184}}$ or $\sigma = \sqrt{\frac{S_{xx}}{n}} = 5.188...$ $[=5.19^*]$ | B1cso* | Correct expression with square root **or** correct formula; $\sum x^2 =$ awrt 98720 and $\sigma = \sqrt{\frac{98715.9...}{184} - \left(\frac{4153.6}{184}\right)^2}$ |

### Part (d):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $z = (\pm)\ 1.28(16)$ | B1 | Identifying z-value for 10th or 90th percentile (allow awrt $\pm 1.28$); or identifying $[P_{90}=]29.251...$ or $[P_{10}=]15.948...$ |
| $2 \times 1.2816 \times 5.19$ | M1 | For $2 \times z \times 5.19$ where $1 < z < 2$; or $P_{90} - P_{10}$ where $25 < P_{90} < 35$ and $10 < P_{10} < 20$ |
| $= $ awrt **13.3** | A1 | — |

### Part (e):
| Answer/Working | Mark | Guidance |
|---|---|---|
| Daily mean wind speed/Beaufort — qualitative data | B1 | Data is non-numeric; do not allow wind direction/wind gust |
| Rainfall — not symmetric/lots of days with 0 rainfall | B1 | Not symmetric/skewed/not bell shaped/lots of 0s/many days with no rain/mean≠mode or median |

---
2.

\begin{figure}[h]
\begin{center}
  \includegraphics[alt={},max width=\textwidth]{d1eaaae7-c1dc-4aee-ab54-59f35519a7a4-06_321_1822_294_127}
\captionsetup{labelformat=empty}
\caption{Figure 1}
\end{center}
\end{figure}

The partially completed box plot in Figure 1 shows the distribution of daily mean air temperatures using the data from the large data set for Beijing in 2015

An outlier is defined as a value\\
more than $1.5 \times$ IQR below $Q _ { 1 }$ or\\
more than $1.5 \times$ IQR above $Q _ { 3 }$\\
The three lowest air temperatures in the data set are $7.6 ^ { \circ } \mathrm { C } , 8.1 ^ { \circ } \mathrm { C }$ and $9.1 ^ { \circ } \mathrm { C }$\\
The highest air temperature in the data set is $32.5 ^ { \circ } \mathrm { C }$
\begin{enumerate}[label=(\alph*)]
\item Complete the box plot in Figure 1 showing clearly any outliers.
\item Using your knowledge of the large data set, suggest from which month the two outliers are likely to have come.

Using the data from the large data set, Simon produced the following summary statistics for the daily mean air temperature, $x ^ { \circ } \mathrm { C }$, for Beijing in 2015

$$n = 184 \quad \sum x = 4153.6 \quad \mathrm {~S} _ { x x } = 4952.906$$
\item Show that, to 3 significant figures, the standard deviation is $5.19 ^ { \circ } \mathrm { C }$

Simon decides to model the air temperatures with the random variable

$$T \sim \mathrm {~N} \left( 22.6,5.19 ^ { 2 } \right)$$
\item Using Simon's model, calculate the 10th to 90th interpercentile range.

Simon wants to model another variable from the large data set for Beijing using a normal distribution.
\item State two variables from the large data set for Beijing that are not suitable to be modelled by a normal distribution. Give a reason for each answer.\\

\includegraphics[max width=\textwidth, alt={}, center]{d1eaaae7-c1dc-4aee-ab54-59f35519a7a4-09_473_1813_2161_127}\\
(Total for Question 2 is 11 marks)
\end{enumerate}

\hfill \mbox{\textit{Edexcel Paper 3 2019 Q2 [11]}}