| Exam Board | Edexcel |
|---|---|
| Module | AS Paper 2 (AS Paper 2) |
| Year | 2020 |
| Session | June |
| Marks | 5 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Bivariate data |
| Type | Analyze large data set correlations |
| Difficulty | Moderate -0.8 This is a straightforward large data set question requiring basic statistical literacy: understanding rounding (part a), applying a standard outlier definition with given quartiles (part b), interpreting correlation from a scatter diagram (part c), and identifying a variable from context (part d). All parts involve routine application of AS-level statistical concepts with no complex calculations or novel problem-solving required. |
| Spec | 2.01a Population and sample: terminology2.02d Informal interpretation of correlation2.02h Recognize outliers |
| Date |
| Daily Mean Visibility | ||
| Units | \(\%\) | |||
| \(10 / 06 / 1987\) | 90 | 5300 | ||
| \(28 / 06 / 1987\) | 100 | 0 |
| Answer | Marks | Guidance |
|---|---|---|
| Part | Answer/Working | Mark |
| (a) | 0 to 500 m | B1 |
| (b) | \(1100+1600+1.5\times1600\ [=5100]\) | M1 |
| \(5300 > 5100\) therefore outlier | A1 | 5100; if a value for the point is stated it must be above 5100 otherwise A0. Statement comparing and conclusion it is an outlier or above \(Q_3+1.5\text{IQR}\) |
| (c) | As the humidity increases the mean visibility decreases | B1 |
| (d) | (Hours of) sunshine | B1 |
| (5) | 5 marks |
## Question 2:
| Part | Answer/Working | Mark | Guidance |
|---|---|---|---|
| **(a)** | 0 to 500 m | B1 | For realising it is the maximum distance with correct units. Allow 0 to 50dm or <500m or <50dm |
| **(b)** | $1100+1600+1.5\times1600\ [=5100]$ | M1 | Attempt to find $Q_3$ and the upper limit |
| | $5300 > 5100$ therefore outlier | A1 | 5100; if a value for the point is stated it must be above 5100 otherwise A0. Statement comparing and conclusion it is an outlier or above $Q_3+1.5\text{IQR}$ |
| **(c)** | As the humidity increases the mean visibility decreases | B1 | Suitable interpretation of negative correlation mentioning both humidity and visibility |
| **(d)** | (Hours of) sunshine | B1 | Correct deduction that unlabelled variable is hours of sunshine. Must be quantitative variable. Not cloud cover (values bigger than 8), not wind speed (not integers), not daily mean temperature (values near zero unlikely in June) |
| | | (5) | **5 marks** |
---
\begin{enumerate}
\item Jerry is studying visibility for Camborne using the large data set June 1987.
\end{enumerate}
The table below contains two extracts from the large data set.\\
It shows the daily maximum relative humidity and the daily mean visibility.
\begin{center}
\begin{tabular}{ | c | c | c | }
\hline
Date & \begin{tabular}{ c }
Daily Maximum \\
Relative Humidity \\
\end{tabular} & Daily Mean Visibility \\
\hline
Units & $\%$ & \\
\hline
$10 / 06 / 1987$ & 90 & 5300 \\
\hline
$28 / 06 / 1987$ & 100 & 0 \\
\hline
\end{tabular}
\end{center}
(The units for Daily Mean Visibility are deliberately omitted.)\\
Given that daily mean visibility is given to the nearest 100,\\
(a) write down the range of distances in metres that corresponds to the recorded value 0 for the daily mean visibility.
Jerry drew the following scatter diagram, Figure 2, and calculated some statistics using the June 1987 data for Camborne from the large data set.
\begin{figure}[h]
\begin{center}
\includegraphics[alt={},max width=\textwidth]{d62e5a00-cd23-417f-b244-8b3e24da4aa2-04_823_1764_1281_137}
\captionsetup{labelformat=empty}
\caption{Figure 2}
\end{center}
\end{figure}
Jerry defines an outlier as a value that is more than 1.5 times the interquartile range above $Q _ { 3 }$ or more than 1.5 times the interquartile range below $Q _ { 1 }$.\\
(b) Show that the point circled on the scatter diagram is an outlier for visibility.\\
(c) Interpret the correlation between the daily mean visibility and the daily maximum relative humidity.
Jerry drew the following scatter diagram, Figure 3, using the June 1987 data for Camborne from the large data set, but forgot to label the $x$-axis.
\begin{center}
\end{center}
\begin{figure}[h]
\begin{center}
\includegraphics[alt={},max width=\textwidth]{d62e5a00-cd23-417f-b244-8b3e24da4aa2-05_730_1056_342_386}
\captionsetup{labelformat=empty}
\caption{Figure 3}
\end{center}
\end{figure}
(d) Using your knowledge of the large data set, suggest which variable the $x$-axis on this scatter diagram represents.
\hfill \mbox{\textit{Edexcel AS Paper 2 2020 Q2 [5]}}