| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Marks | 21 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Data representation |
| Type | Use linear interpolation for median or quartiles |
| Difficulty | Standard +0.3 This is a standard S1 grouped data question requiring linear interpolation for quartiles and percentiles, calculation of outlier boundaries, and drawing box plots. While multi-part with 21 marks total, each component uses routine techniques taught in S1: cumulative frequency interpolation, applying the given outlier formula, and comparing distributions. The unusual outlier definition is provided explicitly. No novel problem-solving or insight required—just systematic application of standard methods. |
| Spec | 2.02a Interpret single variable data: tables and diagrams2.02f Measures of average and spread2.02g Calculate mean and standard deviation2.02h Recognize outliers |
| Weight (g) | 0 - 10 | 10 - 20 | 20 - 30 | 30 - 40 | 40 - 50 | 50 - 60 | 60 - 80 |
| No. of items | 2 | 11 | 18 | 12 | 9 | 6 | 2 |
| Answer | Marks | Guidance |
|---|---|---|
| (a) (i) Median = 30th value \(\approx 20 + \frac{17}{18}(10) = 29.4\) | M1 A1 | |
| (ii) \(Q_1 \approx 20 + \frac{2}{9}(10) = 21.1\), \(Q_3 \approx 40 + \frac{2}{9}(10) = 42.2\) | A1 A1 | |
| so IQR = 21.1 | A1 M1 A1 | |
| (iii) \(20 + \frac{6.8}{18}(10) = 23.8\) | M1 A1 | |
| (b) Outliers range from \(-10.55\) to \(73.9\), so 79 is an outlier | B1 | |
| Box plot drawn | B4 | |
| (c) Positive skew | B1 | |
| (d) Median and IQR, as they are not affected by the outlier | B1 B1 | |
| (e) Box plot; Second set slightly higher overall, with wider spread | B4 B1 B1 | Total: 21 marks |
(a) (i) Median = 30th value $\approx 20 + \frac{17}{18}(10) = 29.4$ | M1 A1 |
(ii) $Q_1 \approx 20 + \frac{2}{9}(10) = 21.1$, $Q_3 \approx 40 + \frac{2}{9}(10) = 42.2$ | A1 A1 |
so IQR = 21.1 | A1 M1 A1 |
(iii) $20 + \frac{6.8}{18}(10) = 23.8$ | M1 A1 |
(b) Outliers range from $-10.55$ to $73.9$, so 79 is an outlier | B1 |
Box plot drawn | B4 |
(c) Positive skew | B1 |
(d) Median and IQR, as they are not affected by the outlier | B1 B1 |
(e) Box plot; Second set slightly higher overall, with wider spread | B4 B1 B1 | **Total: 21 marks**
The following table gives the weights, in grams, of 60 items delivered to a company in a day.
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline
Weight (g) & 0 - 10 & 10 - 20 & 20 - 30 & 30 - 40 & 40 - 50 & 50 - 60 & 60 - 80 \\
\hline
No. of items & 2 & 11 & 18 & 12 & 9 & 6 & 2 \\
\hline
\end{tabular}
\begin{enumerate}[label=(\alph*)]
\item Use interpolation to calculate estimated values of
\begin{enumerate}[label=(\roman*)]
\item the median weight,
\item the interquartile range,
\item the thirty-third percentile.
\end{enumerate}
[7 marks]
\end{enumerate}
Outliers are defined to be outside the range from $2.5Q_1 - 1.5Q_2$ to $2.5Q_2 - 1.5Q_1$.
\begin{enumerate}[label=(\alph*)]
\setcounter{enumi}{1}
\item Given that the lightest item weighed 3 g and the two heaviest weighed 65 g and 79 g, draw on graph paper an accurate box-and-whisker plot of the data. Indicate any outliers clearly. [5 marks]
\item Describe the skewness of the distribution. [1 mark]
\end{enumerate}
The mean weight was 32.0 g and the standard deviation of the weights was 14.9 g.
\begin{enumerate}[label=(\alph*)]
\setcounter{enumi}{3}
\item State, with a reason, whether you would choose to summarise the data by using the mean and standard deviation or the median and interquartile range. [2 marks]
\end{enumerate}
On another day, items were delivered whose weights ranged from 14 g to 58 g; the median was 32 g, the lower quartile was 24 g and the interquartile range was 26 g.
\begin{enumerate}[label=(\alph*)]
\setcounter{enumi}{4}
\item Draw a further box plot for these data on the same diagram. Briefly compare the two sets of data using your plots. [6 marks]
\end{enumerate}
\hfill \mbox{\textit{Edexcel S1 Q7 [21]}}