CAIE S1 2019 June — Question 6 10 marks

Exam BoardCAIE
ModuleS1 (Statistics 1)
Year2019
SessionJune
Marks10
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicData representation
TypeState advantages of diagram types
DifficultyEasy -1.8 This question tests basic recall of advantages/disadvantages of box plots, identification of appropriate measures of central tendency with an obvious outlier (768), and construction of a box plot from ordered data. All parts are routine textbook exercises requiring minimal problem-solving—significantly easier than average A-level questions.
Spec2.02a Interpret single variable data: tables and diagrams2.02f Measures of average and spread2.02h Recognize outliers

6
  1. Give one advantage and one disadvantage of using a box-and-whisker plot to represent a set of data.
  2. The times in minutes taken to run a marathon were recorded for a group of 13 marathon runners and were found to be as follows. $$\begin{array} { l l l l l l l l l l l l l } 180 & 275 & 235 & 242 & 311 & 194 & 246 & 229 & 238 & 768 & 332 & 227 & 228 \end{array}$$ State which of the mean, mode or median is most suitable as a measure of central tendency for these times. Explain why the other measures are less suitable.
  3. Another group of 33 people ran the same marathon and their times in minutes were as follows.
    190203215246249253255254258260261
    263267269274276280288283287294300
    307318327331336345351353360368375
    1. On the grid below, draw a box-and-whisker plot to illustrate the times for these 33 people. \includegraphics[max width=\textwidth, alt={}, center]{f4d040a2-6a04-49ce-98ac-8ba5c515f905-09_611_1202_1270_555}
    2. Find the interquartile range of these times.

Question 6(i):
Advantage:
AnswerMarks Guidance
Comment referring to spread or range or shapeB1 Comments referring to quartiles, IQR, Range, median, shape, skewness, data distribution, spread score B1. Any comments with reference to mean or standard deviation or any other 'disadvantage' will score B0. Comments referring to '5-value plot', comparison with another data set, overview or ease of drawing/plotting/reading require an appropriate advantage statement.
Disadvantage:
AnswerMarks Guidance
Comment referring to limited data information providedB1 Comments referring to no individual data, no information about the number of values, unable to calculate mean, standard deviation, variance and mode score B1. Any comments with reference to median, shape or any other 'advantage' will score B0. Comments referring to 'size of data set' or 'average' require an appropriate disadvantage statement. Comments referring to outliers are ignored in all cases unless supported by an appropriate advantage/disadvantage statement. If comments not clearly identified, assume first comment is the advantage.
Total: 2 marks
Question 6(ii):
AnswerMarks Guidance
Not mean as data skewed by one large valueB1 Comment which identifies 768 (or 'a very large number') as the problem. Condone the use of 'outlier'
Not mode as frequencies all the sameB1 Comment which indicates that no mode exists (e.g. all the data is different, there is no repeated number, all the values are different)
MedianB1 Median identified as choice, dependent upon statements for mean and mode being given, even if incorrect or very general.
SC: Mean is identified as most suitable:
AnswerMarks Guidance
Not mode as frequencies all the sameSCB1 Comment which indicates that no mode exists
Not median as not all values usedSCB1 Comment which indicates limitation of median e.g. median is not in middle of range.
Total: 3 marks
Question 6(iii)(a):
AnswerMarks Guidance
\(LQ = 256\) or \(256.5\), \(Med = 280\), \(UQ = 329\), \(Min\ 190\), \(max\ 375\)B1 Median, UQ and LQ values seen, may not be identified or identified correctly. (Not read from box plot unless value stated)
Box plot drawn with median and quartiles plottedB1 FT Median and quartiles plotted in box on graph, linear scale
Correct end points, whiskers from ends of boxB1 Correct end points, whiskers from ends of box but not through box, not at top or bottom of box
Uniform scale from 190 to 375, axis labelled 'time' and 'minutes'B1 Uniform scale from 190 to 375 (need at least 3 linear identified points min) and labelled 'time' and 'minutes' (can be in title). No time axis or time axis with no scale attempt, Max B1B0B0B0
Total: 4 marks
Question 6(iii)(b):
AnswerMarks Guidance
\(IQR = their\ 329 - their\ 256 = 73\) or \(72.5\)B1 FT Must follow through only from *their* stated values (condone if correct quartiles stated here), not reading from graph.
Total: 1 mark
## Question 6(i):

**Advantage:**
Comment referring to spread or range or shape | B1 | Comments referring to quartiles, IQR, Range, median, shape, skewness, data distribution, spread score B1. Any comments with reference to mean or standard deviation or any other 'disadvantage' will score B0. Comments referring to '5-value plot', comparison with another data set, overview or ease of drawing/plotting/reading require an appropriate advantage statement.

**Disadvantage:**
Comment referring to limited data information provided | B1 | Comments referring to no individual data, no information about the number of values, unable to calculate mean, standard deviation, variance and mode score B1. Any comments with reference to median, shape or any other 'advantage' will score B0. Comments referring to 'size of data set' or 'average' require an appropriate disadvantage statement. Comments referring to outliers are ignored in all cases unless supported by an appropriate advantage/disadvantage statement. If comments not clearly identified, assume first comment is the advantage.

**Total: 2 marks**

---

## Question 6(ii):

Not mean as data skewed by one large value | B1 | Comment which identifies 768 (or 'a very large number') as the problem. Condone the use of 'outlier'

Not mode as frequencies all the same | B1 | Comment which indicates that no mode exists (e.g. all the data is different, there is no repeated number, all the values are different)

Median | B1 | Median identified as choice, dependent upon statements for mean and mode being given, even if incorrect or very general.

**SC: Mean is identified as most suitable:**

Not mode as frequencies all the same | SCB1 | Comment which indicates that no mode exists

Not median as not all values used | SCB1 | Comment which indicates limitation of median e.g. median is not in middle of range.

**Total: 3 marks**

---

## Question 6(iii)(a):

$LQ = 256$ or $256.5$, $Med = 280$, $UQ = 329$, $Min\ 190$, $max\ 375$ | B1 | Median, UQ and LQ values seen, may not be identified or identified correctly. (Not read from box plot unless value stated)

Box plot drawn with median and quartiles plotted | B1 | FT Median and quartiles plotted in box on graph, linear scale

Correct end points, whiskers from ends of box | B1 | Correct end points, whiskers from ends of box but not through box, not at top or bottom of box

Uniform scale from 190 to 375, axis labelled 'time' and 'minutes' | B1 | Uniform scale from 190 to 375 (need at least 3 linear identified points min) and labelled 'time' and 'minutes' (can be in title). **No time axis or time axis with no scale attempt**, Max B1B0B0B0

**Total: 4 marks**

---

## Question 6(iii)(b):

$IQR = their\ 329 - their\ 256 = 73$ or $72.5$ | B1 | FT Must follow through only from *their* stated values (condone if correct quartiles stated here), not reading from graph.

**Total: 1 mark**

---
6 (i) Give one advantage and one disadvantage of using a box-and-whisker plot to represent a set of data.\\

(ii) The times in minutes taken to run a marathon were recorded for a group of 13 marathon runners and were found to be as follows.

$$\begin{array} { l l l l l l l l l l l l l } 
180 & 275 & 235 & 242 & 311 & 194 & 246 & 229 & 238 & 768 & 332 & 227 & 228
\end{array}$$

State which of the mean, mode or median is most suitable as a measure of central tendency for these times. Explain why the other measures are less suitable.\\

(iii) Another group of 33 people ran the same marathon and their times in minutes were as follows.

\begin{center}
\begin{tabular}{ l l l l l l l l l l l }
190 & 203 & 215 & 246 & 249 & 253 & 255 & 254 & 258 & 260 & 261 \\
263 & 267 & 269 & 274 & 276 & 280 & 288 & 283 & 287 & 294 & 300 \\
307 & 318 & 327 & 331 & 336 & 345 & 351 & 353 & 360 & 368 & 375 \\
\end{tabular}
\end{center}
\begin{enumerate}[label=(\alph*)]
\item On the grid below, draw a box-and-whisker plot to illustrate the times for these 33 people.\\

\includegraphics[max width=\textwidth, alt={}, center]{f4d040a2-6a04-49ce-98ac-8ba5c515f905-09_611_1202_1270_555}
\item Find the interquartile range of these times.
\end{enumerate}

\hfill \mbox{\textit{CAIE S1 2019 Q6 [10]}}