Outliers from box plot or summary statistics

Questions where the five-number summary or quartiles are directly given or read from a box plot, and the student applies the 1.5×IQR rule to determine outliers.

9 questions

OCR MEI S1 Q2
2 The box and whisker plot below summarises the weights in grams of the 20 chocolates in a box.
\includegraphics[max width=\textwidth, alt={}, center]{452a52c9-b1fa-4b98-a85d-a34ba0f84a9d-1_290_1186_1099_452}
  1. Find the interquartile range of the data and hence determine whether there are any outliers at either end of the distribution. Ben buys a box of these chocolates each weekend. The chocolates all look the same on the outside, but 7 of them have orange centres, 6 have cherry centres, 4 have coffee centres and 3 have lemon centres. One weekend, each of Ben's 3 children eats one of the chocolates, chosen at random.
  2. Calculate the probabilities of the following events. A: all 3 chocolates have orange centres
    \(B\) : all 3 chocolates have the same centres
  3. Find \(\mathrm { P } ( A \mid B )\) and \(\mathrm { P } ( B \mid A )\). The following weekend, Ben buys an identical box of chocolates and again each of his 3 children eats one of the chocolates, chosen at random.
  4. Find the probability that, on both weekends, the 3 chocolates that they eat all have orange centres.
  5. Ben likes all of the chocolates except those with cherry centres. On another weekend he is the first of his family to eat some of the chocolates. Find the probability that he has to select more than 2 chocolates before he finds one that he likes.
OCR MEI S1 Q1
1 A business analyst collects data about the distribution of hourly wages, in \(\pounds\), of shop-floor workers at a factory. These data are illustrated in the box and whisker plot.
\includegraphics[max width=\textwidth, alt={}, center]{56f1bd5c-4b45-4e36-a324-e7e0edbb5bdd-1_206_1420_505_397}
  1. Name the type of skewness of the distribution.
  2. Find the interquartile range and hence show that there are no outliers at the lower end of the distribution, but there is at least one outlier at the upper end.
  3. Suggest possible reasons why this may be the case.
OCR MEI S1 2010 June Q1
1 A business analyst collects data about the distribution of hourly wages, in \(\pounds\), of shop-floor workers at a factory. These data are illustrated in the box and whisker plot.
\includegraphics[max width=\textwidth, alt={}, center]{091d6f43-ad01-4849-9f3c-3e58349aa169-2_204_1422_484_363}
  1. Name the type of skewness of the distribution.
  2. Find the interquartile range and hence show that there are no outliers at the lower end of the distribution, but there is at least one outlier at the upper end.
  3. Suggest possible reasons why this may be the case.
OCR MEI S1 2015 June Q8
8 The box and whisker plot below summarises the weights in grams of the 20 chocolates in a box.
\includegraphics[max width=\textwidth, alt={}, center]{6015ae6c-bf76-4a0c-af0f-5c53f9c5ed2a-4_287_1177_319_427}
  1. Find the interquartile range of the data and hence determine whether there are any outliers at either end of the distribution. Ben buys a box of these chocolates each weekend. The chocolates all look the same on the outside, but 7 of them have orange centres, 6 have cherry centres, 4 have coffee centres and 3 have lemon centres. One weekend, each of Ben's 3 children eats one of the chocolates, chosen at random.
  2. Calculate the probabilities of the following events. A: all 3 chocolates have orange centres
    \(B\) : all 3 chocolates have the same centres
  3. Find \(\mathrm { P } ( A \mid B )\) and \(\mathrm { P } ( B \mid A )\). The following weekend, Ben buys an identical box of chocolates and again each of his 3 children eats one of the chocolates, chosen at random.
  4. Find the probability that, on both weekends, the 3 chocolates that they eat all have orange centres.
  5. Ben likes all of the chocolates except those with cherry centres. On another weekend he is the first of his family to eat some of the chocolates. Find the probability that he has to select more than 2 chocolates before he finds one that he likes. \section*{END OF QUESTION PAPER} \section*{OCR
    Oxford Cambridge and RSA}
OCR MEI Paper 2 2024 June Q14
14 The pre-release material contains medical data for 103 women and 97 men.
The boxplot represents the weights in kg of 101 of the women from the pre-release material.
\includegraphics[max width=\textwidth, alt={}, center]{8e48bbd3-2166-49e7-8906-833261f331ca-09_421_1232_735_244}
  1. Use your knowledge of the pre-release material to give a reason why the weights of all 103 women were not included in the diagram.
  2. Determine the range of values in which any outliers lie.
  3. Use your knowledge of the pre-release material to explain whether these outliers should be removed from any further analysis of the data.
  4. The median weight of men in the sample was found to be 79.9 kg . Explain what may be inferred by comparing the median weight of men with the median weight of women. Further analysis of the weights of both men and women is carried out. The table shows some of the results.
    meanstandard deviation
    men82.69 kg19.98 kg
    women72.5 kg19.95 kg
  5. Use the information in the table to make two inferences about the distribution of the weights of men compared with the distribution of the weights of women.
Edexcel S1 2020 June Q4
4. A group of students took some tests. A teacher is analysing the average mark for each student. Each student obtained a different average mark. For these average marks, the lower quartile is 24 , the median is 30 and the interquartile range (IQR) is 10
The three lowest average marks are 8, 10 and 15.5 and the three highest average marks are 45, 52.5 and 56 The teacher defines an outlier to be a value that is either
more than \(1.5 \times\) IQR below the lower quartile or
more than \(1.5 \times\) IQR above the upper quartile
  1. Determine any outliers in these data.
  2. On the grid below draw a box plot for these data, indicating clearly any outliers.
    \includegraphics[max width=\textwidth, alt={}, center]{81d5e460-9559-4d25-aa08-6440559aec83-12_350_1223_1128_370}
  3. Use the quartiles to describe the skewness of these data. Give a reason for your answer. Two more students also took the tests. Their average marks, which were both less than 45, are added to the data and the box plot redrawn. The median and the upper quartile are the same but the lower quartile is now 26
  4. Redraw the box plot on the grid below.
    (3)
    \includegraphics[max width=\textwidth, alt={}, center]{81d5e460-9559-4d25-aa08-6440559aec83-12_350_1221_2106_367}
  5. Give ranges of values within which each of these students' average marks must lie. Turn over for spare grids if you need to redraw your answer for part (b) or part (d).
    VIXV SIHIANI III IM IONOOVIAV SIHI NI JYHAM ION OOVI4V SIHI NI JLIYM ION OO
    \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{Copy of grid for part (b)} \includegraphics[alt={},max width=\textwidth]{81d5e460-9559-4d25-aa08-6440559aec83-15_356_1226_1726_367}
    \end{figure} \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{Copy of grid for part (d)} \includegraphics[alt={},max width=\textwidth]{81d5e460-9559-4d25-aa08-6440559aec83-15_353_1226_2240_367}
    \end{figure}
Edexcel S1 2011 January Q3
3. Over a long period of time a small company recorded the amount it received in sales per month. The results are summarised below.
Amount received in sales (£1000s)
Two lowest values3,4
Lower quartile7
Median12
Upper quartile14
Two highest values20,25
An outlier is an observation that falls
either \(1.5 \times\) interquartile range above the upper quartile or \(1.5 \times\) interquartile range below the lower quartile.
  1. On the graph paper below, draw a box plot to represent these data, indicating clearly any outliers.
    (5)
    \includegraphics[max width=\textwidth, alt={}, center]{c78ec7b6-dd06-4de1-94c2-052a5577dd10-05_933_1226_1283_367}
  2. State the skewness of the distribution of the amount of sales received. Justify your answer.
  3. The company claims that for \(75 \%\) of the months, the amount received per month is greater than \(\pounds 10000\). Comment on this claim, giving a reason for your answer.
    (2)
OCR H240/02 2023 June Q8
8 The stem-and-leaf diagram shows the heights, in centimetres, of 15 plants. \(|\)
\(\mid\)02
\(\mid\)10
\(\mid\)24
\(\mid\)30249
\(\mid\)412479
\(\mid\)537
\(\mid\)62
Key: | 2 | 5 means 25 cm .
  1. Draw a box-and-whisker plot to illustrate the data. A statistician intends to analyse the data, but wants to ignore any outliers before doing so.
  2. Discuss briefly whether there are any heights in the diagram which the statistician should ignore.
AQA Paper 3 2023 June Q15
15
  1. A random sample of eight cars was selected from the Large Data Set. The masses of these cars, in kilograms, were as follows.
    \(\begin{array} { l l l l l l l l } 950 & 989 & 1247 & 1415 & 1506 & 1680 & 1833 & 2040 \end{array}\) It is given that, for the population of cars in the Large Data Set: $$\begin{aligned} \text { lower quartile } & = 1167
    \text { median } & = 1393
    \text { upper quartile } & = 1570 \end{aligned}$$ 15
    1. It was decided to remove any of the masses which fall outside the following interval. median \(- 1.5 \times\) interquartile range \(\leq\) mass \(\leq\) median \(+ 1.5 \times\) interquartile range Show that only one of the eight masses in the sample should be removed.
      15
  2. (ii) Write down the statistical name for the mass that should be removed in part (a)(i).
    15
  3. The table shows the probability distribution of the number of previous owners, \(N\), for a sample of cars taken from the Large Data Set.
    \(\boldsymbol { n }\)0123456 or more
    \(\mathbf { P } ( \boldsymbol { N } = \boldsymbol { n } )\)0.140.370.9 k0.250.4 k1.7 k0
    Find the value of \(\mathrm { P } ( 1 \leq N < 5 )\)
    15
  4. 15
  5. An expert team is investigating whether there have been any changes in \(\mathrm { CO } _ { 2 }\) emissions from all cars taken from the Large Data Set.
    The team decided to collect a quota sample of 200 cars to reflect the different years and the different makes of cars in the Large Data Set.
    Using your knowledge of the Large Data Set, explain how the team can collect this sample.
    \includegraphics[max width=\textwidth, alt={}]{6fba7e53-de46-460b-9bef-f1a6962f2e7d-29_2488_1716_219_153}
    \begin{center} \begin{tabular}{|l|} \hline \begin{tabular}{l}