2.02i Select/critique data presentation

68 questions

Sort by: Default | Easiest first | Hardest first
CAIE S1 2024 June Q4
8 marks Easy -1.3
4 The times taken, in seconds, by 15 members of each of two swimming clubs, the Penguins and the Dolphins, to swim 50 metres are shown in the following table.
Penguins353942444545485056585961666872
Dolphins364143484949505154565660616471
  1. Draw a back-to-back stem-and-leaf diagram to represent this information, with Penguins on the left-hand side. \includegraphics[max width=\textwidth, alt={}, center]{9b21cc0f-b043-4251-8aa9-cb1e5c2fb5d0-09_2720_33_141_20} The diagram shows a box-and-whisker plot representing the times for the Penguins.
  2. On the same diagram, draw a box-and-whisker plot to represent the times for the Dolphins. \includegraphics[max width=\textwidth, alt={}, center]{9b21cc0f-b043-4251-8aa9-cb1e5c2fb5d0-09_719_1219_424_424}
  3. Hence state one difference between the distributions of the times for the Penguins and the Dolphins.
CAIE S1 2003 June Q1
5 marks Easy -1.8
1
  1. \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{Sales of Superclene Toothpaste} \includegraphics[alt={},max width=\textwidth]{df20f053-8d67-428d-bb19-9447049deed5-2_725_1073_347_497}
    \end{figure} The diagram represents the sales of Superclene toothpaste over the last few years. Give a reason why it is misleading.
  2. The following data represent the daily ticket sales at a small theatre during three weeks. $$52,73,34,85,62,79,89,50,45,83,84,91,85,84,87,44,86,41,35,73,86 \text {. }$$
    1. Construct a stem-and-leaf diagram to illustrate the data.
    2. Use your diagram to find the median of the data.
CAIE S1 2012 June Q1
4 marks Easy -1.8
1 Ashfaq and Kuljit have done a school statistics project on the prices of a particular model of headphones for MP3 players. Ashfaq collected prices from 21 shops. Kuljit used the internet to collect prices from 163 websites.
  1. Name a suitable statistical diagram for Ashfaq to represent his data, together with a reason for choosing this particular diagram.
  2. Name a suitable statistical diagram for Kuljit to represent her data, together with a reason for choosing this particular diagram.
CAIE S1 2019 June Q4
6 marks Moderate -0.8
4 The Mathematics and English A-level marks of 1400 pupils all taking the same examinations are shown in the cumulative frequency graphs below. Both examinations are marked out of 100 . \includegraphics[max width=\textwidth, alt={}, center]{be6c6525-a20c-42d0-8fef-1cd254baaa76-06_1682_1246_404_445} Use suitable data from these graphs to compare the central tendency and spread of the marks in Mathematics and English.
CAIE S1 2005 November Q1
4 marks Easy -1.8
1 A study of the ages of car drivers in a certain country produced the results shown in the table. \begin{table}[h]
\captionsetup{labelformat=empty} \caption{Percentage of drivers in each age group}
YoungMiddle-agedElderly
Males403525
Females207010
\end{table} Illustrate these results diagrammatically.
CAIE S1 2006 November Q3
6 marks Easy -1.3
3 In a survey, people were asked how long they took to travel to and from work, on average. The median time was 3 hours 36 minutes, the upper quartile was 4 hours 42 minutes and the interquartile range was 3 hours 48 minutes. The longest time taken was 5 hours 12 minutes and the shortest time was 30 minutes.
  1. Find the lower quartile.
  2. Represent the information by a box-and-whisker plot, using a scale of 2 cm to represent 60 minutes.
CAIE S1 2010 November Q4
7 marks Easy -1.8
4 The weights in kilograms of 11 bags of sugar and 7 bags of flour are as follows.
Sugar: 1.9611 .98312 .00812 .0141 .9681 .9941 .2 .0112 .0171 .9771 .9841 .989
Flour: \(\begin{array} { l l l l l l l } 1.945 & 1.962 & 1.949 & 1.977 & 1.964 & 1.941 & 1.953 \end{array}\)
  1. Represent this information on a back-to-back stem-and-leaf diagram with sugar on the left-hand side.
  2. Find the median and interquartile range of the weights of the bags of sugar.
CAIE S1 2013 November Q4
7 marks Moderate -0.8
4 The following are the house prices in thousands of dollars, arranged in ascending order, for 51 houses from a certain area.
253270310354386428433468472477485520520524526531535
536538541543546548549551554572583590605614638649652
666670682684690710725726731734745760800854863957986
  1. Draw a box-and-whisker plot to represent the data. An expensive house is defined as a house which has a price that is more than 1.5 times the interquartile range above the upper quartile.
  2. For the above data, give the prices of the expensive houses.
  3. Give one disadvantage of using a box-and-whisker plot rather than a stem-and-leaf diagram to represent this set of data.
OCR S1 2005 January Q2
6 marks Easy -1.8
2 The back-to-back stem-and-leaf diagram below shows the number of hours of television watched per week by each of 15 boys and 15 girls. $$\begin{aligned} & \text { Boys Girls } \\ & \left. \begin{array} { r r r r r r r r | r r r r r r r r r r r r r } & 677664 & 4 & 3 & 0 & 0 & 5 & 5 & 6 & 677888 \end{array} \right\} \end{aligned}$$ Key: 4 | 2 | 2 means a boy who watched 24 hours and a girl who watched 22 hours of television per week.
  1. Find the median and the quartiles of the results for the boys.
  2. Give a reason why the median might be preferred to the mean in using an average to compare the two data sets.
  3. State one advantage, and one disadvantage, of using stem-and-leaf diagrams rather than box-andwhisker plots to represent the data.
OCR S1 Specimen Q6
11 marks Moderate -0.5
6 \includegraphics[max width=\textwidth, alt={}, center]{2fb25fc5-0445-44fa-a23e-647d14b1a376-3_803_1180_1018_413} The diagram shows the cumulative frequency graphs for the marks scored by the candidates in an examination. The 2000 candidates each took two papers; the upper curve shows the distribution of marks on paper 1 and the lower curve shows the distribution on paper 2. The maximum mark on each paper was 100.
  1. Use the diagram to estimate the median mark for each of paper 1 and paper 2.
  2. State with a reason which of the two papers you think was the easier one.
  3. To achieve grade A on paper 1 candidates had to score 66 marks out of 100. What mark on paper 2 gives equal proportions of candidates achieving grade A on the two papers? What is this proportion?
  4. The candidates' marks for the two papers could also be illustrated by means of a pair of box-and whisker plots. Give two brief comments comparing the usefulness of cumulative frequency graphs and box-and-whisker plots for representing the data.
OCR MEI S1 2007 June Q5
6 marks Easy -1.8
5 A GCSE geography student is investigating a claim that global warming is causing summers in Britain to have more rainfall. He collects rainfall data from a local weather station for 2001 and 2006. The vertical line chart shows the number of days per week on which some rainfall was recorded during the 22 weeks of summer 2001. \includegraphics[max width=\textwidth, alt={}, center]{5e4f3310-b96e-43db-9b6d-61da3270db06-4_720_1557_443_296} Number of days per week with rain recorded in summer 2001
  1. Show that the median of the data is 4 , and find the interquartile range.
  2. For summer 2006 the median is 3 and the interquartile range is also 3. The student concludes that the data demonstrate that global warming is causing summer rainfall to decrease rather than increase. Is this a valid conclusion from the data? Give two brief reasons to justify your answer.
OCR MEI S1 2008 June Q7
20 marks Moderate -0.8
7 The histogram shows the age distribution of people living in Inner London in 2001. \includegraphics[max width=\textwidth, alt={}, center]{be764df3-ff20-415d-9c5c-10edabf350de-5_814_1383_349_379} Data sourced from the 2001 Census, \href{http://www.statistics.gov.uk}{www.statistics.gov.uk}
  1. State the type of skewness shown by the distribution.
  2. Use the histogram to estimate the number of people aged under 25.
  3. The table below shows the cumulative frequency distribution.
    Age2030405065100
    Cumulative frequency (thousands)66012401810\(a\)24902770
    (A) Use the histogram to find the value of \(a\).
    (B) Use the table to calculate an estimate of the median age of these people. The ages of people living in Outer London in 2001 are summarised below.
    Age ( \(x\) years)\(0 \leqslant x < 20\)\(20 \leqslant x < 30\)\(30 \leqslant x < 40\)\(40 \leqslant x < 50\)\(50 \leqslant x < 65\)\(65 \leqslant x < 100\)
    Frequency (thousands)1120650770590680610
  4. Illustrate these data by means of a histogram.
  5. Make two brief comments on the differences between the age distributions of the populations of Inner London and Outer London.
  6. The data given in the table for Outer London are used to calculate the following estimates. Mean 38.5, median 35.7, midrange 50, standard deviation 23.7, interquartile range 34.4.
    The final group in the table assumes that the maximum age of any resident is 100 years. These estimates are to be recalculated, based on a maximum age of 105, rather than 100. For each of the five estimates, state whether it would increase, decrease or be unchanged.
OCR MEI S1 Q3
6 marks Easy -1.8
3 A GCSE geography student is investigating a claim that global warming is causing summers in Britain to have more rainfall. He collects rainfall data from a local weather station for 2001 and 2006. The vertical line chart shows the number of days per week on which some rainfall was recorded during the 22 weeks of summer 2001. \includegraphics[max width=\textwidth, alt={}, center]{c7cb0f6b-7b6b-4c52-8287-7efc6bd70247-3_804_1557_547_337}
  1. Show that the median of the data is 4 , and find the interquartile range.
  2. For summer 2006 the median is 3 and the interquartile range is also 3. The student concludes that the data demonstrate that global warming is causing summer rainfall to decrease rather than increase. Is this a valid conclusion from the data? Give two brief reasons to justify your answer.
OCR MEI S1 Q5
18 marks Standard +0.3
5 Yasmin has 5 coins. One of these coins is biased with P (heads) \(= 0.6\). The other 4 coins are fair. She tosses all 5 coins once and records the number of heads, \(X\).
  1. Show that \(\mathrm { P } ( X = 0 ) = 0.025\).
  2. Show that \(\mathrm { P } ( X = 1 ) = 0.1375\). The table shows the probability distribution of \(X\).
    \(r\)012345
    \(\mathrm { P } ( X = r )\)0.0250.13750.30.3250.1750.0375
  3. Draw a vertical line chart to illustrate the probability distribution.
  4. Comment on the skewness of the distribution.
  5. Find \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\).
  6. Yasmin tosses the 5 coins three times. Find the probability that the total number of heads is 3 .
Edexcel S1 2004 January Q5
18 marks Moderate -0.3
5. The values of daily sales, to the nearest \(\pounds\), taken at a newsagents last year are summarised in the table below.
SalesNumber of days
\(1 - 200\)166
\(201 - 400\)100
\(401 - 700\)59
\(701 - 1000\)30
\(1001 - 1500\)5
  1. Draw a histogram to represent these data.
  2. Use interpolation to estimate the median and inter-quartile range of daily sales.
  3. Estimate the mean and the standard deviation of these data. The newsagent wants to compare last year's sales with other years.
  4. State whether the newsagent should use the median and the inter-quartile range or the mean and the standard deviation to compare daily sales. Give a reason for your answer.
    (2)
OCR MEI S1 2012 January Q1
7 marks Easy -1.8
1 The mean daily maximum temperatures at a research station over a 12-month period, measured to the nearest degree Celsius, are given below.
JanFebMarAprMayJunJulAugSepOctNovDec
8152529313134363426158
  1. Construct a sorted stem and leaf diagram to represent these data, taking stem values of \(0,10 , \ldots\).
  2. Write down the median of these data.
  3. The mean of these data is 24.3 . Would the mean or the median be a better measure of central tendency of the data? Briefly explain your answer.
OCR MEI S1 2012 January Q7
19 marks Moderate -0.3
7 The birth weights of 200 lambs from crossbred sheep are illustrated by the cumulative frequency diagram below. \includegraphics[max width=\textwidth, alt={}, center]{4b259fe3-73ef-419f-85ad-1a3b1e6ea56e-4_917_1146_367_447}
  1. Estimate the percentage of lambs with birth weight over 6 kg .
  2. Estimate the median and interquartile range of the data.
  3. Use your answers to part (ii) to show that there are very few, if any, outliers. Comment briefly on whether any outliers should be disregarded in analysing these data. The box and whisker plot shows the birth weights of 100 lambs from Welsh Mountain sheep. \includegraphics[max width=\textwidth, alt={}, center]{4b259fe3-73ef-419f-85ad-1a3b1e6ea56e-4_328_1616_1749_260}
  4. Use appropriate measures to compare briefly the central tendencies and variations of the weights of the two types of lamb.
  5. The weight of the largest Welsh Mountain lamb was originally recorded as 6.5 kg , but then corrected. If this error had not been corrected, how would this have affected your answers to part (iv)? Briefly explain your answer.
  6. One lamb of each type is selected at random. Estimate the probability that the birth weight of both lambs is at least 3.9 kg .
OCR MEI S1 2013 January Q6
18 marks Standard +0.3
6 The heights \(x \mathrm {~cm}\) of 100 boys in Year 7 at a school are summarised in the table below.
Height\(125 \leqslant x \leqslant 140\)\(140 < x \leqslant 145\)\(145 < x \leqslant 150\)\(150 < x \leqslant 160\)\(160 < x \leqslant 170\)
Frequency252924184
  1. Estimate the number of boys who have heights of at least 155 cm .
  2. Calculate an estimate of the median height of the 100 boys.
  3. Draw a histogram to illustrate the data. The histogram below shows the heights of 100 girls in Year 7 at the same school. \includegraphics[max width=\textwidth, alt={}, center]{76283206-687f-45d6-9204-952d60843cf1-3_865_1349_1297_349}
  4. How many more girls than boys had heights exceeding 160 cm ?
  5. Calculate an estimate of the mean height of the 100 girls.
OCR MEI S1 2009 June Q1
5 marks Easy -1.8
1 In a traffic survey, the number of people in each car passing the survey point is recorded. The results are given in the following frequency table.
Number of people1234
Frequency5031165
  1. Write down the median and mode of these data.
  2. Draw a vertical line diagram for these data.
  3. State the type of skewness of the distribution.
OCR MEI S1 2016 June Q6
18 marks Moderate -0.8
6 An online store has a total of 930 different types of women's running shoe on sale. The prices in pounds of the types of women's running shoe are summarised in the table below.
Price \(( \pounds x )\)\(10 \leqslant x \leqslant 40\)\(40 < x \leqslant 50\)\(50 < x \leqslant 60\)\(60 < x \leqslant 80\)\(80 < x \leqslant 200\)
Frequency147109182317175
  1. Calculate estimates of the mean and standard deviation of the shoe prices.
  2. Calculate an estimate of the percentage of types of shoe that cost at least \(\pounds 100\).
  3. Draw a histogram to illustrate the data. The corresponding histogram below shows the prices in pounds of the 990 types of men's running shoe on sale at the same online store. \includegraphics[max width=\textwidth, alt={}, center]{aff0c5b2-011b-49a0-bf05-6d905f890eba-4_643_1192_340_440}
  4. State the type of skewness shown by the histogram for men's running shoes.
  5. Martin is investigating the percentage of types of shoe on sale at the store that cost more than \(\pounds 100\). He believes that this percentage is greater for men's shoes than for women's shoes. Estimate the percentage for men's shoes and comment on whether you can be certain which percentage is higher.
  6. You are given that the mean and standard deviation of the prices of men's running shoes are \(\pounds 68.83\) and \(\pounds 42.93\) respectively. Compare the central tendency and variation of the prices of men's and women's running shoes at the store.
OCR H240/02 2021 November Q13
9 marks Moderate -0.8
13 The four pie charts illustrate the numbers of employees using different methods of travel in four Local Authorities in 2011. \includegraphics[max width=\textwidth, alt={}, center]{7298e7b9-ad52-480c-bc2b-8289aeab9ebb-10_1131_1077_347_242}
\multirow[t]{4}{*}{Key:}\multirow{4}{*}{\includegraphics[max width=\textwidth, alt={}]{7298e7b9-ad52-480c-bc2b-8289aeab9ebb-10_105_142_1578_465} }Public transport
Private motorised transport
Bicycle
All other methods of travel
  1. State, with reasons, which of the four Local Authorities is most likely to be a rural area with many hills.
  2. Explain why pie charts are more suitable for answering part (a) than bar charts showing the same data.
  3. Two of the Local Authorities represent urban areas.
    1. State with a reason which two Local Authorities are likely to be urban.
    2. One urban Local Authority introduced a Park-and-Ride service in 2006. Users of this service drive to the edge of the urban area and then use buses to take them into the centre of the area. A student claims that a comparison of the corresponding pie charts for 2001 (not shown) and 2011 would enable them to identify which Local Authority this was. State with a reason whether you agree with the student.
OCR H240/02 Q9
4 marks Moderate -0.8
9 The diagram below shows some "Cycle to work" data taken from the 2001 and 2011 UK censuses. The diagram shows the percentages, by age group, of male and female workers in England and Wales, excluding London, who cycled to work in 2001 and 2011. \includegraphics[max width=\textwidth, alt={}, center]{f2f45d6c-cfdc-455b-ab08-597b06a69f36-10_951_1635_559_207} The following questions refer to the workers represented by the graphs in the diagram.
  1. A researcher is going to take a sample of men and a sample of women and ask them whether or not they cycle to work. Why would it be more important to stratify the sample of men? A research project followed a randomly chosen large sample of the group of male workers who were aged 30-34 in 2001.
  2. Does the diagram suggest that the proportion of this group who cycled to work has increased or decreased from 2001 to 2011?
    Justify your answer.
  3. Write down one assumption that you have to make about these workers in order to draw this conclusion.
OCR H240/02 Q13
5 marks Moderate -0.5
13 The table and the four scatter diagrams below show data taken from the 2011 UK census for four regions. On the scatter diagrams the names have been replaced by letters.
The table shows, for each region, the mean and standard deviation of the proportion of workers in each Local Authority who travel to work by driving a car or van and the proportion of workers in each Local Authority who travel to work as a passenger in a car or van.
Each scatter diagram shows, for each of the Local Authorities in a particular region, the proportion of workers who travel to work by driving a car or van and the proportion of workers who travel to work as a passenger in a car or van.
Driving a car or vanPassenger in a car or van
MeanStandard deviationMeanStandard deviation
London0.2570.1330.0170.008
South East0.5780.0640.0450.010
South West0.5800.0840.0490.007
Wales0.6440.0450.0680.015
Region A \includegraphics[max width=\textwidth, alt={}, center]{f2f45d6c-cfdc-455b-ab08-597b06a69f36-14_634_1116_1308_299} Region B \includegraphics[max width=\textwidth, alt={}, center]{f2f45d6c-cfdc-455b-ab08-597b06a69f36-14_636_1109_2049_301} \includegraphics[max width=\textwidth, alt={}, center]{f2f45d6c-cfdc-455b-ab08-597b06a69f36-15_737_1183_237_240} \includegraphics[max width=\textwidth, alt={}, center]{f2f45d6c-cfdc-455b-ab08-597b06a69f36-15_723_1169_1046_246}
  1. Using the values given in the table, match each region to its corresponding scatter diagram, explaining your reasoning.
  2. Steven claims that the outlier in the scatter diagram for Region C consists of a group of small islands. Explain whether or not the data given above support his claim.
  3. One of the Local Authorities in Region B consists of a single large island. Explain whether or not you would expect this Local Authority to appear as an outlier in the scatter diagram for Region B.
Edexcel AS Paper 2 2024 June Q3
7 marks Moderate -0.3
  1. Customers in a shop have to queue to pay.
The partially completed table below and partially completed histogram opposite, give information about the time, \(x\) minutes, spent in the queue by each of 112 customers one day.
Time in queue ( \(\boldsymbol { x }\) minutes)Frequency
\(1 - 2\)64
\(2 - 3\)
\(3 - 4\)13
\(4 - 6\)
\(6 - 8\)3
No customer spent less than 1 minute or longer than 8 minutes in the queue.
  1. Complete the table.
  2. Complete the histogram. Ting decides to model the frequency density for these 112 customers by a curve with equation $$y = \frac { k } { x ^ { 2 } } \quad 1 \leqslant x \leqslant 8$$ where \(k\) is a constant.
  3. Find the value of \(k\)
    \includegraphics[max width=\textwidth, alt={}]{6a0b46f8-7a6a-4ed8-8c7a-9772787f155a-07_1584_1189_285_443}
    Only use this grid if you need to redraw your histogram. \includegraphics[max width=\textwidth, alt={}, center]{6a0b46f8-7a6a-4ed8-8c7a-9772787f155a-09_1582_1192_367_440} \includegraphics[max width=\textwidth, alt={}, center]{6a0b46f8-7a6a-4ed8-8c7a-9772787f155a-09_2267_51_307_36}
OCR PURE Q8
3 marks Easy -1.8
8
  1. Joseph drew a histogram to show information about one Local Authority. He used data from the "Age structure by LA 2011" tab in the large data set. The table shows an extract from the data that he used.
    Age group0 to 4
    Frequency2143
    Joseph used a scale of \(1 \mathrm {~cm} = 1000\) units on the frequency density axis. Calculate the height of the histogram block for the 0 to 4 class.
  2. Magdalene wishes to draw a statistical diagram to illustrate some of the data from the "Method of travel by LA 2011" tab in the large data set. State why she cannot draw a histogram.