2.02i Select/critique data presentation

68 questions

Sort by: Default | Easiest first | Hardest first
Edexcel S1 Specimen Q5
16 marks Easy -1.2
  1. Explain briefly the advantages and disadvantages of using the quartiles to summarise a set of data. [4]
  2. Describe the main features and uses of a box plot. [3]
The distances, in kilometres, travelled to school by the teachers in two schools, \(A\) and \(B\), in the same town were recorded. The data for School \(A\) are summarised in Diagram 1. \includegraphics{figure_1} For School \(B\), the least distance travelled was 3 km and the longest distance travelled was 55 km. The three quartiles were 17, 24 and 31 respectively. An outlier is an observation that falls either \(1.5 \times\) (interquartile range) above the upper quartile or \(1.5 \times\) (interquartile range) below the lower quartile.
  1. Draw a box plot for School \(B\). [5]
  2. Compare and contrast the two box plots. [4]
Edexcel S1 Q7
15 marks Moderate -0.8
Jane and Tahira play together in a basketball team. The list below shows the number of points that Jane scored in each of 30 games.
39192830182123153424
29174312242541192640
45232132372418152436
  1. Construct a stem and leaf diagram for these data. [3 marks]
  2. Find the median and quartiles for these data. [4 marks]
  3. Represent these data with a boxplot. [3 marks]
Tahira played in the same 30 games and her lowest and highest points total in a game were 19 and 41 respectively. The quartiles for Tahira were 27, 31 and 35 respectively.
  1. Using the same scale draw a boxplot for Tahira's points totals. [2 marks]
  2. Compare and contrast the number of points scored per game by Jane and Tahira. [3 marks]
Edexcel S1 Q4
14 marks Moderate -0.8
A College offers evening classes in GCSE Mathematics and English. In order to assess which age groups were reluctant to use the classes, the College collected data on the age in completed years of those currently attending each course. The results are shown in this back-to-back stem and leaf diagram. \includegraphics{figure_4} Key: \(1 | 3 | 2\) means age 31 doing Mathematics and age 32 doing English
  1. Find the median and quartiles of the age in completed years of those attending the Mathematics classes. [4 marks]
  2. On graph paper, draw a box plot representing the data for the Mathematics class. [3 marks]
The median and quartiles of the age in completed years of those attending the English classes are 25, 41 and 57 years respectively.
  1. Draw a box plot representing the data for the English class using the same scale as for the data from the Mathematics class. [3 marks]
  2. Using your box plots, compare and contrast the ages of those taking each class. [4 marks]
OCR H240/02 2020 November Q14
6 marks Easy -1.8
Table 1 shows the numbers of usual residents in the age range 0 to 4 in 15 Local Authorities (LAs) in 2001 and 2011. The table also shows the increase in the numbers in this age group, and the same increase as a percentage. \includegraphics{figure_14} Fig. 2 shows the increase in each LA in raw numbers, and Fig. 3 shows the percentage increase in each LA. \includegraphics{figure_14_2} \includegraphics{figure_14_3}
  1. The Education Committees in these LAs need to plan for the provision of schools for pupils in their districts.
    1. Explain why, in this context, the increase is more important than the actual numbers. [1]
    2. In which of the following LAs was there likely to have been the greatest need for extra teachers in the years following 2011: Bolton, Sefton, Tameside or Wigan? Give a reason for your answer. [2]
    3. State an assumption about the populations needed to make your answer in part (ii) valid. [1]
  2. In two of the 15 LAs the proportion of young families is greater than in the other 13 LAs. Suggest, using only data from Fig. 2 and Fig. 3 and/or Table 1, which two LAs these are most likely to be. [2]
AQA AS Paper 1 2019 June Q10
9 marks Moderate -0.3
On 18 March 2019 there were 12 hours of daylight in Inverness. On 16 June 2019, 90 days later, there will be 18 hours of daylight in Inverness. Jude decides to model the number of hours of daylight in Inverness, \(N\), by the formula $$N = A + B\sin t°$$ where \(t\) is the number of days after 18 March 2019.
    1. State the value that Jude should use for \(A\). [1 mark]
    2. State the value that Jude should use for \(B\). [1 mark]
    3. Using Jude's model, calculate the number of hours of daylight in Inverness on 15 May 2019, 58 days after 18 March 2019. [1 mark]
    4. Using Jude's model, find how many days during 2019 will have at least 17.4 hours of daylight in Inverness. [4 marks]
    5. Explain why Jude's model will become inaccurate for 2020 and future years. [1 mark]
  1. Anisa decides to model the number of hours of daylight in Inverness with the formula $$N = A + B\sin \left(\frac{360}{365}t\right)°$$ Explain why Anisa's model is better than Jude's model. [1 mark]
AQA AS Paper 2 2018 June Q17
2 marks Easy -1.8
The table below is an extract from the Large Data Set, showing the purchased quantities of fats and oils for the South East of England in 2014.
DescriptionPurchased quantity
Butter42
Soft margarine16
Olive oil17
Other vegetable and salad oils28
Kim claims that more olive oil was purchased in the South East than soft margarine. Explain why Kim may be incorrect. [2 marks]
AQA Paper 3 2019 June Q16
10 marks Moderate -0.3
  1. The graph below shows the amount of salt, in grams, purchased per person per week in England between 2001–02 and 2014, based upon the Large Data Set. \includegraphics{figure_16a} Meera and Gemma are arguing about what this graph shows. Meera believes that the amount of salt consumed by people decreased greatly during this period. Gemma says that this is not the case. Using your knowledge of the Large Data Set, give two reasons why Gemma may be correct. [2 marks]
  2. It is known that the mean amount of sugar purchased per person in England in 2014 was 78.9 grams, with a standard deviation of 25.0 grams. In 2018, a sample of 918 people had a mean of 80.4 grams of sugar purchased per person. Investigate, at the 5\% level of significance, whether the mean amount of sugar purchased per person in England has changed between 2014 and 2018. Assume that the survey data is a random sample taken from a normal distribution and that the standard deviation has remained the same. [6 marks]
  3. Another test is performed to determine whether the mean amount of fat purchased per person has changed between 2014 and 2018. At the 10\% significance level, the null hypothesis is rejected. With reference to the 10\% significance level, explain why it is not necessarily true that there has been a change. [2 marks]
OCR PURE Q13
7 marks Easy -2.5
The radar diagrams illustrate some population figures from the 2011 census results. \includegraphics{figure_13} Each radius represents an age group, as follows:
Radius123456
Age group0-1718-2930-4445-5960-7475+
The distance of each dot from the centre represents the number of people in the relevant age group.
  1. The scales on the two diagrams are different. State an advantage and a disadvantage of using different scales in order to make comparisons between the ages of people in these two Local Authorities. [2]
  2. Approximately how many people aged 45 to 59 were there in Liverpool? [1]
  3. State the main two differences between the age profiles of the two Local Authorities. [2]
  4. James makes the following claim. "Assuming that there are no significant movements of population either into or out of the two regions, the 2021 census results are likely to show an increase in the number of children in Liverpool and a decrease in the number of children in Rutland." Use the radar diagrams to give a justification for this claim. [2]
OCR MEI AS Paper 2 2018 June Q11
9 marks Easy -1.8
The pre-release material contains data concerning the death rate per thousand people and the birth rate per thousand people in all the countries of the world. The diagram in Fig. 11.1 was generated using a spreadsheet and summarises the birth rates for all the countries in Africa. \includegraphics{figure_11_1} Fig. 11.1
  1. Identify two respects in which the presentation of the data is incorrect. [2]
Fig. 11.2 shows a scatter diagram of death rate, \(y\), against birth rate, \(x\), for a sample of 55 countries, all of which are in Africa. A line of best fit has also been drawn. \includegraphics{figure_11_2} Fig. 11.2 The equation of the line of best fit is \(y = 0.15x + 4.72\).
    1. What does the diagram suggest about the relationship between death rate and birth rate? [1]
    2. The birth rate in Togo is recorded as 34.13 per thousand, but the data on death rate has been lost. Use the equation of the line of best fit to estimate the death rate in Togo. [1]
    3. Explain why it would not be sensible to use the equation of the line of best fit to estimate the death rate in a country where the birth rate is 5.5 per thousand. [1]
    4. Explain why it would not be sensible to use the equation of the line of best fit to estimate the death rate in a Caribbean country where the birth rate is known. [1]
    5. Explain why it is unlikely that the sample is random. [1]
Including Togo there were 56 items available for selection.
  1. Describe how a sample of size 14 from this data could be generated for further analysis using systematic sampling. [2]
WJEC Unit 2 2018 June Q06
10 marks Moderate -0.8
Basel is a keen learner of languages. He finds a website on which a large number of language tutors offer their services. Basel records the cost, in dollars, of a one hour lesson from a random sample of tutors. He puts the data into a computer program which gives the following summary statistics. Cost per 1 hour lesson Min. :10.0 1st Qu. :16.0 Median :17.2 Mean :19.8 3rd Qu. :21.0 Max. :40.0
  1. Showing all calculations, comment on any outliers for the cost of a one hour lesson with a language tutor. [4]
  2. Describe the skewness of the data and explain what it means in this context. [2]
Dafydd is also a keen learner of languages. He takes his own random sample of the cost, in dollars, for a one hour lesson. He produces the following box plot. \includegraphics{figure_6}
    1. What will happen to the mean if the outlier is removed?
    2. What will happen to the median if the outlier is removed? [2]
  1. Compare and contrast the distributions of the cost of one hour language lessons for Dafydd's sample and Basel's sample. [2]
WJEC Unit 2 2024 June Q5
8 marks Easy -1.2
In March 2020, the coronavirus pandemic caused major disruption to the lives of individuals across the world. A newspaper published the following graph from the gov.uk website, along with an article which included the following excerpt. "The daily number of vaccines administered continues to fall. In order to get control of the virus, we need the number of people receiving a second dose of the vaccine to keep rocketing. The fear is it will start to drop off soon, which will leave many people still unprotected." \includegraphics{figure_5}
  1. By referring to the graph, explain how the quote could be misleading. [1]
The daily numbers of second dose vaccines, in thousands, over the period April 1st 2021 to May 31st 2021 are shown in the table below.
Daily numberMidpointFrequencyPercentage
of 2nd dose\(x\)\(f\)
vaccines
(1000s)
\(0 \leqslant v < 100\)5023·3
\(100 \leqslant v < 200\)150813·1
\(200 \leqslant v < 300\)2501016·4
\(300 \leqslant v < 400\)3501321·3
\(400 \leqslant v < 500\)4502642·6
\(500 \leqslant v < 600\)55023·3
Total61100
    1. Calculate estimates of the mean and standard deviation for the daily number of second dose vaccines given over this period. You may use \(\sum x^2 f = 8272500\). [4]
    2. Comment on the skewness of these data. [1]
  1. Give a possible reason for the pattern observed in this graph. [1]
  2. State, with a reason, whether or not you think the data for April 15th to April 18th are incorrect. [1]
WJEC Unit 2 Specimen Q5
12 marks Easy -1.2
Gareth has a keen interest in pop music. He recently read the following claim in a music magazine. In the pop industry most songs on the radio are not longer than three minutes.
  1. He decided to investigate this claim by recording the lengths of the top 50 singles in the UK Official Singles Chart for the week beginning 17 June 2016. (A 'single' in this context is one digital audio track.) Comment on the suitability of this sample to investigate the magazine's claim. [1]
  2. Gareth recorded the data in the table below.
    Length of singles for top 50 UK Official Chart singles, 17 June 2016
    2.5-(3.0)3.0-(3.5)3.5-(4.0)4.0-(4.5)4.5-(5.0)5.0-(5.5)5.5-(6.0)6.0-(6.5)6.5-(7.0)7.0-(7.5)
    317227000001
    He used these data to produce a graph of the distributions of the lengths of singles \includegraphics{figure_2} State two corrections that Gareth needs to make to the histogram so that it accurately represents the data in the table. [2]
  3. Gareth also produced a box plot of the lengths of singles. \includegraphics{figure_3} He sees that there is one obvious outlier.
    1. What will happen to the mean if the outlier is removed?
    2. What will happen to the standard deviation if the outlier is removed? [2]
  4. Gareth decided to remove the outlier. He then produced a table of summary statistics.
    1. Use the appropriate statistics from the table to show, by calculation, that the maximum value for the length of a single is not an outlier.
      Summary statistics
      Length of single for top 50 UK Official Singles Chart (minutes)
      Length of singleNMeanStandard deviationMinimumLower quartileMedianUpper quartileMaximum
      493.570.3932.773.263.603.894.38
    2. State, with a reason, whether these statistics support the magazine's claim. [4]
  5. Gareth also calculated summary statistics for the lengths of 30 singles selected at random from his personal collection.
    Summary statistics
    Length of single for Gareth's random sample of 30 singles (minutes)
    Length of singleNMeanStandard deviationMinimumLower quartileMedianUpper quartileMaximum
    303.130.3642.582.732.923.223.95
    Compare and contrast the distribution of lengths of singles in Gareth's personal collection with the distribution in the top 50 UK Official Singles Chart. [3]
SPS SPS SM Statistics 2024 January Q4
6 marks Easy -1.2
The table shows the increases, between 2001 and 2011, in the percentages of employees travelling to work by various methods, in the Local Authorities (LAs) in the North East region of the UK. \includegraphics{figure_4} The first two digits of the Geography code give the type of each of the LAs: 06: Unitary authority 07: Non-metropolitan district 08: Metropolitan borough
  1. In what type of LA are the largest increases in percentages of people travelling by underground, metro, light rail or tram? [1]
  2. Identify two main changes in the pattern of travel to work in the North East region between 2001 and 2011. [2]
Now assume the following.
  • The data refer to residents in the given LAs who are in the age range 20 to 65 at the time of each census.
  • The number of people in the age range 20 to 65 who move into or out of each given LA, or who die, between 2001 and 2011 is negligible.
  1. Estimate the percentage of the people in the age range 20 to 65 in 2011 whose data appears in both 2001 and 2011. [2]
  2. In the light of your answer to part (c), suggest a reason for the changes in the pattern of travel to work in the North East region between 2001 and 2011. [1]
SPS SPS SM Statistics 2025 April Q5
13 marks Easy -1.3
In a study of reaction times, 25 participants completed a test where their reaction times (in milliseconds) were recorded. The results are shown in the stem-and-leaf diagram below: 20 | 3 5 7 9 21 | 0 2 5 6 8 22 | 1 3 4 5 7 9 23 | 0 2 5 8 24 | 1 4 6 7 25 | 2 5 Key: 21 | 0 represents a reaction time of 210 milliseconds
  1. State the median reaction time. [1]
  2. Calculate the interquartile range of these reaction times. [2]
  3. Find the mean and standard deviation of these reaction times. [3]
  4. State one advantage of using a stem-and-leaf diagram to display this data rather than a frequency table. [1]
  5. One participant completed the test again and recorded a reaction time of 195 milliseconds. Add this result to the stem-and-leaf diagram and state the effect this would have on: a. the median b. the mean c. the standard deviation [4]
  6. Explain why the interquartile range might be preferred to the standard deviation as a measure of spread in this context [2]
SPS SPS SM Statistics 2024 September Q4
7 marks Easy -1.8
The radar diagrams illustrate some population figures from the 2011 census results. \includegraphics{figure_4} Each radius represents an age group, as follows:
Radius123456
Age group0-1718-2930-4445-5960-7475+
The distance of each dot from the centre represents the number of people in the relevant age group.
  1. The scales on the two diagrams are different. State an advantage and a disadvantage of using different scales in order to make comparisons between the ages of people in these two Local Authorities. [2]
  2. Approximately how many people aged 45 to 59 were there in Liverpool? [1]
  3. State the main two differences between the age profiles of the two Local Authorities. [2]
  4. James makes the following claim. "Assuming that there are no significant movements of population either into or out of the two regions, the 2021 census results are likely to show an increase in the number of children in Liverpool and a decrease in the number of children in Rutland." Use the radar diagrams to give a justification for this claim. [2]
SPS SPS SM 2025 October Q11
9 marks Moderate -0.8
A student dissolves 0.5 kg of salt in a bucket of water. Water leaks out of a hole in the bucket so the student lets fresh water flow in so that the bucket stays full. They assume that the salty water remaining in the bucket mixes with the fresh water that flows in, so the concentration of salt is uniform throughout the bucket. They model the mass \(M\) kg of salt remaining after \(t\) minutes by \(M = ak^t\) where \(a\) and \(k\) are constants.
  1. Show that the model for \(M\) can be rewritten in the form \(\log_{10} M = t\log_{10} k + \log_{10} a\). [1]
The student measures the concentration of salt in the bucket at certain times to estimate the mass of the salt remaining. The results are shown in the table below.
\(t\) minutes813213550
\(M\) kg0.40.30.20.10.05
The student uses this data and plots \(y = \log_{10} M\) against \(x = t\) using graph drawing software. The software gives \(y = -0.0214x - 0.2403\) for the equation of the line of best fit.
    1. Find the values of \(a\) and \(k\) that follow from the equation of the line. [2]
    2. Interpret the value of \(k\) in context. [1]
  1. It is known that when \(t = 0\) the mass of salt in the bucket is 0.5 kg. Comment on the accuracy when the model is used to estimate the initial mass of the salt. [1]
  2. Use the model to predict the value of \(t\) at which \(M = 0.01\) kg. [2]
  3. Rewrite the model for \(M\) in the form \(M = ae^{-ht}\) where \(h\) is a constant to be determined. [2]
OCR H240/02 2017 Specimen Q9
4 marks Easy -1.8
The diagram below shows some "Cycle to work" data taken from the 2001 and 2011 UK censuses. The diagram shows the percentages, by age group, of male and female workers in England and Wales, excluding London, who cycled to work in 2001 and 2011. \includegraphics{figure_9} The following questions refer to the workers represented by the graphs in the diagram.
  1. A researcher is going to take a sample of men and a sample of women and ask them whether or not they cycle to work. Why would it be more important to stratify the sample of men? [1]
A research project followed a randomly chosen large sample of the group of male workers who were aged 30-34 in 2001.
  1. Does the diagram suggest that the proportion of this group who cycled to work has increased or decreased from 2001 to 2011? Justify your answer. [2]
  2. Write down one assumption that you have to make about these workers in order to draw this conclusion. [1]
OCR H240/02 2017 Specimen Q13
5 marks Moderate -0.8
The table and the four scatter diagrams below show data taken from the 2011 UK census for four regions. On the scatter diagrams the names have been replaced by letters. The table shows, for each region, the mean and standard deviation of the proportion of workers in each Local Authority who travel to work by driving a car or van and the proportion of workers in each Local Authority who travel to work as a passenger in a car or van. Each scatter diagram shows, for each of the Local Authorities in a particular region, the proportion of workers who travel to work by driving a car or van and the proportion of workers who travel to work as a passenger in a car or van. \includegraphics{figure_13}
  1. Using the values given in the table, match each region to its corresponding scatter diagram, explaining your reasoning. [3]
  2. Steven claims that the outlier in the scatter diagram for Region C consists of a group of small islands. Explain whether or not the data given above support his claim. [1]
  3. One of the Local Authorities in Region B consists of a single large island. Explain whether or not you would expect this Local Authority to appear as an outlier in the scatter diagram for Region B. [1]