Misinterpretation of data in graphs

Questions asking students to critique misleading conclusions or statements drawn from graphs, where the graph itself may be correct but the interpretation or accompanying text is flawed.

8 questions

OCR H240/02 2021 November Q13
13 The four pie charts illustrate the numbers of employees using different methods of travel in four Local Authorities in 2011.
\includegraphics[max width=\textwidth, alt={}, center]{7298e7b9-ad52-480c-bc2b-8289aeab9ebb-10_1131_1077_347_242}
\multirow[t]{4}{*}{Key:}\multirow{4}{*}{\includegraphics[max width=\textwidth, alt={}]{7298e7b9-ad52-480c-bc2b-8289aeab9ebb-10_105_142_1578_465} }Public transport
Private motorised transport
Bicycle
All other methods of travel
  1. State, with reasons, which of the four Local Authorities is most likely to be a rural area with many hills.
  2. Explain why pie charts are more suitable for answering part (a) than bar charts showing the same data.
  3. Two of the Local Authorities represent urban areas.
    1. State with a reason which two Local Authorities are likely to be urban.
    2. One urban Local Authority introduced a Park-and-Ride service in 2006. Users of this service drive to the edge of the urban area and then use buses to take them into the centre of the area. A student claims that a comparison of the corresponding pie charts for 2001 (not shown) and 2011 would enable them to identify which Local Authority this was. State with a reason whether you agree with the student.
Edexcel AS Paper 2 2022 June Q4
  1. Jiang is studying the variable Daily Mean Pressure from the large data set.
He drew the following box and whisker plot for these data for one of the months for one location using a linear scale but
  • he failed to label all the values on the scale
  • he gave an incorrect value for the median
    \includegraphics[max width=\textwidth, alt={}, center]{08e3b0b0-2155-4b37-83e3-343c317ca10c-09_248_1264_573_402}
Daily Mean Pressure (hPa)
Using your knowledge of the large data set, suggest a suitable value for
  1. the median,
  2. the range.
    (You are not expected to have memorised values from the large data set. The question is simply looking for sensible answers.)
    1. Jiang is studying the variable Daily Mean Pressure from the large data set.
    He drew the following box and whisker plot for these data for one of the months for one nong asing at
    • he gave an incorrect value for the median "
    Using your knowledge of the large data set, suggest a suitable value for
  3. the median,
    "
    \includegraphics[max width=\textwidth, alt={}, center]{08e3b0b0-2155-4b37-83e3-343c317ca10c-09_42_31_1213_1304}
    "
Edexcel AS Paper 2 2024 June Q2
  1. Keith is studying the variable Daily Mean Wind Direction, in degrees, from the large data set.
Keith summarised the data for Camborne from 1987 into 4 directions \(A , B , C\) and \(D\) representing North, South, East and West in some order.
Direction\(A\)\(B\)\(C\)\(D\)
Frequency22485658
  1. Using your knowledge of the large data set state, giving a reason, which direction \(A\) represents. The entry for Hurn on 27th September 1987 was 999
  2. State, giving a reason, what Keith should do with this value.
OCR PURE 2066 Q10
10 The table shows extracts from the "Method of travel by LA" tabs for 2001 and 2011 in the large data set.
Local authority (LA)All people in employmentUnderground, metro, light rail, tramTrainBus, minibus or coachMotorcycle, scooter or mopedDriving a car or van
LA1 20017922614369523520575122716052
LA1 201111855622486833630541122012445
LA2 20012036141901062153271256121690
LA2 20112278943231865137321038146644
LA3 20014299335482436327424105
LA3 20114901433828338019128981
LA4 2001101697656932175884645407
LA4 2011123218249513152427576354020
  1. In one of these four LAs a new tram system was opened in 2004. Suggest, with a reason taken from the data, which LA this could have been.
  2. Julian suggests that the figures for "Bus, minibus or coach" for LA1 show that some new bus routes were probably introduced in this LA between 2001 and 2011. Use data from the table to comment on this suggestion.
  3. In one of these four LAs a congestion charge on vehicles was introduced in 2003. Suggest, with a reason taken from the data, which LA this could have been.
OCR PURE Q10
10 The table shows the increases, between 2001 and 2011, in the percentages of employees travelling to work by various methods, in the Local Authorities (LAs) in the North East region of the UK.
Geography codeLocal authorityWork mainly at or from homeUnderground, metro, light rail or tramBus, minibus or coachDriving a car or vanPassenger in a car or vanOn foot
E06000047County Durham0.74\%0.05\%-1.50\%4.58\%-2.99\%-0.97\%
E06000005Darlington0.26\%-0.01\%-3.25\%3.06\%-1.28\%0.99\%
E08000020Gateshead-0.01\%-0.01\%-2.28\%4.62\%-2.35\%-0.18\%
E06000001Hartlepool0.03\%-0.04\%-1.62\%4.80\%-2.38\%-0.26\%
E06000002Middlesbrough-0.34\%-0.01\%-2.32\%2.19\%-1.33\%0.67\%
E08000021Newcastle upon Tyne0.10\%-0.23\%-0.67\%-0.48\%-1.51\%1.75\%
E08000022North Tyneside0.05\%0.54\%-1.18\%3.30\%-2.21\%-0.60\%
E06000048Northumberland1.39\%-0.08\%-0.95\%3.50\%-2.37\%-1.44\%
E06000003Redcar and Cleveland-0.02\%-0.01\%-2.09\%4.20\%-2.06\%-0.49\%
E08000023South Tyneside-0.36\%2.03\%-3.05\%4.50\%-2.41\%-0.51\%
E06000004Stockton-on-Tees0.14\%0.03\%-2.02\%3.52\%-2.01\%-0.15\%
E08000024Sunderland0.17\%1.48\%-3.11\%4.89\%-2.21\%-0.52\%
Increase in percentage of employees travelling to work by various methods
The first two digits of the Geography code give the type of each of the LAs:
06: Unitary authority
07: Non-metropolitan district
08: Metropolitan borough
  1. In what type of LA are the largest increases in percentages of people travelling by underground, metro, light rail or tram?
  2. Identify two main changes in the pattern of travel to work in the North East region between 2001 and 2011. Now assume the following.
    • The data refer to residents in the given LAs who are in the age range 20 to 65 at the time of each census.
    • The number of people in the age range 20 to 65 who move into or out of each given LA, or who die, between 2001 and 2011 is negligible.
    • Estimate the percentage of the people in the age range 20 to 65 in 2011 whose data appears in both 2001 and 2011.
    • In the light of your answer to part (c), suggest a reason for the changes in the pattern of travel to work in the North East region between 2001 and 2011.
OCR MEI Paper 2 2023 June Q9
9 The pre-release material contains information concerning the median income of taxpayers in different areas of London. Some of the data for Camden is shown in the table below. The years quoted in this question refer to the end of the financial years used in the pre-release material. For example, the year 2004 in the table refers to the year 2003/04 in the pre-release material.
Year20042005200620072008200920102011
Median
Income in \(\pounds\)
2130023200242002590026900\#N/A2840029400
  1. Explain whether these data are a sample or a population of Camden taxpayers. A time series for the data is shown below. \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{Median income of taxpayers in Camden 2004-2011} \includegraphics[alt={},max width=\textwidth]{11788aaf-98fb-4a78-8a40-a40743b1fe15-07_624_1469_950_242}
    \end{figure} The LINEST function on a spreadsheet is used to formulate the following model for the data:
    \(I = 1115 Y - 2212950\), where \(I =\) median income of taxpayers in \(\pounds\) and \(Y =\) year.
  2. Use this model to find an estimate of the median income of taxpayers in Camden in 2009.
  3. Give two reasons why this estimate is likely to be close to the true value. The median income of taxpayers in Croydon in 2009 is also not available.
  4. Use your knowledge of the pre-release material to explain whether the model used in part (b) would give a reasonable estimate of the missing value for Croydon.
WJEC Unit 2 2024 June Q5
4 marks
5. In March 2020, the coronavirus pandemic caused major disruption to the lives of individuals across the world. A newspaper published the following graph from the \href{http://gov.uk}{gov.uk} website, along with an article which included the following excerpt.
"The daily number of vaccines administered continues to fall. In order to get control of the virus, we need the number of people receiving a second dose of the vaccine to keep rocketing. The fear is it will start to drop off soon, which will leave many people still unprotected." \begin{figure}[h]
\captionsetup{labelformat=empty} \caption{Number of people (1000s) who received 2nd dose vaccinations daily in the UK, by report date} \includegraphics[alt={},max width=\textwidth]{d9ef2033-bf8b-4aec-bc88-34dbc8b9c208-12_531_1525_906_260}
\end{figure}
  1. By referring to the graph, explain how the quote could be misleading.
    The daily numbers of second dose vaccines, in thousands, over the period April 1st 2021 to May 31st 2021 are shown in the table below.
    Daily number of 2nd dose vaccines (1000s)Midpoint \(x\)Frequency \(f\)Percentage
    \(0 \leqslant v < 100\)5023.3
    \(100 \leqslant v < 200\)150813.1
    \(200 \leqslant v < 300\)2501016.4
    \(300 \leqslant v < 400\)3501321.3
    \(400 \leqslant v < 500\)45026\(42 \cdot 6\)
    \(500 \leqslant v < 600\)55023.3
    Total61100
    1. Calculate estimates of the mean and standard deviation for the daily number of second dose vaccines given over this period. You may use \(\sum x ^ { 2 } f = 8272500\). [4]
    2. Comment on the skewness of these data.
      A second graph in the article shows the number of people receiving a third dose of the vaccine. This graph has a repeated pattern of rising then falling. An extract is shown below. \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{Number of people (millions) who received 3rd dose vaccinations daily in the UK, by report date} \includegraphics[alt={},max width=\textwidth]{d9ef2033-bf8b-4aec-bc88-34dbc8b9c208-14_556_1115_571_466}
      \end{figure}
  2. Give a possible reason for the pattern observed in this graph.
    Another extract shows the number of people who received the third dose of the vaccine between 27th March 2022 and 25th April 2022. \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{Number of people (1000s) who received 3rd dose vaccinations daily in the UK, by report date} \includegraphics[alt={},max width=\textwidth]{d9ef2033-bf8b-4aec-bc88-34dbc8b9c208-15_537_1246_571_406}
    \end{figure}
  3. State, with a reason, whether or not you think the data for April 15th to April 18th are incorrect.
SPS SPS SM Statistics 2026 January Q3
3. Researchers investigated the change in the numbers of people in employment using underground, metro, light rail or tram (UMLRT) between 2001 and 2011. The data are combined for those Local Authorities (LAs) with UMLRT stations into five regions: Birmingham, Liverpool, Manchester, Sheffield and Rotherham, and Tyne and Wear. Fig. 1 shows the total numbers of people in employment in those LAs. Fig. 2 shows the total numbers of people in employment who use UMLRT in those LAs. \begin{figure}[h]
\captionsetup{labelformat=empty} \caption{Fig. 1} \includegraphics[alt={},max width=\textwidth]{fdff6575-679e-4d25-ad43-e9d343c1746f-08_834_1694_836_166}
\end{figure} \begin{figure}[h]
\captionsetup{labelformat=empty} \caption{Fig. 2} \includegraphics[alt={},max width=\textwidth]{fdff6575-679e-4d25-ad43-e9d343c1746f-08_833_1694_1822_166}
\end{figure}
  1. Use these charts to explain which of Birmingham and Liverpool has the larger proportion of people in employment who used UMLRT in 2011. One of the researchers says, "Between 2001 and 2011, the increase in the number of people in employment who use UMLRT is greatest in Tyne and Wear." Sam says, "But what matters more is which region has the greatest increase in the proportion of people in employment who use UMLRT."
  2. Give a reason why the planners responsible for the building of trains and the maintenance of infrastructure might disagree with Sam.
  3. Explain whether those responsible for encouraging the greater use of public transport would agree with Sam.
  4. The charts are compiled from data in the Large Data Set by using those LAs which contain UMLRT stations in each region. Explain a disadvantage of using these data.