Interpret census or real-world data

A question is this type if and only if it asks to analyze or interpret bivariate relationships in census data, population data, or other real-world datasets with multiple Local Authorities or regions.

10 questions · Easy -1.1

Sort by: Default | Easiest first | Hardest first
OCR H240/02 2018 June Q11
6 marks Moderate -0.8
11 Christa used Pearson's product-moment correlation coefficient, \(r\), to compare the use of public transport with the use of private vehicles for travel to work in the UK.
  1. Using the pre-release data set for all 348 UK Local Authorities, she considered the following four variables.
    Number of employees using
    public transport
    \(x\)
    Number of employees using
    private vehicles
    \(y\)
    Proportion of employees using
    public transport
    \(a\)
    Proportion of employees using
    private vehicles
    \(b\)
    1. Explain, in context, why you would expect strong, positive correlation between \(x\) and \(y\).
    2. Explain, in context, what kind of correlation you would expect between \(a\) and \(b\).
    3. Christa also considered the data for the 33 London boroughs alone and she generated the following scatter diagram. \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{London} \includegraphics[alt={},max width=\textwidth]{65d9d34c-8c78-45fe-b9f0-dab071ae56bb-07_467_707_1366_653}
      \end{figure} One London Borough is represented by an outlier in the diagram.
      (a) Suggest what effect this outlier is likely to have on the value of \(r\) for the 32 London Boroughs.
      (b) Suggest what effect this outlier is likely to have on the value of \(r\) for the whole country.
    4. What can you deduce about the area of the London Borough represented by the outlier? Explain your answer.
OCR H240/02 Q9
4 marks Moderate -0.8
9 The diagram below shows some "Cycle to work" data taken from the 2001 and 2011 UK censuses. The diagram shows the percentages, by age group, of male and female workers in England and Wales, excluding London, who cycled to work in 2001 and 2011. \includegraphics[max width=\textwidth, alt={}, center]{f2f45d6c-cfdc-455b-ab08-597b06a69f36-10_951_1635_559_207} The following questions refer to the workers represented by the graphs in the diagram.
  1. A researcher is going to take a sample of men and a sample of women and ask them whether or not they cycle to work. Why would it be more important to stratify the sample of men? A research project followed a randomly chosen large sample of the group of male workers who were aged 30-34 in 2001.
  2. Does the diagram suggest that the proportion of this group who cycled to work has increased or decreased from 2001 to 2011?
    Justify your answer.
  3. Write down one assumption that you have to make about these workers in order to draw this conclusion.
OCR H240/02 Q13
5 marks Moderate -0.5
13 The table and the four scatter diagrams below show data taken from the 2011 UK census for four regions. On the scatter diagrams the names have been replaced by letters.
The table shows, for each region, the mean and standard deviation of the proportion of workers in each Local Authority who travel to work by driving a car or van and the proportion of workers in each Local Authority who travel to work as a passenger in a car or van.
Each scatter diagram shows, for each of the Local Authorities in a particular region, the proportion of workers who travel to work by driving a car or van and the proportion of workers who travel to work as a passenger in a car or van.
Driving a car or vanPassenger in a car or van
MeanStandard deviationMeanStandard deviation
London0.2570.1330.0170.008
South East0.5780.0640.0450.010
South West0.5800.0840.0490.007
Wales0.6440.0450.0680.015
Region A \includegraphics[max width=\textwidth, alt={}, center]{f2f45d6c-cfdc-455b-ab08-597b06a69f36-14_634_1116_1308_299} Region B \includegraphics[max width=\textwidth, alt={}, center]{f2f45d6c-cfdc-455b-ab08-597b06a69f36-14_636_1109_2049_301} \includegraphics[max width=\textwidth, alt={}, center]{f2f45d6c-cfdc-455b-ab08-597b06a69f36-15_737_1183_237_240} \includegraphics[max width=\textwidth, alt={}, center]{f2f45d6c-cfdc-455b-ab08-597b06a69f36-15_723_1169_1046_246}
  1. Using the values given in the table, match each region to its corresponding scatter diagram, explaining your reasoning.
  2. Steven claims that the outlier in the scatter diagram for Region C consists of a group of small islands. Explain whether or not the data given above support his claim.
  3. One of the Local Authorities in Region B consists of a single large island. Explain whether or not you would expect this Local Authority to appear as an outlier in the scatter diagram for Region B.
OCR PURE Q9
8 marks Easy -1.8
9 A researcher is studying changes in behaviour in travelling to work by people who live outside London, between 2001 and 2011. He chooses the 15 Local Authorities (LAs) outside London with the largest decreases in the percentage of people driving to work, and arranges these in descending order. The table shows the changes in percentages from 2001 to 2011 in various travel categories, for these Local Authorities.
Local AuthorityWork mainly at or from homeUnderground, metro, light rail, tramTrainBus, minibus or coachDriving a car or vanPassenger in a car or vanBicycleOn foot
Brighton and Hove3.20.11.50.8-8.2-1.52.12.3
Cambridge2.20.01.61.2-7.4-1.03.10.6
Elmbridge2.90.44.10.2-6.6-0.70.3-0.3
Oxford2.00.00.6-0.4-5.2-1.12.22.1
Epsom and Ewell1.60.43.91.1-5.2-0.90.0-0.6
Watford0.72.03.10.4-4.5-1.20.0-0.1
Tandridge3.30.24.0-0.1-4.5-1.10.0-1.3
Mole Valley3.30.11.90.3-4.4-0.70.2-0.3
St Albans2.30.33.4-0.3-4.3-1.20.3-0.2
Chiltern2.91.41.40.1-4.2-0.6-0.2-0.8
Exeter0.70.01.0-0.6-4.2-1.51.73.4
Woking2.10.13.70.0-4.2-1.3-0.10.0
Reigate and Banstead1.80.13.20.6-4.1-1.00.1-0.2
Waverley4.30.12.5-0.5-3.9-0.9-0.3-0.9
Guildford2.70.12.40.2-3.6-1.20.0-0.3
  1. Explain why these LAs are not necessarily the 15 LAs with the largest decreases in the percentage of people driving to work.
  2. The researcher wants to talk to those LAs outside London which have been most successful in encouraging people to change to cycling or walking to work.
    Suggest four LAs that he should talk to and why.
  3. The researcher claims that Waverley is the LA outside London which has had the largest increase in the number of people working mainly at or from home.
    Does the data support his claim? Explain your answer.
  4. Which two categories have replaced driving to work for the highest percentages of workers in these LAs? Support your answer with evidence from the table.
  5. The researcher suggested that there would be strong correlation between the decrease in the percentage driving to work and the increase in percentage working mainly at or from home. Without calculation, use data from the table to comment briefly on this suggestion.
OCR PURE Q11
6 marks Moderate -0.5
11 A student is investigating changes in the number of residents in Local Authorities in the SouthEast Region between 2001 and 2011. The scatter diagram shows the number \(x\) of residents in these Local Authorities in the age group 8 to 9 in 2001 and the number \(y\) of residents in the same Local Authorities in the age group 18 to 19 in 2011.
[diagram]
  1. Suggest a reason why the student is comparing these two age groups in 2001 and 2011. The student notices that most of the data points are close to the line \(y = x\).
    1. Explain what this suggests about the residents in these Local Authorities.
    2. The student says that correlation does not imply causation, so there is no causal link between the values of \(x\) and the values of \(y\). Explain whether or not they are correct.
  2. Some of these Local Authorities contain universities.
    1. On the diagram in the Printed Answer Booklet, circle three points that are likely to represent Local Authorities containing universities.
    2. Give a reason for your choice of points in part (c)(i). Assume that the proportion of residents in age group 8 to 9 in 2001 was roughly the same in each Local Authority in the South-East. The Local Authority in this region with the largest population is Medway.
  3. On the diagram in the Printed Answer Booklet, label clearly with the letter \(M\) the point that corresponds to Medway.
OCR MEI Paper 2 2019 June Q14
9 marks Moderate -0.8
14 The pre-release material includes data concerning crude death rates in different countries of the world. Fig. 14.1 shows some information concerning crude death rates in countries in Europe and in Africa. \begin{table}[h]
EuropeAfrica
\(n\)4856
minimum6.283.58
lower quartile8.507.31
median9.538.71
upper quartile11.4111.93
maximum14.4614.89
\captionsetup{labelformat=empty} \caption{Fig. 14.1}
\end{table}
  1. Use your knowledge of the large data set to suggest a reason why the statistics in Fig. 14.1 refer to only 48 of the 51 European countries.
  2. Use the information in Fig. 14.1 to show that there are no outliers in either data set. The crude death rate in Libya is recorded as 3.58 and the population of Libya is recorded as 6411776.
  3. Calculate an estimate of the number of deaths in Libya in a year. The median age in Germany is 46.5 and the crude death rate is 11.42. The median age in Cyprus is 36.1 and the crude death rate is 6.62 .
  4. Explain why a country like Germany, with a higher median age than Cyprus, might also be expected to have a higher crude death rate than Cyprus. Fig. 14.2 shows a scatter diagram of median age against crude death rate for countries in Africa and Fig. 14.3 shows a scatter diagram of median age against crude death rate for countries in Europe. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{95eb3bcc-6d3c-4f7e-9b27-5e046ab57ec5-10_678_1221_1975_248} \captionsetup{labelformat=empty} \caption{Fig. 14.2}
    \end{figure} \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{95eb3bcc-6d3c-4f7e-9b27-5e046ab57ec5-11_588_1248_223_228} \captionsetup{labelformat=empty} \caption{Fig. 14.3}
    \end{figure} The rank correlation coefficient for the data shown in Fig. 14.2 is - 0.281206 .
    The rank correlation coefficient for the data shown in Fig. 14.3 is 0.335215 .
  5. Compare and contrast what may be inferred about the relationship between median age and crude death rate in countries in Africa and in countries in Europe.
OCR MEI Paper 2 2023 June Q14
8 marks Moderate -0.8
14 The pre-release material contains information concerning the median income of taxpayers in \(\pounds\) and the percentage of all pupils at the end of KS4 achieving 5 or more GCSEs at grade A*-C, including English and Maths, for different areas of London. Some of the data for 2014/15 is shown in Fig. 14.1. \begin{table}[h]
\captionsetup{labelformat=empty} \caption{Fig. 14.1}
Median Income of Taxpayers in £Percentage of Pupils Achieving 5 or more A*-C, including English and Maths
City of London61100\#N/A
Barking and Dagenham2180054.0
Barnet2710070.1
Bexley2440055.0
Brent2270060.0
Bromley2810068.0
\end{table} A student investigated whether there is any relationship between median income of taxpayers and percentage of pupils achieving 5 or more GCSEs at grade A*-C, including English and Maths.
  1. With reference to Fig. 14.1, explain how the data should be cleaned before any analysis can take place. After the data was cleaned, the student used software to draw the scatter diagram shown in Fig. 14.2. Scatter diagram to show percentage of pupils achieving 5 A*-C grades against median income of taxpayers \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{Fig. 14.2} \includegraphics[alt={},max width=\textwidth]{11788aaf-98fb-4a78-8a40-a40743b1fe15-10_574_1481_1900_241}
    \end{figure} The student calculated that the product moment correlation coefficient for these data is 0.3743 .
  2. Give two reasons why it may not be appropriate to use a linear model for the relationship between median income of taxpayers in \(\pounds\) and the percentage of all pupils at the end of KS4 achieving 5 or more GCSEs at grade A*-C. The student carried out some further analysis. The results are shown in Fig. 14.3. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Fig. 14.3}
    median income of
    taxpayers in \(\pounds\)
    percentage of pupils
    achieving \(5 + \mathrm { A } ^ { * } - \mathrm { C }\)
    mean2721661.0
    standard deviation4177.55.32
    \end{table} The student identified three outliers in total.
    The student decided to remove these outliers and recalculate the product moment correlation coefficient.
  3. Explain whether the new value of the product moment correlation coefficient would be between 0.3743 and 1 or between 0 and 0.3743 .
OCR H240/02 2023 June Q13
10 marks Easy -1.8
The scatter diagram uses information about all the Local Authorities (LAs) in the UK, taken from the 2011 census. For each LA it shows the percentage (\(x\)) of employees who used public transport to travel to work and the percentage (\(y\)) who used motorised private transport. "Public transport" includes train, bus, minibus, coach, underground, metro and light rail. "Motorised private transport" includes car, van, motorcycle, scooter, moped, taxi and passenger in a car or van. \includegraphics{figure_13}
  1. Most of the points in the diagram lie on or near the line with equation \(x + y = k\), where \(k\) is a constant.
    1. Give a possible value for \(k\). [1]
    2. Hence give an approximate value for the percentage of employees who either worked from home or walked or cycled to work. [1]
  2. The average amount of fuel used per person per day for travelling to work in any LA is denoted by F. Consider the two groups of LAs where the percentages using motorised private transport are highest and lowest.
    1. Using only the information in the diagram, suggest, with a reason, which of these two groups will have greater values of F than the other group. [1]
    A student says that it is not possible to give a reliable answer to part (b)(i) without some further information.
    1. Suggest two kinds of further information which would enable a more reliable answer to be given. [2]
  3. Points \(A\) and \(B\) in the diagram are the most extreme outliers. Use their positions on the diagram to answer the following questions about the two LAs represented by these two points.
    1. The two LAs share a certain characteristic. Describe, with a justification, this characteristic. [2]
    2. The environments in these two LAs are very different. Describe, with a justification, this difference. [2]
  4. A student says that it is difficult to extract detailed information from the scatter diagram. Explain whether you agree with this criticism. [1]
OCR PURE Q12
4 marks Easy -2.3
This question deals with information about the populations of Local Authorities (LAs) in the North of England, taken from the 2011 census. \includegraphics{figure_6} Fig. 1 and Fig. 2 both show strong correlation, but of two different kinds.
  1. For each diagram, use a single word to describe the kind of correlation shown. [1]
  2. For each diagram, suggest a reason, in context, why the correlation is of the particular kind described in part (a). [2]
Fig. 3 is the same as Fig. 2 but with the point \(A\) marked. Fig. 4 shows information about the same LAs as Fig. 2 and Fig. 3. \includegraphics{figure_7}
  1. Point \(A\) in Fig. 3 and point \(B\) in Fig. 4 represent the same LA. Explain how you can tell that this LA has a large population. [1]
OCR H240/02 2017 Specimen Q13
5 marks Moderate -0.8
The table and the four scatter diagrams below show data taken from the 2011 UK census for four regions. On the scatter diagrams the names have been replaced by letters. The table shows, for each region, the mean and standard deviation of the proportion of workers in each Local Authority who travel to work by driving a car or van and the proportion of workers in each Local Authority who travel to work as a passenger in a car or van. Each scatter diagram shows, for each of the Local Authorities in a particular region, the proportion of workers who travel to work by driving a car or van and the proportion of workers who travel to work as a passenger in a car or van. \includegraphics{figure_13}
  1. Using the values given in the table, match each region to its corresponding scatter diagram, explaining your reasoning. [3]
  2. Steven claims that the outlier in the scatter diagram for Region C consists of a group of small islands. Explain whether or not the data given above support his claim. [1]
  3. One of the Local Authorities in Region B consists of a single large island. Explain whether or not you would expect this Local Authority to appear as an outlier in the scatter diagram for Region B. [1]