Histogram from discrete rounded data

Questions where data is recorded to the nearest unit (e.g., 10-19, 20-29 to nearest cm) requiring conversion to continuous boundaries (9.5-19.5, 19.5-29.5) before calculating frequency densities.

6 questions

CAIE S1 2024 November Q4
4 On a certain day, the heights of 150 sunflower plants grown by children at a local school are measured, correct to the nearest cm . These heights are summarised in the following table.
Height
\(( \mathrm { cm } )\)
\(10 - 19\)\(20 - 29\)\(30 - 39\)\(40 - 44\)\(45 - 49\)\(50 - 54\)\(55 - 59\)
Frequency1018324228146
  1. Draw a cumulative frequency graph to illustrate the data.
    \includegraphics[max width=\textwidth, alt={}, center]{915661eb-2544-4293-af72-608fedb43d70-06_1600_1301_760_383}
  2. Use your graph to estimate the 30th percentile of the heights of the sunflower plants.
    \includegraphics[max width=\textwidth, alt={}, center]{915661eb-2544-4293-af72-608fedb43d70-07_2723_35_101_20}
  3. Calculate estimates for the mean and the standard deviation of the heights of the 150 sunflower plants.
CAIE S1 2018 June Q1
1 The masses in kilograms of 50 children having a medical check-up were recorded correct to the nearest kilogram. The results are shown in the table.
Mass (kg)\(10 - 14\)\(15 - 19\)\(20 - 24\)\(25 - 34\)\(35 - 59\)
Frequency61214108
  1. Find which class interval contains the lower quartile.
  2. On the grid, draw a histogram to illustrate the data in the table.
    \includegraphics[max width=\textwidth, alt={}, center]{dd75fa20-fead-48d6-aff4-c5e733769f9f-02_1397_1397_1187_415}
Edexcel S1 2014 January Q2
2. A rugby club coach uses club records to take a random sample of 15 players from 1990 and an independent random sample of 15 players from 2010. The body weight of each player was recorded to the nearest kg and the results from 2010 are summarised in the table below.
Body weight (kg)75-7980-8485-8990-9495-99100-104105-109
Number of Players (2010)1224321
  1. Find the estimated values in kg of the summary statistics \(a\), \(b\) and \(c\) in the table below.
    Estimate in 1990Estimate in 2010
    Mean83.0\(a\)
    Median82.0\(b\)
    Variance44.0\(c\)
    Give your answers to 3 significant figures. The rugby coach claims that players’ body weight increased between 1990 and 2010.
  2. Using the table in part (a), comment on the rugby coach's claim. \includegraphics[max width=\textwidth, alt={}, center]{a839a89a-17f0-473b-ac10-bcec3dbe97f7-05_104_97_2613_1784}
Edexcel S1 2007 January Q4
  1. Summarised below are the distances, to the nearest mile, travelled to work by a random sample of 120 commuters.
Distance (to the nearest mile)Number of commuters
0-910
10-1919
20-2943
30-3925
40-498
50-596
60-695
70-793
80-891
For this distribution,
  1. describe its shape,
  2. use linear interpolation to estimate its median. The mid-point of each class was represented by \(x\) and its corresponding frequency by \(f\) giving $$\Sigma f x = 3550 \text { and } \Sigma f x ^ { 2 } = 138020$$
  3. Estimate the mean and the standard deviation of this distribution. One coefficient of skewness is given by $$\frac { 3 ( \text { mean - median } ) } { \text { standard deviation } } .$$
  4. Evaluate this coefficient for this distribution.
  5. State whether or not the value of your coefficient is consistent with your description in part (a). Justify your answer.
  6. State, with a reason, whether you should use the mean or the median to represent the data in this distribution.
  7. State the circumstance under which it would not matter whether you used the mean or the median to represent a set of data.
Edexcel S1 2006 June Q2
2. Sunita and Shelley talk to one another once a week on the telephone. Over many weeks they recorded, to the nearest minute, the number of minutes spent in conversation on each occasion. The following table summarises their results.
Time
(to the nearest minute)
Number of
Conversations
\(5 - 9\)2
\(10 - 14\)9
\(15 - 19\)20
\(20 - 24\)13
\(25 - 29\)8
\(30 - 34\)3
Two of the conversations were chosen at random.
  1. Find the probability that both of them were longer than 24.5 minutes. The mid-point of each class was represented by \(x\) and its corresponding frequency by \(f\), giving \(\Sigma f x = 1060\).
  2. Calculate an estimate of the mean time spent on their conversations. During the following 25 weeks they monitored their weekly conversations and found that at the end of the 80 weeks their overall mean length of conversation was 21 minutes.
  3. Find the mean time spent in conversation during these 25 weeks.
  4. Comment on these two mean values.
Edexcel S1 2011 June Q5
5. A class of students had a sudoku competition. The time taken for each student to complete the sudoku was recorded to the nearest minute and the results are summarised in the table below.
TimeMid-point, \(x\)Frequency, f
2-852
9-127
13-15145
16-18178
19-2220.54
23-3026.54
$$\text { (You may use } \sum \mathrm { f } x ^ { 2 } = 8603.75 \text { ) }$$
  1. Write down the mid-point for the 9-12 interval.
  2. Use linear interpolation to estimate the median time taken by the students.
  3. Estimate the mean and standard deviation of the times taken by the students. The teacher suggested that a normal distribution could be used to model the times taken by the students to complete the sudoku.
  4. Give a reason to support the use of a normal distribution in this case. On another occasion the teacher calculated the quartiles for the times taken by the students to complete a different sudoku and found $$Q _ { 1 } = 8.5 \quad Q _ { 2 } = 13.0 \quad Q _ { 3 } = 21.0$$
  5. Describe, giving a reason, the skewness of the times on this occasion.