2.02a Interpret single variable data: tables and diagrams

209 questions

Sort by: Default | Easiest first | Hardest first
Edexcel S1 2024 June Q3
14 marks Moderate -0.8
  1. The lengths, \(x \mathrm {~mm}\), of 50 pebbles are summarised in the table below.
LengthFrequency
\(20 \leqslant x < 30\)2
\(30 \leqslant x < 32\)16
\(32 \leqslant x < 36\)20
\(36 \leqslant x < 40\)8
\(40 \leqslant x < 45\)3
\(45 \leqslant x < 50\)1
A histogram is drawn to represent these data.
The bar representing the class \(32 \leqslant x < 36\) is 2.5 cm wide and 7.5 cm tall.
  1. Calculate the width and the height of the bar representing the class \(30 \leqslant x < 32\)
  2. Using linear interpolation, estimate the median of \(x\) The weight, \(w\) grams, of each of the 50 pebbles is coded using \(10 y = w - 20\) These coded data are summarised by $$\sum y = 104 \quad \sum y ^ { 2 } = 233.54$$
  3. Show that the mean of \(w\) is 40.8
  4. Calculate the standard deviation of \(w\) The weight of a pebble recorded as 40.8 grams is added to the sample.
  5. Without carrying out any further calculations, state, giving a reason, what effect this would have on the value of
    1. the mean of \(w\)
    2. the standard deviation of \(w\)
Edexcel S1 2018 October Q3
13 marks Moderate -0.8
3. The parking times, \(t\) hours, for cars in a car park are summarised below.
Time (t hours)Frequency (f)Time midpoint (m)
\(0 \leqslant t < 1\)100.5
\(1 \leqslant t < 2\)181.5
\(2 \leqslant t < 4\)153
\(4 \leqslant t < 6\)125
\(6 \leqslant t < 12\)59
$$\text { (You may use } \sum \mathrm { fm } = 182 \text { and } \sum \mathrm { fm } ^ { 2 } = 883 \text { ) }$$ A histogram is drawn to represent these data.
The bar representing the time \(1 \leqslant t < 2\) has a width of 1.5 cm and a height of 6 cm .
  1. Calculate the width and the height of the bar representing the time \(4 \leqslant t < 6\)
  2. Use linear interpolation to estimate the median parking time for the cars in the car park.
  3. Estimate the mean and the standard deviation of the parking time for the cars in the car park.
  4. Describe, giving a reason, the skewness of the data. One of these cars is selected at random.
  5. Estimate the probability that this car is parked for more than 75 minutes.
Edexcel S1 2022 October Q1
11 marks Moderate -0.8
  1. The stem lengths of a sample of 120 tulips are recorded in the grouped frequency table below.
Stem length (cm)Frequency
\(40 \leqslant x < 42\)12
\(42 \leqslant x < 45\)18
\(45 \leqslant x < 50\)23
\(50 \leqslant x < 55\)35
\(55 \leqslant x < 58\)24
\(58 \leqslant x < 60\)8
A histogram is drawn to represent these data.
The area of the bar representing the \(40 \leqslant x < 42\) class is \(16.5 \mathrm {~cm} ^ { 2 }\)
  1. Calculate the exact area of the bar representing the \(42 \leqslant x < 45\) class. The height of the tallest bar in the histogram is 10 cm .
  2. Find the exact height of the second tallest bar. \(Q _ { 1 }\) for these data is 45 cm .
  3. Use linear interpolation to find an estimate for
    1. \(Q _ { 2 }\)
    2. the interquartile range. One measure of skewness is given by $$\frac { Q _ { 3 } - 2 Q _ { 2 } + Q _ { 1 } } { Q _ { 3 } - Q _ { 1 } }$$
  4. By calculating this measure, describe the skewness of these data.
Edexcel S1 2018 Specimen Q2
3 marks Moderate -0.8
  1. The time taken to complete a puzzle, in minutes, is recorded for each person in a club. The times are summarised in a grouped frequency distribution and represented by a histogram.
One of the class intervals has a frequency of 20 and is shown by a bar of width 1.5 cm and height 12 cm on the histogram. The total area under the histogram is \(94.5 \mathrm {~cm} ^ { 2 }\) Find the number of people in the club.
Edexcel S1 2003 January Q4
16 marks Easy -1.2
4. A restaurant owner is concerned about the amount of time customers have to wait before being served. He collects data on the waiting times, to the nearest minute, of 20 customers. These data are listed below.
15,14,16,15,17,16,15,14,15,16,
17,16,15,14,16,17,15,25,18,16
  1. Find the median and inter-quartile range of the waiting times. An outlier is an observation that falls either \(1.5 \times\) (inter-quartile range) above the upper quartile or \(1.5 \times\) (inter-quartile range) below the lower quartile.
  2. Draw a boxplot to represent these data, clearly indicating any outliers.
  3. Find the mean of these data.
  4. Comment on the skewness of these data. Justify your answer.
Edexcel S1 2005 January Q2
14 marks Easy -1.8
2. The number of caravans on Seaview caravan site on each night in August last year is summarised in the following stem and leaf diagram.
Caravans110 means 10Totals
10(2)
218(4)
30347(8)
41588(9)
5267(5)
62(3)
  1. Find the three quartiles of these data. During the same month, the least number of caravans on Northcliffe caravan site was 31. The maximum number of caravans on this site on any night that month was 72 . The three quartiles for this site were 38,45 and 52 respectively.
  2. On graph paper and using the same scale, draw box plots to represent the data for both caravan sites. You may assume that there are no outliers.
  3. Compare and contrast these two box plots.
  4. Give an interpretation to the upper quartiles of these two distributions.
Edexcel S1 2006 January Q1
14 marks Easy -1.3
  1. Over a period of time, the number of people \(x\) leaving a hotel each morning was recorded. These data are summarised in the stem and leaf diagram below.
Number leaving32 means 32Totals
2799(3)
322356(5)
401489(5)
5233666(7)
60145(4)
723(2)
81(1)
For these data,
  1. write down the mode,
  2. find the values of the three quartiles. Given that \(\Sigma x = 1335\) and \(\Sigma x ^ { 2 } = 71801\), find
  3. the mean and the standard deviation of these data. One measure of skewness is found using $$\frac { \text { mean - mode } } { \text { standard deviation } } \text {. }$$
  4. Evaluate this measure to show that these data are negatively skewed.
  5. Give two other reasons why these data are negatively skewed.
Edexcel S1 2007 January Q4
14 marks Moderate -0.8
  1. Summarised below are the distances, to the nearest mile, travelled to work by a random sample of 120 commuters.
Distance (to the nearest mile)Number of commuters
0-910
10-1919
20-2943
30-3925
40-498
50-596
60-695
70-793
80-891
For this distribution,
  1. describe its shape,
  2. use linear interpolation to estimate its median. The mid-point of each class was represented by \(x\) and its corresponding frequency by \(f\) giving $$\Sigma f x = 3550 \text { and } \Sigma f x ^ { 2 } = 138020$$
  3. Estimate the mean and the standard deviation of this distribution. One coefficient of skewness is given by $$\frac { 3 ( \text { mean - median } ) } { \text { standard deviation } } .$$
  4. Evaluate this coefficient for this distribution.
  5. State whether or not the value of your coefficient is consistent with your description in part (a). Justify your answer.
  6. State, with a reason, whether you should use the mean or the median to represent the data in this distribution.
  7. State the circumstance under which it would not matter whether you used the mean or the median to represent a set of data.
Edexcel S1 2007 January Q5
7 marks Easy -1.2
  1. A teacher recorded, to the nearest hour, the time spent watching television during a particular week by each child in a random sample. The times were summarised in a grouped frequency table and represented by a histogram.
One of the classes in the grouped frequency distribution was 20-29 and its associated frequency was 9. On the histogram the height of the rectangle representing that class was 3.6 cm and the width was 2 cm .
  1. Give a reason to support the use of a histogram to represent these data.
  2. Write down the underlying feature associated with each of the bars in a histogram.
  3. Show that on this histogram each child was represented by \(0.8 \mathrm {~cm} ^ { 2 }\). The total area under the histogram was \(24 \mathrm {~cm} ^ { 2 }\).
  4. Find the total number of children in the group.
Edexcel S1 2008 June Q2
14 marks Moderate -0.8
2. The age in years of the residents of two hotels are shown in the back to back stem and leaf diagram below. Abbey Hotel \(8 | 5 | 0\) means 58 years in Abbey hotel and 50 years in Balmoral hotel Balmoral Hotel
(1)20
(4)97511
(4)983126(1)
(11)999976653323447(3)
(6)9877504005569(6)
\multirow[t]{3}{*}{(1)}85000013667(9)
6233457(6)
7015(3)
For the Balmoral Hotel,
  1. write down the mode of the age of the residents,
  2. find the values of the lower quartile, the median and the upper quartile.
    1. Find the mean, \(\bar { x }\), of the age of the residents.
    2. Given that \(\sum x ^ { 2 } = 81213\) find the standard deviation of the age of the residents. One measure of skewness is found using $$\frac { \text { mean - mode } } { \text { standard deviation } }$$
  3. Evaluate this measure for the Balmoral Hotel. For the Abbey Hotel, the mode is 39 , the mean is 33.2 , the standard deviation is 12.7 and the measure of skewness is - 0.454
  4. Compare the two age distributions of the residents of each hotel.
Edexcel S1 2013 June Q2
11 marks Easy -1.3
  1. The marks of a group of female students in a statistics test are summarised in Figure 1
\begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{6faf2dd2-a114-40b7-88ae-4a75dbfb4706-04_629_1102_342_429} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure}
  1. Write down the mark which is exceeded by \(75 \%\) of the female students. The marks of a group of male students in the same statistics test are summarised by the stem and leaf diagram below.
    Mark(2|6 means 26)Totals
    14(1)
    26(1)
    3447(3)
    4066778(6)
    5001113677(9)
    6223338(6)
    7008(3)
    85(1)
    90(1)
  2. Find the median and interquartile range of the marks of the male students. An outlier is a mark that is
    either more than \(1.5 \times\) interquartile range above the upper quartile or more than \(1.5 \times\) interquartile range below the lower quartile.
  3. In the space provided on Figure 1 draw a box plot to represent the marks of the male students, indicating clearly any outliers.
  4. Compare and contrast the marks of the male and the female students.
Edexcel S1 2013 June Q4
14 marks Moderate -0.8
4. The following table summarises the times, \(t\) minutes to the nearest minute, recorded for a group of students to complete an exam.
Time (minutes) \(t\)\(11 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 35\)\(36 - 45\)\(46 - 60\)
Number of students f628816131110
$$\text { [You may use } \sum \mathrm { f } t ^ { 2 } = 134281.25 \text { ] }$$
  1. Estimate the mean and standard deviation of these data.
  2. Use linear interpolation to estimate the value of the median.
  3. Show that the estimated value of the lower quartile is 18.6 to 3 significant figures.
  4. Estimate the interquartile range of this distribution.
  5. Give a reason why the mean and standard deviation are not the most appropriate summary statistics to use with these data. The person timing the exam made an error and each student actually took 5 minutes less than the times recorded above. The table below summarises the actual times.
    Time (minutes) \(t\)\(6 - 15\)\(16 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 40\)\(41 - 55\)
    Number of students f628816131110
  6. Without further calculations, explain the effect this would have on each of the estimates found in parts (a), (b), (c) and (d).
Edexcel S1 2014 June Q5
12 marks Moderate -0.8
  1. The table shows the time, to the nearest minute, spent waiting for a taxi by each of 80 people one Sunday afternoon.
Waiting time
(in minutes)
Frequency
\(2 - 4\)15
\(5 - 6\)9
76
824
\(9 - 10\)14
\(11 - 15\)12
  1. Write down the upper class boundary for the \(2 - 4\) minute interval. A histogram is drawn to represent these data. The height of the tallest bar is 6 cm .
  2. Calculate the height of the second tallest bar.
  3. Estimate the number of people with a waiting time between 3.5 minutes and 7 minutes.
  4. Use linear interpolation to estimate the median, the lower quartile and the upper quartile of the waiting times.
  5. Describe the skewness of these data, giving a reason for your answer.
Edexcel S1 2014 June Q1
9 marks Moderate -0.8
  1. A random sample of 35 homeowners was taken from each of the villages Greenslax and Penville and their ages were recorded. The results are summarised in the back-to-back stem and leaf diagram below.
TotalsGreenslaxPenvilleTotals
(2)8725567889(7)
(3)98731112344569(11)
(4)4440401247(5)
(5)66522500555(5)
(7)865421162566(4)
(8)8664311705(2)
(5)984328(0)
(1)499(1)
Key: 7 | 3 | 1 means 37 years for Greenslax and 31 years for Penville
Some of the quartiles for these two distributions are given in the table below.
GreenslaxPenville
Lower quartile, \(Q _ { 1 }\)\(a\)31
Median, \(Q _ { 2 }\)6439
Upper quartile, \(Q _ { 3 }\)\(b\)55
  1. Find the value of \(a\) and the value of \(b\). An outlier is a value that falls either $$\begin{aligned} & \text { more than } 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { above } Q _ { 3 } \\ & \text { or more than } 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { below } Q _ { 1 } \end{aligned}$$
  2. On the graph paper opposite draw a box plot to represent the data from Penville. Show clearly any outliers.
  3. State the skewness of each distribution. Justify your answers. \includegraphics[max width=\textwidth, alt={}, center]{8270bcae-494c-4248-8229-a72e9e84eab0-03_930_1237_1800_367}
Edexcel S1 2014 June Q6
11 marks Moderate -0.3
6. The times, in seconds, spent in a queue at a supermarket by 85 randomly selected customers, are summarised in the table below.
Time (seconds)Number of customers, \(f\)
0-302
30-6010
60-7017
70-8025
80-10025
100-1506
A histogram was drawn to represent these data. The \(30 - 60\) group was represented by a bar of width 1.5 cm and height 1 cm .
  1. Find the width and the height of the \(70 - 80\) group.
  2. Use linear interpolation to estimate the median of this distribution. Given that \(x\) denotes the midpoint of each group in the table and $$\sum f x = 6460 \quad \sum f x ^ { 2 } = 529400$$
  3. calculate an estimate for
    1. the mean,
    2. the standard deviation,
      for the above data. One measure of skewness is given by $$\text { coefficient of skewness } = \frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }$$
  4. Evaluate this coefficient and comment on the skewness of these data.
Edexcel S1 2015 June Q1
14 marks Easy -1.2
  1. Each of 60 students was asked to draw a \(20 ^ { \circ }\) angle without using a protractor. The size of each angle drawn was measured. The results are summarised in the box plot below. \includegraphics[max width=\textwidth, alt={}, center]{9626e3ce-35d6-41b5-a0bd-1185f38b9e36-02_371_1040_340_461}
    1. Find the range for these data.
    2. Find the interquartile range for these data.
    The students were then asked to draw a \(70 ^ { \circ }\) angle.
    The results are summarised in the table below.
    Angle, \(\boldsymbol { a }\), (degrees)Number of students
    \(55 \leqslant a < 60\)6
    \(60 \leqslant a < 65\)15
    \(65 \leqslant a < 70\)13
    \(70 \leqslant a < 75\)11
    \(75 \leqslant a < 80\)8
    \(80 \leqslant a < 85\)7
  2. Use linear interpolation to estimate the size of the median angle drawn. Give your answer to 1 decimal place.
  3. Show that the lower quartile is \(63 ^ { \circ }\) For these data, the upper quartile is \(75 ^ { \circ }\), the minimum is \(55 ^ { \circ }\) and the maximum is \(84 ^ { \circ }\) An outlier is an observation that falls either more than \(1.5 \times\) (interquartile range) above the upper quartile or more than \(1.5 \times\) (interquartile range) below the lower quartile.
    1. Show that there are no outliers for these data.
    2. Draw a box plot for these data on the grid on page 3.
  4. State which angle the students were more accurate at drawing. Give reasons for your answer.
    (3) \includegraphics[max width=\textwidth, alt={}, center]{9626e3ce-35d6-41b5-a0bd-1185f38b9e36-03_378_1059_2067_447}
Edexcel S1 2004 November Q1
14 marks Moderate -0.8
  1. As part of their job, taxi drivers record the number of miles they travel each day. A random sample of the mileages recorded by taxi drivers Keith and Asif are summarised in the back-toback stem and leaf diagram below.
TotalsAsifTotals
(9)87432110184457(4)
(11)9865433111957899(5)
(6)87422020022448(6)
(6)943100212356679(7)
(4)6411221124558(7)
(2)202311346678(8)
(2)71242489(4)
(1)9254(1)
(2)9326(0)
Key: 0184 means 180 for Keith and 184 for Asif
The quartiles for these two distributions are summarised in the table below.
KeithAsif
Lower quartile191\(a\)
Median\(b\)218
Upper quartile221\(c\)
  1. Find the values of \(a , b\) and \(c\). Outliers are values that lie outside the limits $$Q _ { 1 } - 1.5 \left( Q _ { 3 } - Q _ { 1 } \right) \text { and } Q _ { 3 } + 1.5 \left( Q _ { 3 } - Q _ { 1 } \right) .$$
  2. On graph paper, and showing your scale clearly, draw a box plot to represent Keith's data.
  3. Comment on the skewness of the two distributions.
Edexcel S1 2004 November Q7
6 marks Easy -1.8
7. A college organised a 'fun run'. The times, to the nearest minute, of a random sample of 100 students who took part are summarised in the table below.
TimeNumber of students
\(40 - 44\)10
\(45 - 47\)15
4823
\(49 - 51\)21
\(52 - 55\)16
\(56 - 60\)15
  1. Give a reason to support the use of a histogram to represent these data.
  2. Write down the upper class boundary and the lower class boundary of the class 40-44.
  3. On graph paper, draw a histogram to represent these data. END
Edexcel S1 Q3
13 marks Moderate -0.3
3. The frequency distribution for the lengths of 108 fish in an aquarium is given by the following table. The lengths of the fish ranged from 5 cm to 90 cm .
Length \(( \mathrm { cm } )\)\(5 - 10\)\(10 - 20\)\(20 - 25\)\(25 - 30\)\(30 - 40\)\(40 - 60\)\(60 - 90\)
Frequency8162018201412
  1. Calculate estimates of the three quartiles of the distribution.
  2. On graph paper, draw a box and whisker plot of the data.
  3. Hence describe the skewness of the distribution.
  4. If the data were represented by a histogram, what would be the ratio of the heights of the shortest and highest bars?
Edexcel S1 Q5
14 marks Easy -1.2
5. In a survey unemployed people were asked how many months it had been, to the nearest month, since they were last employed on a full-time basis. The data collected is summarised in this stem and leaf diagram.
Number of months(2 | 1 means 21 months)Totals
011224446779(11)
102355689( )
21568( )
3079( )
45( )
527(2)
63(1)
70(1)
  1. Write down the values needed to complete the totals column on the stem and leaf diagram.
  2. State the mode of these data.
  3. Find the median and quartiles of these data. Given that any values outside of the limits \(\mathrm { Q } _ { 1 } - 1.5 \left( \mathrm { Q } _ { 3 } - \mathrm { Q } _ { 1 } \right)\) and \(\mathrm { Q } _ { 3 } + 1.5 \left( \mathrm { Q } _ { 3 } - \mathrm { Q } _ { 1 } \right)\) are to be regarded as outliers,
  4. determine if there are any outliers in these data,
  5. draw a box plot representing these data on graph paper,
  6. describe the skewness of these data and suggest a reason for it.
Edexcel S2 Q1
5 marks Easy -1.3
  1. (a) Explain the difference between a discrete and a continuous variable.
A random number generator on a calculator generates numbers, \(X\), to 3 decimal places, in the range 0 to 1 , e.g. 0.386 . The variable \(X\) may be modelled by a continuous uniform distribution, having the probability density function \(\mathrm { f } ( x )\), where $$\begin{array} { l l } \mathrm { f } ( x ) = 1 & 0 < x < 1 \\ \mathrm { f } ( x ) = 0 & \text { otherwise } \end{array}$$ (b) Explain why this model is not totally accurate.
(c) Sketch the cumulative distribution function of \(X\).
OCR MEI S1 Q3
8 marks Easy -1.3
3 Answer part (i) of this question on the insert provided. A taxi driver operates from a taxi rank at a main railway station in London. During one particular week he makes 120 journeys, the lengths of which are summarised in the table.
Length
\(( x\) miles \()\)
\(0 < x \leqslant 1\)\(1 < x \leqslant 2\)\(2 < x \leqslant 3\)\(3 < x \leqslant 4\)\(4 < x \leqslant 6\)\(6 < x \leqslant 10\)
Number of
journeys
3830211498
  1. On the insert, draw a cumulative frequency diagram to illustrate the data.
  2. Use your graph to estimate the median length of journey and the quartiles. Hence find the interquartile range.
  3. State the type of skewness of the distribution of the data.
OCR MEI S1 Q4
9 marks Easy -1.8
4 Answer part (i) of this question on the insert provided. A taxi driver operates from a taxi rank at a main railway station in London. During one particular week he makes 120 journeys, the lengths of which are summarised in the table.
Length
\(( x\) miles \()\)
\(0 < x \leqslant 1\)\(1 < x \leqslant 2\)\(2 < x \leqslant 3\)\(3 < x \leqslant 4\)\(4 < x \leqslant 6\)\(6 < x \leqslant 10\)
Number of
journeys
3830211498
  1. On the insert, draw a cumulative frequency diagram to illustrate the data.
  2. Use your graph to estimate the median length of journey and the quartiles. Hence find the interquartile range.
  3. State the type of skewness of the distribution of the data.
OCR H240/02 2018 September Q9
12 marks Moderate -0.3
9 The finance department of a retail firm recorded the daily income each day for 300 days. The results are summarised in the histogram. \includegraphics[max width=\textwidth, alt={}, center]{85de9a39-f8be-40ee-b0c8-e2e632be93d8-6_689_1575_488_246}
  1. Find the number of days on which the daily income was between \(\pounds 4000\) and \(\pounds 6000\).
  2. Calculate an estimate of the number of days on which the daily income was between \(\pounds 2700\) and \(\pounds 3600\).
  3. Use the midpoints of the classes to show that an estimate of the mean daily income is \(\pounds 3275\). An estimate of the standard deviation of the daily income is \(\pounds 1060\). The finance department uses the distribution \(\mathrm { N } \left( 3275,1060 ^ { 2 } \right)\) to model the daily income, in pounds.
  4. Calculate the number of days on which, according to this model, the daily income would be between \(\pounds 4000\) and \(\pounds 6000\).
  5. It is given that approximately \(95 \%\) of values of the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\) lie within the range \(\mu \pm 2 \sigma\). Without further calculation, use this fact to comment briefly on whether the proposed model is a good fit to the data illustrated in the histogram.
Edexcel S1 Q5
Moderate -0.3
5. The following grouped frequency distribution summarises the number of minutes, to the nearest minute, that a random sample of 200 motorists were delayed by roadworks on a stretch of motorway.
Delay (mins)Number of motorists
\(4 - 6\)15
\(7 - 8\)28
949
1053
\(11 - 12\)30
\(13 - 15\)15
\(16 - 20\)10
  1. Using graph paper represent these data by a histogram.
  2. Give a reason to justify the use of a histogram to represent these data.
  3. Use interpolation to estimate the median of this distribution.
  4. Calculate an estimate of the mean and an estimate of the standard deviation of these data. One coefficient of skewness is given by $$\frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } } .$$
  5. Evaluate this coefficient for the above data.
  6. Explain why the normal distribution may not be suitable to model the number of minutes that motorists are delayed by these roadworks.