2.02f Measures of average and spread

447 questions

Sort by: Default | Easiest first | Hardest first
CAIE S1 2010 June Q3
6 marks Moderate -0.8
\includegraphics{figure_3} The birth weights of random samples of 900 babies born in country \(A\) and 900 babies born in country \(B\) are illustrated in the cumulative frequency graphs. Use suitable data from these graphs to compare the central tendency and spread of the birth weights of the two sets of babies. [6]
CAIE S1 2014 November Q6
9 marks Easy -1.2
On a certain day in spring, the heights of 200 daffodils are measured, correct to the nearest centimetre. The frequency distribution is given below.
Height (cm)\(4 - 10\)\(11 - 15\)\(16 - 20\)\(21 - 25\)\(26 - 30\)
Frequency2232784028
  1. Draw a cumulative frequency graph to illustrate the data. [4]
  2. 28\% of these daffodils are of height \(h\) cm or more. Estimate \(h\). [2]
  3. You are given that the estimate of the mean height of these daffodils, calculated from the table, is 18.39 cm. Calculate an estimate of the standard deviation of the heights of these daffodils. [3]
Edexcel S1 2023 June Q1
8 marks Moderate -0.8
The histogram shows the distances, in km, that 274 people travel to work. \includegraphics{figure_1} Given that 60 of these people travel between 10km and 20km to work, estimate
  1. the number of people who travel between 22km and 45km to work, [3]
  2. the median distance travelled to work by these 274 people, [2]
  3. the mean distance travelled to work by these 274 people. [3]
Edexcel S1 2002 January Q6
17 marks Easy -1.2
Hospital records show the number of babies born in a year. The number of babies delivered by 15 male doctors is summarised by the stem and leaf diagram below. Babies \quad (4|5 means 45) \quad Totals 0 \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad (0) 1|9 \quad \quad \quad \quad \quad \quad \quad \quad \quad (1) 2|1 6 7 7 \quad \quad \quad \quad \quad \quad (4) 3|2 2 3 4 8 \quad \quad \quad \quad \quad (5) 4|5 \quad \quad \quad \quad \quad \quad \quad \quad \quad (1) 5|1 \quad \quad \quad \quad \quad \quad \quad \quad \quad (1) 6|0 \quad \quad \quad \quad \quad \quad \quad \quad \quad (1) 7 \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad (0) 8|6 7 \quad \quad \quad \quad \quad \quad \quad \quad (2)
  1. Find the median and inter-quartile range of these data. [3]
  2. Given that there are no outliers, draw a box plot on graph paper to represent these data. Start your scale at the origin. [4]
  3. Calculate the mean and standard deviation of these data. [5]
The records also contain the number of babies delivered by 10 female doctors. 34 \quad 30 \quad 20 \quad 15 \quad 6 32 \quad 26 \quad 19 \quad 11 \quad 4 The quartiles are 11, 19.5 and 30.
  1. Using the same scale as in part (b) and on the same graph paper draw a box plot for the data for the 10 female doctors. [3]
  2. Compare and contrast the box plots for the data for male and female doctors. [2]
Edexcel S1 2010 January Q2
9 marks Easy -1.2
The 19 employees of a company take an aptitude test. The scores out of 40 are illustrated in the stem and leaf diagram below. \(2|6\) means a score of 26 \begin{align} 0 & | 7 & (1)
1 & | 88 & (2)
2 & | 4468 & (4)
3 & | 2333459 & (7)
4 & | 00000 & (5) \end{align} Find
  1. the median score, [1]
  2. the interquartile range. [3]
The company director decides that any employees whose scores are so low that they are outliers will undergo retraining. An outlier is an observation whose value is less than the lower quartile minus 1.0 times the interquartile range.
  1. Explain why there is only one employee who will undergo retraining. [2]
  2. On the graph paper on page 5, draw a box plot to illustrate the employees' scores. [3]
Edexcel S1 2010 January Q3
11 marks Moderate -0.8
The birth weights, in kg, of 1500 babies are summarised in the table below.
Weight (kg)Midpoint, \(x\)kgFrequency, \(f\)
\(0.0 - 1.0\)\(0.50\)\(1\)
\(1.0 - 2.0\)\(1.50\)\(6\)
\(2.0 - 2.5\)\(2.25\)\(60\)
\(2.5 - 3.0\)\(280\)
\(3.0 - 3.5\)\(3.25\)\(820\)
\(3.5 - 4.0\)\(3.75\)\(320\)
\(4.0 - 5.0\)\(4.50\)\(10\)
\(5.0 - 6.0\)\(3\)
[You may use \(\sum fx = 4841\) and \(\sum fx^2 = 15889.5\)]
  1. Write down the missing midpoints in the table above. [2]
  2. Calculate an estimate of the mean birth weight. [2]
  3. Calculate an estimate of the standard deviation of the birth weight. [3]
  4. Use interpolation to estimate the median birth weight. [2]
  5. Describe the skewness of the distribution. Give a reason for your answer. [2]
Edexcel S1 2011 June Q5
11 marks Moderate -0.8
A class of students had a sudoku competition. The time taken for each student to complete the sudoku was recorded to the nearest minute and the results are summarised in the table below.
TimeMid-point, \(x\)Frequency, \(f\)
2 - 852
9 - 127
13 - 15145
16 - 18178
19 - 2220.54
23 - 3026.54
(You may use \(\sum fx^2 = 8603.75\))
  1. Write down the mid-point for the 9 - 12 interval. [1]
  2. Use linear interpolation to estimate the median time taken by the students. [2]
  3. Estimate the mean and standard deviation of the times taken by the students. [5]
The teacher suggested that a normal distribution could be used to model the times taken by the students to complete the sudoku.
  1. Give a reason to support the use of a normal distribution in this case. [1]
On another occasion the teacher calculated the quartiles for the times taken by the students to complete a different sudoku and found \(Q_1 = 8.5 \quad Q_2 = 13.0 \quad Q_3 = 21.0\)
  1. Describe, giving a reason, the skewness of the times on this occasion. [2]
Edexcel S1 2002 November Q7
18 marks Moderate -0.8
The following stem and leaf diagram shows the aptitude scores \(x\) obtained by all the applicants for a particular job.
Aptitude score\(3|1\) means 31
31 2 9(3)
42 4 6 8 9(5)
51 3 3 5 6 7 9(7)
60 1 3 3 3 5 6 8 8 9(10)
71 2 2 2 4 5 5 5 6 8 8 8 8 9(14)
80 1 2 3 5 8 8 9(8)
90 1 2(3)
  1. Write down the modal aptitude score. [1]
  2. Find the three quartiles for these data. [3]
Outliers can be defined to be outside the limits \(Q_1 - 1.0(Q_3 - Q_1)\) and \(Q_3 + 1.0(Q_3 - Q_1)\).
  1. On a graph paper, draw a box plot to represent these data. [7]
For these data, \(\Sigma x = 3363\) and \(\Sigma x^2 = 238305\).
  1. Calculate, to 2 decimal places, the mean and the standard deviation for these data. [3]
  2. Use two different methods to show that these data are negatively skewed. [4]
Edexcel S1 Specimen Q5
16 marks Easy -1.2
  1. Explain briefly the advantages and disadvantages of using the quartiles to summarise a set of data. [4]
  2. Describe the main features and uses of a box plot. [3]
The distances, in kilometres, travelled to school by the teachers in two schools, \(A\) and \(B\), in the same town were recorded. The data for School \(A\) are summarised in Diagram 1. \includegraphics{figure_1} For School \(B\), the least distance travelled was 3 km and the longest distance travelled was 55 km. The three quartiles were 17, 24 and 31 respectively. An outlier is an observation that falls either \(1.5 \times\) (interquartile range) above the upper quartile or \(1.5 \times\) (interquartile range) below the lower quartile.
  1. Draw a box plot for School \(B\). [5]
  2. Compare and contrast the two box plots. [4]
Edexcel S2 Specimen Q4
11 marks Standard +0.3
A company director monitored the number of errors on each page of typing done by her new secretary and obtained the following results:
No. of errors012345
No. of pages376560492712
  1. Show that the mean number of errors per page in this sample of pages is 2. [2]
  2. Find the variance of the number of errors per page in this sample. [2]
  3. Explain how your answers to parts (a) and (b) might support the director's belief that the number of errors per page could be modelled by a Poisson distribution. [1]
Some time later the director notices that a 4-page report which the secretary has just typed contains only 3 errors. The director wishes to test whether or not this represents evidence that the number of errors per page made by the secretary is now less than 2.
  1. Assuming a Poisson distribution and stating your hypothesis clearly, carry out this test. Use a 5\% level of significance. [6]
Edexcel S2 Specimen Q7
20 marks Standard +0.3
The continuous random variable \(X\) has probability density function f(\(x\)) given by $$\text{f}(x) = \begin{cases} \frac{1}{20}x^3, & 1 \leq x \leq 3 \\ 0, & \text{otherwise} \end{cases}$$
  1. Sketch f(\(x\)) for all values of \(x\). [3]
  2. Calculate E(\(X\)). [3]
  3. Show that the standard deviation of \(X\) is 0.459 to 3 decimal places. [3]
  4. Show that for \(1 \leq x \leq 3\), P(\(X \leq x\)) is given by \(\frac{1}{80}(x^4 - 1)\) and specify fully the cumulative distribution function of \(X\). [5]
  5. Find the interquartile range for the random variable \(X\). [4]
Some statisticians use the following formula to estimate the interquartile range: $$\text{interquartile range} = \frac{4}{3} \times \text{standard deviation}.$$
  1. Use this formula to estimate the interquartile range in this case, and comment. [2]
Edexcel S3 2015 June Q3
11 marks Standard +0.3
The number of accidents on a particular stretch of motorway was recorded each day for 200 consecutive days. The results are summarised in the following table.
Number of accidents012345
Frequency4757463596
  1. Show that the mean number of accidents per day for these data is 1.6 [1]
A motorway supervisor believes that the number of accidents per day on this stretch of motorway can be modelled by a Poisson distribution. She uses the mean found in part (a) to calculate the expected frequencies for this model. Her results are given in the following table.
Number of accidents012345 or more
Frequency40.3864.61\(r\)27.5711.03\(s\)
  1. Find the value of \(r\) and the value of \(s\), giving your answers to 2 decimal places. [3]
  2. Stating your hypotheses clearly, use a 10\% level of significance to test the motorway supervisor's belief. Show your working clearly. [7]
Edexcel S1 Q5
14 marks Moderate -0.8
The stem-and-leaf diagram shows the values taken by two variables \(A\) and \(B\).
\(A\)\(B\)
8, 7, 4, 1, 011, 1, 2, 5, 6, 8, 9
9, 8, 7, 6, 6, 5, 220, 3, 4, 6, 7, 7, 9
9, 7, 6, 4, 2, 1, 031, 4, 5, 5, 8
8, 6, 3, 2, 240, 2, 6, 6, 9, 9
6, 4, 052, 3, 5, 7
5, 3, 160, 1
Key: 3 | 1 | 2 means \(A = 13\), \(B = 12\)
  1. For each set of data, calculate estimates of the median and the quartiles. [6 marks]
  2. Calculate the 42nd percentile for \(A\). [2 marks]
  3. On graph paper, indicating your scale clearly, construct box and whisker plots for both sets of data. [4 marks]
  4. Describe the skewness of the distribution of \(A\) and of \(B\). [2 marks]
Edexcel S1 Q5
16 marks Moderate -0.8
In a survey of natural habitats, the numbers of trees in sixty equal areas of land were recorded, as follows:
171292340321153422318
154510521413294369301547
356241319269312718620
22183051493550258102631
332940373844243442381123
  1. Construct a stem-and-leaf diagram to illustrate this data, using the groupings 5 - 9, 10 - 14, 15 - 19, 20 - 24, etc. [8 marks]
  2. Find the three quartiles for the distribution. [4 marks]
  3. On graph paper construct a box plot for the data, showing your scale and clearly indicating any outliers. [4 marks]
Edexcel S1 Q3
10 marks Moderate -0.8
The variable \(X\) represents the marks out of 150 scored by a group of students in an examination. The following ten values of \(X\) were obtained: 60, 66, 76, 80, 94, 106, 110, 116, 124, 140.
  1. Write down the median, \(M\), of the ten marks. [1 mark]
  2. Using the coding \(y = \frac{x - M}{2}\), and showing all your working clearly, find the mean and the standard deviation of the marks. [6 marks]
  3. Find E\((3X - 5)\). [3 marks]
Edexcel S1 Q7
17 marks Moderate -0.8
The back-to-back stem and leaf diagram shows the journey times, to the nearest minute, of the commuter services into a big city provided by the trains of two operating companies.
Company \(A\)Company \(B\)
(3)\(4\ 3\ 1\)2\(0\ 5\ 6\ 6\ 8\ 9\)(6)
(4)\(9\ 8\ 6\ 5\)3\(1\ 3\ 4\ 7\ 9\)(5)
(4)\(8\ 8\ 6\ 2\)4\(0\ 1\ 3\ 5\ 8\)( )
(6)\(9\ 7\ 5\ 3\ 2\ 1\)5\(2\ 6\ 8\ 9\ 9\)( )
(3)\(6\ 5\ 3\)6\(3\ 4\ 7\ 7\)( )
(3)\(3\ 2\ 2\)7\(0\ 1\ 5\)( )
Key: \(4|3|6\) means 34 minutes for Company \(A\) and 36 minutes for Company \(B\).
  1. Write down the numbers needed to complete the diagram. [1 mark]
  2. Find the median and the quartiles for each company. [6 marks]
  3. On graph paper, draw box plots for the two companies. Show your scale. [6 marks]
  4. Use your plots to compare the two sets of data briefly. [2 marks]
  5. Describe the skewness of each company's distribution of times. [2 marks]
Edexcel S1 Q6
15 marks Moderate -0.3
1000 houses were sold in a small town in a one-year period. The selling prices were as given in the following table:
Selling PriceNumber of HousesSelling PriceNumber of Houses
Up to £50 00060Up to £150 000642
Up to £75 000227Up to £200 000805
Up to £100 000305Up to £500 000849
Up to £125 000414Up to £750 0001000
  1. Name (do not draw) a suitable type of graph for illustrating this data. [1 mark]
  2. Use interpolation to find estimates of the median and the quartiles. [6 marks]
  3. Estimate the 37th percentile. [2 marks]
Given further that the lowest price was £42 000 and the range of the prices was £690 000,
  1. draw a box plot to represent the data. Show your scale clearly. [4 marks]
In another town the median price was £149 000, and the interquartile range was £90 000.
  1. Briefly compare the prices in the two towns using this information. [2 marks]
Edexcel S1 Q5
16 marks Moderate -0.8
The following data were collected by counting the number of cars that passed the gates of a college in 60 successive 5 minute intervals.
122019313235372926272017
1598111317172125272825
303237404545444742413638
353430302726292423212118
161619222628231715101213
  1. Make a stem and leaf diagram for this data, using the groups \(5-9\), \(10-14\), \(\ldots\), \(45-49\). Show the total in each group and give a key to the diagram. [7 marks]
  2. Find the three quartiles for this data. [4 marks]
  3. On graph paper, draw a box plot for the data. [4 marks]
  4. Describe the skewness of the distribution. [1 mark]
Edexcel S1 Q7
21 marks Standard +0.3
The following table gives the weights, in grams, of 60 items delivered to a company in a day.
Weight (g)0 - 1010 - 2020 - 3030 - 4040 - 5050 - 6060 - 80
No. of items2111812962
  1. Use interpolation to calculate estimated values of
    1. the median weight,
    2. the interquartile range,
    3. the thirty-third percentile.
    [7 marks]
Outliers are defined to be outside the range from \(2.5Q_1 - 1.5Q_2\) to \(2.5Q_2 - 1.5Q_1\).
  1. Given that the lightest item weighed 3 g and the two heaviest weighed 65 g and 79 g, draw on graph paper an accurate box-and-whisker plot of the data. Indicate any outliers clearly. [5 marks]
  2. Describe the skewness of the distribution. [1 mark]
The mean weight was 32.0 g and the standard deviation of the weights was 14.9 g.
  1. State, with a reason, whether you would choose to summarise the data by using the mean and standard deviation or the median and interquartile range. [2 marks]
On another day, items were delivered whose weights ranged from 14 g to 58 g; the median was 32 g, the lower quartile was 24 g and the interquartile range was 26 g.
  1. Draw a further box plot for these data on the same diagram. Briefly compare the two sets of data using your plots. [6 marks]
OCR S1 2010 January Q2
13 marks Moderate -0.8
40 people were asked to guess the length of a certain road. Each person gave their guess, \(l\) km, correct to the nearest kilometre. The results are summarised below.
\(l\)10-1213-1516-2021-30
Frequency113206
    1. Use appropriate formulae to calculate estimates of the mean and standard deviation of \(l\). [6]
    2. Explain why your answers are only estimates. [1]
  1. A histogram is to be drawn to illustrate the data. Calculate the frequency density of the block for the 16-20 class. [2]
  2. Explain which class contains the median value of \(l\). [2]
  3. Later, the person whose guess was between 10 km and 12 km changed his guess to between 13 km and 15 km. Without calculation state whether the following will increase, decrease or remain the same:
    1. the mean of \(l\), [1]
    2. the standard deviation of \(l\). [1]
OCR S1 2013 January Q6
7 marks Moderate -0.8
The masses, \(x\) grams, of 800 apples are summarised in the histogram. \includegraphics{figure_6}
  1. On the frequency density axis, 1 cm represents \(a\) units. Find the value of \(a\). [3]
  2. Find an estimate of the median mass of the apples. [4]
OCR S1 2009 June Q5
5 marks Moderate -0.8
The diameters of 100 pebbles were measured. The measurements rounded to the nearest millimetre, \(x\), are summarised in the table.
\(x\)\(10 \leqslant x \leqslant 19\)\(20 \leqslant x \leqslant 24\)\(25 \leqslant x \leqslant 29\)\(30 \leqslant x \leqslant 49\)
Number of stones25222924
These data are to be presented on a statistical diagram.
  1. For a histogram, find the frequency density of the \(10 \leqslant x \leqslant 19\) class. [2]
  2. For a cumulative frequency graph, state the coordinates of the first two points that should be plotted. [2]
  3. Why is it not possible to draw an exact box-and-whisker plot to illustrate the data? [1]
OCR S1 2009 June Q6
11 marks Moderate -0.8
Last year Eleanor played 11 rounds of golf. Her scores were as follows: 79, 71, 80, 67, 67, 74, 66, 65, 71, 66, 64.
  1. Calculate the mean of these scores and show that the standard deviation is 5.31, correct to 3 significant figures. [4]
  2. Find the median and interquartile range of the scores. [4]
This year, Eleanor also played 11 rounds of golf. The standard deviation of her scores was 4.23, correct to 3 significant figures, and the interquartile range was the same as last year.
  1. Give a possible reason why the standard deviation of her scores was lower than last year although her interquartile range was unchanged. [1]
In golf, smaller scores mean a better standard of play than larger scores. Ken suggests that since the standard deviation was smaller this year, Eleanor's overall standard has improved.
  1. Explain why Ken is wrong. [1]
  2. State what the smaller standard deviation does show about Eleanor's play. [1]
OCR S1 2010 June Q1
9 marks Easy -1.2
The marks of some students in a French examination were summarised in a grouped frequency distribution and a cumulative frequency diagram was drawn, as shown below. \includegraphics{figure_1}
  1. Estimate how many students took the examination. [1]
  2. How can you tell that no student scored more than 55 marks? [1]
  3. Find the greatest possible range of the marks. [1]
  4. The minimum mark for Grade C was 27. The number of students who gained exactly Grade C was the same as the number of students who gained a grade lower than C. Estimate the maximum mark for Grade C. [3]
  5. In a German examination the marks of the same students had an interquartile range of 16 marks. What does this result indicate about the performance of the students in the German examination as compared with the French examination? [3]
OCR S1 2013 June Q1
7 marks Easy -1.8
The lengths, in centimetres, of 18 snakes are given below. 24 62 20 65 27 67 69 32 40 53 55 47 33 45 55 56 49 58
  1. Draw an ordered stem-and-leaf diagram for the data. [3]
  2. Find the mean and median of the lengths of the snakes. [2]
  3. It was found that one of the lengths had been measured incorrectly. After this length was corrected, the median increased by 1 cm. Give two possibilities for the incorrect length and give a corrected value in each case. [2]