2.02f Measures of average and spread

447 questions

Sort by: Default | Easiest first | Hardest first
AQA Paper 3 2024 June Q14
5 marks Moderate -0.8
The annual cost of energy in 2021 for each of the 350 households in Village A can be modelled by a random variable \(X\) It is given that $$\sum x = 945\,000 \quad \sum x^2 = 2\,607\,500\,000$$ \begin{enumerate}[label=(\alph*)] \item Calculate the mean of \(X\). [1 mark] \item Calculate the standard deviation of \(X\). [2 marks] \item For households in Village B the annual cost of energy in 2021 has mean £3100 and standard deviation £325 Compare the annual cost of energy in 2021 for households in Village A and Village B. [2 marks]
OCR PURE Q7
5 marks Standard +0.3
  1. Two real numbers are denoted by \(a\) and \(b\).
    1. Write down expressions for the following.
    2. Prove that the mean of the squares of \(a\) and \(b\) is greater than or equal to the square of their mean. [3]
  2. You are given that the result in part (a)(ii) is true for any two or more real numbers. Explain what this result shows about the variance of a set of data. [1]
OCR PURE Q10
9 marks Easy -1.2
The masses of a random sample of 120 boulders in a certain area were recorded. The results are summarized in the histogram. \includegraphics{figure_5}
  1. Calculate the number of boulders with masses between 60 and 65 kg. [2]
    1. Use midpoints to find estimates of the mean and standard deviation of the masses of the boulders in the sample. [3]
    2. Explain why your answers are only estimates. [1]
  2. Use your answers to part (b)(i) to determine an estimate of the number of outliers, if any, in the distribution. [2]
  3. Give one advantage of using a histogram rather than a pie chart in this context. [1]
OCR MEI AS Paper 2 2018 June Q7
8 marks Moderate -0.8
Rose and Emma each wear a device that records the number of steps they take in a day. All the results for a 7-day period are given in Fig. 7.
Day1234567
Rose10014112621014993619708992110369
Emma9204991387411001510261739110856
Fig. 7 The 7-day mean is the mean number of steps taken in the last 7 days. The 7-day mean for Rose is 10112.
  1. Calculate the 7-day mean for Emma. [1]
At the end of day 8 a new 7-day mean is calculated by including the number of steps taken on day 8 and omitting the number of steps taken on day 1. On day 8 Rose takes 10259 steps.
  1. Determine the number of steps Emma must take on day 8 so that her 7-day mean at the end of day 8 is the same as for Rose. [4]
In fact, over a long period of time, the mean of the number of steps per day that Emma takes is 10341 and the standard deviation is 948.
  1. Determine whether the number of steps Emma needs to take on day 8 so that her 7-day mean is the same as that for Rose in part (ii) is unusually high. [3]
OCR MEI Paper 2 2022 June Q11
10 marks Standard +0.3
A die in the form of a dodecahedron has its faces numbered from 1 to 12. The die is biased so that the probability that a score of 12 is achieved is different from any other score. The probability distribution of \(X\), the score on the die, is given in the table in terms of \(p\) and \(k\), where \(0 < p < 1\) and \(k\) is a positive integer.
\(x\)123456789101112
P\((X = x)\)\(p\)\(p\)\(p\)\(p\)\(p\)\(p\)\(p\)\(p\)\(p\)\(p\)\(p\)\(kp\)
Sam rolls the die 30 times, Leo rolls the die 60 times and Nina rolls the die 120 times. They each plot their scores on bar line graphs.
  1. Explain whose graph is most likely to give the best representation of the theoretical probability distribution for the score on the die. [1]
  2. Find \(p\) in terms of \(k\). [2]
  3. Determine, in terms of \(k\), the expected number of times Nina rolls a 12. [3]
  4. Given that Nina rolls a 12 on 32 occasions, calculate an estimate of the value of \(k\). [2]
Nina rolls the die a further 30 times.
  1. Use your answer to part (d) to calculate an estimate for the probability that she obtains a 12 exactly 8 times in these 30 rolls. [2]
OCR MEI Paper 2 Specimen Q8
3 marks Standard +0.8
Alison selects 10 of her male friends. For each one she measures the distance between his eyes. The distances, measured in mm, are as follows: 51 57 58 59 61 64 64 65 67 68 The mean of these data is 61.4. The sample standard deviation is 5.232, correct to 3 decimal places. One of the friends decides he does not want his measurement to be used. Alison replaces his measurement with the measurement from another male friend. This increases the mean to 62.0 and reduces the standard deviation. Give a possible value for the measurement which has been removed and find the measurement which has replaced it. [3]
OCR MEI Paper 2 Specimen Q15
15 marks Standard +0.3
A quality control department checks the lifetimes of batteries produced by a company. The lifetimes, \(x\) minutes, for a random sample of 80 'Superstrength' batteries are shown in the table below.
Lifetime\(160 \leq x < 165\)\(165 \leq x < 168\)\(168 \leq x < 170\)\(170 \leq x < 172\)\(172 \leq x < 175\)\(175 \leq x < 180\)
Frequency5142021164
  1. Estimate the proportion of these batteries which have a lifetime of at least 174.0 minutes. [2]
  2. Use the data in the table to estimate
    [3]
The data in the table on the previous page are represented in the following histogram, Fig 15. \includegraphics{figure_15} A quality control manager models the data by a Normal distribution with the mean and standard deviation you calculated in part (b).
  1. Comment briefly on whether the histogram supports this choice of model. [2]
    1. Use this model to estimate the probability that a randomly selected battery will have a lifetime of more than 174.0 minutes.
    2. Compare your answer with your answer to part (a). [3]
The company also manufactures 'Ultrapower' batteries, which are stated to have a mean lifetime of 210 minutes.
  1. A random sample of 8 Ultrapower batteries is selected. The mean lifetime of these batteries is 207.3 minutes. Carry out a hypothesis test at the 5% level to investigate whether the mean lifetime is as high as stated. You should use the following hypotheses \(\text{H}_0 : \mu = 210\), \(\text{H}_1 : \mu < 210\), where \(\mu\) represents the population mean for Ultrapower batteries. You should assume that the population is Normally distributed with standard deviation 3.4. [5]
OCR MEI Paper 2 Specimen Q16
20 marks Easy -1.8
Fig. 16.1, Fig. 16.2 and Fig. 16.3 show some data about life expectancy, including some from the pre-release data set. \includegraphics{figure_16_1} \includegraphics{figure_16_2} \includegraphics{figure_16_3}
  1. Comment on the shapes of the distributions of life expectancy at birth in 2014 and 1974. [2]
    1. The minimum value shown in the box plot is negative. What does a negative value indicate? [1]
    2. What feature of Fig 16.3 suggests that a Normal distribution would not be an appropriate model for increase in life expectancy from one year to another year? [1]
    3. Software has been used to obtain the values in the table in Fig. 16.3. Decide whether the level of accuracy is appropriate. Justify your answer. [1]
    4. John claims that for half the people in the world their life expectancy has improved by 10 years or more. Explain why Fig. 16.3 does not provide conclusive evidence for John's claim. [1]
  2. Decide whether the maximum increase in life expectancy from 1974 to 2014 is an outlier. Justify your answer. [3]
Here is some further information from the pre-release data set.
CountryLife expectancy at birth in 2014
Ethiopia60.8
Sweden81.9
    1. Estimate the change in life expectancy at birth for Ethiopia between 1974 and 2014.
    2. Estimate the change in life expectancy at birth for Sweden between 1974 and 2014.
    3. Give one possible reason why the answers to parts (i) and (ii) are so different. [4]
Fig. 16.4 shows the relationship between life expectancy at birth in 2014 and 1974. \includegraphics{figure_16_4} A spreadsheet gives the following linear model for all the data in Fig 16.4. (Life expectancy at birth 2014) = 30.98 + 0.67 × (Life expectancy at birth 1974) The life expectancy at birth in 1974 for the region that now constitutes the country of South Sudan was 37.4 years. The value for this country in 2014 is not available.
    1. Use the linear model to estimate the life expectancy at birth in 2014 for South Sudan. [2]
    2. Give two reasons why your answer to part (i) is not likely to be an accurate estimate for the life expectancy at birth in 2014 for South Sudan. You should refer to both information from Fig 16.4 and your knowledge of the large data set. [2]
  1. In how many of the countries represented in Fig. 16.4 did life expectancy drop between 1974 and 2014? Justify your answer. [3]
WJEC Unit 2 2018 June Q06
10 marks Moderate -0.8
Basel is a keen learner of languages. He finds a website on which a large number of language tutors offer their services. Basel records the cost, in dollars, of a one hour lesson from a random sample of tutors. He puts the data into a computer program which gives the following summary statistics. Cost per 1 hour lesson Min. :10.0 1st Qu. :16.0 Median :17.2 Mean :19.8 3rd Qu. :21.0 Max. :40.0
  1. Showing all calculations, comment on any outliers for the cost of a one hour lesson with a language tutor. [4]
  2. Describe the skewness of the data and explain what it means in this context. [2]
Dafydd is also a keen learner of languages. He takes his own random sample of the cost, in dollars, for a one hour lesson. He produces the following box plot. \includegraphics{figure_6}
    1. What will happen to the mean if the outlier is removed?
    2. What will happen to the median if the outlier is removed? [2]
  1. Compare and contrast the distributions of the cost of one hour language lessons for Dafydd's sample and Basel's sample. [2]
WJEC Unit 2 2024 June Q5
8 marks Easy -1.2
In March 2020, the coronavirus pandemic caused major disruption to the lives of individuals across the world. A newspaper published the following graph from the gov.uk website, along with an article which included the following excerpt. "The daily number of vaccines administered continues to fall. In order to get control of the virus, we need the number of people receiving a second dose of the vaccine to keep rocketing. The fear is it will start to drop off soon, which will leave many people still unprotected." \includegraphics{figure_5}
  1. By referring to the graph, explain how the quote could be misleading. [1]
The daily numbers of second dose vaccines, in thousands, over the period April 1st 2021 to May 31st 2021 are shown in the table below.
Daily numberMidpointFrequencyPercentage
of 2nd dose\(x\)\(f\)
vaccines
(1000s)
\(0 \leqslant v < 100\)5023·3
\(100 \leqslant v < 200\)150813·1
\(200 \leqslant v < 300\)2501016·4
\(300 \leqslant v < 400\)3501321·3
\(400 \leqslant v < 500\)4502642·6
\(500 \leqslant v < 600\)55023·3
Total61100
    1. Calculate estimates of the mean and standard deviation for the daily number of second dose vaccines given over this period. You may use \(\sum x^2 f = 8272500\). [4]
    2. Comment on the skewness of these data. [1]
  1. Give a possible reason for the pattern observed in this graph. [1]
  2. State, with a reason, whether or not you think the data for April 15th to April 18th are incorrect. [1]
WJEC Unit 2 Specimen Q5
12 marks Easy -1.2
Gareth has a keen interest in pop music. He recently read the following claim in a music magazine. In the pop industry most songs on the radio are not longer than three minutes.
  1. He decided to investigate this claim by recording the lengths of the top 50 singles in the UK Official Singles Chart for the week beginning 17 June 2016. (A 'single' in this context is one digital audio track.) Comment on the suitability of this sample to investigate the magazine's claim. [1]
  2. Gareth recorded the data in the table below.
    Length of singles for top 50 UK Official Chart singles, 17 June 2016
    2.5-(3.0)3.0-(3.5)3.5-(4.0)4.0-(4.5)4.5-(5.0)5.0-(5.5)5.5-(6.0)6.0-(6.5)6.5-(7.0)7.0-(7.5)
    317227000001
    He used these data to produce a graph of the distributions of the lengths of singles \includegraphics{figure_2} State two corrections that Gareth needs to make to the histogram so that it accurately represents the data in the table. [2]
  3. Gareth also produced a box plot of the lengths of singles. \includegraphics{figure_3} He sees that there is one obvious outlier.
    1. What will happen to the mean if the outlier is removed?
    2. What will happen to the standard deviation if the outlier is removed? [2]
  4. Gareth decided to remove the outlier. He then produced a table of summary statistics.
    1. Use the appropriate statistics from the table to show, by calculation, that the maximum value for the length of a single is not an outlier.
      Summary statistics
      Length of single for top 50 UK Official Singles Chart (minutes)
      Length of singleNMeanStandard deviationMinimumLower quartileMedianUpper quartileMaximum
      493.570.3932.773.263.603.894.38
    2. State, with a reason, whether these statistics support the magazine's claim. [4]
  5. Gareth also calculated summary statistics for the lengths of 30 singles selected at random from his personal collection.
    Summary statistics
    Length of single for Gareth's random sample of 30 singles (minutes)
    Length of singleNMeanStandard deviationMinimumLower quartileMedianUpper quartileMaximum
    303.130.3642.582.732.923.223.95
    Compare and contrast the distribution of lengths of singles in Gareth's personal collection with the distribution in the top 50 UK Official Singles Chart. [3]
SPS SPS SM 2021 February Q4
10 marks Easy -1.3
Each member of a group of 27 people was timed when completing a puzzle. The time taken, \(x\) minutes, for each member of the group was recorded. These times are summarised in the following box and whisker plot. \includegraphics{figure_4}
  1. Find the range of the times. [1]
  2. Find the interquartile range of the times. [1]
  3. For these 27 people \(\sum x = 607.5\) and \(\sum x^2 = 17623.25\) calculate the mean time taken to complete the puzzle. [1]
  4. calculate the standard deviation of the times taken to complete the puzzle. [2]
  5. Taruni defines an outlier as a value more than 3 standard deviations above the mean. State how many outliers Taruni would say there are in these data, giving a reason for your answer. [1]
  6. Adam and Beth also completed the puzzle in \(a\) minutes and \(b\) minutes respectively, where \(a > b\). When their times are included with the data of the other 27 people
    Suggest a possible value for \(a\) and a possible value for \(b\), explaining how your values satisfy the above conditions. [3]
  7. Without carrying out any further calculations, explain why the standard deviation of all 29 times will be lower than your answer to part (d). [1]
SPS SPS FM Statistics 2025 April Q2
13 marks Moderate -0.8
In a study of reaction times, 25 participants completed a test where their reaction times (in milliseconds) were recorded. The results are shown in the stem-and-leaf diagram below: 20 | 3 5 7 9 21 | 0 2 5 6 8 22 | 1 3 4 5 7 9 23 | 0 2 5 8 24 | 1 4 6 7 25 | 2 5 Key: 21 | 0 represents a reaction time of 210 milliseconds
  1. State the median reaction time. [1]
  2. Calculate the interquartile range of these reaction times. [2]
  3. Find the mean and standard deviation of these reaction times. [3]
  4. State one advantage of using a stem-and-leaf diagram to display this data rather than a frequency table. [1]
  5. One participant completed the test again and recorded a reaction time of 195 milliseconds. Add this result to the stem-and-leaf diagram and state the effect this would have on:
    1. the median
    2. the mean
    3. the standard deviation
    [4]
  6. Explain why the interquartile range might be preferred to the standard deviation as a measure of spread in this context [2]
SPS SPS SM Statistics 2025 April Q5
13 marks Easy -1.3
In a study of reaction times, 25 participants completed a test where their reaction times (in milliseconds) were recorded. The results are shown in the stem-and-leaf diagram below: 20 | 3 5 7 9 21 | 0 2 5 6 8 22 | 1 3 4 5 7 9 23 | 0 2 5 8 24 | 1 4 6 7 25 | 2 5 Key: 21 | 0 represents a reaction time of 210 milliseconds
  1. State the median reaction time. [1]
  2. Calculate the interquartile range of these reaction times. [2]
  3. Find the mean and standard deviation of these reaction times. [3]
  4. State one advantage of using a stem-and-leaf diagram to display this data rather than a frequency table. [1]
  5. One participant completed the test again and recorded a reaction time of 195 milliseconds. Add this result to the stem-and-leaf diagram and state the effect this would have on: a. the median b. the mean c. the standard deviation [4]
  6. Explain why the interquartile range might be preferred to the standard deviation as a measure of spread in this context [2]
OCR H240/02 2018 December Q10
6 marks Moderate -0.8
Using the 2001 UK census results and some software, Javid intended to calculate the mean number of people who travelled to work by underground, metro, light rail or tram (UMLT) for all 348 Local Authorities. However, Javid noticed that for one LA the entry in the UMLT column is a dash, rather than a 0. See the extract below.
Data extract for one LA in 2001
Work mainly at or from homeUMLTTrainBus, minibus or coach
29544
Javid felt that it was not clear how this LA was to be treated so he decided to omit it from his calculation.
  1. Explain how the omission of this LA affects Javid's calculation of the mean. [1]
The value of the mean that Javid obtained was 2046.3.
  1. Calculate the value of the mean when this LA is not removed. [2]
Javid finds that the corresponding mean for all Local Authorities for 2011 is 2860.8. In order to compare the means for the two years, Javid also finds the total number of employees in each of these years. His results are given below.
Year20012011
Total number of employees23 627 75326 526 336
  1. Show that a higher proportion of employees used the metro to travel to work in 2011 than in 2001. [2]
  2. Suggest a reason for this increase. [1]
Pre-U Pre-U 9794/1 2010 June Q13
10 marks Moderate -0.3
A survey was conducted into the annual salary offered for 19 different jobs in 2008. The results were as follows, in thousands of pounds.
15161819213636384141
4347515556606264110
It was decided to undertake a further study to see if self-esteem was correlated with level of annual salary. A random sample of 11 employees was taken and self-esteem was rated on a scale of 1 to 10 with the highest self-esteem being 10. The results were as follows.
Salary in £10 000's1234567891011
Self-esteem435177851079
Pre-U Pre-U 9794/1 2011 June Q14
9 marks Standard +0.3
  1. The table below relates the values of two variables \(x\) and \(y\).
    \(x\)1\(A\)\(A + 3\)10
    \(y\)2\(A - 1\)\(A\)5
    \(A\) is a positive integer and \(\sum xy = 92\).
    1. Calculate the value of \(A\). [3]
    2. Explain how you can tell that the product-moment correlation coefficient is 1. [1]
  2. A music society has 300 members. 240 like Puccini, 100 like Wagner and 50 like neither.
    1. Calculate the probability that a member chosen at random likes Puccini but not Wagner. [3]
    2. Calculate the probability that a member chosen at random likes Puccini given that this member likes Wagner. [2]
Pre-U Pre-U 9794/3 2013 November Q5
8 marks Moderate -0.8
The table summarises 43 birth weights as recorded for babies born in a particular hospital during one week.
Birth weight (w kg)\(2.0 \leqslant w < 2.5\)\(2.5 \leqslant w < 3.0\)\(3.0 \leqslant w < 3.5\)\(3.5 \leqslant w < 4.0\)\(4.0 \leqslant w < 4.5\)
Frequency1691710
  1. State the type of skewness of the data. [1]
  2. Given that the lower quartile is 3.21 kg and the upper quartile is 3.96 kg, determine whether there are any babies whose birth weights might be regarded as outliers. [4]
  3. The mean birth weight was found to be 3.58 kg. However, it was discovered subsequently that the table includes the birth weight, 2.52 kg, of one baby that has been recorded twice. Find the mean birth weight after this error has been removed. [3]
Pre-U Pre-U 9794/3 2014 June Q1
5 marks Easy -1.3
The masses, in kilograms, of 100 chickens on sale in a large supermarket were recorded as follows.
Mass (\(x\) kg)\(1.6 \leqslant x < 1.8\)\(1.8 \leqslant x < 2.0\)\(2.0 \leqslant x < 2.2\)\(2.2 \leqslant x < 2.4\)\(2.4 \leqslant x < 2.6\)
Number of chickens1627281811
Calculate estimates of the mean and standard deviation of the masses of these chickens. [5]
Pre-U Pre-U 9794/3 2014 June Q1
5 marks Easy -1.8
The masses, in kilograms, of 100 chickens on sale in a large supermarket were recorded as follows.
Mass (\(x\) kg)\(1.6 \leq x < 1.8\)\(1.8 \leq x < 2.0\)\(2.0 \leq x < 2.2\)\(2.2 \leq x < 2.4\)\(2.4 \leq x < 2.6\)
Number of chickens1627281811
Calculate estimates of the mean and standard deviation of the masses of these chickens. [5]
Pre-U Pre-U 9794/3 2019 Specimen Q3
5 marks Easy -1.8
The table shows fuel economy figures in miles per gallon (mpg) for some new cars.
CarABCDEFGHIJKLMNO
Mpg574034331117302731203524262332
  1. Find the median and quartiles for the mpg of these 15 cars. [2]
  2. Use the values in part (a) to identify any cars for which the mpg is an outlier. [3]
Pre-U Pre-U 9794/3 2020 Specimen Q3
5 marks Easy -1.3
The table shows fuel economy figures in miles per gallon (mpg) for some new cars.
CarABCDEFGHIJKLMNO
Mpg574034331117302731203524262332
  1. Find the median and quartiles for the mpg of these 15 cars. [2]
  2. Use the values in part (a) to identify any cars for which the mpg is an outlier. [3]