2.02g Calculate mean and standard deviation

382 questions

Sort by: Default | Easiest first | Hardest first
Edexcel S4 Q6
12 marks Standard +0.3
Brickland and Goodbrick are two manufacturers of bricks. The lengths of the bricks produced by each manufacturer can be assumed to be normally distributed. A random sample of 20 bricks is taken from Brickland and the length, \(x\) mm, of each brick is recorded. The mean of this sample is 207.1 mm and the variance is 3.2 mm².
  1. Calculate the 98\% confidence interval for the mean length of brick from Brickland. [4]
A random sample of 10 bricks is selected from those manufactured by Goodbrick. The length of each brick, \(y\) mm, is recorded. The results are summarised as follows. \(\sum y = 2046.2\) \(\sum y^2 = 418785.4\) The variances of the length of brick for each manufacturer are assumed to be the same.
  1. [(b)] Find a 90\% confidence interval for the value by which the mean length of brick made by Brickland exceeds the mean length of brick made by Goodbrick. [8]
(Total 12 marks)
OCR H240/02 2020 November Q11
9 marks Moderate -0.3
As part of a research project, the masses, \(m\) grams, of a random sample of 1000 pebbles from a certain beach were recorded. The results are summarised in the table.
Mass (g)\(50 \leq m < 150\)\(150 \leq m < 200\)\(200 \leq m < 250\)\(250 \leq m < 350\)
Frequency162318355165
  1. Calculate estimates of the mean and standard deviation of these masses. [2]
The masses, \(x\) grams, of a random sample of 1000 pebbles on a different beach were also found. It was proposed that the distribution of these masses should be modelled by the random variable \(X \sim N(200, 3600)\).
  1. Use the model to find \(P(150 < X < 210)\). [1]
  2. Use the model to determine \(x_1\) such that \(P(160 < X < x_1) = 0.6\), giving your answer correct to five significant figures. [3]
It was found that the smallest and largest masses of the pebbles in this second sample were 112 g and 288 g respectively.
  1. Use these results to show that the model may not be appropriate. [1]
  2. Suggest a different value of a parameter of the model in the light of these results. [2]
OCR H240/02 2023 June Q7
5 marks Standard +0.8
A student wishes to prove that, for all positive integers \(a\) and \(b\), \(a^2 - 4b \neq 2\).
  1. Prove that \(a^2 - 4b = 2 \Rightarrow a\) is even. [2]
  2. Hence or otherwise prove that, for all positive integers \(a\) and \(b\), \(a^2 - 4b \neq 2\). [3]
AQA AS Paper 2 2018 June Q14
1 marks Easy -1.8
Given that \(\sum x = 364\), \(\sum x^2 = 19412\), \(n = 10\), find \(\sigma\), the standard deviation of \(X\). Circle your answer. 24.8 \quad 44.1 \quad 616.2 \quad 1941.2 [1 mark]
AQA AS Paper 2 2020 June Q15
3 marks Moderate -0.8
A random sample of ten CO₂ emissions was selected from the Large Data Set. The emissions in grams per kilogram were: 13 \quad 45 \quad 45 \quad 0 \quad 49 \quad 77 \quad 49 \quad 49 \quad 49 \quad 78
  1. Find the standard deviation of the sample. [1 mark]
  2. An environmentalist calculated the average CO₂ emissions for cars in the Large Data Set registered in 2002 and in 2016. The averages are listed below.
    Year of registration20022016
    Average CO₂ emission171.2120.4
    The environmentalist claims that the average CO₂ emissions for 2002 and 2016 combined is 145.8 Determine whether this claim is correct. Fully justify your answer. [2 marks]
AQA AS Paper 2 2024 June Q16
5 marks Easy -1.8
An investigation into the hydrocarbon emissions, \(X\) g/km, from cars in the Large Data Set was carried out. The results are summarised below. $$\sum x = 128.657 \qquad \sum x^2 = 8.701 \, 707 \qquad n = 2405$$ where \(n\) is the total number of cars which had a measured hydrocarbon emission in the Large Data Set.
    1. Find the mean of \(X\) [1 mark]
    2. Find the standard deviation of \(X\) [2 marks]
    1. The Large Data Set is a sample taken from the entire UK Department for Transport Stock Vehicle Database. It is claimed that the values in part (a)(i) and part (a)(ii) obtained from the Large Data Set should be reliable estimates for the mean and standard deviation of the hydrocarbon emissions for the entire UK Department for Transport Stock Vehicle Database. State, with a reason, whether this claim is likely to be correct. [1 mark]
    2. State one type of emission where more than 80% of the data is known for cars in the entire UK Department for Transport Stock Vehicle Database. [1 mark]
AQA AS Paper 2 Specimen Q17
6 marks Moderate -0.8
The table below is an extract from the Large Data Set.
MakeRegionEngine sizeMassCO2CO
VAUXHALLSouth West139811631180.463
VOLKSWAGENLondon99910551060.407
VAUXHALLSouth West12481225850.141
BMWSouth West297916351940.139
TOYOTASouth West199516501230.274
BMWSouth West297902440.447
FORDSouth West159601650.518
TOYOTASouth West12991050144
VAUXHALLLondon139813611400.695
FORDNorth West495117992990.621
    1. Calculate the standard deviation of the engine sizes in the table. [1 mark]
    2. The mean of the engine sizes is 2084 Any value more than 2 standard deviations from the mean can be identified as an outlier. Using this definition of an outlier, show that the sample of engine sizes has exactly one outlier. Fully justify your answer. [3 marks]
  1. Rajan calculates the mean of the masses of the cars in this extract and states that it is 1094 kg. Use your knowledge of the Large Data Set to suggest what error Rajan is likely to have made in his calculation. [1 mark]
  2. Rajan claims there is an error in the data recorded in the table for one of the Toyotas from the South West, because there is no value for its carbon monoxide emissions. Use your knowledge of the Large Data Set to comment on Rajan's claim. [1 mark]
AQA Paper 3 2018 June Q16
12 marks Moderate -0.3
A survey of 120 adults found that the volume, \(X\) litres per person, of carbonated drinks they consumed in a week had the following results: $$\sum x = 165.6 \quad \sum x^2 = 261.8$$
    1. Calculate the mean of \(X\). [1 mark]
    2. Calculate the standard deviation of \(X\). [2 marks]
  1. Assuming that \(X\) can be modelled by a normal distribution find
    1. P\((0.5 < X < 1.5)\) [2 marks]
    2. P\((X = 1)\) [1 mark]
  2. Determine with a reason, whether a normal distribution is suitable to model this data. [2 marks]
  3. It is known that the volume, \(Y\) litres per person, of energy drinks consumed in a week may be modelled by a normal distribution with standard deviation 0.21 Given that P\((Y > 0.75) = 0.10\), find the value of \(\mu\), correct to three significant figures. [4 marks]
AQA Paper 3 2019 June Q12
6 marks Moderate -0.3
Amelia decides to analyse the heights of members of her school rowing club. The heights of a random sample of 10 rowers are shown in the table below.
RowerJessNellLivNeveAnnToriMayaKathDarcyJen
Height (cm)162169172156146161159164157160
  1. Any value more than 2 standard deviations from the mean may be regarded as an outlier. Verify that Ann's height is an outlier. Fully justify your answer. [4 marks]
  2. Amelia thinks she may have written down Ann's height incorrectly. If Ann's height were discarded, state with a reason what, if any, difference this would make to the mean and standard deviation. [2 marks]
AQA Paper 3 2020 June Q11
1 marks Easy -1.8
The table below shows the temperature on Mount Everest on the first day of each month.
MonthJanFebMarAprMayJunJulAugSepOctNovDec
Temperature (\(^\circ\)C)\(-17\)\(-16\)\(-14\)\(-9\)\(-2\)\(2\)\(6\)\(5\)\(-3\)\(-4\)\(-11\)\(-18\)
Calculate the standard deviation of these temperatures. Circle your answer. [1 mark] \(-6.75\) \quad \(5.82\) \quad \(8.24\) \quad \(67.85\)
AQA Paper 3 2021 June Q13
6 marks Moderate -0.8
The table below is an extract from the Large Data Set.
Propulsion TypeRegionEngine SizeMassCO₂Particulate Emissions
2London189615331540.04
2North West189614231460.029
2North West189613531380.025
2South West199815471590.026
2London189613881380.025
2South West189612141300.011
2South West189614801460.029
2South West189614131460.024
2South West249616951920.034
2South West142212511220.025
2South West199520751750.034
2London189612851400.036
2North West18960146
    1. Calculate the mean and standard deviation of CO₂ emissions in the table. [2 marks]
    2. Any value more than 2 standard deviations from the mean can be identified as an outlier. Determine, using this definition of an outlier, if there are any outliers in this sample of CO₂ emissions. Fully justify your answer. [2 marks]
  1. Maria claims that the last line in the table must contain two errors. Use your knowledge of the Large Data Set to comment on Maria's claim. [2 marks]
AQA Paper 3 2022 June Q18
11 marks Moderate -0.8
In a particular year, the height of a male athlete at the Summer Olympics has a mean 1.78 metres and standard deviation 0.23 metres. The heights of 95% of male athletes are between 1.33 metres and 2.22 metres.
  1. Comment on whether a normal distribution may be suitable to model the height of a male athlete at the Summer Olympics in this particular year. [3 marks]
  2. You may assume that the height of a male athlete at the Summer Olympics may be modelled by a normal distribution with mean 1.78 metres and standard deviation 0.23 metres.
    1. Find the probability that the height of a randomly selected male athlete is 1.82 metres. [1 mark]
    2. Find the probability that the height of a randomly selected male athlete is between 1.70 metres and 1.90 metres. [1 mark]
    3. Two male athletes are chosen at random. Calculate the probability that both of their heights are between 1.70 metres and 1.90 metres. [1 mark]
  3. The summarised data for the heights, \(h\) metres, of a random sample of 40 male athletes at the Winter Olympics is given below. $$\sum h = 69.2 \quad\quad \sum (h - \bar{h})^2 = 2.81$$ Use this data to calculate estimates of the mean and standard deviation of the heights of male athletes at the Winter Olympics. [3 marks]
  4. Using your answers from part (c), compare the heights of male athletes at the Summer Olympics and male athletes at the Winter Olympics. [2 marks]
AQA Paper 3 2024 June Q14
5 marks Moderate -0.8
The annual cost of energy in 2021 for each of the 350 households in Village A can be modelled by a random variable \(X\) It is given that $$\sum x = 945\,000 \quad \sum x^2 = 2\,607\,500\,000$$ \begin{enumerate}[label=(\alph*)] \item Calculate the mean of \(X\). [1 mark] \item Calculate the standard deviation of \(X\). [2 marks] \item For households in Village B the annual cost of energy in 2021 has mean £3100 and standard deviation £325 Compare the annual cost of energy in 2021 for households in Village A and Village B. [2 marks]
OCR PURE Q10
9 marks Easy -1.2
The masses of a random sample of 120 boulders in a certain area were recorded. The results are summarized in the histogram. \includegraphics{figure_5}
  1. Calculate the number of boulders with masses between 60 and 65 kg. [2]
    1. Use midpoints to find estimates of the mean and standard deviation of the masses of the boulders in the sample. [3]
    2. Explain why your answers are only estimates. [1]
  2. Use your answers to part (b)(i) to determine an estimate of the number of outliers, if any, in the distribution. [2]
  3. Give one advantage of using a histogram rather than a pie chart in this context. [1]
OCR MEI AS Paper 2 2018 June Q7
8 marks Moderate -0.8
Rose and Emma each wear a device that records the number of steps they take in a day. All the results for a 7-day period are given in Fig. 7.
Day1234567
Rose10014112621014993619708992110369
Emma9204991387411001510261739110856
Fig. 7 The 7-day mean is the mean number of steps taken in the last 7 days. The 7-day mean for Rose is 10112.
  1. Calculate the 7-day mean for Emma. [1]
At the end of day 8 a new 7-day mean is calculated by including the number of steps taken on day 8 and omitting the number of steps taken on day 1. On day 8 Rose takes 10259 steps.
  1. Determine the number of steps Emma must take on day 8 so that her 7-day mean at the end of day 8 is the same as for Rose. [4]
In fact, over a long period of time, the mean of the number of steps per day that Emma takes is 10341 and the standard deviation is 948.
  1. Determine whether the number of steps Emma needs to take on day 8 so that her 7-day mean is the same as that for Rose in part (ii) is unusually high. [3]
OCR MEI Paper 2 2022 June Q9
9 marks Moderate -0.3
At the beginning of the academic year, all the pupils in year 12 at a college take part in an assessment. Summary statistics for the marks obtained by the 2021 cohort are given below. \(n = 205 \quad \sum x = 23042 \quad \sum x^2 = 2591716\) Marks may only be whole numbers, but the Head of Mathematics believes that the distribution of marks may be modelled by a Normal distribution.
  1. Calculate
    [2]
  2. Use your answers to part (a) to write down a possible Normal model for the distribution of marks. [2]
One candidate in the cohort scored less than 105.
  1. Determine whether the model found in part (b) is consistent with this information. [3]
  2. Use the model to calculate an estimate of the number of candidates who scored 115 marks. [2]
OCR MEI Paper 2 Specimen Q8
3 marks Standard +0.8
Alison selects 10 of her male friends. For each one she measures the distance between his eyes. The distances, measured in mm, are as follows: 51 57 58 59 61 64 64 65 67 68 The mean of these data is 61.4. The sample standard deviation is 5.232, correct to 3 decimal places. One of the friends decides he does not want his measurement to be used. Alison replaces his measurement with the measurement from another male friend. This increases the mean to 62.0 and reduces the standard deviation. Give a possible value for the measurement which has been removed and find the measurement which has replaced it. [3]
OCR MEI Paper 2 Specimen Q15
15 marks Standard +0.3
A quality control department checks the lifetimes of batteries produced by a company. The lifetimes, \(x\) minutes, for a random sample of 80 'Superstrength' batteries are shown in the table below.
Lifetime\(160 \leq x < 165\)\(165 \leq x < 168\)\(168 \leq x < 170\)\(170 \leq x < 172\)\(172 \leq x < 175\)\(175 \leq x < 180\)
Frequency5142021164
  1. Estimate the proportion of these batteries which have a lifetime of at least 174.0 minutes. [2]
  2. Use the data in the table to estimate
    [3]
The data in the table on the previous page are represented in the following histogram, Fig 15. \includegraphics{figure_15} A quality control manager models the data by a Normal distribution with the mean and standard deviation you calculated in part (b).
  1. Comment briefly on whether the histogram supports this choice of model. [2]
    1. Use this model to estimate the probability that a randomly selected battery will have a lifetime of more than 174.0 minutes.
    2. Compare your answer with your answer to part (a). [3]
The company also manufactures 'Ultrapower' batteries, which are stated to have a mean lifetime of 210 minutes.
  1. A random sample of 8 Ultrapower batteries is selected. The mean lifetime of these batteries is 207.3 minutes. Carry out a hypothesis test at the 5% level to investigate whether the mean lifetime is as high as stated. You should use the following hypotheses \(\text{H}_0 : \mu = 210\), \(\text{H}_1 : \mu < 210\), where \(\mu\) represents the population mean for Ultrapower batteries. You should assume that the population is Normally distributed with standard deviation 3.4. [5]
OCR MEI Paper 2 Specimen Q16
20 marks Easy -1.8
Fig. 16.1, Fig. 16.2 and Fig. 16.3 show some data about life expectancy, including some from the pre-release data set. \includegraphics{figure_16_1} \includegraphics{figure_16_2} \includegraphics{figure_16_3}
  1. Comment on the shapes of the distributions of life expectancy at birth in 2014 and 1974. [2]
    1. The minimum value shown in the box plot is negative. What does a negative value indicate? [1]
    2. What feature of Fig 16.3 suggests that a Normal distribution would not be an appropriate model for increase in life expectancy from one year to another year? [1]
    3. Software has been used to obtain the values in the table in Fig. 16.3. Decide whether the level of accuracy is appropriate. Justify your answer. [1]
    4. John claims that for half the people in the world their life expectancy has improved by 10 years or more. Explain why Fig. 16.3 does not provide conclusive evidence for John's claim. [1]
  2. Decide whether the maximum increase in life expectancy from 1974 to 2014 is an outlier. Justify your answer. [3]
Here is some further information from the pre-release data set.
CountryLife expectancy at birth in 2014
Ethiopia60.8
Sweden81.9
    1. Estimate the change in life expectancy at birth for Ethiopia between 1974 and 2014.
    2. Estimate the change in life expectancy at birth for Sweden between 1974 and 2014.
    3. Give one possible reason why the answers to parts (i) and (ii) are so different. [4]
Fig. 16.4 shows the relationship between life expectancy at birth in 2014 and 1974. \includegraphics{figure_16_4} A spreadsheet gives the following linear model for all the data in Fig 16.4. (Life expectancy at birth 2014) = 30.98 + 0.67 × (Life expectancy at birth 1974) The life expectancy at birth in 1974 for the region that now constitutes the country of South Sudan was 37.4 years. The value for this country in 2014 is not available.
    1. Use the linear model to estimate the life expectancy at birth in 2014 for South Sudan. [2]
    2. Give two reasons why your answer to part (i) is not likely to be an accurate estimate for the life expectancy at birth in 2014 for South Sudan. You should refer to both information from Fig 16.4 and your knowledge of the large data set. [2]
  1. In how many of the countries represented in Fig. 16.4 did life expectancy drop between 1974 and 2014? Justify your answer. [3]
WJEC Unit 2 2018 June Q06
10 marks Moderate -0.8
Basel is a keen learner of languages. He finds a website on which a large number of language tutors offer their services. Basel records the cost, in dollars, of a one hour lesson from a random sample of tutors. He puts the data into a computer program which gives the following summary statistics. Cost per 1 hour lesson Min. :10.0 1st Qu. :16.0 Median :17.2 Mean :19.8 3rd Qu. :21.0 Max. :40.0
  1. Showing all calculations, comment on any outliers for the cost of a one hour lesson with a language tutor. [4]
  2. Describe the skewness of the data and explain what it means in this context. [2]
Dafydd is also a keen learner of languages. He takes his own random sample of the cost, in dollars, for a one hour lesson. He produces the following box plot. \includegraphics{figure_6}
    1. What will happen to the mean if the outlier is removed?
    2. What will happen to the median if the outlier is removed? [2]
  1. Compare and contrast the distributions of the cost of one hour language lessons for Dafydd's sample and Basel's sample. [2]
WJEC Unit 2 2024 June Q5
8 marks Easy -1.2
In March 2020, the coronavirus pandemic caused major disruption to the lives of individuals across the world. A newspaper published the following graph from the gov.uk website, along with an article which included the following excerpt. "The daily number of vaccines administered continues to fall. In order to get control of the virus, we need the number of people receiving a second dose of the vaccine to keep rocketing. The fear is it will start to drop off soon, which will leave many people still unprotected." \includegraphics{figure_5}
  1. By referring to the graph, explain how the quote could be misleading. [1]
The daily numbers of second dose vaccines, in thousands, over the period April 1st 2021 to May 31st 2021 are shown in the table below.
Daily numberMidpointFrequencyPercentage
of 2nd dose\(x\)\(f\)
vaccines
(1000s)
\(0 \leqslant v < 100\)5023·3
\(100 \leqslant v < 200\)150813·1
\(200 \leqslant v < 300\)2501016·4
\(300 \leqslant v < 400\)3501321·3
\(400 \leqslant v < 500\)4502642·6
\(500 \leqslant v < 600\)55023·3
Total61100
    1. Calculate estimates of the mean and standard deviation for the daily number of second dose vaccines given over this period. You may use \(\sum x^2 f = 8272500\). [4]
    2. Comment on the skewness of these data. [1]
  1. Give a possible reason for the pattern observed in this graph. [1]
  2. State, with a reason, whether or not you think the data for April 15th to April 18th are incorrect. [1]
SPS SPS FM Statistics 2021 June Q1
4 marks Moderate -0.8
Employees at a company were asked how long their average commute to work was. The table below gives information about their answers.
Time taken (\(t\) minutes)Number of people
\(0 < t \leq 10\)\(x\)
\(10 < t \leq 20\)30
\(20 < t \leq 30\)35
\(30 < t \leq 50\)28
\(50 < t \leq 90\)12
The company estimates that the mean time for employees commuting to work is 28 minutes. Work out the value of \(x\), showing your working clearly. [4]
SPS SPS FM Statistics 2021 June Q5
7 marks Moderate -0.3
Eleven students in a class sit a Mathematics exam and their average score is 67% with a standard deviation of 12%. One student from the class is absent and sits the paper later, achieving a score of 85%.
  1. Find the mean score for the whole class and the standard deviation for the whole class. [5]
  2. Comment, with justification, on whether the score for the paper sat later should be considered as an outlier. [2]
SPS SPS SM 2021 February Q4
10 marks Easy -1.3
Each member of a group of 27 people was timed when completing a puzzle. The time taken, \(x\) minutes, for each member of the group was recorded. These times are summarised in the following box and whisker plot. \includegraphics{figure_4}
  1. Find the range of the times. [1]
  2. Find the interquartile range of the times. [1]
  3. For these 27 people \(\sum x = 607.5\) and \(\sum x^2 = 17623.25\) calculate the mean time taken to complete the puzzle. [1]
  4. calculate the standard deviation of the times taken to complete the puzzle. [2]
  5. Taruni defines an outlier as a value more than 3 standard deviations above the mean. State how many outliers Taruni would say there are in these data, giving a reason for your answer. [1]
  6. Adam and Beth also completed the puzzle in \(a\) minutes and \(b\) minutes respectively, where \(a > b\). When their times are included with the data of the other 27 people
    Suggest a possible value for \(a\) and a possible value for \(b\), explaining how your values satisfy the above conditions. [3]
  7. Without carrying out any further calculations, explain why the standard deviation of all 29 times will be lower than your answer to part (d). [1]
SPS SPS SM Statistics 2024 January Q1
4 marks Easy -1.8
At the beginning of the academic year, all the pupils in year 12 at a college take part in an assessment. Summary statistics for the marks obtained by the 2021 cohort are given below. \(n = 205\) \(\sum x = 23042\) \(\sum x^2 = 2591716\) Marks may only be whole numbers, but the Head of Mathematics believes that the distribution of marks may be modelled by a Normal distribution.
  1. Calculate
    [2]
  2. Use your answers to part (a) to write down a possible Normal model for the distribution of marks. [2]