2.02g Calculate mean and standard deviation

382 questions

Sort by: Default | Easiest first | Hardest first
OCR MEI S1 Q2
8 marks Moderate -0.8
2 Dwayne is a car salesman. The numbers of cars, \(x\), sold by Dwayne each month during the year 2008 are summarised by $$n = 12 , \quad \Sigma x = 126 , \quad \Sigma x ^ { 2 } = 1582 .$$
  1. Calculate the mean and standard deviation of the monthly numbers of cars sold.
  2. Dwayne earns \(\pounds 500\) each month plus \(\pounds 100\) commission for each car sold. Show that the mean of Dwayne's monthly earnings is \(\pounds 1550\). Find the standard deviation of Dwayne's monthly earnings.
  3. Marlene is a car saleswoman and is paid in the same way as Dwayne. During 2008 her monthly earnings have mean \(\pounds 1625\) and standard deviation \(\pounds 280\). Briefly compare the monthly numbers of cars sold by Marlene and Dwayne during 2008.
OCR MEI S1 Q6
6 marks Moderate -0.8
6 In a survey, a sample of 44 fields is selected. Their areas ( \(x\) hectares) are summarised in the grouped frequency table.
Area \(( x )\)\(0 < x \leqslant 3\)\(3 < x \leqslant 5\)\(5 < x \leqslant 7\)\(7 < x \leqslant 10\)\(10 < x \leqslant 20\)
Frequency3813146
  1. Calculate an estimate of the sample mean and the sample standard deviation.
  2. Determine whether there could be any outliers at the upper end of the distribution.
OCR MEI S1 Q2
8 marks Moderate -0.8
2 The marks \(x\) scored by a sample of 56 students in an examination are summarised by $$n = 56 , \quad \Sigma x = 3026 , \quad \Sigma x ^ { 2 } = 178890 .$$
  1. Calculate the mean and standard deviation of the marks.
  2. The highest mark scored by any of the 56 students in the examination was 93. Show that this result may be considered to be an outlier.
  3. The formula \(y = 1.2 x - 10\) is used to scale the marks. Find the mean and standard deviation of the scaled marks.
OCR MEI S1 Q1
8 marks Easy -1.2
1 The stem and leaf diagram illustrates the heights in metres of 25 young oak trees.
3467899
402234689
501358
6245
746
81
Key: 4 |2 represents 4.2
  1. State the type of skewness of the distribution.
  2. Use your calculator to find the mean and standard deviation of these data.
  3. Determine whether there are any outliers.
OCR MEI S1 Q6
7 marks Easy -1.2
6 A supermarket chain buys a batch of 10000 scratchcard draw tickets for sale in its stores. 50 of these tickets have a \(\pounds 10\) prize, 20 of them have a \(\pounds 100\) prize, one of them has a \(\pounds 5000\) prize and all of the rest have no prize. This information is summarised in the frequency table below.
Prize money\(\pounds 0\)\(\pounds 10\)\(\pounds 100\)\(\pounds 5000\)
Frequency992950201
  1. Find the mean and standard deviation of the prize money per ticket.
  2. I buy two of these tickets at random. Find the probability that I win either two \(\pounds 10\) prizes or two \(\pounds 100\) prizes.
OCR MEI S1 Q5
20 marks Moderate -0.3
5 A pear grower collects a random sample of 120 pears from his orchard. The histogram below shows the lengths, in mm , of these pears. \includegraphics[max width=\textwidth, alt={}, center]{056d3e9a-088d-4c97-9546-7cecb59b8727-3_815_1628_505_304}
  1. Calculate the number of pears which are between 90 and 100 mm long.
  2. Calculate an estimate of the mean length of the pears. Explain why your answer is only an estimate.
  3. Calculate an estimate of the standard deviation.
  4. Use your answers to parts (ii) and (iii) to investigate whether there are any outliers.
  5. Name the type of skewness of the distribution.
  6. Illustrate the data using a cumulative frequency diagram.
OCR MEI S1 Q5
22 marks Moderate -0.3
5 A pear grower collects a random sample of 120 pears from his orchard. The histogram below shows the lengths, in mm , of these pears. \includegraphics[max width=\textwidth, alt={}, center]{99c502aa-2c9f-461d-9dc0-ed55e3df32a2-3_815_1628_505_304}
  1. Calculate the number of pears which are between 90 and 100 mm long.
  2. Calculate an estimate of the mean length of the pears. Explain why your answer is only an estimate.
  3. Calculate an estimate of the standard deviation.
  4. Use your answers to parts (ii) and (iii) to investigate whether there are any outliers.
  5. Name the type of skewness of the distribution.
  6. Illustrate the data using a cumulative frequency diagram.
Edexcel S1 2014 January Q2
8 marks Moderate -0.8
2. A rugby club coach uses club records to take a random sample of 15 players from 1990 and an independent random sample of 15 players from 2010. The body weight of each player was recorded to the nearest kg and the results from 2010 are summarised in the table below.
Body weight (kg)75-7980-8485-8990-9495-99100-104105-109
Number of Players (2010)1224321
  1. Find the estimated values in kg of the summary statistics \(a\), \(b\) and \(c\) in the table below.
    Estimate in 1990Estimate in 2010
    Mean83.0\(a\)
    Median82.0\(b\)
    Variance44.0\(c\)
    Give your answers to 3 significant figures. The rugby coach claims that players' body weight increased between 1990 and 2010.
  2. Using the table in part (a), comment on the rugby coach's claim. \includegraphics[max width=\textwidth, alt={}, center]{a839a89a-17f0-473b-ac10-bcec3dbe97f7-05_104_97_2613_1784}
Edexcel S1 2014 January Q8
10 marks Moderate -0.8
8. A manager records the number of hours of overtime claimed by 40 staff in a month. The histogram in Figure 1 represents the results. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{a839a89a-17f0-473b-ac10-bcec3dbe97f7-26_1107_1513_406_210} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure}
  1. Calculate the number of staff who have claimed less than 10 hours of overtime in the month.
  2. Estimate the median number of hours of overtime claimed by these 40 staff in the month.
  3. Estimate the mean number of hours of overtime claimed by these 40 staff in the month. The manager wants to compare these data with overtime data he collected earlier to find out if the overtime claimed by staff has decreased.
  4. State, giving a reason, whether the manager should use the median or the mean to compare the overtime claimed by staff.
    (2)
Edexcel S1 2015 January Q2
9 marks Moderate -0.8
  1. A sports teacher recorded the number of press-ups done by his students in two minutes. He recorded this information for a Year 7 class and for a Year 11 class.
The back-to-back stem and leaf diagram shows this information.
TotalsYear 7 classYear 11 classTotals
(6)8765541
(10)977654442220569(4)
(7)8754330334588(5)
(5)99722405679(5)
(3)840503556677799(11)
60333348(7)
Key: \(2 | 4 | 0\) means 42 press-ups for a Year 7 student and 40 press-ups for a Year 11 student
  1. Find the median number of press-ups for each class. For the Year 11 class, the lower quartile is 38 and the upper quartile is 59
  2. Find the lower quartile and the upper quartile for the Year 7 class.
  3. Use the medians and quartiles to describe the skewness of each of the two distributions.
  4. Give two reasons why the normal distribution should not be used to model the number of press-ups done by the Year 11 class.
Edexcel S1 2016 January Q6
9 marks Moderate -0.8
6. Yujie is investigating the weights of 10 young rabbits. She records the weight, \(x\) grams, of each rabbit and the results are summarised below. $$\sum x = 8360 \quad \text { and } \quad \sum ( x - \bar { x } ) ^ { 2 } = 63840$$
  1. Calculate the mean and the standard deviation of the weights of these rabbits. Given that the median weight of these rabbits is 815 grams,
  2. describe, giving a reason, the skewness of these data. Two more rabbits weighing 776 grams and 896 grams are added to make a group of 12 rabbits.
  3. State, giving a reason, how the inclusion of these two rabbits would affect the mean.
  4. By considering the change in \(\sum ( x - \bar { x } ) ^ { 2 }\), state what effect the inclusion of these two rabbits would have on the standard deviation.
    END
Edexcel S1 2018 January Q1
12 marks Moderate -0.8
  1. Two classes of students, class \(A\) and class \(B\), sat a test.
Class \(A\) has 10 students. Class \(B\) has 15 students. Each student achieved a score, \(x\), on the test and their scores are summarised in the table below.
\cline { 2 - 4 } \multicolumn{1}{c|}{}\(n\)\(\sum x\)\(\sum x ^ { 2 }\)
Class \(A\)1077059610
Class \(B\)15\(t\)58035
The mean score for Class \(A\) is 77 and the mean score for Class \(B\) is 61
  1. Find the value of \(t\)
  2. Calculate the variance of the test scores for each class. The highest score on the test was 95 and the lowest score was 45 These were each scored by students from the same class.
  3. State, with a reason, which class you believe they were from. The two classes are combined into one group of 25 students.
    1. Find the mean test score for all 25 students.
    2. Find the variance of the test scores for all 25 students. The teacher of class \(A\) later realises that he added up the test scores for his class incorrectly. Each student's test score in class \(A\) should be increased by 3
  4. Without further calculations, state, with a reason, the effect this will have on
    1. the variance of the test scores for class \(A\)
    2. the mean test score for all 25 students
    3. the variance of the test scores for all 25 students.
Edexcel S1 2019 January Q4
13 marks Moderate -0.8
4. A group of 100 adults recorded the amount of time, \(t\) minutes, they spent exercising each day. Their results are summarised in the table below.
Time (t minutes)Frequency (f)Time midpoint (x)
\(0 \leqslant t < 15\)257.5
\(15 \leqslant t < 30\)1722.5
\(30 \leqslant t < 60\)2845
\(60 \leqslant t < 120\)2490
\(120 \leqslant t \leqslant 240\)6180
[You may use \(\sum \mathrm { f } x ^ { 2 } = 455\) 512.5]
A histogram is drawn to represent these data.
The bar representing the time \(0 \leqslant t < 15\) has width 0.5 cm and height 6 cm .
  1. Calculate the width and height of the bar representing a time of \(60 \leqslant t < 120\)
  2. Use linear interpolation to estimate the median time spent exercising by these adults each day.
  3. Find an estimate of the mean time spent exercising by these adults each day.
  4. Calculate an estimate for the standard deviation of these times.
  5. Describe, giving a reason, the skewness of these data. Further analysis of the above data revealed that 18 of the 25 adults in the \(0 \leqslant t < 15\) group took no exercise each day.
  6. State, giving a reason, what effect, if any, this new information would have on your answers to
    1. the estimate of the median in part (b),
    2. the estimate of the mean in part (c),
    3. the estimate of the standard deviation in part (d).
Edexcel S1 2019 January Q6
18 marks Moderate -0.3
  1. Following some school examinations, Chetna is studying the results of the 16 students in her class. The mark for paper \(1 , x\), and the mark for paper \(2 , y\), for each student are summarised in the following statistics.
$$\bar { x } = 35.75 \quad \bar { y } = 25.75 \quad \sigma _ { x } = 7.79 \quad \sigma _ { y } = 11.91 \quad \sum x y = 15837$$
  1. Comment on the differences between the marks of the students on paper 1 and paper 2 Chetna decides to examine these data in more detail and plots the marks for each of the 16 students on the scatter diagram opposite.
    1. Explain why the circled point \(( 38,0 )\) is possibly an outlier.
    2. Suggest a possible reason for this result. Chetna decides to omit the data point \(( 38,0 )\) and examine the other 15 students' marks.
  2. Find the value of \(\bar { x }\) and the value of \(\bar { y }\) for these 15 students. For these 15 students
    1. explain why \(\sum x y\) is still 15837
    2. show that \(\mathrm { S } _ { x y } = 1169.8\) For these 15 students, Chetna calculates \(\mathrm { S } _ { x x } = 965.6\) and \(\mathrm { S } _ { y y } = 1561.7\) correct to 1 decimal place.
  3. Calculate the product moment correlation coefficient for these 15 students.
  4. Calculate the equation of the line of regression of \(y\) on \(x\) for these 15 students, giving your answer in the form \(y = a + b x\) The product moment correlation coefficient between \(x\) and \(y\) for all 16 students is 0.746
  5. Explain how your calculation in part (e) supports Chetna's decision to omit the point \(( 38,0 )\) before calculating the equation of the linear regression line.
    (1)
  6. Estimate the mark in the second paper for a student who scored 38 marks in the first paper.
    \includegraphics[max width=\textwidth, alt={}]{d3f4450d-60eb-49b6-be1b-d2fcfad0451f-17_1127_1146_301_406}
    \includegraphics[max width=\textwidth, alt={}]{d3f4450d-60eb-49b6-be1b-d2fcfad0451f-20_2630_1828_121_121}
Edexcel S1 2021 January Q2
9 marks Easy -1.3
2. The stem and leaf diagram below shows the ages (in years) of the residents in a care home.
AgeKey: \(4 \mid 3\) is an age of 43
43\(( 1 )\)
54
6235688899\(( 1 )\)
711344666889\(( 9 )\)
80027889\(( 11 )\)
937
  1. Find the median age of the residents.
  2. Find the interquartile range (IQR) of the ages of the residents. An outlier is defined as a value that is either
    more than \(1.5 \times ( \mathrm { IQR } )\) below the lower quartile or more than \(1.5 \times ( \mathrm { IQR } )\) above the upper quartile.
  3. Determine any outliers in these data. Show clearly any calculations that you use.
  4. On the grid on page 5, draw a box plot to summarise these data.
    Ages
Edexcel S1 2021 January Q6
15 marks Moderate -0.3
  1. A disc of radius 1 cm is rolled onto a horizontal grid of rectangles so that the disc is equally likely to land anywhere on the grid. Each rectangle is 5 cm long and 3 cm wide. There are no gaps between the rectangles and the grid is sufficiently large so that no discs roll off the grid.
If the disc lands inside a rectangle without covering any part of the edges of the rectangle then a prize is won. By considering the possible positions for the centre of the disc,
  1. show that the probability of winning a prize on any particular roll is \(\frac { 1 } { 5 }\) A group of 15 students each roll the disc onto the grid twenty times and record the number of times, \(x\), that each student wins a prize. Their results are summarised as follows $$\sum x = 61 \quad \sum x ^ { 2 } = 295$$
  2. Find the standard deviation of the number of prizes won per student. A second group of 12 students each roll the disc onto the grid twenty times and the mean number of prizes won per student is 3.5 with a standard deviation of 2
  3. Find the mean and standard deviation of the number of prizes won per student for the whole group of 27 students. The 27 students also recorded the number of times that the disc covered a corner of a rectangle and estimated the probability to be 0.2216 (to 4 decimal places).
  4. Explain how this probability could be used to find an estimate for the value of \(\pi\) and state the value of your estimate.
Edexcel S1 2023 January Q1
10 marks Moderate -0.3
The histogram shows the times taken, \(t\) minutes, by each of 100 people to swim 500 metres. \includegraphics[max width=\textwidth, alt={}, center]{c316fa29-dedc-4890-bd82-31eb0bb819f9-02_986_1070_342_424}
  1. Use the histogram to complete the frequency table for the times taken by the 100 people to swim 500 metres.
    Time taken ( \(\boldsymbol { t }\) minutes)\(5 - 10\)\(10 - 14\)\(14 - 18\)\(18 - 25\)\(25 - 40\)
    Frequency ( \(\boldsymbol { f }\) )101624
  2. Estimate the number of people who took less than 16 minutes to swim 500 metres.
  3. Find an estimate for the mean time taken to swim 500 metres. Given that \(\sum f t ^ { 2 } = 41033\)
  4. find an estimate for the standard deviation of the times taken to swim 500 metres. Given that \(Q _ { 3 } = 23\)
  5. use linear interpolation to estimate the interquartile range of the times taken to swim 500 metres.
Edexcel S1 2023 January Q5
17 marks Moderate -0.3
The lengths, \(L \mathrm {~mm}\), of housefly wings are normally distributed with \(L \sim \mathrm {~N} \left( 4.5,0.4 ^ { 2 } \right)\)
  1. Find the probability that a randomly selected housefly has a wing length of less than 3.86 mm .
  2. Find
    1. the upper quartile ( \(Q _ { 3 }\) ) of \(L\)
    2. the lower quartile ( \(Q _ { 1 }\) ) of \(L\) A value that is greater than \(Q _ { 3 } + 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right)\) or smaller than \(Q _ { 1 } - 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right)\) is defined as an outlier.
  3. Find these two outlier limits. A housefly is selected at random.
  4. Using standardisation, show that the probability that this housefly is not an outlier is 0.993 to 3 decimal places. Given that this housefly is not an outlier,
  5. showing your working, find the probability that the wing length of this housefly is greater than 5 mm .
Edexcel S1 2024 January Q2
12 marks Moderate -0.3
  1. The average minimum monthly temperature, \(x\) degrees Fahrenheit ( \({ } ^ { \circ } \mathrm { F }\) ), and the average maximum monthly temperature, \(y\) degrees Fahrenheit ( \({ } ^ { \circ } \mathrm { F }\) ), in Kolkata were recorded for 12 months.
Some of the summary statistics are given below. $$\sum x = 862 \quad \sum x ^ { 2 } = 62802 \quad \mathrm {~S} _ { y y } = 413.67 \quad S _ { x y } = 512.67 \quad n = 12$$
    1. Calculate the mean of the 12 values of the average minimum
      monthly temperature.
    2. Show that the standard deviation of the 12 values of the average minimum monthly temperature is \(8.57 ^ { \circ } \mathrm { F }\) to 3 significant figures.
  1. Calculate the product moment correlation coefficient between \(x\) and \(y\) For comparative purposes with a UK city, it was necessary to convert the temperatures from degrees Fahrenheit ( \({ } ^ { \circ } \mathrm { F }\) ) to degrees Celsius ( \({ } ^ { \circ } \mathrm { C }\) ). The formula used was $$c = \frac { 5 } { 9 } ( f - 32 )$$ where \(f\) is the temperature in \({ } ^ { \circ } \mathrm { F }\) and \(c\) is the temperature in \({ } ^ { \circ } \mathrm { C }\)
  2. Use this formula and the values from part (a) to calculate, in \({ } ^ { \circ } \mathrm { C }\), the mean and the standard deviation of the 12 values of the average minimum monthly temperature in Kolkata.
    Give your answers to 3 significant figures. Given that
    • \(u\) is the equivalent temperature in \({ } ^ { \circ } \mathrm { C }\) of \(x\)
    • \(\quad v\) is the equivalent temperature in \({ } ^ { \circ } \mathrm { C }\) of \(y\)
    • state, giving a reason, the product moment correlation coefficient between \(u\) and \(v\)
Edexcel S1 2024 January Q4
12 marks Moderate -0.8
  1. A French test and a Spanish test were sat by 11 students.
The table below shows their marks.
StudentABCDEFGHIJK
French mark ( f )2430323236364044506068
Spanish mark ( \(\boldsymbol { s }\) )1690242832363844484868
Greg says that if these points were plotted on a scatter diagram, then the point \(( 30,90 )\) would be an outlier because 90 is an outlier for the Spanish marks. An outlier is defined as a value that is $$\text { greater than } Q _ { 3 } + 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { or smaller than } Q _ { 1 } - 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right)$$
  1. Show that 90 is an outlier for the Spanish marks. Ignoring the point (30, 90), Greg calculated the following summary statistics. $$\sum f = 422 \quad \sum s = 382 \quad S _ { f f } = 1667.6 \quad S _ { f s } = 1735.6$$
  2. Use these summary statistics to show that the equation of the least squares regression line of \(s\) on \(f\) for the remaining 10 students is $$s = - 5.72 + 1.04 f$$ where the values of the intercept and gradient are given to 3 significant figures. You must show your working.
  3. Give an interpretation of the gradient of the regression line. Two further students sat the French test but missed the Spanish test.
  4. Using the equation given in part (b), estimate
    1. a Spanish mark for the student who scored 55 marks in their French test,
    2. a Spanish mark for the student who scored 18 marks in their French test.
  5. State, giving a reason, which of the two estimates found in part (d) would be the more reliable estimate.
Edexcel S1 2014 June Q2
14 marks Moderate -0.8
  1. The table below shows the distances (to the nearest km ) travelled to work by the 50 employees in an office.
Distance (km)Frequency (f)Distance midpoint (x)
0-2161.25
3-5124
6-10108
11-20815.5
21-40430.5
$$\text { [You may use } \left. \sum \mathrm { f } x = 394 , \quad \sum \mathrm { f } x ^ { 2 } = 6500 \right]$$ A histogram has been drawn to represent these data.
The bar representing the distance of \(3 - 5\) has a width of 1.5 cm and a height of 6 cm .
  1. Calculate the width and height of the bar representing the distance of 6-10
  2. Use linear interpolation to estimate the median distance travelled to work.
    1. Show that an estimate of the mean distance travelled to work is 7.88 km .
    2. Estimate the standard deviation of the distances travelled to work.
  3. Describe, giving a reason, the skewness of these data. Peng starts to work in this office as the \(51 ^ { \text {st } }\) employee.
    She travels a distance of 7.88 km to work.
  4. Without carrying out any further calculations, state, giving a reason, what effect Peng's addition to the workforce would have on your estimates of the
    1. mean,
    2. median,
    3. standard deviation
      of the distances travelled to work.
Edexcel S1 2004 January Q5
18 marks Moderate -0.3
5. The values of daily sales, to the nearest \(\pounds\), taken at a newsagents last year are summarised in the table below.
SalesNumber of days
\(1 - 200\)166
\(201 - 400\)100
\(401 - 700\)59
\(701 - 1000\)30
\(1001 - 1500\)5
  1. Draw a histogram to represent these data.
  2. Use interpolation to estimate the median and inter-quartile range of daily sales.
  3. Estimate the mean and the standard deviation of these data. The newsagent wants to compare last year's sales with other years.
  4. State whether the newsagent should use the median and the inter-quartile range or the mean and the standard deviation to compare daily sales. Give a reason for your answer.
    (2)
CAIE S1 2020 Specimen Q2
4 marks Easy -1.2
2 A summary of the speeds, \(x\) kilometres per hour, of 22 cars passing a certain point gave the following information: $$\Sigma ( x - 50 ) = 81.4 \text { and } \Sigma ( x - 50 ) ^ { 2 } = 671.0 .$$ Find the variance of the speeds and hence find the value of \(\Sigma x ^ { 2 }\).
OCR S1 2009 January Q5
8 marks Easy -1.3
5 The stem-and-leaf diagram shows the masses, in grams, of 23 plums, measured correct to the nearest gram.
5567889
61235689
700245678
80
97
9
\(\quad\) Key \(: 6 \mid 2\) means 62
  1. Find the median and interquartile range of these masses.
  2. State one advantage of using the interquartile range rather than the standard deviation as a measure of the variation in these masses.
  3. State one advantage and one disadvantage of using a stem-and-leaf diagram rather than a box-and-whisker plot to represent data.
  4. James wished to calculate the mean and standard deviation of the given data. He first subtracted 5 from each of the digits to the left of the line in the stem-and-leaf diagram, giving the following.
    0567889
    11235689
    200245678
    30
    47
    The mean and standard deviation of the data in this diagram are 18.1 and 9.7 respectively, correct to 1 decimal place. Write down the mean and standard deviation of the data in the original diagram.
OCR S1 2012 January Q5
11 marks Moderate -0.8
5 At a certain resort the number of hours of sunshine, measured to the nearest hour, was recorded on each of 21 days. The results are summarised in the table.
Hours of sunshine0\(1 - 3\)\(4 - 6\)\(7 - 9\)\(10 - 15\)
Number of days06942
The diagram shows part of a histogram to illustrate the data. The scale on the frequency density axis is 2 cm to 1 unit. \includegraphics[max width=\textwidth, alt={}, center]{56ca7462-d061-48d3-bc5f-274d925e4e34-3_944_1778_699_148}
  1. (a) Calculate the frequency density of the \(1 - 3\) class.
    (b) Fred wishes to draw the block for the 10 - 15 class on the same diagram. Calculate the height, in centimetres, of this block.
  2. A cumulative frequency graph is to be drawn. Write down the coordinates of the first two points that should be plotted. You are not asked to draw the graph.
  3. (a) Calculate estimates of the mean and standard deviation of the number of hours of sunshine.
    (b) Explain why your answers are only estimates.