2.02g Calculate mean and standard deviation

382 questions

Sort by: Default | Easiest first | Hardest first
Edexcel S1 2006 June Q2
10 marks Moderate -0.8
2. Sunita and Shelley talk to one another once a week on the telephone. Over many weeks they recorded, to the nearest minute, the number of minutes spent in conversation on each occasion. The following table summarises their results.
Time
(to the nearest minute)
Number of
Conversations
\(5 - 9\)2
\(10 - 14\)9
\(15 - 19\)20
\(20 - 24\)13
\(25 - 29\)8
\(30 - 34\)3
Two of the conversations were chosen at random.
  1. Find the probability that both of them were longer than 24.5 minutes. The mid-point of each class was represented by \(x\) and its corresponding frequency by \(f\), giving \(\Sigma f x = 1060\).
  2. Calculate an estimate of the mean time spent on their conversations. During the following 25 weeks they monitored their weekly conversations and found that at the end of the 80 weeks their overall mean length of conversation was 21 minutes.
  3. Find the mean time spent in conversation during these 25 weeks.
  4. Comment on these two mean values.
Edexcel S1 2007 June Q5
17 marks Moderate -0.3
5. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{045e10d2-1766-4399-aa0a-5619dd0cce0f-10_726_1509_255_278} \captionsetup{labelformat=empty} \caption{Figure 2}
\end{figure} Figure 2 shows a histogram for the variable \(t\) which represents the time taken, in minutes, by a group of people to swim 500 m .
  1. Complete the frequency table for \(t\).
    \(t\)\(5 - 10\)\(10 - 14\)\(14 - 18\)\(18 - 25\)\(25 - 40\)
    Frequency101624
  2. Estimate the number of people who took longer than 20 minutes to swim 500 m .
  3. Find an estimate of the mean time taken.
  4. Find an estimate for the standard deviation of \(t\).
  5. Find the median and quartiles for \(t\). One measure of skewness is found using \(\frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }\).
  6. Evaluate this measure and describe the skewness of these data.
Edexcel S1 2008 June Q2
14 marks Moderate -0.8
2. The age in years of the residents of two hotels are shown in the back to back stem and leaf diagram below. Abbey Hotel \(8 | 5 | 0\) means 58 years in Abbey hotel and 50 years in Balmoral hotel Balmoral Hotel
(1)20
(4)97511
(4)983126(1)
(11)999976653323447(3)
(6)9877504005569(6)
\multirow[t]{3}{*}{(1)}85000013667(9)
6233457(6)
7015(3)
For the Balmoral Hotel,
  1. write down the mode of the age of the residents,
  2. find the values of the lower quartile, the median and the upper quartile.
    1. Find the mean, \(\bar { x }\), of the age of the residents.
    2. Given that \(\sum x ^ { 2 } = 81213\) find the standard deviation of the age of the residents. One measure of skewness is found using $$\frac { \text { mean - mode } } { \text { standard deviation } }$$
  3. Evaluate this measure for the Balmoral Hotel. For the Abbey Hotel, the mode is 39 , the mean is 33.2 , the standard deviation is 12.7 and the measure of skewness is - 0.454
  4. Compare the two age distributions of the residents of each hotel.
Edexcel S1 2009 June Q4
13 marks Moderate -0.3
4. A researcher measured the foot lengths of a random sample of 120 ten-year-old children. The lengths are summarised in the table below.
Foot length, \(l\), (cm)Number of children
\(10 \leqslant l < 12\)5
\(12 \leqslant l < 17\)53
\(17 \leqslant l < 19\)29
\(19 \leqslant l < 21\)15
\(21 \leqslant l < 23\)11
\(23 \leqslant l < 25\)7
  1. Use interpolation to estimate the median of this distribution.
  2. Calculate estimates for the mean and the standard deviation of these data. One measure of skewness is given by $$\text { Coefficient of skewness } = \frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }$$
  3. Evaluate this coefficient and comment on the skewness of these data. Greg suggests that a normal distribution is a suitable model for the foot lengths of ten-year-old children.
  4. Using the value found in part (c), comment on Greg's suggestion, giving a reason for your answer.
Edexcel S1 2010 June Q5
14 marks Moderate -0.8
5. A teacher selects a random sample of 56 students and records, to the nearest hour, the time spent watching television in a particular week.
Hours\(1 - 10\)\(11 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 40\)\(41 - 59\)
Frequency615111383
Mid-point5.515.52850
  1. Find the mid-points of the 21-25 hour and 31-40 hour groups. A histogram was drawn to represent these data. The \(11 - 20\) group was represented by a bar of width 4 cm and height 6 cm .
  2. Find the width and height of the 26-30 group.
  3. Estimate the mean and standard deviation of the time spent watching television by these students.
  4. Use linear interpolation to estimate the median length of time spent watching television by these students. The teacher estimated the lower quartile and the upper quartile of the time spent watching television to be 15.8 and 29.3 respectively.
  5. State, giving a reason, the skewness of these data.
Edexcel S1 2012 June Q5
13 marks Moderate -0.8
5. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{0593544d-392d-465b-b922-c9cb1435abb5-08_1031_1239_116_354} \captionsetup{labelformat=empty} \caption{Figure 2}
\end{figure} A policeman records the speed of the traffic on a busy road with a 30 mph speed limit. He records the speeds of a sample of 450 cars. The histogram in Figure 2 represents the results.
  1. Calculate the number of cars that were exceeding the speed limit by at least 5 mph in the sample.
  2. Estimate the value of the mean speed of the cars in the sample.
  3. Estimate, to 1 decimal place, the value of the median speed of the cars in the sample.
  4. Comment on the shape of the distribution. Give a reason for your answer.
  5. State, with a reason, whether the estimate of the mean or the median is a better representation of the average speed of the traffic on the road.
Edexcel S1 2013 June Q3
13 marks Moderate -0.8
3. An agriculturalist is studying the yields, \(y \mathrm {~kg}\), from tomato plants. The data from a random sample of 70 tomato plants are summarised below.
Yield ( \(y \mathrm {~kg}\) )Frequency (f)Yield midpoint ( \(x \mathrm {~kg}\) )
\(0 \leqslant y < 5\)162.5
\(5 \leqslant y < 10\)247.5
\(10 \leqslant y < 15\)1412.5
\(15 \leqslant y < 25\)1220
\(25 \leqslant y < 35\)430
$$\text { (You may use } \sum \mathrm { f } x = 755 \text { and } \sum \mathrm { f } x ^ { 2 } = 12037.5 \text { ) }$$ A histogram has been drawn to represent these data. The bar representing the yield \(5 \leqslant y < 10\) has a width of 1.5 cm and a height of 8 cm .
  1. Calculate the width and the height of the bar representing the yield \(15 \leqslant y < 25\)
  2. Use linear interpolation to estimate the median yield of the tomato plants.
  3. Estimate the mean and the standard deviation of the yields of the tomato plants.
  4. Describe, giving a reason, the skewness of the data.
  5. Estimate the number of tomato plants in the sample that have a yield of more than 1 standard deviation above the mean.
Edexcel S1 2013 June Q2
11 marks Easy -1.3
  1. The marks of a group of female students in a statistics test are summarised in Figure 1
\begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{6faf2dd2-a114-40b7-88ae-4a75dbfb4706-04_629_1102_342_429} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure}
  1. Write down the mark which is exceeded by \(75 \%\) of the female students. The marks of a group of male students in the same statistics test are summarised by the stem and leaf diagram below.
    Mark(2|6 means 26)Totals
    14(1)
    26(1)
    3447(3)
    4066778(6)
    5001113677(9)
    6223338(6)
    7008(3)
    85(1)
    90(1)
  2. Find the median and interquartile range of the marks of the male students. An outlier is a mark that is
    either more than \(1.5 \times\) interquartile range above the upper quartile or more than \(1.5 \times\) interquartile range below the lower quartile.
  3. In the space provided on Figure 1 draw a box plot to represent the marks of the male students, indicating clearly any outliers.
  4. Compare and contrast the marks of the male and the female students.
Edexcel S1 2013 June Q4
14 marks Moderate -0.8
4. The following table summarises the times, \(t\) minutes to the nearest minute, recorded for a group of students to complete an exam.
Time (minutes) \(t\)\(11 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 35\)\(36 - 45\)\(46 - 60\)
Number of students f628816131110
$$\text { [You may use } \sum \mathrm { f } t ^ { 2 } = 134281.25 \text { ] }$$
  1. Estimate the mean and standard deviation of these data.
  2. Use linear interpolation to estimate the value of the median.
  3. Show that the estimated value of the lower quartile is 18.6 to 3 significant figures.
  4. Estimate the interquartile range of this distribution.
  5. Give a reason why the mean and standard deviation are not the most appropriate summary statistics to use with these data. The person timing the exam made an error and each student actually took 5 minutes less than the times recorded above. The table below summarises the actual times.
    Time (minutes) \(t\)\(6 - 15\)\(16 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 40\)\(41 - 55\)
    Number of students f628816131110
  6. Without further calculations, explain the effect this would have on each of the estimates found in parts (a), (b), (c) and (d).
Edexcel S1 2014 June Q5
12 marks Moderate -0.8
  1. The table shows the time, to the nearest minute, spent waiting for a taxi by each of 80 people one Sunday afternoon.
Waiting time
(in minutes)
Frequency
\(2 - 4\)15
\(5 - 6\)9
76
824
\(9 - 10\)14
\(11 - 15\)12
  1. Write down the upper class boundary for the \(2 - 4\) minute interval. A histogram is drawn to represent these data. The height of the tallest bar is 6 cm .
  2. Calculate the height of the second tallest bar.
  3. Estimate the number of people with a waiting time between 3.5 minutes and 7 minutes.
  4. Use linear interpolation to estimate the median, the lower quartile and the upper quartile of the waiting times.
  5. Describe the skewness of these data, giving a reason for your answer.
Edexcel S1 2014 June Q1
9 marks Moderate -0.8
  1. A random sample of 35 homeowners was taken from each of the villages Greenslax and Penville and their ages were recorded. The results are summarised in the back-to-back stem and leaf diagram below.
TotalsGreenslaxPenvilleTotals
(2)8725567889(7)
(3)98731112344569(11)
(4)4440401247(5)
(5)66522500555(5)
(7)865421162566(4)
(8)8664311705(2)
(5)984328(0)
(1)499(1)
Key: 7 | 3 | 1 means 37 years for Greenslax and 31 years for Penville
Some of the quartiles for these two distributions are given in the table below.
GreenslaxPenville
Lower quartile, \(Q _ { 1 }\)\(a\)31
Median, \(Q _ { 2 }\)6439
Upper quartile, \(Q _ { 3 }\)\(b\)55
  1. Find the value of \(a\) and the value of \(b\). An outlier is a value that falls either $$\begin{aligned} & \text { more than } 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { above } Q _ { 3 } \\ & \text { or more than } 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { below } Q _ { 1 } \end{aligned}$$
  2. On the graph paper opposite draw a box plot to represent the data from Penville. Show clearly any outliers.
  3. State the skewness of each distribution. Justify your answers. \includegraphics[max width=\textwidth, alt={}, center]{8270bcae-494c-4248-8229-a72e9e84eab0-03_930_1237_1800_367}
Edexcel S1 2014 June Q2
4 marks Easy -1.2
2. The mark, \(x\), scored by each student who sat a statistics examination is coded using $$y = 1.4 x - 20$$ The coded marks have mean 60.8 and standard deviation 6.60 Find the mean and the standard deviation of \(x\). \includegraphics[max width=\textwidth, alt={}, center]{8270bcae-494c-4248-8229-a72e9e84eab0-04_99_97_2613_1784}
Edexcel S1 2014 June Q6
11 marks Moderate -0.3
6. The times, in seconds, spent in a queue at a supermarket by 85 randomly selected customers, are summarised in the table below.
Time (seconds)Number of customers, \(f\)
0-302
30-6010
60-7017
70-8025
80-10025
100-1506
A histogram was drawn to represent these data. The \(30 - 60\) group was represented by a bar of width 1.5 cm and height 1 cm .
  1. Find the width and the height of the \(70 - 80\) group.
  2. Use linear interpolation to estimate the median of this distribution. Given that \(x\) denotes the midpoint of each group in the table and $$\sum f x = 6460 \quad \sum f x ^ { 2 } = 529400$$
  3. calculate an estimate for
    1. the mean,
    2. the standard deviation,
      for the above data. One measure of skewness is given by $$\text { coefficient of skewness } = \frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }$$
  4. Evaluate this coefficient and comment on the skewness of these data.
Edexcel S1 2015 June Q1
14 marks Easy -1.2
Each of 60 students was asked to draw a \(20 ^ { \circ }\) angle without using a protractor. The size of each angle drawn was measured. The results are summarised in the box plot below. \includegraphics[max width=\textwidth, alt={}, center]{9626e3ce-35d6-41b5-a0bd-1185f38b9e36-02_371_1040_340_461}
  1. Find the range for these data.
  2. Find the interquartile range for these data. The students were then asked to draw a \(70 ^ { \circ }\) angle.
    The results are summarised in the table below.
    Angle, \(\boldsymbol { a }\), (degrees)Number of students
    \(55 \leqslant a < 60\)6
    \(60 \leqslant a < 65\)15
    \(65 \leqslant a < 70\)13
    \(70 \leqslant a < 75\)11
    \(75 \leqslant a < 80\)8
    \(80 \leqslant a < 85\)7
  3. Use linear interpolation to estimate the size of the median angle drawn. Give your answer to 1 decimal place.
  4. Show that the lower quartile is \(63 ^ { \circ }\) For these data, the upper quartile is \(75 ^ { \circ }\), the minimum is \(55 ^ { \circ }\) and the maximum is \(84 ^ { \circ }\) An outlier is an observation that falls either more than \(1.5 \times\) (interquartile range) above the upper quartile or more than \(1.5 \times\) (interquartile range) below the lower quartile.
    1. Show that there are no outliers for these data.
    2. Draw a box plot for these data on the grid on page 3.
  5. State which angle the students were more accurate at drawing. Give reasons for your answer.
    (3) \includegraphics[max width=\textwidth, alt={}, center]{9626e3ce-35d6-41b5-a0bd-1185f38b9e36-03_378_1059_2067_447}
Edexcel S1 2015 June Q2
8 marks Moderate -0.8
2. An estate agent recorded the price per square metre, \(p \pounds / \mathrm { m } ^ { 2 }\), for 7 two-bedroom houses. He then coded the data using the coding \(q = \frac { p - a } { b }\), where \(a\) and \(b\) are positive constants. His results are shown in the table below.
\(p\)1840184818301824181918341850
\(q\)4.04.83.02.41.93.45.0
  1. Find the value of \(a\) and the value of \(b\) The estate agent also recorded the distance, \(d \mathrm {~km}\), of each house from the nearest train station. The results are summarised below. $$\mathrm { S } _ { d d } = 1.02 \quad \mathrm {~S} _ { q q } = 8.22 \quad \mathrm {~S} _ { d q } = - 2.17$$
  2. Calculate the product moment correlation coefficient between \(d\) and \(q\)
  3. Write down the value of the product moment correlation coefficient between \(d\) and \(p\) The estate agent records the price and size of 2 additional two-bedroom houses, \(H\) and \(J\).
    HousePrice \(( \pounds )\)Size \(\left( \mathrm { m } ^ { 2 } \right)\)
    \(H\)15640085
    \(J\)17290095
  4. Suggest which house is most likely to be closer to a train station. Justify your answer.
Edexcel S1 2016 June Q1
11 marks Moderate -0.8
  1. A biologist is studying the behaviour of bees in a hive. Once a bee has located a source of food, it returns to the hive and performs a dance to indicate to the other bees how far away the source of the food is. The dance consists of a series of wiggles. The biologist records the distance, \(d\) metres, of the food source from the hive and the average number of wiggles, \(w\), in the dance.
Distance, \(\boldsymbol { d } \mathbf { m }\)305080100150400500650
Average number
of wiggles, \(\boldsymbol { w }\)
0.7251.2101.7752.2503.5186.3828.1859.555
[You may use \(\sum w = 33.6 \sum d w = 13833 \mathrm {~S} _ { d d } = 394600 \mathrm {~S} _ { w w } = 80.481\) (to 3 decimal places)]
  1. Show that \(\mathrm { S } _ { d w } = 5601\)
  2. State, giving a reason, which is the response variable.
  3. Calculate the product moment correlation coefficient for these data.
  4. Calculate the equation of the regression line of \(w\) on \(d\), giving your answer in the form \(w = a + b d\) A new source of food is located 350 m from the hive.
    1. Use your regression equation to estimate the average number of wiggles in the corresponding dance.
    2. Comment, giving a reason, on the reliability of your estimate.
Edexcel S1 2016 June Q3
10 marks Moderate -0.8
3. Before going on holiday to Seapron, Tania records the weekly rainfall ( \(x \mathrm {~mm}\) ) at Seapron for 8 weeks during the summer. Her results are summarised as $$\sum x = 86.8 \quad \sum x ^ { 2 } = 985.88$$
  1. Find the standard deviation, \(\sigma _ { x }\), for these data.
    (3) Tania also records the number of hours of sunshine ( \(y\) hours) per week at Seapron for these 8 weeks and obtains the following $$\bar { y } = 58 \quad \sigma _ { y } = 9.461 \text { (correct to } 4 \text { significant figures) } \quad \sum x y = 4900.5$$
  2. Show that \(\mathrm { S } _ { y y } = 716\) (correct to 3 significant figures)
  3. Find \(\mathrm { S } _ { x y }\)
  4. Calculate the product moment correlation coefficient, \(r\), for these data. During Tania's week-long holiday at Seapron there are 14 mm of rain and 70 hours of sunshine.
  5. State, giving a reason, what the effect of adding this information to the above data would be on the value of the product moment correlation coefficient.
Edexcel S1 2016 June Q5
17 marks Moderate -0.8
5. A midwife records the weights, in kg , of a sample of 50 babies born at a hospital. Her results are given in the table below.
Weight ( \(\boldsymbol { w } \mathbf { ~ k g }\) )Frequency (f)Weight midpoint (x)
\(0 \leqslant w < 2\)11
\(2 \leqslant w < 3\)82.5
\(3 \leqslant w < 3.5\)173.25
\(3.5 \leqslant w < 4\)173.75
\(4 \leqslant w < 5\)74.5
[You may use \(\sum \mathrm { f } x ^ { 2 } = 611.375\) ] A histogram has been drawn to represent these data. The bar representing the weight \(2 \leqslant w < 3\) has a width of 1 cm and a height of 4 cm .
  1. Calculate the width and height of the bar representing a weight of \(3 \leqslant w < 3.5\)
  2. Use linear interpolation to estimate the median weight of these babies.
    1. Show that an estimate of the mean weight of these babies is 3.43 kg .
    2. Find an estimate of the standard deviation of the weights of these babies. Shyam decides to model the weights of babies born at the hospital, by the random variable \(W\), where \(W \sim \mathrm {~N} \left( 3.43,0.65 ^ { 2 } \right)\)
  3. Find \(\mathrm { P } ( W < 3 )\)
  4. With reference to your answers to (b), (c)(i) and (d) comment on Shyam's decision. A newborn baby weighing 3.43 kg is born at the hospital.
  5. Without carrying out any further calculations, state, giving a reason, what effect the addition of this newborn baby to the sample would have on your estimate of the
    1. mean,
    2. standard deviation.
Edexcel S1 2017 June Q2
14 marks Moderate -0.8
2. An estate agent is studying the cost of office space in London. He takes a random sample of 90 offices and calculates the cost, \(\pounds x\) per square foot. His results are given in the table below.
Cost (£ \(\boldsymbol { x }\) )Frequency (f)Midpoint (£y)
\(20 \leqslant x < 40\)1230
\(40 \leqslant x < 45\)1342.5
\(45 \leqslant x < 50\)2547.5
\(50 \leqslant x < 60\)3255
\(60 \leqslant x < 80\)870
A histogram is drawn for these data and the bar representing \(50 \leqslant x < 60\) is 2 cm wide and 8 cm high.
  1. Calculate the width and height of the bar representing \(20 \leqslant x < 40\)
  2. Use linear interpolation to estimate the median cost.
  3. Estimate the mean cost of office space for these data.
  4. Estimate the standard deviation for these data.
  5. Describe, giving a reason, the skewness. Rika suggests that the cost of office space in London can be modelled by a normal distribution with mean \(\pounds 50\) and standard deviation \(\pounds 10\)
  6. With reference to your answer to part (e), comment on Rika's suggestion.
  7. Use Rika's model to estimate the 80th percentile of the cost of office space in London.
Edexcel S1 2018 June Q2
12 marks Moderate -0.8
2. The following grouped frequency distribution summarises the number of minutes, to the nearest minute, that a random sample of 100 motorists were delayed by roadworks on a stretch of motorway one Monday.
Delay (minutes)Number of motorists (f)Delay midpoint (x)
3-6384.5
7-8257.5
9-10189.5
11-151213
16-20718
(You may use \(\sum \mathrm { f } x ^ { 2 } = 8096.25\) ) A histogram has been drawn to represent these data. The bar representing a delay of (3-6) minutes has a width of 2 cm and a height of 9.5 cm .
  1. Calculate the width and the height of the bar representing a delay of (11-15) minutes.
  2. Use linear interpolation to estimate the median delay.
  3. Calculate an estimate of the mean delay.
  4. Calculate an estimate of the standard deviation of the delays. One coefficient of skewness is given by \(\frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }\)
  5. Evaluate this coefficient for the above data, giving your answer to 2 significant figures. On the following Friday, the coefficient of skewness for the delays on this stretch of motorway was - 0.22
  6. State, giving a reason, how the delays on this stretch of motorway on Friday are different from the delays on Monday.
Edexcel S1 Q3
11 marks Standard +0.3
3. Data relating to the lifetimes (to the nearest hour) of a random sample of 200 light bulbs from the production line of a manufacturer were summarised in a group frequency table. The mid-point of each group in the table was represented by \(x\) and the corresponding frequency for that group by \(f\). The data were then coded using \(y = \frac { ( x - 755.0 ) } { 2.5 }\) and summarised as follows: $$\Sigma f y = - 467 , \Sigma f y ^ { 2 } = 9179 .$$
  1. Calculate estimates of the mean and the standard deviation of the lifetimes of this sample of bulbs.
    (9 marks)
    An estimate of the interquartile range for these data was 27.7 hours.
  2. Explain, giving a reason, whether you would recommend the manufacturer to use the interquartile range or the standard deviation to represent the spread of lifetimes of the bulbs from this production line.
    (2 marks)
Edexcel S1 2003 November Q6
16 marks Moderate -0.8
6. A travel agent sells holidays from his shop. The price, in \(\pounds\), of 15 holidays sold on a particular day are shown below.
29910502315999485
3501691015650830
992100689550475
For these data, find
  1. the mean and the standard deviation,
  2. the median and the inter-quartile range. An outlier is an observation that falls either more than \(1.5 \times\) (inter-quartile range) above the upper quartile or more than \(1.5 \times\) (inter-quartile range) below the lower quartile.
  3. Determine if any of the prices are outliers. The travel agent also sells holidays from a website on the Internet. On the same day, he recorded the price, \(\pounds x\), of each of 20 holidays sold on the website. The cheapest holiday sold was \(\pounds 98\), the most expensive was \(\pounds 2400\) and the quartiles of these data were \(\pounds 305 , \pounds 1379\) and \(\pounds 1805\). There were no outliers.
  4. On graph paper, and using the same scale, draw box plots for the holidays sold in the shop and the holidays sold on the website.
  5. Compare and contrast sales from the shop and sales from the website. \section*{END}
Edexcel S1 2004 November Q1
14 marks Moderate -0.8
  1. As part of their job, taxi drivers record the number of miles they travel each day. A random sample of the mileages recorded by taxi drivers Keith and Asif are summarised in the back-toback stem and leaf diagram below.
TotalsAsifTotals
(9)87432110184457(4)
(11)9865433111957899(5)
(6)87422020022448(6)
(6)943100212356679(7)
(4)6411221124558(7)
(2)202311346678(8)
(2)71242489(4)
(1)9254(1)
(2)9326(0)
Key: 0184 means 180 for Keith and 184 for Asif
The quartiles for these two distributions are summarised in the table below.
KeithAsif
Lower quartile191\(a\)
Median\(b\)218
Upper quartile221\(c\)
  1. Find the values of \(a , b\) and \(c\). Outliers are values that lie outside the limits $$Q _ { 1 } - 1.5 \left( Q _ { 3 } - Q _ { 1 } \right) \text { and } Q _ { 3 } + 1.5 \left( Q _ { 3 } - Q _ { 1 } \right) .$$
  2. On graph paper, and showing your scale clearly, draw a box plot to represent Keith's data.
  3. Comment on the skewness of the two distributions.
Edexcel S1 2004 November Q6
18 marks Easy -1.2
6. Students in Mr Brawn's exercise class have to do press-ups and sit-ups. The number of press-ups \(x\) and the number of sit-ups \(y\) done by a random sample of 8 students are summarised below. $$\begin{array} { l l } \Sigma x = 272 , & \Sigma x ^ { 2 } = 10164 , \quad \Sigma x y = 11222 , \\ \Sigma y = 320 , & \Sigma y ^ { 2 } = 13464 . \end{array}$$
  1. Evaluate \(S _ { x x } , S _ { y y }\) and \(S _ { x y }\).
  2. Calculate, to 3 decimal places, the product moment correlation coefficient between \(x\) and \(y\).
  3. Give an interpretation of your coefficient.
  4. Calculate the mean and the standard deviation of the number of press-ups done by these students. Mr Brawn assumes that the number of press-ups that can be done by any student can be modelled by a normal distribution with mean \(\mu\) and standard deviation \(\sigma\). Assuming that \(\mu\) and \(\sigma\) take the same values as those calculated in part (d),
  5. find the value of \(a\) such that \(\mathrm { P } ( \mu - a < X < \mu + a ) = 0.95\).
  6. Comment on Mr Brawn's assumption of normality.
Edexcel S3 2017 June Q4
11 marks Standard +0.3
4. The number of emergency plumbing calls received per day by a local council was recorded over a period of 80 days. The results are summarised in the table below.
Number of calls, \(\boldsymbol { x }\)012345678
Frequency3131415108863
  1. Show that the mean number of emergency plumbing calls received per day is 3.5 A council officer suggests that a Poisson distribution can be used to model the number of emergency plumbing calls received per day. He uses the mean from the sample above and calculates the expected frequencies shown in the table below.
    \(\boldsymbol { x }\)01234567
    8 or
    more
    Expected
    frequency
    2.428.4614.80\(r\)15.1010.576.173.08\(s\)
  2. Calculate the value of \(r\) and the value of \(s\), giving your answers correct to 2 decimal places.
  3. Test, at the \(5 \%\) level of significance, whether or not the Poisson distribution is a suitable model for the number of emergency plumbing calls received per day. State your hypotheses clearly.