2.02h Recognize outliers

154 questions

Sort by: Default | Easiest first | Hardest first
Edexcel S1 2009 January Q5
16 marks Standard +0.3
5. In a shopping survey a random sample of 104 teenagers were asked how many hours, to the nearest hour, they spent shopping in the last month. The results are summarised in the table below.
Number of hoursMid-pointFrequency
0-52.7520
6-76.516
8-10918
11-151325
16-2520.515
26-503810
A histogram was drawn and the group ( \(8 - 10\) ) hours was represented by a rectangle that was 1.5 cm wide and 3 cm high.
  1. Calculate the width and height of the rectangle representing the group (16-25) hours.
  2. Use linear interpolation to estimate the median and interquartile range.
  3. Estimate the mean and standard deviation of the number of hours spent shopping.
  4. State, giving a reason, the skewness of these data.
  5. State, giving a reason, which average and measure of dispersion you would recommend to use to summarise these data.
Edexcel S1 2011 January Q3
9 marks Easy -1.2
3. Over a long period of time a small company recorded the amount it received in sales per month. The results are summarised below.
Amount received in sales (£1000s)
Two lowest values3,4
Lower quartile7
Median12
Upper quartile14
Two highest values20,25
An outlier is an observation that falls
either \(1.5 \times\) interquartile range above the upper quartile or \(1.5 \times\) interquartile range below the lower quartile.
  1. On the graph paper below, draw a box plot to represent these data, indicating clearly any outliers.
    (5) \includegraphics[max width=\textwidth, alt={}, center]{c78ec7b6-dd06-4de1-94c2-052a5577dd10-05_933_1226_1283_367}
  2. State the skewness of the distribution of the amount of sales received. Justify your answer.
  3. The company claims that for \(75 \%\) of the months, the amount received per month is greater than \(\pounds 10000\). Comment on this claim, giving a reason for your answer.
    (2)
Edexcel S1 2011 January Q5
7 marks Moderate -0.8
5. On a randomly chosen day, each of the 32 students in a class recorded the time, \(t\) minutes to the nearest minute, they spent on their homework. The data for the class is summarised in the following table.
Time, \(t\)Number of students
10-192
20-294
30-398
40-4911
50-695
70-792
  1. Use interpolation to estimate the value of the median. Given that $$\sum t = 1414 \quad \text { and } \quad \sum t ^ { 2 } = 69378$$
  2. find the mean and the standard deviation of the times spent by the students on their homework.
  3. Comment on the skewness of the distribution of the times spent by the students on their homework. Give a reason for your answer.
Edexcel S1 2012 January Q4
13 marks Easy -1.3
  1. The marks, \(x\), of 45 students randomly selected from those students who sat a mathematics examination are shown in the stem and leaf diagram below.
MarkTotals
36999\(( 3 )\)
40122234\(( 6 )\)
4566668\(( 5 )\)
50233344\(( 6 )\)
55566779\(( 6 )\)
600000013444\(( 9 )\)
65566789\(( 6 )\)
712333\(( 4 )\)
Key(3|6 means 36)
  1. Write down the modal mark of these students.
  2. Find the values of the lower quartile, the median and the upper quartile. For these students \(\sum x = 2497\) and \(\sum x ^ { 2 } = 143369\)
  3. Find the mean and the standard deviation of the marks of these students.
  4. Describe the skewness of the marks of these students, giving a reason for your answer. The mean and standard deviation of the marks of all the students who sat the examination were 55 and 10 respectively. The examiners decided that the total mark of each student should be scaled by subtracting 5 marks and then reducing the mark by a further \(10 \%\).
  5. Find the mean and standard deviation of the scaled marks of all the students.
Edexcel S1 2005 June Q4
10 marks Easy -1.2
4. Aeroplanes fly from City \(A\) to City \(B\). Over a long period of time the number of minutes delay in take-off from City \(A\) was recorded. The minimum delay was 5 minutes and the maximum delay was 63 minutes. A quarter of all delays were at most 12 minutes, half were at most 17 minutes and \(75 \%\) were at most 28 minutes. Only one of the delays was longer than 45 minutes. An outlier is an observation that falls either \(1.5 \times\) (interquartile range) above the upper quartile or \(1.5 \times\) (interquartile range) below the lower quartile.
  1. On the graph paper opposite draw a box plot to represent these data.
  2. Comment on the distribution of delays. Justify your answer.
  3. Suggest how the distribution might be interpreted by a passenger who frequently flies from City \(A\) to City \(B\). \includegraphics[max width=\textwidth, alt={}, center]{9698650f-ef85-468d-a703-1b40df7f9d02-07_1190_1487_278_223}
Edexcel S1 2006 June Q1
15 marks Easy -1.8
  1. (a) Describe the main features and uses of a box plot.
Children from schools \(A\) and \(B\) took part in a fun run for charity. The times, to the nearest minute, taken by the children from school \(A\) are summarised in Figure 1. \begin{figure}[h]
\captionsetup{labelformat=empty} \caption{Figure 1} \includegraphics[alt={},max width=\textwidth]{c8bade79-a39a-4055-bfae-928f5338fdfc-02_398_1045_946_461}
\end{figure} (b) (i) Write down the time by which \(75 \%\) of the children in school \(A\) had completed the run.
(ii) State the name given to this value.
(c) Explain what you understand by the two crosses ( X ) on Figure 1.
For school \(B\) the least time taken by any of the children was 25 minutes and the longest time was 55 minutes. The three quartiles were 30,37 and 50 respectively.
(d) Draw a box plot to represent the data from school \(B\). \includegraphics[max width=\textwidth, alt={}, center]{c8bade79-a39a-4055-bfae-928f5338fdfc-03_798_1196_580_372}
(e) Compare and contrast these two box plots.
Edexcel S1 2007 June Q2
10 marks Moderate -0.8
2. The box plot in Figure 1 shows a summary of the weights of the luggage, in kg, for each musician in an orchestra on an overseas tour. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{045e10d2-1766-4399-aa0a-5619dd0cce0f-03_346_1452_324_228} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure} The airline's recommended weight limit for each musician's luggage was 45 kg . Given that none of the musicians' luggage weighed exactly 45 kg ,
  1. state the proportion of the musicians whose luggage was below the recommended weight limit. A quarter of the musicians had to pay a charge for taking heavy luggage.
  2. State the smallest weight for which the charge was made.
  3. Explain what you understand by the + on the box plot in Figure 1, and suggest an instrument that the owner of this luggage might play.
  4. Describe the skewness of this distribution. Give a reason for your answer. One musician of the orchestra suggests that the weights of luggage, in kg, can be modelled by a normal distribution with quartiles as given in Figure 1.
  5. Find the standard deviation of this normal distribution.
Edexcel S1 2010 June Q5
14 marks Moderate -0.8
5. A teacher selects a random sample of 56 students and records, to the nearest hour, the time spent watching television in a particular week.
Hours\(1 - 10\)\(11 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 40\)\(41 - 59\)
Frequency615111383
Mid-point5.515.52850
  1. Find the mid-points of the 21-25 hour and 31-40 hour groups. A histogram was drawn to represent these data. The \(11 - 20\) group was represented by a bar of width 4 cm and height 6 cm .
  2. Find the width and height of the 26-30 group.
  3. Estimate the mean and standard deviation of the time spent watching television by these students.
  4. Use linear interpolation to estimate the median length of time spent watching television by these students. The teacher estimated the lower quartile and the upper quartile of the time spent watching television to be 15.8 and 29.3 respectively.
  5. State, giving a reason, the skewness of these data.
Edexcel S1 2010 June Q7
12 marks Standard +0.3
7. The distances travelled to work, \(D \mathrm {~km}\), by the employees at a large company are normally distributed with \(D \sim \mathrm {~N} \left( 30,8 ^ { 2 } \right)\).
  1. Find the probability that a randomly selected employee has a journey to work of more than 20 km .
  2. Find the upper quartile, \(Q _ { 3 }\), of \(D\).
  3. Write down the lower quartile, \(Q _ { 1 }\), of \(D\). An outlier is defined as any value of \(D\) such that \(D < h\) or \(D > k\) where $$h = Q _ { 1 } - 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \quad \text { and } \quad k = Q _ { 3 } + 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right)$$
  4. Find the value of \(h\) and the value of \(k\). An employee is selected at random.
  5. Find the probability that the distance travelled to work by this employee is an outlier.
Edexcel S1 2012 June Q5
13 marks Moderate -0.8
5. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{0593544d-392d-465b-b922-c9cb1435abb5-08_1031_1239_116_354} \captionsetup{labelformat=empty} \caption{Figure 2}
\end{figure} A policeman records the speed of the traffic on a busy road with a 30 mph speed limit. He records the speeds of a sample of 450 cars. The histogram in Figure 2 represents the results.
  1. Calculate the number of cars that were exceeding the speed limit by at least 5 mph in the sample.
  2. Estimate the value of the mean speed of the cars in the sample.
  3. Estimate, to 1 decimal place, the value of the median speed of the cars in the sample.
  4. Comment on the shape of the distribution. Give a reason for your answer.
  5. State, with a reason, whether the estimate of the mean or the median is a better representation of the average speed of the traffic on the road.
Edexcel S1 2013 June Q3
13 marks Moderate -0.8
3. An agriculturalist is studying the yields, \(y \mathrm {~kg}\), from tomato plants. The data from a random sample of 70 tomato plants are summarised below.
Yield ( \(y \mathrm {~kg}\) )Frequency (f)Yield midpoint ( \(x \mathrm {~kg}\) )
\(0 \leqslant y < 5\)162.5
\(5 \leqslant y < 10\)247.5
\(10 \leqslant y < 15\)1412.5
\(15 \leqslant y < 25\)1220
\(25 \leqslant y < 35\)430
$$\text { (You may use } \sum \mathrm { f } x = 755 \text { and } \sum \mathrm { f } x ^ { 2 } = 12037.5 \text { ) }$$ A histogram has been drawn to represent these data. The bar representing the yield \(5 \leqslant y < 10\) has a width of 1.5 cm and a height of 8 cm .
  1. Calculate the width and the height of the bar representing the yield \(15 \leqslant y < 25\)
  2. Use linear interpolation to estimate the median yield of the tomato plants.
  3. Estimate the mean and the standard deviation of the yields of the tomato plants.
  4. Describe, giving a reason, the skewness of the data.
  5. Estimate the number of tomato plants in the sample that have a yield of more than 1 standard deviation above the mean.
Edexcel S1 2013 June Q2
11 marks Easy -1.3
  1. The marks of a group of female students in a statistics test are summarised in Figure 1
\begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{6faf2dd2-a114-40b7-88ae-4a75dbfb4706-04_629_1102_342_429} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure}
  1. Write down the mark which is exceeded by \(75 \%\) of the female students. The marks of a group of male students in the same statistics test are summarised by the stem and leaf diagram below.
    Mark(2|6 means 26)Totals
    14(1)
    26(1)
    3447(3)
    4066778(6)
    5001113677(9)
    6223338(6)
    7008(3)
    85(1)
    90(1)
  2. Find the median and interquartile range of the marks of the male students. An outlier is a mark that is
    either more than \(1.5 \times\) interquartile range above the upper quartile or more than \(1.5 \times\) interquartile range below the lower quartile.
  3. In the space provided on Figure 1 draw a box plot to represent the marks of the male students, indicating clearly any outliers.
  4. Compare and contrast the marks of the male and the female students.
Edexcel S1 2013 June Q4
14 marks Moderate -0.8
4. The following table summarises the times, \(t\) minutes to the nearest minute, recorded for a group of students to complete an exam.
Time (minutes) \(t\)\(11 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 35\)\(36 - 45\)\(46 - 60\)
Number of students f628816131110
$$\text { [You may use } \sum \mathrm { f } t ^ { 2 } = 134281.25 \text { ] }$$
  1. Estimate the mean and standard deviation of these data.
  2. Use linear interpolation to estimate the value of the median.
  3. Show that the estimated value of the lower quartile is 18.6 to 3 significant figures.
  4. Estimate the interquartile range of this distribution.
  5. Give a reason why the mean and standard deviation are not the most appropriate summary statistics to use with these data. The person timing the exam made an error and each student actually took 5 minutes less than the times recorded above. The table below summarises the actual times.
    Time (minutes) \(t\)\(6 - 15\)\(16 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 40\)\(41 - 55\)
    Number of students f628816131110
  6. Without further calculations, explain the effect this would have on each of the estimates found in parts (a), (b), (c) and (d).
Edexcel S1 2014 June Q5
12 marks Moderate -0.8
  1. The table shows the time, to the nearest minute, spent waiting for a taxi by each of 80 people one Sunday afternoon.
Waiting time
(in minutes)
Frequency
\(2 - 4\)15
\(5 - 6\)9
76
824
\(9 - 10\)14
\(11 - 15\)12
  1. Write down the upper class boundary for the \(2 - 4\) minute interval. A histogram is drawn to represent these data. The height of the tallest bar is 6 cm .
  2. Calculate the height of the second tallest bar.
  3. Estimate the number of people with a waiting time between 3.5 minutes and 7 minutes.
  4. Use linear interpolation to estimate the median, the lower quartile and the upper quartile of the waiting times.
  5. Describe the skewness of these data, giving a reason for your answer.
Edexcel S1 2014 June Q1
9 marks Moderate -0.8
  1. A random sample of 35 homeowners was taken from each of the villages Greenslax and Penville and their ages were recorded. The results are summarised in the back-to-back stem and leaf diagram below.
TotalsGreenslaxPenvilleTotals
(2)8725567889(7)
(3)98731112344569(11)
(4)4440401247(5)
(5)66522500555(5)
(7)865421162566(4)
(8)8664311705(2)
(5)984328(0)
(1)499(1)
Key: 7 | 3 | 1 means 37 years for Greenslax and 31 years for Penville
Some of the quartiles for these two distributions are given in the table below.
GreenslaxPenville
Lower quartile, \(Q _ { 1 }\)\(a\)31
Median, \(Q _ { 2 }\)6439
Upper quartile, \(Q _ { 3 }\)\(b\)55
  1. Find the value of \(a\) and the value of \(b\). An outlier is a value that falls either $$\begin{aligned} & \text { more than } 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { above } Q _ { 3 } \\ & \text { or more than } 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { below } Q _ { 1 } \end{aligned}$$
  2. On the graph paper opposite draw a box plot to represent the data from Penville. Show clearly any outliers.
  3. State the skewness of each distribution. Justify your answers. \includegraphics[max width=\textwidth, alt={}, center]{8270bcae-494c-4248-8229-a72e9e84eab0-03_930_1237_1800_367}
Edexcel S1 2014 June Q6
11 marks Moderate -0.3
6. The times, in seconds, spent in a queue at a supermarket by 85 randomly selected customers, are summarised in the table below.
Time (seconds)Number of customers, \(f\)
0-302
30-6010
60-7017
70-8025
80-10025
100-1506
A histogram was drawn to represent these data. The \(30 - 60\) group was represented by a bar of width 1.5 cm and height 1 cm .
  1. Find the width and the height of the \(70 - 80\) group.
  2. Use linear interpolation to estimate the median of this distribution. Given that \(x\) denotes the midpoint of each group in the table and $$\sum f x = 6460 \quad \sum f x ^ { 2 } = 529400$$
  3. calculate an estimate for
    1. the mean,
    2. the standard deviation,
      for the above data. One measure of skewness is given by $$\text { coefficient of skewness } = \frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }$$
  4. Evaluate this coefficient and comment on the skewness of these data.
Edexcel S1 2015 June Q1
14 marks Easy -1.2
  1. Each of 60 students was asked to draw a \(20 ^ { \circ }\) angle without using a protractor. The size of each angle drawn was measured. The results are summarised in the box plot below. \includegraphics[max width=\textwidth, alt={}, center]{9626e3ce-35d6-41b5-a0bd-1185f38b9e36-02_371_1040_340_461}
    1. Find the range for these data.
    2. Find the interquartile range for these data.
    The students were then asked to draw a \(70 ^ { \circ }\) angle.
    The results are summarised in the table below.
    Angle, \(\boldsymbol { a }\), (degrees)Number of students
    \(55 \leqslant a < 60\)6
    \(60 \leqslant a < 65\)15
    \(65 \leqslant a < 70\)13
    \(70 \leqslant a < 75\)11
    \(75 \leqslant a < 80\)8
    \(80 \leqslant a < 85\)7
  2. Use linear interpolation to estimate the size of the median angle drawn. Give your answer to 1 decimal place.
  3. Show that the lower quartile is \(63 ^ { \circ }\) For these data, the upper quartile is \(75 ^ { \circ }\), the minimum is \(55 ^ { \circ }\) and the maximum is \(84 ^ { \circ }\) An outlier is an observation that falls either more than \(1.5 \times\) (interquartile range) above the upper quartile or more than \(1.5 \times\) (interquartile range) below the lower quartile.
    1. Show that there are no outliers for these data.
    2. Draw a box plot for these data on the grid on page 3.
  4. State which angle the students were more accurate at drawing. Give reasons for your answer.
    (3) \includegraphics[max width=\textwidth, alt={}, center]{9626e3ce-35d6-41b5-a0bd-1185f38b9e36-03_378_1059_2067_447}
Edexcel S1 2017 June Q2
14 marks Moderate -0.8
2. An estate agent is studying the cost of office space in London. He takes a random sample of 90 offices and calculates the cost, \(\pounds x\) per square foot. His results are given in the table below.
Cost (£ \(\boldsymbol { x }\) )Frequency (f)Midpoint (£y)
\(20 \leqslant x < 40\)1230
\(40 \leqslant x < 45\)1342.5
\(45 \leqslant x < 50\)2547.5
\(50 \leqslant x < 60\)3255
\(60 \leqslant x < 80\)870
A histogram is drawn for these data and the bar representing \(50 \leqslant x < 60\) is 2 cm wide and 8 cm high.
  1. Calculate the width and height of the bar representing \(20 \leqslant x < 40\)
  2. Use linear interpolation to estimate the median cost.
  3. Estimate the mean cost of office space for these data.
  4. Estimate the standard deviation for these data.
  5. Describe, giving a reason, the skewness. Rika suggests that the cost of office space in London can be modelled by a normal distribution with mean \(\pounds 50\) and standard deviation \(\pounds 10\)
  6. With reference to your answer to part (e), comment on Rika's suggestion.
  7. Use Rika's model to estimate the 80th percentile of the cost of office space in London.
Edexcel S1 2018 June Q2
12 marks Moderate -0.8
2. The following grouped frequency distribution summarises the number of minutes, to the nearest minute, that a random sample of 100 motorists were delayed by roadworks on a stretch of motorway one Monday.
Delay (minutes)Number of motorists (f)Delay midpoint (x)
3-6384.5
7-8257.5
9-10189.5
11-151213
16-20718
(You may use \(\sum \mathrm { f } x ^ { 2 } = 8096.25\) ) A histogram has been drawn to represent these data. The bar representing a delay of (3-6) minutes has a width of 2 cm and a height of 9.5 cm .
  1. Calculate the width and the height of the bar representing a delay of (11-15) minutes.
  2. Use linear interpolation to estimate the median delay.
  3. Calculate an estimate of the mean delay.
  4. Calculate an estimate of the standard deviation of the delays. One coefficient of skewness is given by \(\frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }\)
  5. Evaluate this coefficient for the above data, giving your answer to 2 significant figures. On the following Friday, the coefficient of skewness for the delays on this stretch of motorway was - 0.22
  6. State, giving a reason, how the delays on this stretch of motorway on Friday are different from the delays on Monday.
Edexcel S1 Q2
11 marks Easy -1.3
2. A botany student counted the number of daisies in each of 42 randomly chosen areas of 1 m by 1 m in a large field. The results are summarised in the following stem and leaf diagram.
Number of daisies\(1 \mid 1\) means 11
11223444(7)
15567899(7)
200133334(8)
25567999(7)
3001244(6)
366788(5)
413(2)
  1. Write down the modal value of these data.
  2. Find the median and the quartiles of these data.
  3. On graph paper and showing your scale clearly, draw a box plot to represent these data.
  4. Comment on the skewness of this distribution. The student moved to another field and collected similar data from that field.
  5. Comment on how the student might summarise both sets of raw data before drawing box plots.
    (1 mark)
Edexcel S1 Q3
11 marks Standard +0.3
3. Data relating to the lifetimes (to the nearest hour) of a random sample of 200 light bulbs from the production line of a manufacturer were summarised in a group frequency table. The mid-point of each group in the table was represented by \(x\) and the corresponding frequency for that group by \(f\). The data were then coded using \(y = \frac { ( x - 755.0 ) } { 2.5 }\) and summarised as follows: $$\Sigma f y = - 467 , \Sigma f y ^ { 2 } = 9179 .$$
  1. Calculate estimates of the mean and the standard deviation of the lifetimes of this sample of bulbs.
    (9 marks)
    An estimate of the interquartile range for these data was 27.7 hours.
  2. Explain, giving a reason, whether you would recommend the manufacturer to use the interquartile range or the standard deviation to represent the spread of lifetimes of the bulbs from this production line.
    (2 marks)
Edexcel S1 2003 November Q6
16 marks Moderate -0.8
6. A travel agent sells holidays from his shop. The price, in \(\pounds\), of 15 holidays sold on a particular day are shown below.
29910502315999485
3501691015650830
992100689550475
For these data, find
  1. the mean and the standard deviation,
  2. the median and the inter-quartile range. An outlier is an observation that falls either more than \(1.5 \times\) (inter-quartile range) above the upper quartile or more than \(1.5 \times\) (inter-quartile range) below the lower quartile.
  3. Determine if any of the prices are outliers. The travel agent also sells holidays from a website on the Internet. On the same day, he recorded the price, \(\pounds x\), of each of 20 holidays sold on the website. The cheapest holiday sold was \(\pounds 98\), the most expensive was \(\pounds 2400\) and the quartiles of these data were \(\pounds 305 , \pounds 1379\) and \(\pounds 1805\). There were no outliers.
  4. On graph paper, and using the same scale, draw box plots for the holidays sold in the shop and the holidays sold on the website.
  5. Compare and contrast sales from the shop and sales from the website. \section*{END}
Edexcel S1 2004 November Q1
14 marks Moderate -0.8
  1. As part of their job, taxi drivers record the number of miles they travel each day. A random sample of the mileages recorded by taxi drivers Keith and Asif are summarised in the back-toback stem and leaf diagram below.
TotalsAsifTotals
(9)87432110184457(4)
(11)9865433111957899(5)
(6)87422020022448(6)
(6)943100212356679(7)
(4)6411221124558(7)
(2)202311346678(8)
(2)71242489(4)
(1)9254(1)
(2)9326(0)
Key: 0184 means 180 for Keith and 184 for Asif
The quartiles for these two distributions are summarised in the table below.
KeithAsif
Lower quartile191\(a\)
Median\(b\)218
Upper quartile221\(c\)
  1. Find the values of \(a , b\) and \(c\). Outliers are values that lie outside the limits $$Q _ { 1 } - 1.5 \left( Q _ { 3 } - Q _ { 1 } \right) \text { and } Q _ { 3 } + 1.5 \left( Q _ { 3 } - Q _ { 1 } \right) .$$
  2. On graph paper, and showing your scale clearly, draw a box plot to represent Keith's data.
  3. Comment on the skewness of the two distributions.
Edexcel S3 2013 June Q4
14 marks Standard +0.3
4. Customers at a post office are timed to see how long they wait until being served at the counter. A random sample of 50 customers is chosen and their waiting times, \(x\) minutes, are summarised in Table 1. \begin{table}[h]
Waiting time in minutes \(( x )\)Frequency
\(0 - 3\)8
\(3 - 5\)12
\(5 - 6\)13
\(6 - 8\)9
\(8 - 12\)8
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table}
  1. Show that an estimate of \(\bar { x } = 5.49\) and an estimate of \(s _ { x } ^ { 2 } = 6.88\) The post office manager believes that the customers' waiting times can be modelled by a normal distribution.
    Assuming the data is normally distributed, she calculates the expected frequencies for these data and some of these frequencies are shown in Table 2. \begin{table}[h]
    Waiting Time\(x < 3\)\(3 - 5\)\(5 - 6\)\(6 - 8\)\(x > 8\)
    Expected Frequency8.5612.737.56\(a\)\(b\)
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table}
  2. Find the value of \(a\) and the value of \(b\).
  3. Test, at the \(5 \%\) level of significance, the manager's belief. State your hypotheses clearly.
AQA S1 2007 June Q4
12 marks Moderate -0.8
4 A library allows each member to have up to 15 books on loan at any one time. The table shows the numbers of books currently on loan to a random sample of 95 members of the library.
Number of books on loan01234\(5 - 9\)\(10 - 14\)15
Number of members4132417151156
  1. For these data:
    1. state values for the mode and range;
    2. determine values for the median and interquartile range;
    3. calculate estimates of the mean and standard deviation.
  2. Making reference to your answers to part (a), give a reason for preferring:
    1. the median and interquartile range to the mean and standard deviation for summarising the given data;
    2. the mean and standard deviation to the mode and range for summarising the given data.
      (1 mark)