OCR S1 — Question 9 3 marks

Exam BoardOCR
ModuleS1 (Statistics 1)
Marks3
TopicBinomial Distribution
TypeFinding binomial parameters from properties

9 A variable \(X\) has the distribution \(\mathrm { B } ( 11 , p )\).
  1. Given that \(p = \frac { 3 } { 4 }\), find \(\mathrm { P } ( X = 5 )\).
  2. Given that \(\mathrm { P } ( X = 0 ) = 0.05\), find \(p\).
  3. Given that \(\operatorname { Var } ( X ) = 1.76\), find the two possible values of \(p\). 1 The table shows the probability distribution for a random variable X .
  4. Calculate \(\mathrm { E } ( Y )\) and \(\operatorname { Var } ( Y )\). Another random variable, \(Z\), is independent of \(Y\). The probability distribution for \(Z\) is shown in the table.
    \(z\)123
    \(\mathrm { P } ( Z = z )\)0.10.250.65
    One value of \(Y\) and one value of \(Z\) are chosen at random. Find the probability that
  5. \(Y + Z = 3\),
  6. \(Y \times Z\) is even. 7
  7. Andrew plays 10 tennis matches. In each match he either wins or loses.
    (a) State, in this context, two conditions needed for a binomial distribution to arise.
    (b) Assuming these conditions are satisfied, define a variable in this context which has a binomial distribution.
  8. The random variable \(X\) has the distribution \(\mathrm { B } ( 21 , p )\), where \(0 < p < 1\). Given that \(\mathrm { P } ( X = 10 ) = \mathrm { P } ( X = 9 )\), find the value of \(p\). 8 The stem-and-leaf diagram shows the age in completed years of the members of a sports club.
    MaleFemale
    88761 | 66677889
    765533211334578899
    98443323347
    5214018
    900
    Key: 1 | 4 | 0 represents a male aged 41 and a female aged 40.
  9. Find the median and interquartile range for the males.
  10. The median and interquartile range for the females are 27 and 15 respectively. Make two comparisons between the ages of the males and the ages of the females.
  11. The mean age of the males is 30.7 and the mean age of the females is 27.5 , each correct to 1 decimal place. Give one advantage of using the median rather than the mean to compare the ages of the males with the ages of the females. A record was kept of the number of hours, \(X\), spent by each member at the club in a year. The results were summarised by $$n = 49 , \quad \Sigma ( x - 200 ) = 245 , \quad \Sigma ( x - 200 ) ^ { 2 } = 9849 .$$
  12. Calculate the mean and standard deviation of \(X\). 9 It is thought that the pH value of sand (a measure of the sand's acidity) may affect the extent to which a particular species of plant will grow in that sand. A botanist wished to determine whether there was any correlation between the pH value of the sand on certain sand dunes, and the amount of each of two plant species growing there. She chose random sections of equal area on each of eight sand dunes and measured the pH values. She then measured the area within each section that was covered by each of the two species. The results were as follows.
    \cline { 2 - 10 } \multicolumn{1}{c|}{}Dune\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
    \cline { 2 - 10 } \multicolumn{1}{c|}{}pH value, \(x\)8.58.59.58.56.57.58.59.0
    \multirow{2}{*}{
    Area, \(y \mathrm {~cm} ^ { 2 }\)
    covered
    }
    Species \(P\)1501505753304515340330
    \cline { 2 - 10 }Species \(Q\)1701580230752500
    The results for species \(P\) can be summarised by $$n = 8 , \quad \Sigma x = 66.5 , \quad \Sigma x ^ { 2 } = 558.75 , \quad \Sigma y = 1935 , \quad \Sigma y ^ { 2 } = 711275 , \quad \Sigma x y = 17082.5 .$$
  13. Give a reason why it might be appropriate to calculate the equation of the regression line of \(y\) on \(x\) rather than \(x\) on \(y\) in this situation.
  14. Calculate the equation of the regression line of \(y\) on \(x\) for species \(P\), in the form \(y = a + b x\), giving the values of \(a\) and \(b\) correct to 3 significant figures.
  15. Estimate the value of \(y\) for species \(P\) on sand where the pH value is 7.0 . The values of the product moment correlation coefficient between \(x\) and \(y\) for species \(P\) and \(Q\) are \(r _ { P } = 0.828\) and \(r _ { Q } = 0.0302\).
  16. Describe the relationship between the area covered by species \(Q\) and the pH value.
  17. State, with a reason, whether the regression line of \(y\) on \(x\) for species \(P\) will provide a reliable estimate of the value of \(y\) when the pH value is
    (a) 8 ,
    (b) 4 .
  18. Assume that the equation of the regression line of \(y\) on \(x\) for species \(Q\) is also known. State, with a reason, whether this line will provide a reliable estimate of the value of \(y\) when the pH value is 8 . 1
  19. State the value of the product moment correlation coefficient for each of the following scatter diagrams.
    (a)
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-25_448_675_443_411}
    (b)
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-25_448_656_443_1206}
  20. Calculate the value of Spearman's rank correlation coefficient for the following data.
    \(x\)3.84.14.55.3
    \(y\)1.40.80.71.2
    2 A class consists of 7 students from Ashville and 8 from Bewton. A committee of 5 students is chosen at random from the class.
  21. Find the probability that 2 students from Ashville and 3 from Bewton are chosen.
  22. In fact 2 students from Ashville and 3 from Bewton are chosen. In order to watch a video, all 5 committee members sit in a row. In how many different orders can they sit if no two students from Bewton sit next to each other? 3
  23. A random variable \(X\) has the distribution \(\mathrm { B } ( 8,0.55 )\). Find
    (a) \(\mathrm { P } ( X < 7 )\),
    (b) \(\mathrm { P } ( X = 5 )\),
    (c) \(\mathrm { P } ( 3 \leqslant X < 6 )\).
  24. A random variable \(Y\) has the distribution \(\mathrm { B } \left( 10 , \frac { 5 } { 12 } \right)\). Find
    (a) \(\mathrm { P } ( Y = 2 )\),
    (b) \(\operatorname { Var } ( Y )\). 4 At a fairground stall, on each turn a player receives prize money with the following probabilities.
    Prize money\(\pounds 0.00\)\(\pounds 0.50\)\(\pounds 5.00\)
    Probability\(\frac { 17 } { 20 }\)\(\frac { 1 } { 10 }\)\(\frac { 1 } { 20 }\)
  25. Find the probability that a player who has two turns will receive a total of \(\pounds 5.50\) in prize money.
  26. The stall-holder wishes to make a profit of 20 p per turn on average. Calculate the amount the stall-holder should charge for each turn. 5
  27. A bag contains 12 red discs and 10 black discs. Two discs are removed at random, without replacement. Find the probability that both discs are red.
  28. Another bag contains 7 green discs and 8 blue discs. Three discs are removed at random, without replacement. Find the probability that exactly two of these discs are green.
  29. A third bag contains 45 discs, each of which is either yellow or brown. Two discs are removed at random, without replacement. The probability that both discs are yellow is \(\frac { 1 } { 15 }\). Find the number of yellow discs which were in the bag at first. 6
  30. The numbers of males and females in Year 12 at a school are illustrated in the pie chart. The number of males in Year 12 is 128. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{11316ea6-3999-4003-b77d-bee8b547c1da-27_517_522_402_854} \captionsetup{labelformat=empty} \caption{Year 12}
    \end{figure} (a) Find the number of females in Year 12.
    (b) On a corresponding pie chart for Year 13, the angle of the sector representing males is \(150 ^ { \circ }\). Explain why this does not necessarily mean that the number of males in Year 13 is more than 128.
  31. All the Year 12 students took a General Studies examination. The results are illustrated in the box-and-whisker plots. Year 12 Females
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-27_90_661_1457_760}
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-27_109_1259_1612_612} Year 12 Males
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-27_94_789_1777_717}
    (a) One student said "The Year 12 pie chart shows that there are more females than males, but the box-and-whisker plots show that there are more males than females." Comment on this statement.
    (b) Give two comparisons between the overall performance of the females and the males in the General Studies examination.
    (c) Give one advantage and one disadvantage of using box-and-whisker plots rather than histograms to display the results.
  32. The mean mark for 102 of the male students was 51. The mean mark for the remaining 26 male students was 59 . Calculate the mean mark for all 128 male students. 7 Once each year, Paula enters a lottery for a place in an annual marathon. Each time she enters the lottery, the probability of her obtaining a place is 0.3 . Find the probability that
  33. the first time she obtains a place is on her 4th attempt,
  34. she does not obtain a place on any of her first 6 attempts,
  35. she needs fewer than 10 attempts to obtain a place,
  36. she obtains a place exactly twice in her first 5 attempts. 8 A city council attempted to reduce traffic congestion by introducing a congestion charge. The charge was set at \(\pounds 4.00\) for the first year and was then increased by \(\pounds 2.00\) each year. For each of the first eight years, the council recorded the average number of vehicles entering the city centre per day. The results are shown in the table.
    Charge, \(\pounds x\)4681012141618
    Average number of vehicles
    per day, \(y\) million
    2.42.52.22.32.01.81.71.5
    $$\left[ n = 8 , \Sigma x = 88 , \Sigma y = 16.4 , \Sigma x ^ { 2 } = 1136 , \Sigma y ^ { 2 } = 34.52 , \Sigma x y = 168.6 . \right]$$
  37. Calculate the product moment correlation coefficient for these data.
  38. Explain why \(x\) is the independent variable.
  39. Calculate the equation of the regression line of \(y\) on \(x\).
  40. (a) Use your equation to estimate the average number of vehicles which will enter the city centre per day when the congestion charge is raised to \(\pounds 20.00\).
    (b) Comment on the reliability of your estimate.
  41. The council wishes to estimate the congestion charge required to reduce the average number of vehicles entering the city per day to 1.0 million. Assuming that a reliable estimate can be made by extrapolation, state whether they should use the regression line of \(y\) on \(x\) or the regression line of \(x\) on \(y\). Give a reason for your answer. 1 Each time a certain triangular spinner is spun, it lands on one of the numbers 0,1 and 2 with probabilities as shown in the table.
    NumberProbability
    00.7
    10.2
    20.1
    The spinner is spun twice. The total of the two numbers on which it lands is denoted by \(X\).
  42. Show that \(\mathrm { P } ( X = 2 ) = 0.18\). The probability distribution of \(X\) is given in the table.
    \(x\)01234
    \(\mathrm { P } ( X = x )\)0.490.280.180.040.01
  43. Calculate \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\). 2 The table shows the age, \(x\) years, and the mean diameter, \(y \mathrm {~cm}\), of the trunk of each of seven randomly selected trees of a certain species.
    Age \(( x\) years \()\)11122028354551
    Mean trunk diameter \(( y \mathrm {~cm} )\)12.216.026.439.239.651.360.6
    $$\left[ n = 7 , \Sigma x = 202 , \Sigma y = 245.3 , \Sigma x ^ { 2 } = 7300 , \Sigma y ^ { 2 } = 10510.65 , \Sigma x y = 8736.9 . \right]$$
  44. (a) Use an appropriate formula to show that the gradient of the regression line of \(y\) on \(x\) is 1.13 , correct to 2 decimal places.
    (b) Find the equation of the regression line of \(y\) on \(x\).
  45. Use your equation to estimate the mean trunk diameter of a tree of this species with age
    (a) 30 years,
    (b) 100 years. It is given that the value of the product moment correlation coefficient for the data in the table is 0.988 , correct to 3 decimal places.
  46. Comment on the reliability of each of your two estimates. 3 Erika is a birdwatcher. The probability that she will see a woodpecker on any given day is \(\frac { 1 } { 8 }\). It is assumed that this probability is unaffected by whether she has seen a woodpecker on any other day.
  47. Calculate the probability that Erika first sees a woodpecker
    (a) on the third day,
    (b) after the third day.
  48. Find the expectation of the number of days up to and including the first day on which she sees a woodpecker.
  49. Calculate the probability that she sees a woodpecker on exactly 2 days in the first 15 days. 4 Three tutors each marked the coursework of five students. The marks are given in the table.
    Student\(A\)\(B\)\(C\)\(D\)\(E\)
    Tutor 17367604839
    Tutor 26250617665
    Tutor 34250635471
  50. Calculate Spearman's rank correlation coefficient, \(r _ { \mathrm { s } }\), between the marks for tutors 1 and 2 .
  51. The values of \(r _ { \mathrm { s } }\) for the other pairs of tutors, are as follows. $$\begin{array} { c c } \text { Tutors } 1 \text { and 3: } & r _ { \mathrm { s } } = - 0.9
    \text { Tutors } 2 \text { and 3: } & r _ { \mathrm { s } } = 0.3 \end{array}$$ State which two tutors differ most widely in their judgements. Give your reason. 5 The stem-and-leaf diagram shows the masses, in grams, of 23 plums, measured correct to the nearest gram.
    5567889
    61235689
    700245678
    80
    97
    9
    \(\quad\) Key \(: 6 \mid 2\) means 62
  52. Find the median and interquartile range of these masses.
  53. State one advantage of using the interquartile range rather than the standard deviation as a measure of the variation in these masses.
  54. State one advantage and one disadvantage of using a stem-and-leaf diagram rather than a box-and-whisker plot to represent data.
  55. James wished to calculate the mean and standard deviation of the given data. He first subtracted 5 from each of the digits to the left of the line in the stem-and-leaf diagram, giving the following.
    0567889
    11235689
    200245678
    30
    47
    The mean and standard deviation of the data in this diagram are 18.1 and 9.7 respectively, correct to 1 decimal place. Write down the mean and standard deviation of the data in the original diagram. 6 A test consists of 4 algebra questions, A, B, C and D, and 4 geometry questions, G, H, I and J.
    The examiner plans to arrange all 8 questions in a random order, regardless of topic.
  56. (a) How many different arrangements are possible?
    (b) Find the probability that no two Algebra questions are next to each other and no two Geometry questions are next to each other. Later, the examiner decides that the questions should be arranged in two sections, Algebra followed by Geometry, with the questions in each section arranged in a random order.
  57. (a) How many different arrangements are possible?
    (b) Find the probability that questions A and H are next to each other.
    (c) Find the probability that questions B and J are separated by more than four other questions. 7 At a factory that makes crockery the quality control department has found that \(10 \%\) of plates have minor faults. These are classed as 'seconds'. Plates are stored in batches of 12. The number of seconds in a batch is denoted by \(X\).
  58. State an appropriate distribution with which to model \(X\). Give the value(s) of any parameter(s) and state any assumptions required for the model to be valid. Assume now that your model is valid.
  59. Find
    (a) \(\mathrm { P } ( X = 3 )\),
    (b) \(\mathrm { P } ( X \geqslant 1 )\).
  60. A random sample of 4 batches is selected. Find the probability that the number of these batches that contain at least 1 second is fewer than 3 . 8 A game uses an unbiased die with faces numbered 1 to 6 . The die is thrown once. If it shows 4 or 5 or 6 then this number is the final score. If it shows 1 or 2 or 3 then the die is thrown again and the final score is the sum of the numbers shown on the two throws.
  61. Find the probability that the final score is 4 .
  62. Given that the die is thrown only once, find the probability that the final score is 4 .
  63. Given that the die is thrown twice, find the probability that the final score is 4 .
    \(120 \%\) of packets of a certain kind of cereal contain a free gift. Jane buys one packet a week for 8 weeks. The number of free gifts that Jane receives is denoted by \(X\). Assuming that Jane's 8 packets can be regarded as a random sample, find
  64. \(\mathrm { P } ( X = 3 )\),
  65. \(\mathrm { P } ( X \geqslant 3 )\),
  66. \(\mathrm { E } ( X )\). 2 Two judges placed 7 dancers in rank order. Both judges placed dancers \(A\) and \(B\) in the first two places, but in opposite orders. The judges agreed about the ranks for all the other 5 dancers. Calculate the value of Spearman's rank correlation coefficient. 3 In an agricultural experiment, the relationship between the amount of water supplied, \(x\) units, and the yield, \(y\) units, was investigated. Six values of \(x\) were chosen and for each value of \(x\) the corresponding value of \(y\) was measured. The results are shown in the table.
    \(x\)123456
    \(y\)36881110
    These results, together with the regression line of \(y\) on \(x\), are plotted on the graph.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-33_743_846_1366_648}
  67. Give a reason why the regression line of \(x\) on \(y\) is not suitable in this context.
  68. Explain the significance, for the regression line of \(y\) on \(x\), of the distances shown by the vertical dotted lines in the diagram.
  69. Calculate the value of the product moment correlation coefficient, \(r\).
  70. Comment on your value of \(r\) in relation to the diagram. \section*{June 2009} \(430 \%\) of people own a Talk- 2 phone. People are selected at random, one at a time, and asked whether they own a Talk-2 phone. The number of people questioned, up to and including the first person who owns a Talk- 2 phone, is denoted by \(X\). Find
  71. \(\mathrm { P } ( X = 4 )\),
  72. \(\mathrm { P } ( X > 4 )\),
  73. \(\mathrm { P } ( X < 6 )\). 5 The diameters of 100 pebbles were measured. The measurements rounded to the nearest millimetre, \(x\), are summarised in the table.
    \(x\)\(10 \leqslant x \leqslant 19\)\(20 \leqslant x \leqslant 24\)\(25 \leqslant x \leqslant 29\)\(30 \leqslant x \leqslant 49\)
    Number of stones25222924
    These data are to be presented on a statistical diagram.
  74. For a histogram, find the frequency density of the \(10 \leqslant x \leqslant 19\) class.
  75. For a cumulative frequency graph, state the coordinates of the first two points that should be plotted.
  76. Why is it not possible to draw an exact box-and-whisker plot to illustrate the data? 6 Last year Eleanor played 11 rounds of golf. Her scores were as follows: $$79 , \quad 71 , \quad 80 , \quad 67 , \quad 67 , \quad 74 , \quad 66 , \quad 65 , \quad 71 , \quad 66 , \quad 64 .$$
  77. Calculate the mean of these scores and show that the standard deviation is 5.31 , correct to 3 significant figures.
  78. Find the median and interquartile range of the scores. This year, Eleanor also played 11 rounds of golf. The standard deviation of her scores was 4.23, correct to 3 significant figures, and the interquartile range was the same as last year.
  79. Give a possible reason why the standard deviation of her scores was lower than last year although her interquartile range was unchanged. In golf, smaller scores mean a better standard of play than larger scores. Ken suggests that since the standard deviation was smaller this year, Eleanor's overall standard has improved.
  80. Explain why Ken is wrong.
  81. State what the smaller standard deviation does show about Eleanor's play. 7 Three letters are selected at random from the 8 letters of the word COMPUTER, without regard to order.
  82. Find the number of possible selections of 3 letters.
  83. Find the probability that the letter P is included in the selection. Three letters are now selected at random, one at a time, from the 8 letters of the word COMPUTER, and are placed in order in a line.
  84. Find the probability that the 3 letters form the word TOP. 8 A game at a charity event uses a bag containing 19 white counters and 1 red counter. To play the game once a player takes counters at random from the bag, one at a time, without replacement. If the red counter is taken, the player wins a prize and the game ends. If not, the game ends when 3 white counters have been taken. Niko plays the game once.
  85. (a) Copy and complete the tree diagram showing the probabilities for Niko. \section*{First counter} \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-35_301_433_1228_529}
    (b) Find the probability that Niko will win a prize.
  86. The number of counters that Niko takes is denoted by \(X\).
    (a) Find \(\mathrm { P } ( X = 3 )\).
    (b) Find \(\mathrm { E } ( X )\). 9 Repeated independent trials of a certain experiment are carried out. On each trial the probability of success is 0.12 .
  87. Find the smallest value of \(n\) such that the probability of at least one success in \(n\) trials is more than 0.95.
  88. Find the probability that the 3rd success occurs on the 7th trial. OCR is committed to seeking permission to reproduce all third-party content that it uses in its assessment materials. OCR has attempted to identify and contact all copyright holders whose work is used in this paper. To avoid the issue of disclosure of answer-related information to candidates, all copyright acknowledgements are reproduced in the OCR Copyright Acknowledgements Booklet. This is produced for each series of examinations, is given to all schools that receive assessment material and is freely available to download from our public website (\href{http://www.ocr.org.uk}{www.ocr.org.uk}) after the live examination series.
    If OCR has unwittingly failed to correctly acknowledge or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity. For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1PB.
    OCR is part of the Cambridge Assessment Group; Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge. \section*{Jan 2010} 1 Andy makes repeated attempts to thread a needle. The number of attempts up to and including his first success is denoted by \(X\).
  89. State two conditions necessary for \(X\) to have a geometric distribution.
  90. Assuming that \(X\) has the distribution \(\operatorname { Geo } ( 0.3 )\), find
    (a) \(\mathrm { P } ( X = 5 )\),
    (b) \(\mathrm { P } ( X > 5 )\).
  91. Suggest a reason why one of the conditions you have given in part (i) might not be satisfied in this context. 240 people were asked to guess the length of a certain road. Each person gave their guess, \(l \mathrm {~km}\), correct to the nearest kilometre. The results are summarised below.
    \(l\)\(10 - 12\)\(13 - 15\)\(16 - 20\)\(21 - 30\)
    Frequency113206
  92. (a) Use appropriate formulae to calculate estimates of the mean and standard deviation of \(l\).
    (b) Explain why your answers are only estimates.
  93. A histogram is to be drawn to illustrate the data. Calculate the frequency density of the block for the 16-20 class.
  94. Explain which class contains the median value of \(l\).
  95. Later, the person whose guess was between 10 km and 12 km changed his guess to between 13 km and 15 km . Without calculation state whether the following will increase, decrease or remain the same:
    (a) the mean of \(l\),
    (b) the standard deviation of \(l\). 3 The heights, \(h \mathrm {~m}\), and weights, \(m \mathrm {~kg}\), of five men were measured. The results are plotted on the diagram.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-37_736_956_386_593} The results are summarised as follows. $$n = 5 \quad \Sigma h = 9.02 \quad \Sigma m = 377.7 \quad \Sigma h ^ { 2 } = 16.382 \quad \Sigma m ^ { 2 } = 28558.67 \quad \Sigma h m = 681.612$$
  96. Use the summarised data to calculate the value of the product moment correlation coefficient, \(r\).
  97. Comment on your value of \(r\) in relation to the diagram.
  98. It was decided to re-calculate the value of \(r\) after converting the heights to feet and the masses to pounds. State what effect, if any, this will have on the value of \(r\).
  99. One of the men had height 1.63 m and mass 78.4 kg . The data for this man were removed and the value of \(r\) was re-calculated using the original data for the remaining four men. State in general terms what effect, if any, this will have on the value of \(r\). 4 A certain four-sided die is biased. The score, \(X\), on each throw is a random variable with probability distribution as shown in the table. Throws of the die are independent.
    \(x\)0123
    \(\mathrm { P } ( X = x )\)\(\frac { 1 } { 2 }\)\(\frac { 1 } { 4 }\)\(\frac { 1 } { 8 }\)\(\frac { 1 } { 8 }\)
  100. Calculate \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\). The die is thrown 10 times.
  101. Find the probability that there are not more than 4 throws on which the score is 1 .
  102. Find the probability that there are exactly 4 throws on which the score is 2 . 5 A washing-up bowl contains 6 spoons, 5 forks and 3 knives. Three of these 14 items are removed at random, without replacement. Find the probability that
  103. all three items are of different kinds,
  104. all three items are of the same kind. 6 (a) A student calculated the values of the product moment correlation coefficient, \(r\), and Spearman's rank correlation coefficient, \(r _ { s }\), for two sets of bivariate data, \(A\) and \(B\). His results are given below. $$\begin{array} { l l } A : & r = 0.9 \quad \text { and } \quad r _ { s } = 1
    B : & r = 1 \quad \text { and } \quad r _ { s } = 0.9 \end{array}$$ With the aid of a diagram where appropriate, explain why the student's results for \(A\) could both be correct but his results for \(B\) cannot both be correct.
    (b) An old research paper has been partially destroyed. The surviving part of the paper contains the following incomplete information about some bivariate data from an experiment.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-38_344_1204_1114_507} Calculate the missing constant at the end of the equation of the second regression line. 7 The table shows the numbers of male and female members of a vintage car club who own either a Jaguar or a Bentley. No member owns both makes of car.
    \cline { 2 - 3 } \multicolumn{1}{c|}{}MaleFemale
    Jaguar2515
    Bentley128
    One member is chosen at random from these 60 members.
  105. Given that this member is male, find the probability that he owns a Jaguar. Now two members are chosen at random from the 60 members. They are chosen one at a time, without replacement.
  106. Given that the first one of these members is female, find the probability that both own Jaguars. 8 The five letters of the word NEVER are arranged in random order in a straight line.
  107. How many different orders of the letters are possible?
  108. In how many of the possible orders are the two Es next to each other?
  109. Find the probability that the first two letters in the order include exactly one letter E.
    \(9 R\) and \(S\) are independent random variables each having the distribution \(\operatorname { Geo } ( p )\).
  110. Find \(\mathrm { P } ( R = 1\) and \(S = 1 )\) in terms of \(p\).
  111. Show that \(\mathrm { P } ( R = 3\) and \(S = 3 ) = p ^ { 2 } q ^ { 4 }\), where \(q = 1 - p\).
  112. Use the formula for the sum to infinity of a geometric series to show that $$\mathrm { P } ( R = S ) = \frac { p } { 2 - p }$$ \section*{June 2010} 1 The marks of some students in a French examination were summarised in a grouped frequency distribution and a cumulative frequency diagram was drawn, as shown below.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-40_851_1413_351_367}
  113. Estimate how many students took the examination.
  114. How can you tell that no student scored more than 55 marks?
  115. Find the greatest possible range of the marks.
  116. The minimum mark for Grade C was 27 . The number of students who gained exactly Grade C was the same as the number of students who gained a grade lower than C. Estimate the maximum mark for Grade C.
  117. In a German examination the marks of the same students had an interquartile range of 16 marks. What does this result indicate about the performance of the students in the German examination as compared with the French examination? 2 Three skaters, \(A , B\) and \(C\), are placed in rank order by four judges. Judge \(P\) ranks skater \(A\) in 1st place, skater \(B\) in 2nd place and skater \(C\) in 3rd place.
  118. Without carrying out any calculation, state the value of Spearman's rank correlation coefficient for the following ranks. Give a reason for your answer.
    Skater\(A\)\(B\)\(C\)
    Judge \(P\)123
    Judge \(Q\)321
  119. Calculate the value of Spearman's rank correlation coefficient for the following ranks.
    Skater\(A\)\(B\)\(C\)
    Judge \(P\)123
    Judge \(R\)312
  120. Judge \(S\) ranks the skaters at random. Find the probability that the value of Spearman's rank correlation coefficient between the ranks of judge \(P\) and judge \(S\) is 1 . 3
  121. Some values, ( \(x , y\) ), of a bivariate distribution are plotted on a scatter diagram and a regression line is to be drawn. Explain how to decide whether the regression line of \(y\) on \(x\) or the regression line of \(x\) on \(y\) is appropriate.
  122. In an experiment the temperature, \(x ^ { \circ } \mathrm { C }\), of a rod was gradually increased from \(0 ^ { \circ } \mathrm { C }\), and the extension, \(y \mathrm {~mm}\), was measured nine times at \(50 ^ { \circ } \mathrm { C }\) intervals. The results are summarised below. $$n = 9 \quad \Sigma x = 1800 \quad \Sigma y = 14.4 \quad \Sigma x ^ { 2 } = 510000 \quad \Sigma y ^ { 2 } = 32.6416 \quad \Sigma x y = 4080$$ (a) Show that the gradient of the regression line of \(y\) on \(x\) is 0.008 and find the equation of this line.
    (b) Use your equation to estimate the temperature when the extension is 2.5 mm .
    (c) Use your equation to estimate the extension for a temperature of \(- 50 ^ { \circ } \mathrm { C }\).
    (d) Comment on the meaning and the reliability of your estimate in part (c). 4
  123. The random variable \(W\) has the distribution \(\mathrm { B } \left( 10 , \frac { 1 } { 3 } \right)\). Find
    (a) \(\mathrm { P } ( W \leqslant 2 )\),
    (b) \(\mathrm { P } ( W = 2 )\).
  124. The random variable \(X\) has the distribution \(\mathrm { B } ( 15,0.22 )\).
    (a) Find \(\mathrm { P } ( X = 4 )\).
    (b) Find \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\). 5 Each of four cards has a number printed on it as shown.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-41_95_867_1603_598} Two of the cards are chosen at random, without replacement. The random variable \(X\) denotes the sum of the numbers on these two cards.
  125. Show that \(\mathrm { P } ( X = 6 ) = \frac { 1 } { 6 }\) and \(\mathrm { P } ( X = 4 ) = \frac { 1 } { 3 }\).
  126. Write down all the possible values of \(X\) and find the probability distribution of \(X\).
  127. Find \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\). 6 There are 10 numbers in a list. The first 9 numbers have mean 6 and variance 2. The 10th number is 3 . Find the mean and variance of all 10 numbers. \section*{June 2010} 7 The menu below shows all the dishes available at a certain restaurant.
    Rice dishesMain dishesVegetable dishes
    Boiled riceChickenMushrooms
    Fried riceBeefCauliflower
    Pilau riceLambSpinach
    Keema riceMixed grillLentils
    PrawnPotatoes
    Vegetarian
    A group of friends decide that they will share a total of 2 different rice dishes, 3 different main dishes and 4 different vegetable dishes from this menu. Given these restrictions,
  128. find the number of possible combinations of dishes that they can choose to share,
  129. assuming that all choices are equally likely, find the probability that they choose boiled rice. The friends decide to add a further restriction as follows. If they choose boiled rice, they will not choose potatoes.
  130. Find the number of possible combinations of dishes that they can now choose. 8 The proportion of people who watch West Street on television is \(30 \%\). A market researcher interviews people at random in order to contact viewers of West Street. Each day she has to contact a certain number of viewers of West Street.
  131. Near the end of one day she finds that she needs to contact just one more viewer of West Street. Find the probability that the number of further interviews required is
    (a) 4 ,
    (b) less than 4 .
  132. Near the end of another day she finds that she needs to contact just two more viewers of West Street. Find the probability that the number of further interviews required is
    (a) 5 ,
    (b) more than 5 . {www.ocr.org.uk}) after the live examination series.
    If OCR has unwittingly failed to correctly acknowledge or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity. For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1GE.
    OCR is part of the Cambridge Assessment Group; Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge. }\section*{Jan 2011} 1200 candidates took each of two examination papers. The diagram shows the cumulative frequency graphs for their marks.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-43_1045_1425_370_360}
  133. Estimate the median mark for each of the papers.
  134. State, with a reason, which of the two papers was the easier one.
  135. It is suggested that the marks on Paper 2 were less varied than those on Paper 1. Use interquartile ranges to comment on this suggestion.
  136. The minimum mark for grade A , the top grade, on Paper 1 was 10 marks lower than the minimum mark for grade A on Paper 2. Given that 25 candidates gained grade A in Paper 1, find the number of candidates who gained grade A in Paper 2.
  137. The mean and standard deviation of the marks on Paper 1 were 36.5 and 28.2 respectively. Later, a marking error was discovered and it was decided to add 1 mark to each of the 200 marks on Paper 1. State the mean and standard deviation of the new marks on Paper 1. 2 The random variable \(X\) has the distribution \(\operatorname { Geo } ( 0.2 )\). Find
  138. \(\mathrm { P } ( X = 3 )\),
  139. \(\mathrm { P } ( 3 \leqslant X \leqslant 5 )\),
  140. \(\mathrm { P } ( X > 4 )\). Two independent values of \(X\) are found.
  141. Find the probability that the total of these two values is 3 . 3 A firm wishes to assess whether there is a linear relationship between the annual amount spent on advertising, \(\pounds x\) thousand, and the annual profit, \(\pounds y\) thousand. A summary of the figures for 12 years is as follows. $$n = 12 \quad \Sigma x = 86.6 \quad \Sigma y = 943.8 \quad \Sigma x ^ { 2 } = 658.76 \quad \Sigma y ^ { 2 } = 83663.00 \quad \Sigma x y = 7351.12$$
  142. Calculate the product moment correlation coefficient, showing that it is greater than 0.9 .
  143. Comment briefly on this value in this context.
  144. A manager claims that this result shows that spending more money on advertising in the future will result in greater profits. Make two criticisms of this claim.
  145. Calculate the equation of the regression line of \(y\) on \(x\).
  146. Estimate the annual profit during a year when \(\pounds 7400\) was spent on advertising. 4 Jenny and Omar are each allowed two attempts at a high jump.
  147. The probability that Jenny will succeed on her first attempt is 0.6 . If she fails on her first attempt, the probability that she will succeed on her second attempt is 0.7 . Calculate the probability that Jenny will succeed.
  148. The probability that Omar will succeed on his first attempt is \(p\). If he fails on his first attempt, the probability that he will succeed on his second attempt is also \(p\). The probability that he succeeds is 0.51 . Find \(p\).
    \(530 \%\) of packets of Natural Crunch Crisps contain a free gift. Jan buys 5 packets each week.
  149. The number of free gifts that Jan receives in a week is denoted by \(X\). Name a suitable probability distribution with which to model \(X\), giving the value(s) of any parameter(s). State any assumption(s) necessary for the distribution to be a valid model. Assume now that your model is valid.
  150. Find
    (a) \(\mathrm { P } ( X \leqslant 2 )\),
    (b) \(\mathrm { P } ( X = 2 )\).
  151. Find the probability that, in the next 7 weeks, there are exactly 3 weeks in which Jan receives exactly 2 free gifts. 6
  152. The diagram shows 7 cards, each with a digit printed on it. The digits form a 7 -digit number.
    1333559
    How many different 7 -digit numbers can be formed using these cards?
  153. The diagram below shows 5 white cards and 10 grey cards, each with a letter printed on it.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-45_398_801_596_632} From these cards, 3 white cards and 4 grey cards are selected at random without regard to order.
    (a) How many selections of seven cards are possible?
    (b) Find the probability that the seven cards include exactly one card showing the letter A . 7 The probability distribution of a discrete random variable, \(X\), is shown below.
    \(x\)02
    \(\mathrm { P } ( X = x )\)\(a\)\(1 - a\)
  154. Find \(\mathrm { E } ( X )\) in terms of \(a\).
  155. Show that \(\operatorname { Var } ( X ) = 4 a ( 1 - a )\). 8 Five dogs, \(A , B , C , D\) and \(E\), took part in three races. The order in which they finished the first race was \(A B C D E\).
  156. Spearman's rank correlation coefficient between the orders for the 5 dogs in the first two races was found to be - 1 . Write down the order in which the dogs finished the second race.
  157. Spearman's rank correlation coefficient between the orders for the 5 dogs in the first race and the third race was found to be 0.9 .
    (a) Show that, in the usual notation (as in the List of Formulae), \(\Sigma d ^ { 2 } = 2\).
    (b) Hence or otherwise find a possible order in which the dogs could have finished the third race. OCR is committed to seeking permission to reproduce all third-party content that it uses in its assessment materials. OCR has attempted to identify and contact all copyright holders whose work is used in this paper. To avoid the issue of disclosure of answer-related information to candidates, all copyright acknowledgements are reproduced in the OCR Copyright Acknowledgements Booklet. This is produced for each series of examinations and is freely available to download from our public website (\href{http://www.ocr.org.uk}{www.ocr.org.uk}) after the live examination series.
    If OCR has unwittingly failed to correctly acknowledge or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity. For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1GE.
    OCR is part of the Cambridge Assessment Group; Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge. 1 Five salesmen from a certain firm were selected at random for a survey. For each salesman, the annual income, \(x\) thousand pounds, and the distance driven last year, \(y\) thousand miles, were recorded. The results were summarised as follows. $$n = 5 \quad \Sigma x = 251 \quad \Sigma x ^ { 2 } = 14323 \quad \Sigma y = 65 \quad \Sigma y ^ { 2 } = 855 \quad \Sigma x y = 3247$$
  158. (a) Show that the product moment correlation coefficient, \(r\), between \(x\) and \(y\) is - 0.122 , correct to 3 significant figures.
    (b) State what this value of \(r\) shows about the relationship between annual income and distance driven last year for these five salesmen.
    (c) It was decided to recalculate \(r\) with the distances measured in kilometres instead of miles. State what effect, if any, this would have on the value of \(r\).
  159. Another salesman from the firm is selected at random. His annual income is known to be \(\pounds 52000\), but the distance that he drove last year is unknown. In order to estimate this distance, a regression line based on the above data is used. Comment on the reliability of such an estimate. 2 The orders in which 4 contestants, \(P , Q , R\) and \(S\), were placed in two competitions are shown in the table.
    Position1st2nd3rd4th
    Competition 1\(Q\)\(R\)\(S\)\(P\)
    Competition 2\(Q\)\(P\)\(R\)\(S\)
    Calculate Spearman's rank correlation coefficient between these two orders. 3
  160. A random variable, \(X\), has the distribution \(\mathrm { B } ( 12,0.85 )\). Find
    (a) \(\mathrm { P } ( X > 10 )\),
    (b) \(\mathrm { P } ( X = 10 )\),
    (c) \(\operatorname { Var } ( X )\).
  161. A random variable, \(Y\), has the distribution \(\mathrm { B } \left( 2 , \frac { 1 } { 4 } \right)\). Two independent values of \(Y\) are found. Find the probability that the sum of these two values is 1 . 4 The table shows information about the time, \(t\) minutes correct to the nearest minute, taken by 50 people to complete a race.
    Time (minutes)\(t \leqslant 27\)\(28 \leqslant t \leqslant 30\)\(31 \leqslant t \leqslant 35\)\(36 \leqslant t \leqslant 45\)\(46 \leqslant t \leqslant 60\)\(t \geqslant 61\)
    Number of people04281440
  162. In a histogram illustrating the data, the height of the block for the \(31 \leqslant t \leqslant 35\) class is 5.6 cm . Find the height of the block for the \(28 \leqslant t \leqslant 30\) class. (There is no need to draw the histogram.)
  163. The data in the table are used to estimate the median time. State, with a reason, whether the estimated median time is more than 33 minutes, less than 33 minutes or equal to 33 minutes.
  164. Calculate estimates of the mean and standard deviation of the data.
  165. It was found that the winner's time had been incorrectly recorded and that it was actually less than 27 minutes 30 seconds. State whether each of the following will increase, decrease or remain the same:
    (a) the mean,
    (b) the standard deviation,
    (c) the median,
    (d) the interquartile range. \section*{June 2011} 5 A bag contains 4 blue discs and 6 red discs. Chloe takes a disc from the bag. If this disc is red, she takes 2 more discs. If not, she takes 1 more disc. Each disc is taken at random and no discs are replaced.
  166. Complete the probability tree diagram in your Answer Book, showing all the probabilities.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-48_629_412_543_511}
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-48_374_323_529_1398} The total number of blue discs that Chloe takes is denoted by \(X\).
  167. Show that \(\mathrm { P } ( X = 1 ) = \frac { 3 } { 5 }\). The complete probability distribution of \(X\) is given below.
    \(x\)012
    \(\mathrm { P } ( X = x )\)\(\frac { 1 } { 6 }\)\(\frac { 3 } { 5 }\)\(\frac { 7 } { 30 }\)
  168. Calculate \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\). 6 A group of 7 students sit in random order on a bench.
  169. (a) Find the number of orders in which they can sit.
    (b) The 7 students include Tom and Jerry. Find the probability that Tom and Jerry sit next to each other.
  170. The students consist of 3 girls and 4 boys. Find the probability that
    (a) no two boys sit next to each other,
    (b) all three girls sit next to each other. 7 The diagram shows the results of an experiment involving some bivariate data. The least squares regression line of \(y\) on \(x\) for these results is also shown.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-49_774_939_376_603}
  171. Given that the least squares regression line of \(y\) on \(x\) is used for an estimation, state which of \(x\) or \(y\) is treated as the independent variable.
  172. Use the diagram to explain what is meant by 'least squares'.
  173. State, with a reason, the value of Spearman's rank correlation coefficient for these data.
  174. What can be said about the value of the product moment correlation coefficient for these data? 8 Ann, Bill, Chris and Dipak play a game with a fair cubical die. Starting with Ann they take turns, in alphabetical order, to throw the die. This process is repeated as many times as necessary until a player throws a 6 . When this happens, the game stops and this player is the winner. Find the probability that
  175. Chris wins on his first throw,
  176. Dipak wins on his second throw,
  177. Ann gets a third throw,
  178. Bill throws the die exactly three times. 1 The probability distribution of a random variable \(X\) is shown in the table.
    \(x\)1234
    \(\mathrm { P } ( X = x )\)0.10.3\(2 p\)\(p\)
  179. Find \(p\).
  180. Find \(\mathrm { E } ( X )\). 2 In an experiment, the percentage sand content, \(y\), of soil in a given region was measured at nine different depths, \(x \mathrm {~cm}\), taken at intervals of 6 cm from 0 cm to 48 cm . The results are summarised below. $$n = 9 \quad \Sigma x = 216 \quad \Sigma x ^ { 2 } = 7344 \quad \Sigma y = 512.4 \quad \Sigma y ^ { 2 } = 30595 \quad \Sigma x y = 10674$$
  181. State, with a reason, which variable is the independent variable.
  182. Calculate the product moment correlation coefficient between \(x\) and \(y\).
  183. (a) Calculate the equation of the appropriate regression line.
    (b) This regression line is used to estimate the percentage sand content at depths of 25 cm and 100 cm . Comment on the reliability of each of these estimates. You are not asked to find the estimates. 3 A random variable \(X\) has the distribution \(\mathrm { B } ( 13,0.12 )\).
  184. Find \(\mathrm { P } ( X < 2 )\). Two independent values of \(X\) are found.
  185. Find the probability that exactly one of these values is equal to 2 . 4 (a) The table gives the heights and masses of 5 people.
    Person\(A\)\(B\)\(C\)\(D\)\(E\)
    Height (m)1.721.631.771.681.74
    Mass (kg)7562646070
    Calculate Spearman's rank correlation coefficient.
    (b) In an art competition the value of Spearman's rank correlation coefficient, \(r _ { s }\), calculated from two judges’ rankings was 0.75 . A late entry for the competition was received and both judges ranked this entry lower than all the others. By considering the formula for \(r _ { s }\), explain whether the new value of \(r _ { s }\) will be less than 0.75 , equal to 0.75 , or greater than 0.75 . 5 At a certain resort the number of hours of sunshine, measured to the nearest hour, was recorded on each of 21 days. The results are summarised in the table.
    Hours of sunshine0\(1 - 3\)\(4 - 6\)\(7 - 9\)\(10 - 15\)
    Number of days06942
    The diagram shows part of a histogram to illustrate the data. The scale on the frequency density axis is 2 cm to 1 unit.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-51_963_1785_689_141}
  186. (a) Calculate the frequency density of the \(1 - 3\) class.
    (b) Fred wishes to draw the block for the \(10 - 15\) class on the same diagram. Calculate the height, in centimetres, of this block.
  187. A cumulative frequency graph is to be drawn. Write down the coordinates of the first two points that should be plotted. You are not asked to draw the graph.
  188. (a) Calculate estimates of the mean and standard deviation of the number of hours of sunshine.
    (b) Explain why your answers are only estimates. 6 The diagrams illustrate all or part of the probability distributions of the discrete random variables \(V , W , X , Y\) and \(Z\).
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-52_423_365_370_296}
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-52_421_378_370_838}
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-52_419_364_370_1400}
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-52_424_367_879_577}
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-52_421_362_879_1139}
  189. One of these variables has the distribution \(\operatorname { Geo } \left( \frac { 1 } { 2 } \right)\). State, with a reason, which variable this is.
  190. One of these variables has the distribution \(B \left( 4 , \frac { 1 } { 2 } \right)\). State, with reasons, which variable this is.
    \(760 \%\) of the voters at a certain polling station are women. Voters enter the polling station one at a time. The number of voters who enter, up to and including the first woman, is denoted by \(X\).
  191. State a suitable distribution that can be used as a model for \(X\), giving the value(s) of any parameter(s). State also any necessary condition(s) for this distribution to be a good model. Use the distribution stated in part (i) to find
  192. \(\mathrm { P } ( X = 4 )\),
  193. \(\mathrm { P } ( X \geqslant 4 )\). 8 On average, half the plants of a particular variety produce red flowers and the rest produce blue flowers.
  194. Ann chooses 8 plants of this variety at random. Find the probability that more than 6 plants produce red flowers.
  195. Karim chooses 22 plants of this variety at random.
    (a) Find the probability that the number of these plants that produce blue flowers is equal to the number that produce red flowers.
    (b) Hence find the probability that the number of these plants that produce blue flowers is greater than the number that produce red flowers. 9 A bag contains 9 discs numbered 1, 2, 3, 4, 5, 6, 7, 8, 9 .
  196. Andrea chooses 4 discs at random, without replacement, and places them in a row.
    (a) How many different 4 -digit numbers can be made?
    (b) How many different odd 4-digit numbers can be made?
  197. Andrea's 4 discs are put back in the bag. Martin then chooses 4 discs at random, without replacement. Find the probability that
    (a) the 4 digits include at least 3 odd digits,
    (b) the 4 digits add up to 28 . 1 For each of the last five years the number of tourists, \(x\) thousands, visiting Sackton, and the average weekly sales, \(\pounds y\) thousands, in Sackton Stores were noted. The table shows the results.
    Year20072008200920102011
    \(x\)250270264290292
    \(y\)4.23.73.23.53.0
  198. Calculate the product moment correlation coefficient \(r\) between \(x\) and \(y\).
  199. It is required to estimate the average weekly sales at Sackton Stores in a year when the number of tourists is 280000 . Calculate the equation of an appropriate regression line, and use it to find this estimate.
  200. Over a longer period the value of \(r\) is - 0.8 . The mayor says, "This shows that having more tourists causes sales at Sackton Stores to decrease." Give a reason why this statement is not correct. 2 The masses, \(x \mathrm {~kg}\), of 50 bags of flour were measured and the results were summarised as follows. $$n = 50 \quad \Sigma ( x - 1.5 ) = 1.4 \quad \Sigma ( x - 1.5 ) ^ { 2 } = 0.05$$ Calculate the mean and standard deviation of the masses of these bags of flour. 3 The test marks of 14 students are displayed in a stem-and-leaf diagram, as shown below.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-54_241_264_1420_479} Key: 1 | 6 means 16 marks
  201. Find the lower quartile.
  202. Given that the median is 32 , find the values of \(w\) and \(x\).
  203. Find the possible values of the upper quartile.
  204. State one advantage of a stem-and-leaf diagram over a box-and-whisker plot.
  205. State one advantage of a box-and-whisker plot over a stem-and-leaf diagram. 4 A bag contains 5 red discs and 1 black disc. Tina takes two discs from the bag at random without replacement.
  206. The diagram shows part of a tree diagram to illustrate this situation. \section*{First disc}
    \includegraphics[max width=\textwidth, alt={}]{11316ea6-3999-4003-b77d-bee8b547c1da-55_266_499_477_550}
    Complete the tree diagram in your Answer Book showing all the probabilities. \section*{Second disc}
  207. Find the probability that exactly one of the two discs is red. All the discs are replaced in the bag. Tony now takes three discs from the bag at random without replacement.
  208. Given that the first disc Tony takes is red, find the probability that the third disc Tony takes is also red. 5
  209. Write down the value of Spearman's rank correlation coefficent, \(r _ { s }\), for the following sets of ranks. All the discs are replaced in the bag. Tony now takes three discs from the bag at random without replacement.
  210. Given that the first disc Tony takes is red, find the probability that the third disc Tony takes is also red.
  211. Write down the value of Spearman's rank correlation coefficent, \(r _ { s }\), for the following sets of ranks.
    (b)
    Judge \(A\) ranks1234
    Judge \(C\) ranks4321
    (a)
    (a)
    Judge \(A\) ranks1234
    Judge \(B\) ranks1234
  212. Calculate the value of \(r _ { s }\) for the following ranks.
    Judge \(A\) ranks1234
    Judge \(D\) ranks2413
  213. For each of parts (i)(a), (i)(b) and (ii), describe in everyday terms the relationship between the two judges’ opinions. 6 A six-sided die is biased so that the probability of scoring 6 is 0.1 and the probabilities of scoring \(1,2,3,4\), and 5 are all equal. In a game at a fête, contestants pay \(\pounds 3\) to roll this die. If the score is 6 they receive \(\pounds 10\) back. If the score is 5 they receive \(\pounds 5\) back. Otherwise they receive no money back. Find the organiser's expected profit for 100 rolls of the die. 7
  214. 5 of the 7 letters \(\mathrm { A } , \mathrm { B } , \mathrm { C } , \mathrm { D } , \mathrm { E } , \mathrm { F } , \mathrm { G }\) are arranged in a random order in a straight line.
    (a) How many different arrangements of 5 letters are possible?
    (b) How many of these arrangements end with a vowel (A or E)?
  215. A group of 5 people is to be chosen from a list of 7 people.
    (a) How many different groups of 5 people can be chosen?
    (b) The list of 7 people includes Jill and Jo. A group of 5 people is chosen at random from the list. Given that either Jill and Jo are both chosen or neither of them is chosen, find the probability that both of them are chosen. 8
  216. The random variable \(X\) has the distribution \(\mathrm { B } ( 30,0.6 )\). Find \(\mathrm { P } ( X \geqslant 16 )\).
  217. The random variable \(Y\) has the distribution \(\mathrm { B } ( 4,0.7 )\).
    (a) Find \(\mathrm { P } ( Y = 2 )\).
    (b) Three values of \(Y\) are chosen at random. Find the probability that their total is 10 . 9
  218. A clock is designed to chime once each hour, on the hour. The clock has a fault so that each time it is supposed to chime there is a constant probability of \(\frac { 1 } { 10 }\) that it will not chime. It may be assumed that the clock never stops and that faults occur independently. The clock is started at 5 minutes past midnight on a certain day. Find the probability that the first time it does not chime is
    (a) at 0600 on that day,
    (b) before 0600 on that day.
  219. Another clock is designed to chime twice each hour: on the hour and at 30 minutes past the hour. This clock has a fault so that each time it is supposed to chime there is a constant probability of \(\frac { 1 } { 20 }\) that it will not chime. It may be assumed that the clock never stops and that faults occur independently. The clock is started at 5 minutes past midnight on a certain day.
    (a) Find the probability that the first time it does not chime is at either 0030 or 0130 on that day.
    (b) Use the formula for the sum to infinity of a geometric progression to find the probability that the first time it does not chime is at 30 minutes past some hour.