Questions S2 (1597 questions)

Browse by board
AQA AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further AS Paper 1 Further AS Paper 2 Discrete Further AS Paper 2 Mechanics Further AS Paper 2 Statistics Further Paper 1 Further Paper 2 Further Paper 3 Discrete Further Paper 3 Mechanics Further Paper 3 Statistics M1 M2 M3 Paper 1 Paper 2 Paper 3 S1 S2 S3 CAIE FP1 FP2 Further Paper 1 Further Paper 2 Further Paper 3 Further Paper 4 M1 M2 P1 P2 P3 S1 S2 Edexcel AEA AS Paper 1 AS Paper 2 C1 C12 C2 C3 C34 C4 CP AS CP1 CP2 D1 D2 F1 F2 F3 FD1 FD1 AS FD2 FD2 AS FM1 FM1 AS FM2 FM2 AS FP1 FP1 AS FP2 FP2 AS FP3 FS1 FS1 AS FS2 FS2 AS M1 M2 M3 M4 M5 P1 P2 P3 P4 PMT Mocks Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 OCR AS Pure C1 C2 C3 C4 D1 D2 FD1 AS FM1 AS FP1 FP1 AS FP2 FP3 FS1 AS Further Additional Pure Further Additional Pure AS Further Discrete Further Discrete AS Further Mechanics Further Mechanics AS Further Pure Core 1 Further Pure Core 2 Further Pure Core AS Further Statistics Further Statistics AS H240/01 H240/02 H240/03 M1 M2 M3 M4 Mechanics 1 PURE Pure 1 S1 S2 S3 S4 Stats 1 OCR MEI AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further Extra Pure Further Mechanics A AS Further Mechanics B AS Further Mechanics Major Further Mechanics Minor Further Numerical Methods Further Pure Core Further Pure Core AS Further Pure with Technology Further Statistics A AS Further Statistics B AS Further Statistics Major Further Statistics Minor M1 M2 M3 M4 Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 SPS SPS ASFM SPS ASFM Mechanics SPS ASFM Pure SPS ASFM Statistics SPS FM SPS FM Mechanics SPS FM Pure SPS FM Statistics SPS SM SPS SM Mechanics SPS SM Pure SPS SM Statistics WJEC Further Unit 1 Further Unit 2 Further Unit 3 Further Unit 4 Further Unit 5 Further Unit 6 Unit 1 Unit 2 Unit 3 Unit 4
AQA S2 2009 June Q2
2 John works from home. The number of business letters, \(X\), that he receives on a weekday may be modelled by a Poisson distribution with mean 5.0. The number of private letters, \(Y\), that he receives on a weekday may be modelled by a Poisson distribution with mean 1.5.
  1. Find, for a given weekday:
    1. \(\mathrm { P } ( X < 4 )\);
    2. \(\quad \mathrm { P } ( Y = 4 )\).
    1. Assuming that \(X\) and \(Y\) are independent random variables, determine the probability that, on a given weekday, John receives a total of more than 5 business and private letters.
    2. Hence calculate the probability that John receives a total of more than 5 business and private letters on at least 7 out of 8 given weekdays.
  2. The numbers of letters received by John's neighbour, Brenda, on 10 consecutive weekdays are $$\begin{array} { l l l l l l l l l l } 15 & 8 & 14 & 7 & 6 & 8 & 2 & 8 & 9 & 3 \end{array}$$
    1. Calculate the mean and the variance of these data.
    2. State, giving a reason based on your answers to part (c)(i), whether or not a Poisson distribution might provide a suitable model for the number of letters received by Brenda on a weekday.
AQA S2 2009 June Q3
3 A sample survey, conducted to determine the attitudes of residents to a proposed reorganisation of local schools, gave the following results.
Against reorganisationNot against reorganisation
\multirow{5}{*}{Age of resident}16-1792
18-211710
22-4911590
50-654134
Over 6534
Use a \(\chi ^ { 2 }\) test, at the \(5 \%\) level of significance, to determine whether there is an association between the ages of residents and their attitudes to the proposed reorganisation of local schools.
AQA S2 2009 June Q4
4 The continuous random variable \(X\) has probability density function given by $$f ( x ) = \left\{ \begin{array} { c c } \frac { 1 } { 2 } & 0 \leqslant x \leqslant 1
\frac { 3 - x } { 4 } & 1 \leqslant x \leqslant 3
0 & \text { otherwise } \end{array} \right.$$
  1. Sketch the graph of f.
  2. Explain why the value of \(\eta\), the median of \(X\), is 1 .
  3. Show that the value of \(\mu\), the mean of \(X\), is \(\frac { 13 } { 12 }\).
  4. Find \(\mathrm { P } ( X < 3 \mu - \eta )\).
AQA S2 2009 June Q5
5 Joanne has 10 identically-shaped discs, of which 1 is blue, 2 are green, 3 are yellow and 4 are red. She places the 10 discs in a bag and asks her friend David to play a game by selecting, at random and without replacement, two discs from the bag.
  1. Show that:
    1. the probability that the two discs selected are the same colour is \(\frac { 2 } { 9 }\);
    2. the probability that exactly one of the two discs selected is blue is \(\frac { 1 } { 5 }\).
  2. Using the discs, Joanne plays the game with David, under the following conditions: If the two discs selected by David are the same colour, she will pay him 135p. If exactly one of the two discs selected by David is blue, she will pay him 145p. Otherwise David will pay Joanne 45p.
    1. When a game is played, \(X\) is the amount, in pence, won by David. Construct the probability distribution for \(X\), in the form of a table.
    2. Show that \(\mathrm { E } ( X ) = 33\).
  3. Joanne modifies the game so that the amount per game, \(Y\) pence, that she wins may be modelled by $$Y = 104 - 3 X$$
    1. Determine how much Joanne would expect to win if the game is played 100 times.
    2. Calculate the standard deviation of \(Y\), giving your answer to the nearest 1 p .
AQA S2 2009 June Q6
6 Bishen believes that the mean weight of boxes of black peppercorns is 45 grams. Abi, thinking that this is not the case, weighs, in grams, a random sample of 8 boxes of black peppercorns, with the following results. $$\begin{array} { l l l l l l l l } 44 & 44 & 43 & 46 & 42 & 40 & 43 & 46 \end{array}$$
    1. Construct a \(95 \%\) confidence interval for the mean weight of boxes of black peppercorns, stating any assumption that you make.
    2. Comment on Bishen's belief.
    1. Abi claims that the mean weight of boxes of black peppercorns is less than 45 grams. Test this claim at the \(5 \%\) level of significance.
    2. If Bishen's belief is true, state, with a reason, what type of error, if any, may have occurred when conclusions to the test in part (b)(i) were drawn.
      (2 marks)
OCR S2 Q1
1 In a study of urban foxes it is found that on average there are 2 foxes in every 3 acres.
  1. Use a Poisson distribution to find the probability that, at a given moment,
    (a) in a randomly chosen area of 3 acres there are at least 4 foxes,
    (b) in a randomly chosen area of 1 acre there are exactly 2 foxes.
  2. Explain briefly why a Poisson distribution might not be a suitable model.
OCR S2 Q2
2 The random variable \(W\) has the distribution \(B \left( 40 , \frac { 2 } { 7 } \right)\). Use an appropriate approximation to find \(\mathrm { P } ( W > 13 )\).
OCR S2 Q3
3 The manufacturers of a brand of chocolates claim that, on average, \(30 \%\) of their chocolates have hard centres. In a random sample of 8 chocolates from this manufacturer, 5 had hard centres. Test, at the \(5 \%\) significance level, whether there is evidence that the population proportion of chocolates with hard centres is not \(30 \%\), stating your hypotheses clearly. Show the values of any relevant probabilities.
OCR S2 Q4
4 DVD players are tested after manufacture. The probability that a randomly chosen DVD player is defective is 0.02 . The number of defective players in a random sample of size 80 is denoted by \(R\).
  1. Use an appropriate approximation to find \(\mathrm { P } ( R \geqslant 2 )\).
  2. Find the smallest value of \(r\) for which \(\mathrm { P } ( R \geqslant r ) < 0.01\).
OCR S2 Q5
5 In an investment model the increase, \(Y \%\), in the value of an investment in one year is modelled as a continuous random variable with the distribution \(\mathrm { N } \left( \mu , \frac { 1 } { 4 } \mu ^ { 2 } \right)\). The value of \(\mu\) depends on the type of investment chosen.
  1. Find \(\mathrm { P } ( Y < 0 )\), showing that it is independent of the value of \(\mu\).
  2. Given that \(\mu = 6\), find the probability that \(Y < 9\) in each of three randomly chosen years.
  3. Explain why the calculation in part (ii) might not be valid if applied to three consecutive years.
OCR S2 Q6
6 Alex obtained the actual waist measurements, \(w\) inches, of a random sample of 50 pairs of jeans, each of which was labelled as having a 32 -inch waist. The results are summarised by $$n = 50 , \quad \Sigma w = 1615.0 , \quad \Sigma w ^ { 2 } = 52214.50$$ Test, at the \(0.1 \%\) significance level, whether this sample provides evidence that the mean waist measurement of jeans labelled as having 32 -inch waists is in fact greater than 32 inches. State your hypotheses clearly. \section*{Jan 2006}
OCR S2 Q7
7 The random variable \(X\) has the distribution \(\mathrm { N } \left( \mu , 8 ^ { 2 } \right)\). The mean of a random sample of 12 observations of \(X\) is denoted by \(\bar { X }\). A test is carried out at the \(1 \%\) significance level of the null hypothesis \(\mathrm { H } _ { 0 } : \mu = 80\) against the alternative hypothesis \(\mathrm { H } _ { 1 } : \mu < 80\). The test is summarised as follows: 'Reject \(\mathrm { H } _ { 0 }\) if \(\bar { X } < c\); otherwise do not reject \(\mathrm { H } _ { 0 } { } ^ { \prime }\).
  1. Calculate the value of \(c\).
  2. Assuming that \(\mu = 80\), state whether the conclusion of the test is correct, results in a Type I error, or results in a Type II error if:
    (a) \(\bar { X } = 74.0\),
    (b) \(\bar { X } = 75.0\).
  3. Independent repetitions of the above test, using the value of \(c\) found in part (i), suggest that in fact the probability of rejecting the null hypothesis is 0.06 . Use this information to calculate the value of \(\mu\).
OCR S2 Q8
8 A continuous random variable \(X\) has probability density function given by $$f ( x ) = \begin{cases} k x ^ { n } & 0 \leqslant x \leqslant 1
0 & \text { otherwise } \end{cases}$$ where \(n\) and \(k\) are positive constants.
  1. Find \(k\) in terms of \(n\).
  2. Show that \(\mathrm { E } ( X ) = \frac { n + 1 } { n + 2 }\). It is given that \(n = 3\).
  3. Find the variance of \(X\).
  4. One hundred observations of \(X\) are taken, and the mean of the observations is denoted by \(\bar { X }\). Write down the approximate distribution of \(\bar { X }\), giving the values of any parameters.
  5. Write down the mean and the variance of the random variable \(Y\) with probability density function given by $$g ( y ) = \begin{cases} 4 \left( y + \frac { 4 } { 5 } \right) ^ { 3 } & - \frac { 4 } { 5 } \leqslant y \leqslant \frac { 1 } { 5 }
    0 & \text { otherwise } \end{cases}$$ \section*{June 2006} 1 Calculate the variance of the continuous random variable with probability density function given by $$f ( x ) = \begin{cases} \frac { 3 } { 37 } x ^ { 2 } & 3 \leqslant x \leqslant 4
    0 & \text { otherwise } \end{cases}$$ 2
  6. The random variable \(R\) has the distribution \(\mathrm { B } ( 6 , p )\). A random observation of \(R\) is found to be 6. Carry out a \(5 \%\) significance test of the null hypothesis \(\mathrm { H } _ { 0 } : p = 0.45\) against the alternative hypothesis \(\mathrm { H } _ { 1 } : p \neq 0.45\), showing all necessary details of your calculation.
  7. The random variable \(S\) has the distribution \(\mathrm { B } ( n , p ) . \mathrm { H } _ { 0 }\) and \(\mathrm { H } _ { 1 }\) are as in part (i). A random observation of \(S\) is found to be 1 . Use tables to find the largest value of \(n\) for which \(\mathrm { H } _ { 0 }\) is not rejected. Show the values of any relevant probabilities. 3 The continuous random variable \(T\) has mean \(\mu\) and standard deviation \(\sigma\). It is known that \(\mathrm { P } ( T < 140 ) = 0.01\) and \(\mathrm { P } ( T < 300 ) = 0.8\).
  8. Assuming that \(T\) is normally distributed, calculate the values of \(\mu\) and \(\sigma\). In fact, \(T\) represents the time, in minutes, taken by a randomly chosen runner in a public marathon, in which about \(10 \%\) of runners took longer than 400 minutes.
  9. State with a reason whether the mean of \(T\) would be higher than, equal to, or lower than the value calculated in part (i). 4
  10. Explain briefly what is meant by a random sample. Random numbers are used to select, with replacement, a sample of size \(n\) from a population numbered 000, 001, 002, ..., 799.
  11. If \(n = 6\), find the probability that exactly 4 of the selected sample have numbers less than 500 .
  12. If \(n = 60\), use a suitable approximation to calculate the probability that at least 40 of the selected sample have numbers less than 500 . 5 An airline has 300 seats available on a flight to Australia. It is known from experience that on average only \(99 \%\) of those who have booked seats actually arrive to take the flight, the remaining \(1 \%\) being called 'no-shows'. The airline therefore sells more than 300 seats. If more than 300 passengers then arrive, the flight is over-booked. Assume that the number of no-show passengers can be modelled by a binomial distribution.
  13. If the airline sells 303 seats, state a suitable distribution for the number of no-show passengers, and state a suitable approximation to this distribution, giving the values of any parameters. Using the distribution and approximation in part (i),
  14. show that the probability that the flight is over-booked is 0.4165 , correct to 4 decimal places,
  15. find the largest number of seats that can be sold for the probability that the flight is over-booked to be less than 0.2 . \section*{June 2006} 6 Customers arrive at a post office at a constant average rate of 0.4 per minute.
  16. State an assumption needed to model the number of customers arriving in a given time interval by a Poisson distribution. Assuming that the use of a Poisson distribution is justified,
  17. find the probability that more than 2 customers arrive in a randomly chosen 1 -minute interval,
  18. use a suitable approximation to calculate the probability that more than 55 customers arrive in a given two-hour interval,
  19. calculate the smallest time for which the probability that no customers arrive in that time is less than 0.02 , giving your answer to the nearest second. 7 Three independent researchers, \(A , B\) and \(C\), carry out significance tests on the power consumption of a manufacturer's domestic heaters. The power consumption, \(X\) watts, is a normally distributed random variable with mean \(\mu\) and standard deviation 60. Each researcher tests the null hypothesis \(\mathrm { H } _ { 0 } : \mu = 4000\) against the alternative hypothesis \(\mathrm { H } _ { 1 } : \mu > 4000\). Researcher \(A\) uses a sample of size 50 and a significance level of \(5 \%\).
  20. Find the critical region for this test, giving your answer correct to 4 significant figures. In fact the value of \(\mu\) is 4020 .
  21. Calculate the probability that Researcher \(A\) makes a Type II error.
  22. Researcher \(B\) uses a sample bigger than 50 and a significance level of \(5 \%\). Explain whether the probability that Researcher \(B\) makes a Type II error is less than, equal to, or greater than your answer to part (ii).
  23. Researcher \(C\) uses a sample of size 50 and a significance level bigger than \(5 \%\). Explain whether the probability that Researcher \(C\) makes a Type II error is less than, equal to, or greater than your answer to part (ii).
  24. State with a reason whether it is necessary to use the Central Limit Theorem at any point in this question. 1 The random variable \(H\) has the distribution \(\mathrm { N } \left( \mu , 5 ^ { 2 } \right)\). It is given that \(\mathrm { P } ( H < 22 ) = 0.242\). Find the value of \(\mu\). 2 A school has 900 pupils. For a survey, Jan obtains a list of all the pupils, numbered 1 to 900 in alphabetical order. She then selects a sample by the following method. Two fair dice, one red and one green, are thrown, and the number in the list of the first pupil in the sample is determined by the following table.
    \cline { 3 - 8 } \multicolumn{2}{c|}{}Score on green dice
    \cline { 3 - 8 } \multicolumn{2}{c|}{}123456
    Score on
    red dice
    1,2 or 3123456
    For example, if the scores on the red and green dice are 5 and 2 respectively, then the first member of the sample is the pupil numbered 8 in the list. Starting with this first number, every 12th number on the list is then used, so that if the first pupil selected is numbered 8 , the others will be numbered \(20,32,44 , \ldots\).
  25. State the size of the sample.
  26. Explain briefly whether the following statements are true.
    (a) Each pupil in the school has an equal probability of being in the sample.
    (b) The pupils in the sample are selected independently of one another.
  27. Give a reason why the number of the first pupil in the sample should not be obtained simply by adding together the scores on the two dice. Justify your answer. 3 A fair dice is thrown 90 times. Use an appropriate approximation to find the probability that the number 1 is obtained 14 or more times. 4 A set of observations of a random variable \(W\) can be summarised as follows: $$n = 14 , \quad \Sigma w = 100.8 , \quad \Sigma w ^ { 2 } = 938.70 .$$
  28. Calculate an unbiased estimate of the variance of \(W\).
  29. The mean of 70 observations of \(W\) is denoted by \(\bar { W }\). State the approximate distribution of \(\bar { W }\), including unbiased estimate(s) of any parameter(s). \section*{Jan 2007} 5 On a particular night, the number of shooting stars seen per minute can be modelled by the distribution \(\operatorname { Po(0.2). }\)
  30. Find the probability that, in a given 6 -minute period, fewer than 2 shooting stars are seen.
  31. Find the probability that, in 20 periods of 6 minutes each, the number of periods in which fewer than 2 shooting stars are seen is exactly 13 .
  32. Use a suitable approximation to find the probability that, in a given 2-hour period, fewer than 30 shooting stars are seen. 6 The continuous random variable \(X\) has the following probability density function: $$f ( x ) = \begin{cases} a + b x & 0 \leqslant x \leqslant 2
    0 & \text { otherwise } \end{cases}$$ where \(a\) and \(b\) are constants.
  33. Show that \(2 a + 2 b = 1\).
  34. It is given that \(\mathrm { E } ( X ) = \frac { 11 } { 9 }\). Use this information to find a second equation connecting \(a\) and \(b\), and hence find the values of \(a\) and \(b\).
  35. Determine whether the median of \(X\) is greater than, less than, or equal to \(\mathrm { E } ( X )\). A television company believes that the proportion of households that can receive Channel C is 0.35 .
  36. In a random sample of 14 households it is found that 2 can receive Channel C. Test, at the \(2.5 \%\) significance level, whether there is evidence that the proportion of households that can receive Channel C is less than 0.35 .
  37. On another occasion the test is carried out again, with the same hypotheses and significance level as in part (i), but using a new sample, of size \(n\). It is found that no members of the sample can receive Channel C. Find the largest value of \(n\) for which the null hypothesis is not rejected. Show all relevant working. \section*{Jan 2007} 8 The quantity, \(X\) milligrams per litre, of silicon dioxide in a certain brand of mineral water is a random variable with distribution \(\mathrm { N } \left( \mu , 5.6 ^ { 2 } \right)\).
  38. A random sample of 80 observations of \(X\) has sample mean 100.7. Test, at the \(1 \%\) significance level, the null hypothesis \(\mathrm { H } _ { 0 } : \mu = 102\) against the alternative hypothesis \(\mathrm { H } _ { 1 } : \mu \neq 102\).
  39. The test is redesigned so as to meet the following conditions.
    • The hypotheses are \(\mathrm { H } _ { 0 } : \mu = 102\) and \(\mathrm { H } _ { 1 } : \mu < 102\).
    • The significance level is \(1 \%\).
    • The probability of making a Type II error when \(\mu = 100\) is to be (approximately) 0.05 .
    The sample size is \(n\), and the critical region is \(\bar { X } < c\), where \(\bar { X }\) denotes the sample mean.
    (a) Show that \(n\) and \(c\) satisfy (approximately) the equation \(102 - c = \frac { 13.0256 } { \sqrt { n } }\).
    (b) Find another equation satisfied by \(n\) and \(c\).
    (c) Hence find the values of \(n\) and \(c\). \section*{June 2007} 1 A random sample of observations of a random variable \(X\) is summarised by $$n = 100 , \quad \Sigma x = 4830.0 , \quad \Sigma x ^ { 2 } = 249 \text { 509.16. }$$
  40. Obtain unbiased estimates of the mean and variance of \(X\).
  41. The sample mean of 100 observations of \(X\) is denoted by \(\bar { X }\). Explain whether you would need any further information about the distribution of \(X\) in order to estimate \(\mathrm { P } ( \bar { X } > 60 )\). [You should not attempt to carry out the calculation.] 2 It is given that on average one car in forty is yellow. Using a suitable approximation, find the probability that, in a random sample of 130 cars, exactly 4 are yellow. 3 The proportion of adults in a large village who support a proposal to build a bypass is denoted by \(p\). A random sample of size 20 is selected from the adults in the village, and the members of the sample are asked whether or not they support the proposal.
  42. Name the probability distribution that would be used in a hypothesis test for the value of \(p\).
  43. State the properties of a random sample that explain why the distribution in part (i) is likely to be a good model.
    \(4 X\) is a continuous random variable.
  44. State two conditions needed for \(X\) to be well modelled by a normal distribution.
  45. It is given that \(X \sim \mathrm {~N} \left( 50.0,8 ^ { 2 } \right)\). The mean of 20 random observations of \(X\) is denoted by \(\bar { X }\). Find \(\mathrm { P } ( \bar { X } > 47.0 )\). 5 The number of system failures per month in a large network is a random variable with the distribution \(\operatorname { Po } ( \lambda )\). A significance test of the null hypothesis \(\mathrm { H } _ { 0 } : \lambda = 2.5\) is carried out by counting \(R\), the number of system failures in a period of 6 months. The result of the test is that \(\mathrm { H } _ { 0 }\) is rejected if \(R > 23\) but is not rejected if \(R \leqslant 23\).
  46. State the alternative hypothesis.
  47. Find the significance level of the test.
  48. Given that \(\mathrm { P } ( R > 23 ) < 0.1\), use tables to find the largest possible actual value of \(\lambda\). You should show the values of any relevant probabilities. 6 In a rearrangement code, the letters of a message are rearranged so that the frequency with which any particular letter appears is the same as in the original message. In ordinary German the letter \(e\) appears \(19 \%\) of the time. A certain encoded message of 20 letters contains one letter \(e\).
  49. Using an exact binomial distribution, test at the \(10 \%\) significance level whether there is evidence that the proportion of the letter \(e\) in the language from which this message is a sample is less than in German, i.e., less than \(19 \%\).
  50. Give a reason why a binomial distribution might not be an appropriate model in this context. 7 Two continuous random variables \(S\) and \(T\) have probability density functions as follows. $$\begin{array} { l l } S : & f ( x ) = \begin{cases} \frac { 1 } { 2 } & - 1 \leqslant x \leqslant 1
    0 & \text { otherwise } \end{cases}
    T : & g ( x ) = \begin{cases} \frac { 3 } { 2 } x ^ { 2 } & - 1 \leqslant x \leqslant 1
    0 & \text { otherwise } \end{cases} \end{array}$$
  51. Sketch on the same axes the graphs of \(y = \mathrm { f } ( x )\) and \(y = \mathrm { g } ( x )\). [You should not use graph paper or attempt to plot points exactly.]
  52. Explain in everyday terms the difference between the two random variables.
  53. Find the value of \(t\) such that \(\mathrm { P } ( T > t ) = 0.2\). 8 A random variable \(Y\) is normally distributed with mean \(\mu\) and variance 12.25. Two statisticians carry out significance tests of the hypotheses \(\mathrm { H } _ { 0 } : \mu = 63.0 , \mathrm { H } _ { 1 } : \mu > 63.0\).
  54. Statistician \(A\) uses the mean \(\bar { Y }\) of a sample of size 23, and the critical region for his test is \(\bar { Y } > 64.20\). Find the significance level for \(A\) 's test.
  55. Statistician \(B\) uses the mean of a sample of size 50 and a significance level of \(5 \%\).
    (a) Find the critical region for \(B\) 's test.
    (b) Given that \(\mu = 65.0\), find the probability that \(B\) 's test results in a Type II error.
  56. Given that, when \(\mu = 65.0\), the probability that \(A\) 's test results in a Type II error is 0.1365 , state with a reason which test is better.
OCR S2 Q9
27 marks
9
  1. The random variable \(G\) has the distribution \(\mathrm { B } ( n , 0.75 )\). Find the set of values of \(n\) for which the distribution of \(G\) can be well approximated by a normal distribution.
  2. The random variable \(H\) has the distribution \(\mathrm { B } ( n , p )\). It is given that, using a normal approximation, \(\mathrm { P } ( H \geqslant 71 ) = 0.0401\) and \(\mathrm { P } ( H \leqslant 46 ) = 0.0122\).
    1. Find the mean and standard deviation of the approximating normal distribution.
    2. Hence find the values of \(n\) and \(p\). 1 The random variable \(T\) is normally distributed with mean \(\mu\) and standard deviation \(\sigma\). It is given that \(\mathrm { P } ( T > 80 ) = 0.05\) and \(\mathrm { P } ( T > 50 ) = 0.75\). Find the values of \(\mu\) and \(\sigma\). 2 A village has a population of 600 people. A sample of 12 people is obtained as follows. A list of all 600 people is obtained and a three-digit number, between 001 and 600 inclusive, is allocated to each name in alphabetical order. Twelve three-digit random numbers, between 001 and 600 inclusive, are obtained and the people whose names correspond to those numbers are chosen.
    3. Find the probability that all 12 of the numbers chosen are 500 or less.
    4. When the selection has been made, it is found that all of the numbers chosen are 500 or less. One of the people in the village says, "The sampling method must have been biased." Comment on this statement. 3 The random variable \(G\) has the distribution \(\operatorname { Po } ( \lambda )\). A test is carried out of the null hypothesis \(\mathrm { H } _ { 0 } : \lambda = 4.5\) against the alternative hypothesis \(\mathrm { H } _ { 1 } : \lambda \neq 4.5\), based on a single observation of \(G\). The critical region for the test is \(G \leqslant 1\) and \(G \geqslant 9\).
    5. Find the significance level of the test.
    6. Given that \(\lambda = 5.5\), calculate the probability that the test results in a Type II error. 4 The random variable \(Y\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\). The results of 40 independent observations of \(Y\) are summarised by $$\Sigma y = 3296.0 , \quad \Sigma y ^ { 2 } = 286800.40$$
    7. Calculate unbiased estimates of \(\mu\) and \(\sigma ^ { 2 }\).
    8. Use your answers to part (i) to estimate the probability that a single random observation of \(Y\) will be less than 60.0.
    9. Explain whether it is necessary to know that \(Y\) is normally distributed in answering part (i) of this question. 5 Over a long period the number of visitors per week to a stately home was known to have the distribution \(\mathrm { N } \left( 500,100 ^ { 2 } \right)\). After higher car parking charges were introduced, a sample of four randomly chosen weeks gave a mean number of visitors per week of 435 . You should assume that the number of visitors per week is still normally distributed with variance \(100 ^ { 2 }\).
    10. Test, at the \(10 \%\) significance level, whether there is evidence that the mean number of visitors per week has fallen.
    11. Explain why it is necessary to assume that the distribution of the number of visitors per week (after the introduction of higher charges) is normal in order to carry out the test. \section*{Jan 2008} 6 The number of house sales per week handled by an estate agent is modelled by the distribution \(\operatorname { Po } ( 3 )\).
    12. Find the probability that, in one randomly chosen week, the number of sales handled is
  3. greater than 4 ,
  4. exactly 4 .
    (ii) Use a suitable approximation to the Poisson distribution to find the probability that, in a year consisting of 50 working weeks, the estate agent handles more than 165 house sales.
    (iii) One of the conditions needed for the use of a Poisson model to be valid is that house sales are independent of one another.
  5. Explain, in non-technical language, what you understand by this condition.
  6. State another condition that is needed. 7 A continuous random variable \(X _ { 1 }\) has probability density function given by $$f ( x ) = \begin{cases} k x & 0 \leqslant x \leqslant 2
    0 & \text { otherwise } \end{cases}$$ where \(k\) is a constant.
    1. Show that \(k = \frac { 1 } { 2 }\).
    2. Sketch the graph of \(y = \mathrm { f } ( x )\).
    3. Find \(\mathrm { E } \left( X _ { 1 } \right)\) and \(\operatorname { Var } \left( X _ { 1 } \right)\).
    4. Sketch the graph of \(y = \mathrm { f } ( x - 1 )\).
    5. The continuous random variable \(X _ { 2 }\) has probability density function \(\mathrm { f } ( x - 1 )\) for all \(x\). Write down the values of \(\mathrm { E } \left( X _ { 2 } \right)\) and \(\operatorname { Var } \left( X _ { 2 } \right)\). 8 Consultations are taking place as to whether a site currently in use as a car park should be developed as a shopping mall. An agency acting on behalf of a firm of developers claims that at least \(65 \%\) of the local population are in favour of the development. In a survey of a random sample of 12 members of the local population, 6 are in favour of the development.
    6. Carry out a test, at the \(10 \%\) significance level, to determine whether the result of the survey is consistent with the claim of the agency.
    7. A local residents' group claims that no more than \(35 \%\) of the local population are in favour of the development. Without further calculations, state with a reason what can be said about the claim of the local residents' group.
    8. A test is carried out, at the \(15 \%\) significance level, of the agency's claim. The test is based on a random sample of size \(2 n\), and exactly \(n\) of the sample are in favour of the development. Find the smallest possible value of \(n\) for which the outcome of the test is to reject the agency's claim.
      [0pt] [4] \section*{June 2008} 1 The head teacher of a school asks for volunteers from among the pupils to take part in a survey on political interests.
    9. Explain why a sample consisting of all the volunteers is unlikely to give a true picture of the political interests of all pupils in the school.
    10. Describe a better method of obtaining the sample. 2 The annual salaries of employees in a company have mean \(\pounds 30000\) and standard deviation \(\pounds 12000\).
    11. Assuming a normal distribution, calculate the probability that the salary of one randomly chosen employee lies between \(\pounds 20000\) and \(\pounds 24000\).
    12. The salary structure of the company is such that a small number of employees earn much higher salaries than the others. Explain what this suggests about the use of a normal distribution to model the data. 3 In a factory the time, \(T\) minutes, taken by an employee to make a single item is a normally distributed random variable with mean 28.0. A new ventilation system is installed, after which the times taken to produce a random sample of 40 items are measured. The sample mean is 26.44 minutes and it is given that \(\frac { \Sigma t ^ { 2 } } { 40 } - 26.44 ^ { 2 } = 37.05\). Test, at the \(10 \%\) significance level, whether there is evidence of a change in the mean time taken to make an item. 4 The random variable \(U\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\), where the value of \(\sigma\) is known. A test is carried out of the null hypothesis \(\mathrm { H } _ { 0 } : \mu = 50\) against the alternative hypothesis \(\mathrm { H } _ { 1 } : \mu > 50\). The test is carried out at the \(1 \%\) significance level and is based on a random sample of size 10 .
    13. The test is carried out once. The value of the sample mean is 53.0 . The outcome of the test is that \(\mathrm { H } _ { 0 }\) is not rejected. Show that \(\sigma > 4.08\), correct to 3 significant figures.
    14. The test is carried out repeatedly. In each test the actual value of \(\mu\) is 50 . Find the probability that the first test to result in a Type I error is the fifth to be carried out. Give your answer correct to 2 significant figures. \section*{June 2008}
    15. A continuous random variable \(X\) has probability density function given by $$f ( x ) = \begin{cases} \frac { 3 } { 4 } \left( 1 - x ^ { 2 } \right) & - 1 \leqslant x \leqslant 1
      0 & \text { otherwise } \end{cases}$$ The graph of \(y = \mathrm { f } ( x )\) is shown in the diagram.
      \includegraphics[max width=\textwidth, alt={}, center]{b415594b-e304-4d92-934b-0c567beacbd4-14_545_734_575_744} Calculate the value of \(\operatorname { Var } ( X )\).
    16. A continuous random variable \(W\) has probability density function given by $$g ( x ) = \begin{cases} k \left( 9 - x ^ { 2 } \right) & - 3 \leqslant x \leqslant 3
      0 & \text { otherwise } \end{cases}$$ where \(k\) is a constant.
  7. Sketch the graph of \(y = \mathrm { g } ( x )\).
  8. By comparing the graphs of \(y = \mathrm { f } ( x )\) and \(y = \mathrm { g } ( x )\), explain how you can tell without calculation that \(9 k < \frac { 3 } { 4 }\).
  9. State with a reason, but without calculation, whether the standard deviation of \(W\) is greater than, equal to, or less than that of \(X\).
  10. On average I receive 19 e-mails per ( 8 -hour) working day. Assuming that a Poisson distribution is a valid model, find the probability that in one randomly chosen hour I receive either 3 or 4 e-mails.
    1. State the conditions needed to use a Poisson distribution as an approximation to a binomial distribution.
    2. 108 people each throw a pair of fair six-sided dice. Use a Poisson approximation to find the probability that at least 4 people obtain a double six. \section*{June 2008} 7 Wendy analyses the number of 'dropped catches' in international cricket matches. She finds that the mean number of dropped catches per day is 2 . In a recent 5 -day match she found that there was a total of \(c\) dropped catches. She tests, at the \(5 \%\) significance level, whether the mean number of dropped catches per day has increased.
    3. State conditions needed for the number of dropped catches per day to be well modelled by a Poisson distribution. Assume now that these conditions hold.
    4. Find the probability that the test results in a Type I error.
    5. Given that \(c = 14\), carry out the test. 8 A company sponsors a series of concerts. Surveys show that on average \(40 \%\) of audience members know the name of the sponsor. As this figure is thought to be disappointingly low, the publicity material is redesigned.
    6. After the publicity material has been redesigned, a random sample of 12 audience members is obtained, and it is found that 9 members of this sample know the name of the sponsor. Test, at the \(5 \%\) significance level, whether there is evidence that the proportion of audience members who know the name of the sponsor has increased.
    7. A more detailed \(5 \%\) hypothesis test is carried out, based on a random sample of size 400 . This test produces significant evidence that the proportion of audience members knowing the name of the sponsor has increased. Using an appropriate approximation, calculate the smallest possible number of audience members in the sample of 400 who know the name of the sponsor. 1 A newspaper article consists of 800 words. For each word, the probability that it is misprinted is 0.005 , independently of all other words. Use a suitable approximation to find the probability that the total number of misprinted words in the article is no more than 6. Give a reason to justify your approximation. 2 The continuous random variable \(Y\) has the distribution \(\mathrm { N } \left( 23.0,5.0 ^ { 2 } \right)\). The mean of \(n\) observations of \(Y\) is denoted by \(\bar { Y }\). It is given that \(\mathrm { P } ( \bar { Y } > 23.625 ) = 0.0228\). Find the value of \(n\). 3 The number of incidents of radio interference per hour experienced by a certain listener is modelled by a random variable with distribution \(\operatorname { Po } ( 0.42 )\).
    8. Find the probability that the number of incidents of interference in one randomly chosen hour is
  11. 0 ,
  12. exactly 1 .
    (ii) Find the probability that the number of incidents in a randomly chosen 5-hour period is greater than 3 .
    (iii) One hundred hours of listening are monitored and the numbers of 1 -hour periods in which 0,1 , \(2 , \ldots\) incidents of interference are experienced are noted. A bar chart is drawn to represent the results. Without any further calculations, sketch the shape that you would expect for the bar chart. (There is no need to use an exact numerical scale on the frequency axis.) 4 A television company believes that the proportion of adults who watched a certain programme is 0.14 . Out of a random sample of 22 adults, it is found that 2 watched the programme.
    (i) Carry out a significance test, at the \(10 \%\) level, to determine, on the basis of this sample, whether the television company is overestimating the proportion of adults who watched the programme.
    [0pt] [8]
    (ii) The sample was selected randomly. State what properties of this method of sampling are needed to justify the use of the distribution used in your test. 5 The continuous random variables \(S\) and \(T\) have probability density functions as follows. $$\begin{array} { l l } S : & \mathrm { f } ( x ) = \begin{cases} \frac { 1 } { 4 } & - 2 \leqslant x \leqslant 2
    0 & \text { otherwise } \end{cases}
    T : & \mathrm { g } ( x ) = \begin{cases} \frac { 5 } { 64 } x ^ { 4 } & - 2 \leqslant x \leqslant 2
    0 & \text { otherwise } \end{cases} \end{array}$$ (i) Sketch, on the same axes, the graphs of f and g .
    (ii) Describe in everyday terms the difference between the distributions of the random variables \(S\) and \(T\). (Answers that comment only on the shapes of the graphs will receive no credit.)
    (iii) Calculate the variance of \(T\). Jan 2009
    6 The weight of a plastic box manufactured by a company is \(W\) grams, where \(W \sim \mathrm {~N} ( \mu , 20.25 )\). A significance test of the null hypothesis \(\mathrm { H } _ { 0 } : \mu = 50.0\), against the alternative hypothesis \(\mathrm { H } _ { 1 } : \mu \neq 50.0\), is carried out at the \(5 \%\) significance level, based on a sample of size \(n\).
    (i) Given that \(n = 81\),
  13. find the critical region for the test, in terms of the sample mean \(\bar { W }\),
  14. find the probability that the test results in a Type II error when \(\mu = 50.2\).
    (ii) State how the probability of this Type II error would change if \(n\) were greater than 81 . 7 A motorist records the time taken, \(T\) minutes, to drive a particular stretch of road on each of 64 occasions. Her results are summarised by $$\Sigma t = 876.8 , \quad \Sigma t ^ { 2 } = 12657.28$$ (i) Test, at the \(5 \%\) significance level, whether the mean time for the motorist to drive the stretch of road is greater than 13.1 minutes.
    (ii) Explain whether it is necessary to use the Central Limit Theorem in your test. 8 A sales office employs 21 representatives. Each day, for each representative, the probability that he or she achieves a sale is 0.7 , independently of other representatives. The total number of representatives who achieve a sale on any one day is denoted by \(K\).
    (i) Using a suitable approximation (which should be justified), find \(\mathrm { P } ( K \geqslant 16 )\).
    (ii) Using a suitable approximation (which should be justified), find the probability that the mean of 36 observations of \(K\) is less than or equal to 14.0 . June 2009
    1 The random variable \(H\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\). It is given that \(\mathrm { P } ( H < 105.0 ) = 0.2420\) and \(\mathrm { P } ( H > 110.0 ) = 0.6915\). Find the values of \(\mu\) and \(\sigma\), giving your answers to a suitable degree of accuracy. 2 The random variable \(D\) has the distribution \(\operatorname { Po } ( 20 )\). Using an appropriate approximation, which should be justified, calculate \(\mathrm { P } ( D \geqslant 25 )\). 3 An electronics company is developing a new sound system. The company claims that \(60 \%\) of potential buyers think that the system would be good value for money. In a random sample of 12 potential buyers, 4 thought that it would be good value for money. Test, at the \(5 \%\) significance level, whether the proportion claimed by the company is too high. 4 A survey is to be carried out to draw conclusions about the proportion \(p\) of residents of a town who support the building of a new supermarket. It is proposed to carry out the survey by interviewing a large number of people in the high street of the town, which attracts a large number of tourists.
    (i) Give two different reasons why this proposed method is inappropriate.
    (ii) Suggest a good method of carrying out the survey.
    (iii) State two statistical properties of your survey method that would enable reliable conclusions about \(p\) to be drawn. 5 In a large region of derelict land, bricks are found scattered in the earth.
    (i) State two conditions needed for the number of bricks per cubic metre to be modelled by a Poisson distribution. Assume now that the number of bricks in 1 cubic metre of earth can be modelled by the distribution Po(3).
    (ii) Find the probability that the number of bricks in 4 cubic metres of earth is between 8 and 14 inclusive.
    (iii) Find the size of the largest volume of earth for which the probability that no bricks are found is at least 0.4. 6 The continuous random variable \(R\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\). The results of 100 observations of \(R\) are summarised by $$\Sigma r = 3360.0 , \quad \Sigma r ^ { 2 } = 115782.84$$ (i) Calculate an unbiased estimate of \(\mu\) and an unbiased estimate of \(\sigma ^ { 2 }\).
    (ii) The mean of 9 observations of \(R\) is denoted by \(\bar { R }\). Calculate an estimate of \(\mathrm { P } ( \bar { R } > 32.0 )\).
    (iii) Explain whether you need to use the Central Limit Theorem in your answer to part (ii). \section*{June 2009} 7 The continuous random variable \(X\) has probability density function given by $$f ( x ) = \begin{cases} \frac { 2 } { 9 } x ( 3 - x ) & 0 \leqslant x \leqslant 3
    0 & \text { otherwise } \end{cases}$$ (i) Find the variance of \(X\).
    (ii) Show that the probability that a single observation of \(X\) lies between 0.0 and 0.5 is \(\frac { 2 } { 27 }\).
    (iii) 108 observations of \(X\) are obtained. Using a suitable approximation, find the probability that at least 10 of the observations lie between 0.0 and 0.5 .
    (iv) The mean of 108 observations of \(X\) is denoted by \(\bar { X }\). Write down the approximate distribution of \(\bar { X }\), giving the value(s) of any parameter(s). 8 In a large company the time taken for an employee to carry out a certain task is a normally distributed random variable with mean 78.0 s and unknown variance. A new training scheme is introduced and after its introduction the times taken by a random sample of 120 employees are recorded. The mean time for the sample is 76.4 s and an unbiased estimate of the population variance is \(68.9 \mathrm {~s} ^ { 2 }\).
    (i) Test, at the \(1 \%\) significance level, whether the mean time taken for the task has changed.
    (ii) It is required to redesign the test so that the probability of making a Type I error is less than 0.01 when the sample mean is 77.0 s . Calculate an estimate of the smallest sample size needed, and explain why your answer is only an estimate. 1 The values of 5 independent observations from a population can be summarised by $$\Sigma x = 75.8 , \quad \Sigma x ^ { 2 } = 1154.58 .$$ Find unbiased estimates of the population mean and variance. 2 A college has 400 students. A journalist wants to carry out a survey about food preferences and she obtains a sample of 30 pupils from the college by the following method.
    • Obtain a list of all the students.
    • Number the students, with numbers running sequentially from 0 to 399.
    • Select 30 random integers in the range 000 to 999 inclusive. If a random integer is in the range 0 to 399 , then the student with that number is selected. If the number is greater than 399 , then 400 is subtracted from the number (if necessary more than once) until an answer in the range 0 to 399 is selected, and the student with that number is selected.
      1. Explain why this method is unsatisfactory.
      2. Explain how it could be improved.
    3 In a large town, 35\% of the inhabitants have access to television channel \(C\). A random sample of 60 inhabitants is obtained. Use a suitable approximation to find the probability that 18 or fewer inhabitants in the sample have access to channel \(C\). 480 randomly chosen people are asked to estimate a time interval of 60 seconds without using a watch or clock. The mean of the 80 estimates is 58.9 seconds. Previous evidence shows that the population standard deviation of such estimates is 5.0 seconds. Test, at the \(5 \%\) significance level, whether there is evidence that people tend to underestimate the time interval. 5 The number of customers arriving at a store between 8.50 am and 9 am on Saturday mornings is a random variable which can be modelled by the distribution \(\operatorname { Po } ( 11.0 )\). Following a series of price cuts, on one particular Saturday morning 19 customers arrive between 8.50 am and 9 am . The store's management claims, first, that the mean number of customers has increased, and second, that this is due to the price cuts.
    (i) Test the first part of the claim, at the \(5 \%\) significance level.
    (ii) Comment on the second part of the claim. 6 The continuous random variable \(X\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\).
    (i) Each of the three following sets of probabilities is impossible. Give a reason in each case why the probabilities cannot both be correct. (You should not attempt to find \(\mu\) or \(\sigma\).)
  15. \(\mathrm { P } ( X > 50 ) = 0.7\) and \(\mathrm { P } ( X < 50 ) = 0.2\)
  16. \(\mathrm { P } ( X > 50 ) = 0.7\) and \(\mathrm { P } ( X > 70 ) = 0.8\)
  17. \(\quad \mathrm { P } ( X > 50 ) = 0.3\) and \(\mathrm { P } ( X < 70 ) = 0.3\)
    (ii) Given that \(\mathrm { P } ( X > 50 ) = 0.7\) and \(\mathrm { P } ( X < 70 ) = 0.7\), find the values of \(\mu\) and \(\sigma\). 7 The continuous random variable \(T\) is equally likely to take any value from 5.0 to 11.0 inclusive.
    (i) Sketch the graph of the probability density function of \(T\).
    (ii) Write down the value of \(\mathrm { E } ( T )\) and find by integration the value of \(\operatorname { Var } ( T )\).
    (iii) A random sample of 48 observations of \(T\) is obtained. Find the approximate probability that the mean of the sample is greater than 8.3, and explain why the answer is an approximation. 8 The random variable \(R\) has the distribution \(\mathrm { B } ( 10 , p )\). The null hypothesis \(\mathrm { H } _ { 0 } : p = 0.7\) is to be tested against the alternative hypothesis \(\mathrm { H } _ { 1 } : p < 0.7\), at a significance level of \(5 \%\).
    (i) Find the critical region for the test and the probability of making a Type I error.
    (ii) Given that \(p = 0.4\), find the probability that the test results in a Type II error.
    (iii) Given that \(p\) is equally likely to take the values 0.4 and 0.7 , find the probability that the test results in a Type II error. 9 Buttercups in a meadow are distributed independently of one another and at a constant average incidence of 3 buttercups per square metre.
    (i) Find the probability that in 1 square metre there are more than 7 buttercups.
    (ii) Find the probability that in 4 square metres there are either 13 or 14 buttercups.
    (iii) Use a suitable approximation to find the probability that there are no more than 69 buttercups in 20 square metres.
    (iv) (a) Without using an approximation, find an expression for the probability that in \(m\) square metres there are at least 2 buttercups.
  18. It is given that the probability that there are at least 2 buttercups in \(m\) square metres is 0.9 . Using your answer to part (a), show numerically that \(m\) lies between 1.29 and 1.3. \section*{June 2010} 1
    1. The number of inhabitants of a village who are selected for jury service in the course of a 10-year period is a random variable with the distribution \(\operatorname { Po } ( 4.2 )\).
  19. Find the probability that in the course of a 10-year period, at least 7 inhabitants are selected for jury service.
  20. Find the probability that in 1 year, exactly 2 inhabitants are selected for jury service.
    (ii) Explain why the number of inhabitants of the village who contract influenza in 1 year can probably not be well modelled by a Poisson distribution. 2 A university has a large number of students, of whom \(35 \%\) are studying science subjects. A sample of 10 students is obtained by listing all the students, giving each a serial number and selecting by using random numbers.
    (i) Find the probability that fewer than 3 of the sample are studying science subjects.
    (ii) It is required that, in selecting the sample, the same student is not selected twice. Explain whether this requirement invalidates your calculation in part (i). 3 Tennis balls are dropped from a standard height, and the height of bounce, \(H \mathrm {~cm}\), is measured. \(H\) is a random variable with the distribution \(\mathrm { N } \left( 40 , \sigma ^ { 2 } \right)\). It is given that \(\mathrm { P } ( H < 32 ) = 0.2\).
    (i) Find the value of \(\sigma\).
    (ii) 90 tennis balls are selected at random. Use an appropriate approximation to find the probability that more than 19 have \(H < 32\). 4 The proportion of commuters in a town who travel to work by train is 0.4 . Following the opening of a new station car park, a random sample of 16 commuters is obtained, and 11 of these travel to work by train. Test at the \(1 \%\) significance level whether there is evidence of an increase in the proportion of commuters in this town who travel to work by train. 5 The time \(T\) seconds needed for a computer to be ready to use, from the moment it is switched on, is a normally distributed random variable with standard deviation 5 seconds. The specification of the computer says that the population mean time should be not more than 30 seconds.
    (i) A test is carried out, at the \(5 \%\) significance level, of whether the specification is being met, using the mean \(\bar { t }\) of a random sample of 10 times.
  21. Find the critical region for the test, in terms of \(\bar { t }\).
  22. Given that the population mean time is in fact 35 seconds, find the probability that the test results in a Type II error.
    (ii) Because of system degradation and memory load, the population mean time \(\mu\) seconds increases with the number of months of use, \(m\). A formula for \(\mu\) in terms of \(m\) is \(\mu = 20 + 0.6 m\). Use this formula to find the value of \(m\) for which the probability that the test results in rejection of the null hypothesis is 0.5 . 6
  23. The random variable \(D\) has the distribution \(\operatorname { Po } ( 24 )\). Use a suitable approximation to find \(P ( D > 30 )\).
  24. An experiment consists of 200 trials. For each trial, the probability that the result is a success is 0.98 , independent of all other trials. The total number of successes is denoted by \(E\).
    1. Explain why the distribution of \(E\) cannot be well approximated by a Poisson distribution.
    2. By considering the number of failures, use an appropriate Poisson approximation to find \(\mathrm { P } ( E \leqslant 194 )\). 7 A machine is designed to make paper with mean thickness 56.80 micrometres. The thicknesses, \(x\) micrometres, of a random sample of 300 sheets are summarised by $$n = 300 , \quad \Sigma x = 17085.0 , \quad \Sigma x ^ { 2 } = 973847.0 .$$ Test, at the \(10 \%\) significance level, whether the machine is producing paper of the designed thickness.
      [0pt] [11] 8 The continuous random variable \(X\) has probability density function given by $$\mathrm { f } ( x ) = \begin{cases} k x ^ { - a } & x \geqslant 1
      0 & \text { otherwise } \end{cases}$$ where \(k\) and \(a\) are constants and \(a\) is greater than 1 .
    3. Show that \(k = a - 1\).
    4. Find the variance of \(X\) in the case \(a = 4\).
    5. It is given that \(\mathrm { P } ( X < 2 ) = 0.9\). Find the value of \(a\), correct to 3 significant figures. 1 A random sample of nine observations of a random variable is obtained. The results are summarised as $$\Sigma x = 468 , \quad \Sigma x ^ { 2 } = 24820 .$$ Calculate unbiased estimates of the population mean and variance. 2 The random variable \(H\) has the distribution \(\mathrm { N } \left( \mu , 5 ^ { 2 } \right)\). The mean of a sample of \(n\) observations of \(H\) is denoted by \(\bar { H }\). It is given that \(\mathrm { P } ( \bar { H } > 53.28 ) = 0.0250\) and \(\mathrm { P } ( \bar { H } < 51.65 ) = 0.0968\), both correct to 4 decimal places. Find the values of \(\mu\) and \(n\). 3 The probability that a randomly chosen PPhone has a faulty casing is 0.0228 . A random sample of 200 PPhones is obtained. Use a suitable approximation to find the probability that the number of PPhones in the sample with a faulty casing is 2 or fewer. Justify your approximation. 4 The continuous random variable \(X\) has mean \(\mu\) and standard deviation 45. A significance test is to be carried out of the null hypothesis \(\mathrm { H } _ { 0 } : \mu = 230\) against the alternative hypothesis \(\mathrm { H } _ { 1 } : \mu \neq 230\), at the \(1 \%\) significance level. A random sample of size 50 is obtained, and the sample mean is found to be 213.4.
    6. Carry out the test.
    7. Explain whether it is necessary to use the Central Limit Theorem in your test. 5 A temporary job is advertised annually. The number of applicants for the job is a random variable which is known from many years' experience to have a distribution \(\operatorname { Po } ( 12 )\). In 2010 there were 19 applicants for the job. Test, at the 10\% significance level, whether there is evidence of an increase in the mean number of applicants for the job. 6 The number of randomly occurring events in a given time interval is denoted by \(R\). In order that \(R\) is well modelled by a Poisson distribution, it is necessary that events occur independently.
    8. Let \(R\) represent the number of customers dining at a restaurant on a randomly chosen weekday lunchtime. Explain what the condition 'events occur independently' means in this context, and give a reason why it would probably not hold in this context. Let \(D\) represent the number of tables booked at the restaurant on a randomly chosen day. Assume that \(D\) can be well modelled by distribution \(\operatorname { Po } ( 7 )\).
    9. Find \(\mathrm { P } ( D < 5 )\).
    10. Use a suitable approximation to find the probability that, in five randomly chosen days, the total number of tables booked is greater than 40 . 7 Two continuous random variables \(S\) and \(T\) have probability density functions \(\mathrm { f } _ { S }\) and \(\mathrm { f } _ { T }\) given respectively by $$\begin{aligned} & \mathrm { f } _ { S } ( x ) = \begin{cases} \frac { a } { x ^ { 2 } } & 1 \leqslant x \leqslant 3
      0 & \text { otherwise } \end{cases}
      & \mathrm { f } _ { T } ( x ) = \begin{cases} b & 1 \leqslant x \leqslant 3
      0 & \text { otherwise } \end{cases} \end{aligned}$$ where \(a\) and \(b\) are constants.
    11. Sketch on the same axes the graphs of \(y = \mathrm { f } _ { S } ( x )\) and \(y = \mathrm { f } _ { T } ( x )\).
    12. Find the value of \(a\).
    13. Find \(\mathrm { E } ( S )\).
    14. A student gave the following description of the distribution of \(T\) : "The probability that \(T\) occurs is constant". Give an improved description, in everyday terms. 8 A company has 3600 employees, of whom \(22.5 \%\) live more than 30 miles from their workplace. A random sample of 40 employees is obtained.
    15. Use a suitable approximation, which should be justified, to find the probability that more than 5 of the employees in the sample live more than 30 miles from their workplace.
    16. Describe how to use random numbers to select a sample of 40 from a population of 3600 employees. 9 A pharmaceutical company is developing a new drug to treat a certain disease. The company will continue to develop the drug if the proportion \(p\) of those who have the disease and show a substantial improvement after treatment is greater than 0.7 . The company carries out a test, at the \(5 \%\) significance level, on a random sample of 14 patients who suffer from the disease.
    17. Find the critical region for the test.
    18. Given that 12 of the 14 patients in the sample show a substantial improvement, carry out the test.
    19. Find the probability that the test results in a Type II error if in fact \(p = 0.8\). 1 In Fisher Avenue there are 263 houses, numbered 1 to 263. Explain how to obtain a random sample of 20 of these houses. 2 The random variable \(Y\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\). It is given that $$\mathrm { P } ( Y < 48.0 ) = \mathrm { P } ( Y > 57.0 ) = 0.0668 .$$ Find the value \(y _ { 0 }\) such that \(\mathrm { P } \left( Y > y _ { 0 } \right) = 0.05\). 3 The random variable \(X\) has the distribution \(\mathrm { N } \left( \mu , 5 ^ { 2 } \right)\). A hypothesis test is carried out of \(\mathrm { H } _ { 0 } : \mu = 20.0\) against \(\mathrm { H } _ { 1 } : \mu < 20.0\), at the \(1 \%\) level of significance, based on the mean of a sample of size 16 . Given that in fact \(\mu = 15.0\), find the probability that the test results in a Type II error. 4 A continuous random variable \(X\) has probability density function $$f ( x ) = \begin{cases} \frac { 3 } { 16 } ( x - 2 ) ^ { 2 } & 0 \leqslant x \leqslant 4
      0 & \text { otherwise } \end{cases}$$
    20. Sketch the graph of \(y = \mathrm { f } ( x )\).
    21. Calculate the variance of \(X\).
    22. A student writes " \(X\) is more likely to occur when \(x\) takes values further away from 2 ". Explain whether you agree with this statement. 5 A travel company finds from its records that \(40 \%\) of its customers book with travel agents. The company redesigns its website, and then carries out a survey of 10 randomly chosen customers. The result of the survey is that 1 of these customers booked with a travel agent.
    23. Test at the 5\% significance level whether the percentage of customers who book with travel agents has decreased.
    24. The managing director says that "Our redesigned website has resulted in a decrease in the percentage of our customers who book with travel agents." Comment on this statement. 6 Records show that before the year 1990 the maximum daily temperature \(T ^ { \circ } \mathrm { C }\) at a seaside resort in August can be modelled by a distribution with mean 24.3. The maximum temperatures of a random sample of 50 August days since 1990 can be summarised by $$n = 50 , \quad \Sigma t = 1314.0 , \quad \Sigma t ^ { 2 } = 36602.17 .$$
    25. Test, at the \(1 \%\) significance level, whether there is evidence of a change in the mean maximum daily temperature in August since 1990.
    26. Give a reason why it is possible to use the Central Limit Theorem in your test. 7 The number of customer complaints received by a company per day is denoted by \(X\). Assume that \(X\) has the distribution \(\operatorname { Po } ( 2.2 )\).
    27. In a week of 5 working days, the probability there are at least \(n\) customer complaints is 0.146 correct to 3 significant figures. Use tables to find the value of \(n\).
    28. Use a suitable approximation to find the probability that in a period of 20 working days there are fewer than 38 customer complaints. A week of 5 working days in which at least \(n\) customer complaints are received, where \(n\) is the value found in part (i), is called a 'bad' week.
    29. Use a suitable approximation to find the probability that, in 40 randomly chosen weeks, more than 7 are bad. 8
  25. A group of students is discussing the conditions that are needed if a Poisson distribution is to be a good model for the number of telephone calls received by a fire brigade on a working day.
    1. Alice says "Events must be independent". Explain why this condition may not hold in this context.
    2. State a different condition that is needed. Explain whether it is likely to hold in this context.
  26. The random variables \(R , S\) and \(T\) have independent Poisson distributions with means \(\lambda , \mu\) and \(\lambda + \mu\) respectively.
    1. In the case \(\lambda = 2.74\), find \(\mathrm { P } ( R > 2 )\).
    2. In the case \(\lambda = 2\) and \(\mu = 3\), find \(\mathrm { P } ( R = 0\) and \(S = 1 ) + \mathrm { P } ( R = 1\) and \(S = 0 )\). Give your answer correct to 4 decimal places.
    3. In the general case, show algebraically that $$\mathrm { P } ( R = 0 \text { and } S = 1 ) + \mathrm { P } ( R = 1 \text { and } S = 0 ) = \mathrm { P } ( T = 1 ) .$$ 1 A random sample of 50 observations of the random variable \(X\) is summarised by $$n = 50 , \Sigma x = 182.5 , \Sigma x ^ { 2 } = 739.625 .$$ Calculate unbiased estimates of the expectation and variance of \(X\). 2 The random variable \(Y\) has the distribution \(\mathrm { B } ( 140,0.03 )\). Use a suitable approximation to find \(\mathrm { P } ( Y = 5 )\). Justify your approximation. 3 The random variable \(G\) has a normal distribution. It is known that $$\mathrm { P } ( G < 56.2 ) = \mathrm { P } ( G > 63.8 ) = 0.1 .$$ Find \(\mathrm { P } ( G > 65 )\). 4 The discrete random variable \(H\) takes values 1, 2, 3 and 4. It is given that \(\mathrm { E } ( H ) = 2.5\) and \(\operatorname { Var } ( H ) = 1.25\). The mean of a random sample of 50 observations of \(H\) is denoted by \(\bar { H }\).
      Use a suitable approximation to find \(\mathrm { P } ( \bar { H } < 2.6 )\). 5
    4. Six prizes are allocated, using random numbers, to a group of 12 girls and 8 boys. Calculate the probability that exactly 4 of the prizes are allocated to girls if
  27. the same child may win more than one prize,
  28. no child may win more than one prize.
    (ii) Sixty prizes are allocated, using random numbers, to a group of 1200 girls and 800 boys. Use a suitable approximation to calculate the probability that at least 30 of the prizes are allocated to girls. Does it affect your calculation whether or not the same child may win more than one prize? Justify your answer. 6 The number of fruit pips in 1 cubic centimetre of raspberry jam has the distribution \(\operatorname { Po } ( \lambda )\). Under a traditional jam-making process it is known that \(\lambda = 6.3\). A new process is introduced and a random sample of 1 cubic centimetre of jam produced by the new process is found to contain 2 pips. Test, at the \(5 \%\) significance level, whether this is evidence that under the new process the average number of pips has been reduced. 7 (i) The continuous random variable \(X\) has the probability density function $$f ( x ) = \begin{cases} \frac { 1 } { 2 \sqrt { x } } & 1 \leqslant x \leqslant 4
    0 & \text { otherwise } \end{cases}$$ Find (a) \(\mathrm { E } ( X )\),
  29. the median of \(X\).
    (ii) The continuous random variable \(Y\) has the probability density function $$g ( y ) = \left\{ \begin{array} { l r } \frac { 1.5 } { y ^ { 2.5 } } & y \geqslant 1
    0 & \text { otherwise. } \end{array} \right.$$ Given that \(\mathrm { E } ( Y ) = 3\), show that \(\operatorname { Var } ( Y )\) is not finite. 8 In a certain fluid, bacteria are distributed randomly and occur at a constant average rate of 2.5 in every 10 ml of the fluid.
    (i) State a further condition needed for the number of bacteria in a fixed volume of the fluid to be well modelled by a Poisson distribution, explaining what your answer means. Assume now that a Poisson model is appropriate.
    (ii) Find the probability that in 10 ml there are at least 5 bacteria.
    (iii) Find the probability that in 3.7 ml there are exactly 2 bacteria.
    (iv) Use a suitable approximation to find the probability that in 1000 ml there are fewer than 240 bacteria, justifying your approximation. 9 It is desired to test whether the average amount of sleep obtained by school pupils in Year 11 is 8 hours, based on a random sample of size 64. The population standard deviation is 0.87 hours and the sample mean is denoted by \(\bar { H }\). The critical values for the test are \(\bar { H } = 7.72\) and \(\bar { H } = 8.28\).
    (i) State appropriate hypotheses for the test, explaining the meaning of any symbol you use.
    (ii) Calculate the significance level of the test.
    (iii) Explain what is meant by a Type I error in this context.
    (iv) Given that in fact the average amount of sleep obtained by all pupils in Year 11 is 7.9 hours, find the probability that the test results in a Type II error. 1 In one day's production, a machine produces 1000 CDs . Explain how to take a random sample of 15 CDs chosen from one day’s production. 2 (i) For the continuous random variable \(V\), it is known that \(\mathrm { E } ( V ) = 72.0\). The mean of a random sample of 40 observations of \(V\) is denoted by \(\bar { V }\). Given that \(\mathrm { P } ( \bar { V } < 71.2 ) = 0.35\), estimate the value of \(\operatorname { Var } ( V )\).
    (ii) Explain why you need to use the Central Limit Theorem in part (i), and why its use is justified. 3 It is known that on average one person in three prefers the colour of a certain object to be blue. In a psychological test, 12 randomly chosen people were seated in a room with blue walls, and asked to state independently which colour they preferred for the object. Seven of the 12 people said that they preferred blue. Carry out a significance test, at the \(5 \%\) level, of whether the statement "on average one person in three prefers the colour of the object to be blue" is true for people who are seated in a room with blue walls. 4 In a rock, small crystal formations occur at a constant average rate of 3.2 per cubic metre.
    (i) State a further assumption needed to model the number of crystal formations in a fixed volume of rock by a Poisson distribution. In the remainder of the question, you should assume that a Poisson model is appropriate.
    (ii) Calculate the probability that in one cubic metre of rock there are exactly 5 crystal formations.
    (iii) Calculate the probability that in 0.74 cubic metres of rock there are at least 3 crystal formations.
    (iv) Use a suitable approximation to calculate the probability that in 10 cubic metres of rock there are at least 36 crystal formations. 5 The acidity \(A\) (measured in pH ) of soil of a particular type has a normal distribution. The pH values of a random sample of 80 soil samples from a certain region can be summarised as $$\Sigma a = 496 , \quad \Sigma a ^ { 2 } = 3126 .$$ Test, at the \(10 \%\) significance level, whether in this region the mean pH of soil is 6.1. 6 At a tourist car park, a survey is made of the regions from which cars come.
    (i) It is given that \(40 \%\) of cars come from the London region. Use a suitable approximation to find the probability that, in a random sample of 32 cars, more than 17 come from the London region. Justify your approximation.
    (ii) It is given that \(1 \%\) of cars come from France. Use a suitable approximation to find the probability that, in a random sample of 90 cars, exactly 3 come from France. \section*{June 2012} 7 The continuous random variable \(X\) has probability density function $$f ( x ) = \begin{cases} k x ^ { 2 } & 0 \leqslant x \leqslant a
    0 & \text { otherwise } \end{cases}$$ where \(a\) and \(k\) are constants.
    (i) Sketch the graph of \(y = \mathrm { f } ( x )\) and explain in non-technical language what this tells you about \(X\).
    (ii) Given that \(\mathrm { E } ( X ) = 4.5\), find
  30. the value of \(a\),
  31. \(\operatorname { Var } ( X )\). 8 The random variable \(X\) has the distribution \(\mathrm { N } \left( \mu , 8 ^ { 2 } \right)\). A test is carried out, at the \(5 \%\) significance level, of \(\mathrm { H } _ { 0 } : \mu = 30\) against \(\mathrm { H } _ { 1 } : \mu > 30\), based on a random sample of size 18 .
    1. Find the critical region for the test.
    2. If \(\mu = 30\) and the outcome of the test is that \(\mathrm { H } _ { 0 }\) is rejected, state the type of error that is made. On a particular day this test is carried out independently a total of 20 times, and for 4 of these tests the outcome is that \(\mathrm { H } _ { 0 }\) is rejected. It is known that the value of \(\mu\) remains the same throughout these 20 tests.
    3. Find the probability that \(\mathrm { H } _ { 0 }\) is rejected at least 4 times if \(\mu = 30\). Hence state whether you think that \(\mu = 30\), giving a reason.
    4. Given that the probability of making an error of the type different from that stated in part (ii) is 0.4 , calculate the actual value of \(\mu\), giving your answer correct to 4 significant figures. 1 A random variable has the distribution \(\mathrm { B } ( n , p )\). It is required to test \(\mathrm { H } _ { 0 } : p = \frac { 2 } { 3 }\) against \(\mathrm { H } _ { 1 } : p < \frac { 2 } { 3 }\) at a significance level as close to \(1 \%\) as possible, using a sample of size \(n = 8,9\) or 10 . Use tables to find which value of \(n\) gives such a test, stating the critical region for the test and the corresponding significance level.
      [0pt] [4]
      2 A random variable \(C\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\). A random sample of 10 observations of \(C\) is obtained, and the results are summarised as $$n = 10 , \Sigma c = 380 , \Sigma c ^ { 2 } = 14602 .$$
    5. Calculate unbiased estimates of \(\mu\) and \(\sigma ^ { 2 }\).
    6. Hence calculate an estimate of the probability that \(C > 40\). 3 A factory produces 9000 music DVDs each day. A random sample of 100 such DVDs is obtained.
    7. Explain how to obtain this sample using random numbers.
    8. Given that \(24 \%\) of the DVDs produced by the factory are classical, use a suitable approximation to find the probability that, in the sample of 100 DVDs, fewer than 20 are classical. 4 A continuous random variable \(X\) has probability density function $$f ( x ) = \left\{ \begin{array} { c l } k x & 0 \leqslant x \leqslant a
      0 & \text { otherwise } \end{array} \right.$$ where \(k\) and \(a\) are constants.
    9. State what the letter \(x\) represents.
    10. Find \(k\) in terms of \(a\).
    11. Find \(\operatorname { Var } ( X )\) in terms of \(a\). 5 In a mine, a deposit of the substance pitchblende emits radioactive particles. The number of particles emitted has a Poisson distribution with mean 70 particles per second. The warning level is reached if the total number of particles emitted in one minute is more than 4350.
    12. A one-minute period is chosen at random. Use a suitable approximation to show that the probability that the warning level is reached during this period is 0.01 , correct to 2 decimal places. You should calculate the answer correct to 4 decimal places.
    13. Use a suitable approximation to find the probability that in 30 one-minute periods the warning level is reached on at least 4 occasions. (You should use the given rounded value of 0.01 from part (i) in your calculation.) 6 Gordon is a cricketer. Over a long period he knows that his population mean score, in number of runs per innings, is 28 , and the population standard deviation is 12 . In a new season he adopts a different batting style and he finds that in 30 innings using this style his mean score is 28.98 .
    14. Stating a necessary assumption, test at the \(5 \%\) significance level whether his population mean score has increased.
    15. Explain whether it was necessary to use the Central Limit Theorem in part (i). 7 The continuous random variable \(X\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\). The mean of a random sample of \(n\) observations of \(X\) is denoted by \(\bar { X }\). It is given that \(\mathrm { P } ( \bar { X } < 35.0 ) = 0.9772\) and \(\mathrm { P } ( \bar { X } < 20.0 ) = 0.1587\).
    16. Obtain a formula for \(\sigma\) in terms of \(n\). Two students are discussing this question. Aidan says "If you were told another probability, for instance \(\mathrm { P } ( \bar { X } > 32 ) = 0.1\), you could work out the value of \(\sigma\)." Binya says, "No, the value of \(\mathrm { P } ( \bar { X } > 32 )\) is fixed by the information you know already."
    17. State which of Aidan and Binya is right. If you think that Aidan is right, calculate the value of \(\sigma\) given that \(\mathrm { P } ( \bar { X } > 32 ) = 0.1\). If you think that Binya is right, calculate the value of \(\mathrm { P } ( \bar { X } > 32 )\). 8 In a large city the number of traffic lights that fail in one day of 24 hours is denoted by \(Y\). It may be assumed that failures occur randomly.
    18. Explain what the statement "failures occur randomly" means.
    19. State, in context, two different conditions that must be satisfied if \(Y\) is to be modelled by a Poisson distribution, and for each condition explain whether you think it is likely to be met in this context.
    20. For this part you may assume that \(Y\) is well modelled by the distribution \(\operatorname { Po } ( \lambda )\). It is given that \(\mathrm { P } ( Y = 7 ) = \mathrm { P } ( Y = 8 )\). Use an algebraic method to calculate the value of \(\lambda\) and hence calculate the corresponding value of \(\mathrm { P } ( Y = 7 )\). 9 The random variable \(A\) has the distribution \(\mathrm { B } ( 30 , p )\). A test is carried out of the hypotheses \(\mathrm { H } _ { 0 } : p = 0.6\) against \(\mathrm { H } _ { 1 } : p < 0.6\). The critical region is \(A \leqslant 13\).
    21. State the probability that \(\mathrm { H } _ { 0 }\) is rejected when \(p = 0.6\).
    22. Find the probability that a Type II error occurs when \(p = 0.5\).
    23. It is known that on average \(p = 0.5\) on one day in five, and on other days the value of \(p\) is 0.6 . On each day two tests are carried out. If the result of the first test is that \(\mathrm { H } _ { 0 }\) is rejected, the value of \(p\) is adjusted if necessary, to ensure that \(p = 0.6\) for the rest of the day. Otherwise the value of \(p\) remains the same as for the first test. Calculate the probability that the result of the second test is to reject \(\mathrm { H } _ { 0 }\). \section*{June 2013} 1 It is required to select a random sample of 30 pupils from a school with 853 pupils. A student suggests the following method.
      "Give each pupil sequentially a three-digit number from 001 to 853 . Use a calculator to generate random three-digit numbers from 0.000 to 0.999 inclusive, multiply the answer by 853 , add 1 and round off to the nearest whole number. Select the corresponding pupil, and repeat as necessary".
    24. Determine which pupil would be picked for each of the following calculator outputs: $$0.103 , \quad 0.104 , \quad 0.105 , \quad 0.106 , \quad 0.107$$
    25. Use your answers to part (i) to show that this method is biased, and suggest an improvement. 2 The number of neutrinos that pass through a certain region in one second is a random variable with the distribution \(\operatorname { Po } \left( 5 \times 10 ^ { 4 } \right)\). Use a suitable approximation to calculate the probability that the number of neutrinos passing through the region in 40 seconds is less than \(1.999 \times 10 ^ { 6 }\). 3 The mean of a sample of 80 independent observations of a continuous random variable \(Y\) is denoted by \(\bar { Y }\). It is given that \(\mathrm { P } ( \bar { Y } \leqslant 157.18 ) = 0.1\) and \(\mathrm { P } ( \bar { Y } \geqslant 164.76 ) = 0.7\).
    26. Calculate \(\mathrm { E } ( Y )\) and the standard deviation of \(Y\).
    27. State
  32. where in your calculations you have used the Central Limit Theorem,
  33. why it was necessary to use the Central Limit Theorem,
  34. why it was possible to use the Central Limit Theorem. 4 The number of floods in a certain river plain is known to have a Poisson distribution. It is known that up until 10 years ago the mean number of floods per year was 0.32 . During the last 10 years there were 6 floods. Test at the \(1 \%\) significance level whether there is evidence of an increase in the mean number of floods per year. 5 Two random variables \(S\) and \(T\) have probability density functions given by $$\begin{aligned} & f _ { S } ( x ) = \begin{cases} \frac { 3 } { a ^ { 3 } } ( x - a ) ^ { 2 } & 0 \leqslant x \leqslant a
    0 & \text { otherwise } \end{cases}
    & f _ { T } ( x ) = \begin{cases} c & 0 \leqslant x \leqslant a
    0 & \text { otherwise } \end{cases} \end{aligned}$$ where \(a\) and \(c\) are constants.
    1. On a single diagram sketch both probability density functions.
    2. Calculate the mean of \(S\), in terms of \(a\).
    3. Use your diagram to explain which of \(S\) or \(T\) has the bigger variance. (Answers obtained by calculation will score no marks.) 6 The random variable \(X\) denotes the yield, in kilograms per acre, of a certain crop. Under the standard treatment it is known that \(\mathrm { E } ( X ) = 38.4\). Under a new treatment, the yields of 50 randomly chosen regions can be summarised as $$n = 50 , \quad \sum x = 1834.0 , \quad \sum x ^ { 2 } = 70027.37 .$$ Test at the \(1 \%\) level whether there has been a change in the mean crop yield. 7 Past experience shows that \(35 \%\) of the senior pupils in a large school know the regulations about bringing cars to school. The head teacher addresses this subject in an assembly, and afterwards a random sample of 120 senior pupils is selected. In this sample it is found that 50 of these pupils know the regulations. Use a suitable approximation to test, at the \(10 \%\) significance level, whether there is evidence that the proportion of senior pupils who know the regulations has increased. Justify your approximation. 8 The random variable \(R\) has the distribution \(\mathrm { B } ( 14 , p )\). A test is carried out at the \(\alpha \%\) significance level of the null hypothesis \(\mathrm { H } _ { 0 } : p = 0.25\), against \(\mathrm { H } _ { 1 } : p > 0.25\).
    4. Given that \(\alpha\) is as close to 5 as possible, find the probability of a Type II error when the true value of \(p\) is 0.4 .
    5. State what happens to the probability of a Type II error as
  35. \(p\) increases from 0.4,
  36. \(\alpha\) increases, giving a reason. 9 The managers of a car breakdown recovery service are discussing whether the number of breakdowns per day can be modelled by a Poisson distribution. They agree that breakdowns occur randomly. Manager \(A\) says, "it must be assumed that breakdowns occur at a constant rate throughout the day".
    1. Give an improved version of Manager \(A\) 's statement, and explain why the improvement is necessary.
    2. Explain whether you think your improved statement is likely to hold in this context. Assume now that the number \(B\) of breakdowns per day can be modelled by the distribution \(\operatorname { Po } ( \lambda )\).
    3. Given that \(\lambda = 9.0\) and \(\mathrm { P } \left( B > B _ { 0 } \right) < 0.1\), use tables to find the smallest possible value of \(B _ { 0 }\), and state the corresponding value of \(\mathrm { P } \left( B > B _ { 0 } \right)\).
    4. Given that \(\mathrm { P } ( B = 2 ) = 0.0072\), show that \(\lambda\) satisfies an equation of the form \(\lambda = 0.12 \mathrm { e } ^ { k \lambda }\), for a value of \(k\) to be stated. Evaluate the expression \(0.12 \mathrm { e } ^ { k \lambda }\) for \(\lambda = 8.5\) and \(\lambda = 8.6\), giving your answers correct to 4 decimal places. What can be deduced about a possible value of \(\lambda\) ? 1 The random variable \(F\) has the distribution \(\mathrm { B } ( 50,0.7 )\). Use a suitable approximation to find \(\mathrm { P } ( F > 40 )\). 2 The events organiser of a school sends out invitations to 150 people to attend its prize day. From past experience the organiser knows that the number of those who will come to the prize day can be modelled by the distribution \(\mathrm { B } ( 150,0.98 )\).
    5. Explain why this distribution cannot be well approximated by either a normal or a Poisson distribution.
    6. By considering the number of those who do not attend, use a suitable approximation to find the probability that fewer than 146 people attend. 3 The random variable \(G\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\). One hundred observations of \(G\) are taken. The results are summarised in the following table.
      Interval\(G < 40.0\)\(40.0 \leqslant G < 60.0\)\(G \geqslant 60.0\)
      Frequency175825
    7. By considering \(\mathrm { P } ( G < 40.0 )\), write down an equation involving \(\mu\) and \(\sigma\).
    8. Find a second equation involving \(\mu\) and \(\sigma\). Hence calculate values for \(\mu\) and \(\sigma\).
    9. Explain why your answers are only estimates. 4 A zoologist investigates the number of snakes found in a given region of land. The zoologist intends to use a Poisson distribution to model the number of snakes.
    10. One condition for a Poisson distribution to be valid is that snakes must occur at constant average rate. State another condition needed for a Poisson distribution to be valid. Assume now that the number of snakes found in 1 acre of a region can be modelled by the distribution \(\mathrm { Po } ( 4 )\).
    11. Find the probability that, in 1 acre of the region, at least 6 snakes are found.
    12. Find the probability that, in 0.77 acres of the region, the number of snakes found is either 2 or 3 . \section*{June 2014} 5 A continuous random variable \(X\) has probability density function $$f ( x ) = \begin{cases} \frac { 1 } { 2 } \pi \sin ( \pi x ) & 0 \leqslant x \leqslant 1
      0 & \text { otherwise } \end{cases}$$
    13. Show that this is a valid probability density function.
    14. Sketch the curve \(y = \mathrm { f } ( x )\) and write down the value of \(\mathrm { E } ( X )\).
    15. Find the value \(q\) such that \(\mathrm { P } ( X > q ) = 0.75\).
    16. Write down an expression, including an integral, for \(\operatorname { Var } ( X )\). (Do not attempt to evaluate the integral.)
    17. A student states that " \(X\) is more likely to occur when \(x\) is close to \(\mathrm { E } ( X )\)." Give an improved version of this statement. 6 In a city the proportion of inhabitants from ethnic group \(Z\) is known to be 0.4 . A sample of 12 employees of a large company in this city is obtained and it is found that 2 of them are from ethnic group \(Z\). A test is carried out, at the \(5 \%\) significance level, of whether the proportion of employees in this company from ethnic group \(Z\) is less than in the city as a whole.
    18. State an assumption that must be made about the sample for a significance test to be valid.
    19. Describe briefly an appropriate way of obtaining the sample.
    20. Carry out the test.
    21. A manager believes that the company discriminates against ethnic group \(Z\). Explain whether carrying out the test at the \(10 \%\) significance level would be more supportive or less supportive of the manager's belief. 7 An examination board is developing a new syllabus and wants to know if the question papers are the right length. A random sample of 50 candidates was given a pre-test on a dummy paper. The times, \(t\) minutes, taken by these candidates to complete the paper can be summarised by $$n = 50 , \quad \sum t = 4050 , \quad \sum t ^ { 2 } = 329800$$ Assume that times are normally distributed.
    22. Estimate the proportion of candidates that could not complete the paper within 90 minutes.
    23. Test, at the \(10 \%\) significance level, whether the mean time for all candidates to complete this paper is 80 minutes. Use a two-tail test.
    24. Explain whether the assumption that times are normally distributed is necessary in answering
  37. part (i),
  38. part (ii). \section*{June 2014} 8 The random variable \(W\) has the distribution \(\operatorname { Po } ( \lambda )\). A significance test is carried out of the null hypothesis \(\mathrm { H } _ { 0 } : \lambda = 3.60\), against the alternative hypothesis \(\mathrm { H } _ { 1 } : \lambda < 3.60\). The test is based on a single observation of \(W\). The critical region is \(W = 0\).
    1. Find the significance level of the test.
    2. It is known that, when \(\lambda = \lambda _ { 0 }\), the probability that the test results in a Type II error is 0.8 . Find the value of \(\lambda _ { 0 }\). \section*{END OF QUESTION PAPER}
OCR S2 2010 January Q1
1 The values of 5 independent observations from a population can be summarised by $$\Sigma x = 75.8 , \quad \Sigma x ^ { 2 } = 1154.58 .$$ Find unbiased estimates of the population mean and variance.
OCR S2 2010 January Q2
2 A college has 400 students. A journalist wants to carry out a survey about food preferences and she obtains a sample of 30 pupils from the college by the following method.
  • Obtain a list of all the students.
  • Number the students, with numbers running sequentially from 0 to 399.
  • Select 30 random integers in the range 000 to 999 inclusive. If a random integer is in the range 0 to 399 , then the student with that number is selected. If the number is greater than 399 , then 400 is subtracted from the number (if necessary more than once) until an answer in the range 0 to 399 is selected, and the student with that number is selected.
    1. Explain why this method is unsatisfactory.
    2. Explain how it could be improved.
OCR S2 2010 January Q3
3 In a large town, 35\% of the inhabitants have access to television channel \(C\). A random sample of 60 inhabitants is obtained. Use a suitable approximation to find the probability that 18 or fewer inhabitants in the sample have access to channel \(C\). 480 randomly chosen people are asked to estimate a time interval of 60 seconds without using a watch or clock. The mean of the 80 estimates is 58.9 seconds. Previous evidence shows that the population standard deviation of such estimates is 5.0 seconds. Test, at the 5\% significance level, whether there is evidence that people tend to underestimate the time interval.
OCR S2 2010 January Q5
5 The number of customers arriving at a store between 8.50 am and 9 am on Saturday mornings is a random variable which can be modelled by the distribution \(\operatorname { Po } ( 11.0 )\). Following a series of price cuts, on one particular Saturday morning 19 customers arrive between 8.50 am and 9 am . The store's management claims, first, that the mean number of customers has increased, and second, that this is due to the price cuts.
  1. Test the first part of the claim, at the \(5 \%\) significance level.
  2. Comment on the second part of the claim.
OCR S2 2010 January Q6
6 The continuous random variable \(X\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\).
  1. Each of the three following sets of probabilities is impossible. Give a reason in each case why the probabilities cannot both be correct. (You should not attempt to find \(\mu\) or \(\sigma\).)
    (a) \(\mathrm { P } ( X > 50 ) = 0.7\) and \(\mathrm { P } ( X < 50 ) = 0.2\)
    (b) \(\mathrm { P } ( X > 50 ) = 0.7\) and \(\mathrm { P } ( X > 70 ) = 0.8\)
    (c) \(\quad \mathrm { P } ( X > 50 ) = 0.3\) and \(\mathrm { P } ( X < 70 ) = 0.3\)
  2. Given that \(\mathrm { P } ( X > 50 ) = 0.7\) and \(\mathrm { P } ( X < 70 ) = 0.7\), find the values of \(\mu\) and \(\sigma\).
OCR S2 2010 January Q7
7 The continuous random variable \(T\) is equally likely to take any value from 5.0 to 11.0 inclusive.
  1. Sketch the graph of the probability density function of \(T\).
  2. Write down the value of \(\mathrm { E } ( T )\) and find by integration the value of \(\operatorname { Var } ( T )\).
  3. A random sample of 48 observations of \(T\) is obtained. Find the approximate probability that the mean of the sample is greater than 8.3, and explain why the answer is an approximation.
OCR S2 2010 January Q8
8 The random variable \(R\) has the distribution \(\mathrm { B } ( 10 , p )\). The null hypothesis \(\mathrm { H } _ { 0 } : p = 0.7\) is to be tested against the alternative hypothesis \(\mathrm { H } _ { 1 } : p < 0.7\), at a significance level of \(5 \%\).
  1. Find the critical region for the test and the probability of making a Type I error.
  2. Given that \(p = 0.4\), find the probability that the test results in a Type II error.
  3. Given that \(p\) is equally likely to take the values 0.4 and 0.7 , find the probability that the test results in a Type II error.
OCR S2 2010 January Q9
9 Buttercups in a meadow are distributed independently of one another and at a constant average incidence of 3 buttercups per square metre.
  1. Find the probability that in 1 square metre there are more than 7 buttercups.
  2. Find the probability that in 4 square metres there are either 13 or 14 buttercups.
  3. Use a suitable approximation to find the probability that there are no more than 69 buttercups in 20 square metres.
  4. (a) Without using an approximation, find an expression for the probability that in \(m\) square metres there are at least 2 buttercups.
    (b) It is given that the probability that there are at least 2 buttercups in \(m\) square metres is 0.9 . Using your answer to part (a), show numerically that \(m\) lies between 1.29 and 1.3.