OCR Further Statistics (Further Statistics) 2018 December

Question 1
View details
1 The performance of a piece of music is being recorded. The piece consists of three sections, \(A , B\) and \(C\). The times, in seconds, taken to perform the three sections are normally distributed random variables with the following means and standard deviations.
SectionMeanStandard deviation
\(A\)26413
\(B\)1739
\(C\)26413
  1. Assume first that the times for the three sections are independent. Find the probability that the total length of the performance is greater than 720.0 seconds.
  2. In fact sections \(A\) and \(C\) are musically identical, and the recording is made by using a single performance of section \(A\) twice, together with a performance of section \(B\). In this case find the probability that the total length of the performance is greater than 720.0 seconds.
Question 2
View details
2 In a fairground game a competitor scores \(0,1,2\) or 3 with probabilities given in the following table, where \(a\) and \(b\) are constants.
Score0123
Probability\(a\)\(b\)\(b\)\(b\)
The competitor's expected score is 0.9 .
  1. Show that \(b = 0.15\).
  2. Find the variance of the score.
  3. The competitor has to pay \(\pounds 2.50\) to take part, and wins a prize of \(\pounds 2 X\), where \(X\) is the score achieved. Find the expectation of the competitor's loss.
Question 3
View details
3
  1. Alex places 20 black counters and 8 white counters into a bag. She removes 8 counters at random without replacement. Find the probability that the bag now contains exactly 5 white counters.
  2. Bill arranges 8 blue counters and 4 green counters in a random order in a straight line. Find the probability that exactly three of the green counters are next to one another.
Question 4
View details
4 Leyla investigates the number of shoppers who visit a shop between 10.30 am and 11 am on Saturday mornings. She makes the following assumptions.
  • Shoppers visit the shop independently of one another.
  • The average rate at which shoppers visit the shop between these times is constant.
    1. State an appropriate distribution with which Leyla could model the number of shoppers who visit the shop between these times.
Leyla uses this distribution, with mean 14, as her model.
  • Calculate the probability that, between 10.35 am and 10.50 am on a randomly chosen Saturday, at least 10 shoppers visit the shop. Leyla chooses 25 Saturdays at random.
  • Find the expected number of Saturdays, out of 25, on which there are no visitors to the shop between 10.35 am and 10.50 am .
  • In fact on 5 of these Saturdays there were no visitors to the shop between 10.35 am and 10.50 am . Use this fact to comment briefly on the validity of the model that Leyla has used.
  • Question 5
    View details
    5 The birth rate, \(x\) per thousand members of the population, and the life expectancy at birth, \(y\) years, in 14 randomly selected African countries are given in the table.
    Country\(x\)\(y\)Country\(x\)\(y\)
    Benin4.859.2Mozambique5.454.63
    Cameroon4.754.87Nigeria5.752.29
    Congo4.961.42Senegal5.165.81
    Gambia5.759.83Somalia6.554.88
    Liberia4.760.25Sudan4.463.08
    Malawi5.160.97Uganda5.857.25
    Mauretania4.662.77Zambia5.458.75
    \(n = 14 , \sum x = 72.8 , \sum y = 826 , \sum x ^ { 2 } = 392.96 , \sum y ^ { 2 } = 48924.54 , \sum x y = 4279.16\)
    1. Calculate Pearson's product-moment correlation coefficient \(r\) for the data.
    2. State what would be the effect on the value of \(r\) if the birth rate were given per hundred and not per thousand.
    3. Explain what the sign of \(r\) tells you about the relationship between life expectancy and birth rate for these countries.
    4. Test at the \(5 \%\) significance level whether there is correlation between birth rate and life expectancy at birth in African countries.
    5. A researcher wants to estimate the life expectancy at birth in Zimbabwe, where the birth rate is 3.9 per thousand. Explain whether a reliable estimate could be obtained using the regression line of \(y\) on \(x\) for the given data.
    Question 6
    View details
    6 The reaction times, in milliseconds, of all adult males in a standard experiment have a symmetrical distribution with mean and median both equal to 700 and standard deviation 125. The reaction times of a random sample of 6 international athletes are measured and the results are as follows:
    \(\begin{array} { l l l l l l } 702 & 631 & 540 & 714 & 575 & 480 \end{array}\) It is required to test whether international athletes have a mean reaction time which is less than 700.
    1. Assume first that the reaction times of international athletes have the distribution \(\mathrm { N } \left( \mu , 125 ^ { 2 } \right)\). Test at the \(5 \%\) significance level whether \(\mu < 700\).
    2. Now assume only that the distribution of the data is symmetrical, but not necessarily normal.
      1. State with a reason why a Wilcoxon test is preferable to a sign test.
      2. Use an appropriate Wilcoxon test at the \(5 \%\) significance level to test whether the median reaction time of international athletes is less than 700 .
    3. Explain why the significance tests in part (a) and part (b)(ii) could produce different results.
    Question 7 3 marks
    View details
    7 Sasha tends to forget his passwords. He investigates whether the number of attempts he needs to log on to a system with a password can be modelled by a geometric distribution. On 60 occasions he records the number of attempts he needs to log on, and the results are shown in the table.
    Number of attempts1234 or more
    Frequency2019133
    1. Test at the \(1 \%\) significance level whether the results are consistent with the distribution Geo(0.4).
      [0pt]
    2. Suggest which two probabilities should be changed, and in what way, to produce an improved model. (Numerical values are not required.) You should give a reason for your suggestion. [3]
    Question 8
    View details
    8 A continuous random variable \(X\) has probability density function given by the following function, where \(a\) is a constant.
    \(\mathrm { f } ( x ) = \left\{ \begin{array} { l l } \frac { 2 x } { a ^ { 2 } } & 0 \leqslant x \leqslant a ,
    0 & \text { otherwise. } \end{array} \right\}\)
    The expected value of \(X\) is 4 .
    1. Show that \(a = 6\). Five independent observations of \(X\) are obtained, and the largest of them is denoted by \(M\).
    2. Find the cumulative distribution function of \(M\). \section*{OCR} Oxford Cambridge and RSA