OCR MEI S3 (Statistics 3) 2012 June

Question 1
View details
1 Technologists at a company that manufactures paint are trying to develop a new type of gloss paint with a shorter drying time than the current product. In order to test whether the drying time has been reduced, the technologists paint a square metre of each of the new and old paints on each of 10 different surfaces. The lengths of time, in hours, that each square metre takes to dry are as follows.
SurfaceABCDEFGHIJ
Old paint16.617.016.515.616.316.516.415.916.316.1
New paint15.916.316.315.915.516.616.116.016.215.6
  1. Explain why a paired sample is used in this context.
  2. The mean reduction in drying time is to be investigated. Why might a \(t\) test be appropriate in this context and what assumption needs to be made?
  3. Using a significance level of \(5 \%\), carry out a test to see if there appears to be any reduction in mean drying time.
  4. Find a 95\% confidence interval for the true mean reduction in drying time.
Question 2 10 marks
View details
2
    1. Give two reasons why an investigator might need to take a sample in order to obtain information about a population.
    2. State two requirements of a sample.
    3. Discuss briefly the advantage of the sampling being random.
    1. Under what circumstances might one use a Wilcoxon single sample test in order to test a hypothesis about the median of a population? What distributional assumption is needed for the test?
    2. On a stretch of road leading out of the centre of a town, highways officials have been monitoring the speed of the traffic in case it has increased. Previously it was known that the median speed on this stretch was 28.7 miles per hour. For a random sample of 12 vehicles on the stretch, the following speeds were recorded. $$\begin{array} { l l l l l l l l l l l l } 32.0 & 29.1 & 26.1 & 35.2 & 34.4 & 28.6 & 32.3 & 28.5 & 27.0 & 33.3 & 28.2 & 31.9 \end{array}$$ Carry out a test, with a \(5 \%\) significance level, to see whether the speed of the traffic on this stretch of road seems to have increased on the whole.
      [0pt] [10]
Question 3
View details
3 The triathlon is a sports event in which competitors take part in three stages, swimming, cycling and running, one straight after the other. The winner is the competitor with the shortest overall time. In this question the times for the separate stages are assumed to be Normally distributed and independent of each other. For a particular triathlon event in which there was a very large number of competitors, the mean and standard deviation of the times, measured in minutes, for each stage were as follows.
Mean
Standard
deviation
Swimming11.072.36
Cycling57.338.76
Running24.233.75
  1. For a randomly chosen competitor, find the probability that the swimming time is between 10 and 13 minutes.
  2. For a randomly chosen competitor, find the probability that the running time exceeds the swimming time by more than 10 minutes.
  3. For a randomly chosen competitor, find the probability that the swimming and running times combined exceed \(\frac { 2 } { 3 }\) of the cycling time.
  4. In a different triathlon event the total times, in minutes, for a random sample of 12 competitors were as follows. $$\begin{array} { l l l l l l l l l l l l } 103.59 & 99.04 & 85.03 & 81.34 & 106.79 & 89.14 & 98.55 & 98.22 & 108.87 & 116.29 & 102.51 & 92.44 \end{array}$$ Find a 95\% confidence interval for the mean time of all competitors in this event.
  5. Discuss briefly whether the assumptions of Normality and independence for the stages of triathlon events are reasonable.
Question 4
View details
4 The numbers of call-outs per day received by a fire station for a random sample of 255 weekdays were recorded as follows.
Number of call-outs012345 or more
Frequency (days)1457922630
The mean number of call-outs per day for these data is 0.6 . A Poisson model, using this sample mean of 0.6 , is fitted to the data, and gives the following expected frequencies (correct to 3 decimal places).
Number of call-outs012345 or more
Expected frequency139.94783.96825.1905.0380.7560.101
  1. Using a \(5 \%\) significance level, carry out a test to examine the goodness of fit of the model to the data. The time \(T\), measured in days, that elapses between successive call-outs can be modelled using the exponential distribution for which \(\mathrm { f } ( t )\), the probability density function, is $$\mathrm { f } ( t ) = \begin{cases} 0 & t < 0 ,
    \lambda \mathrm { e } ^ { - \lambda t } & t \geqslant 0 , \end{cases}$$ where \(\lambda\) is a positive constant.
  2. For the distribution above, it can be shown that \(\mathrm { E } ( T ) = \frac { 1 } { \lambda }\). Given that the mean time between successive call-outs is \(\frac { 5 } { 3 }\) days, write down the value of \(\lambda\).
  3. Find \(\mathrm { F } ( t )\), the cumulative distribution function.
  4. Find the probability that the time between successive call-outs is more than 1 day.
  5. Find the median time that elapses between successive call-outs.