OCR MEI S4 (Statistics 4) 2006 June

Question 1
View details
1 A parcel is weighed, independently, on two scales. The weights are given by the random variables \(W _ { 1 }\) and \(W _ { 2 }\) which have underlying Normal distributions as follows. $$W _ { 1 } \sim \mathrm {~N} \left( \mu , \sigma _ { 1 } ^ { 2 } \right) , \quad W _ { 2 } \sim \mathrm {~N} \left( \mu , \sigma _ { 2 } ^ { 2 } \right) ,$$ where \(\mu\) is an unknown parameter and \(\sigma _ { 1 } ^ { 2 }\) and \(\sigma _ { 2 } ^ { 2 }\) are taken as known.
  1. Show that the maximum likelihood estimator of \(\mu\) is $$\hat { \mu } = \frac { \sigma _ { 2 } ^ { 2 } } { \sigma _ { 1 } ^ { 2 } + \sigma _ { 2 } ^ { 2 } } W _ { 1 } + \frac { \sigma _ { 1 } ^ { 2 } } { \sigma _ { 1 } ^ { 2 } + \sigma _ { 2 } ^ { 2 } } W _ { 2 } .$$ [You may quote the probability density function of the general Normal distribution from page 9 in the MEI Examination Formulae and Tables Booklet (MF2).]
  2. Show that \(\hat { \mu }\) is an unbiased estimator of \(\mu\).
  3. Obtain the variance of \(\hat { \mu }\).
  4. A simpler estimator \(T = \frac { 1 } { 2 } \left( W _ { 1 } + W _ { 2 } \right)\) is proposed. Write down the variance of \(T\) and hence show that the relative efficiency of \(T\) with respect to \(\hat { \mu }\) is $$y = \left( \frac { 2 \sigma _ { 1 } \sigma _ { 2 } } { \sigma _ { 1 } ^ { 2 } + \sigma _ { 2 } ^ { 2 } } \right) ^ { 2 }$$
  5. Show that \(y \leqslant 1\) for all values of \(\sigma _ { 1 } ^ { 2 }\) and \(\sigma _ { 2 } ^ { 2 }\). Explain why this means that \(\hat { \mu }\) is preferable to \(T\) as an estimator of \(\mu\).
Question 2 8 marks
View details
2 [In this question, you may use the result \(\int _ { 0 } ^ { \infty } u ^ { m } \mathrm { e } ^ { - u } \mathrm {~d} u = m\) ! for any non-negative integer \(m\).]
The random variable \(X\) has probability density function $$\mathrm { f } ( x ) = \begin{cases} \frac { \lambda ^ { k + 1 } x ^ { k } \mathrm { e } ^ { - \lambda x } } { k ! } , & x > 0
0 , & \text { elsewhere } \end{cases}$$ where \(\lambda > 0\) and \(k\) is a non-negative integer.
  1. Show that the moment generating function of \(X\) is \(\left( \frac { \lambda } { \lambda - \theta } \right) ^ { k + 1 }\).
  2. The random variable \(Y\) is the sum of \(n\) independent random variables each distributed as \(X\). Find the moment generating function of \(Y\) and hence obtain the mean and variance of \(Y\). [8]
  3. State the probability density function of \(Y\).
  4. For the case \(\lambda = 1 , k = 2\) and \(n = 5\), it may be shown that the definite integral of the probability density function of \(Y\) between limits 10 and \(\infty\) is 0.9165 . Calculate the corresponding probability that would be given by a Normal approximation and comment briefly.
Question 3
View details
3 The human resources department of a large company is investigating two methods, A and B, for training employees to carry out a certain complicated and intricate task.
  1. Two separate random samples of employees who have not previously performed the task are taken. The first sample is of size 10 ; each of the employees in it is trained by method A. The second sample is of size 12; each of the employees in it is trained by method B. After completing the training, the time for each employee to carry out the task is measured, in controlled conditions. The times are as follows, in minutes.
    Employees trained by method A:35.247.825.838.053.631.033.9
    35.421.642.5
    Employees trained by method B:43.057.568.620.931.444.962.8
    27.641.846.139.861.6
    Stating appropriate assumptions concerning the underlying populations, use a \(t\) test at the \(5 \%\) significance level to examine whether either training method is better in respect of leading, on the whole, to a lower time to carry out the task.
  2. A further trial of method B is carried out to see if the performance of experienced and skilled workers can be improved by re-training them. A random sample of 8 such workers is taken. The times in minutes, under controlled conditions, for each worker to carry out the task before and after re-training are as follows.
    Worker\(W _ { 1 }\)\(W _ { 2 }\)\(W _ { 3 }\)\(W _ { 4 }\)\(W _ { 5 }\)\(W _ { 6 }\)\(W _ { 7 }\)\(W _ { 8 }\)
    Time before32.628.522.927.634.928.834.231.3
    Time after26.224.119.028.629.320.036.019.2
    Stating an appropriate assumption, use a \(t\) test at the \(5 \%\) significance level to examine whether the re-training appears, on the whole, to lead to a lower time to carry out the task.
  3. Explain how the test procedure in part (ii) is enhanced by designing it as a paired comparison.
Question 4 12 marks
View details
4 An experiment is carried out to compare five industrial paints, A, B, C, D, E, that are intended to be used to protect exterior surfaces in polluted urban environments. Five different types of surface (I, II, III, IV, V) are to be used in the experiment, and five specimens of each type of surface are available. Five different external locations ( \(1,2,3,4,5\) ) are used in the experiment. The paints are applied to the specimens of the surfaces which are then left in the locations for a period of six months. At the end of this period, a "score" is given to indicate how effective the paint has been in protecting the surface.
  1. Name a suitable experimental design for this trial and give an example of an experimental layout. Initial analysis of the data indicates that any differences between the types of surface are negligible, as also are any differences between the locations. It is therefore decided to analyse the data by one-way analysis of variance.
  2. State the usual model, including the accompanying distributional assumptions, for the one-way analysis of variance. Interpret the terms in the model.
  3. The data for analysis are as follows. Higher scores indicate better performance.
    Paint APaint BPaint CPaint DPaint E
    6466596564
    5868567852
    7376696956
    6070607261
    6771637158
    [The sum of these data items is 1626 and the sum of their squares is 106838 .]
    Construct the usual one-way analysis of variance table. Carry out the appropriate test, using a 5\% significance level. Report briefly on your conclusions.
    [0pt] [12]