OCR MEI S3 (Statistics 3) 2012 January

Question 1
View details
1
  1. Define simple random sampling. Describe briefly one difficulty associated with simple random sampling.
  2. Freeze-drying is an economically important process used in the production of coffee. It improves the retention of the volatile aroma compounds. In order to maintain the quality of the coffee, technologists need to monitor the drying rate, measured in suitable units, at regular intervals. It is known that, for best results, the mean drying rate should be 70.3 units and anything substantially less than this would be detrimental to the coffee. Recently, a random sample of 12 observations of the drying rate was as follows. $$\begin{array} { l l l l l l l l l l l l } 66.0 & 66.1 & 59.8 & 64.0 & 70.9 & 71.4 & 66.9 & 76.2 & 65.2 & 67.9 & 69.2 & 68.5 \end{array}$$
    1. Carry out a test to investigate at the \(5 \%\) level of significance whether the mean drying rate appears to be less than 70.3. State the distributional assumption that is required for this test.
    2. Find a 95\% confidence interval for the true mean drying rate.
Question 2
View details
2 In a particular chain of supermarkets, one brand of pasta shapes is sold in small packets and large packets. Small packets have a mean weight of 505 g and a standard deviation of 11 g . Large packets have a mean weight of 1005 g and a standard deviation of 17 g . It is assumed that the weights of packets are Normally distributed and are independent of each other.
  1. Find the probability that a randomly chosen large packet weighs between 995 g and 1020 g .
  2. Find the probability that the weights of two randomly chosen small packets differ by less than 25 g .
  3. Find the probability that the total weight of two randomly chosen small packets exceeds the weight of a randomly chosen large packet.
  4. Find the probability that the weight of one randomly chosen small packet exceeds half the weight of a randomly chosen large packet by at least 5 g .
  5. A different brand of pasta shapes is sold in packets of which the weights are assumed to be Normally distributed with standard deviation 14 g . A random sample of 20 packets of this pasta is found to have a mean weight of 246 g . Find a \(95 \%\) confidence interval for the population mean weight of these packets.
Question 3
View details
3
  1. A medical researcher is looking into the delay, in years, between first and second myocardial infarctions (heart attacks). The following table shows the results for a random sample of 225 patients.
    Delay (years)\(0 -\)\(1 -\)\(2 -\)\(3 -\)\(4 - 10\)
    Number of patients160401393
    The mean of this sample is used to construct a model which gives the following expected frequencies.
    Delay (years)\(0 -\)\(1 -\)\(2 -\)\(3 -\)\(4 - 10\)
    Number of patients142.2352.3219.257.084.12
    Carry out a test, using a \(2.5 \%\) level of significance, of the goodness of fit of the model to the data.
  2. A further piece of research compares the incidence of myocardial infarction in men aged 55 to 70 with that in women aged 55 to 70 . Incidence is measured by the number of infarctions per 10000 of the population. For a random sample of 8 health authorities across the UK, the following results for the year 2010 were obtained.
    Health authorityABCDEFGH
    Incidence in men4756155145545032
    Incidence in women3630304754552727
    A Wilcoxon paired sample test, using the hypotheses \(\mathrm { H } _ { 0 } : m = 0\) and \(\mathrm { H } _ { 1 } : m \neq 0\) where \(m\) is the population median difference, is to be carried out to investigate whether there is any difference between men and women on the whole.
    1. Explain why a paired test is being used in this context.
    2. Carry out the test using a \(10 \%\) level of significance.
Question 4
View details
4 At the school summer fair, one of the games involves throwing darts at a circular dartboard of radius \(a\) lying on the ground some distance away. Only darts that land on the board are counted. The distance from the centre of the board to the point where a dart lands is modelled by the random variable \(R\). It is assumed that the probability that a dart lands inside a circle of radius \(r\) is proportional to the area of the circle.
  1. By considering \(\mathrm { P } ( R < r )\) show that \(\mathrm { F } ( r )\), the cumulative distribution function of \(R\), is given by $$\mathrm { F } ( r ) = \begin{cases} 0 & r < 0 ,
    \frac { r ^ { 2 } } { a ^ { 2 } } & 0 \leqslant r \leqslant a ,
    1 & r > a . \end{cases}$$
  2. Find \(\mathrm { f } ( r )\), the probability density function of \(R\).
  3. Find \(\mathrm { E } ( R )\) and show that \(\operatorname { Var } ( R ) = \frac { a ^ { 2 } } { 18 }\). The radius \(a\) of the dartboard is 22.5 cm .
  4. Let \(\bar { R }\) denote the mean distance from the centre of the board of a random sample of 100 darts. Write down an approximation to the distribution of \(\bar { R }\).
  5. A random sample of 100 darts is found to give a mean distance of 13.87 cm . Does this cast any doubt on the modelling?