OCR MEI S2 (Statistics 2) 2009 January

Question 1
View details
1 A researcher is investigating whether there is a relationship between the population size of cities and the average walking speed of pedestrians in the city centres. Data for the population size, \(x\) thousands, and the average walking speed of pedestrians, \(y \mathrm {~m} \mathrm {~s} ^ { - 1 }\), of eight randomly selected cities are given in the table below.
\(x\)18435294982067841530
\(y\)1.150.971.261.351.281.421.321.64
  1. Calculate the value of Spearman's rank correlation coefficient.
  2. Carry out a hypothesis test at the \(5 \%\) significance level to determine whether there is any association between population size and average walking speed. In another investigation, the researcher selects a random sample of six adult males of particular ages and measures their maximum walking speeds. The data are shown in the table below, where \(t\) years is the age of the adult and \(w \mathrm {~m} \mathrm {~s} ^ { - 1 }\) is the maximum walking speed. Also shown are summary statistics and a scatter diagram on which the regression line of \(w\) on \(t\) is drawn.
    \(t\)203040506070
    \(w\)2.492.412.382.141.972.03
    $$n = 6 \quad \Sigma t = 270 \quad \Sigma w = 13.42 \quad \Sigma t ^ { 2 } = 13900 \quad \Sigma w ^ { 2 } = 30.254 \quad \Sigma t w = 584.6$$ \includegraphics[max width=\textwidth, alt={}, center]{77b97142-afb6-41d6-8fec-e982b7a7501b-2_728_1091_1379_529}
  3. Calculate the equation of the regression line of \(w\) on \(t\).
  4. (A) Use this equation to calculate an estimate of maximum walking speed of an 80 -year-old male.
    (B) Explain why it might not be appropriate to use the equation to calculate an estimate of maximum walking speed of a 10 -year-old male.
Question 2
View details
2 Clover stems usually have three leaves. Occasionally a clover stem has four leaves. This is considered by some to be lucky and is known as a four-leaf clover. On average 1 in 10000 clover stems is a four-leaf clover. You may assume that four-leaf clovers occur randomly and independently. A random sample of 5000 clover stems is selected.
  1. State the exact distribution of \(X\), the number of four-leaf clovers in the sample.
  2. Explain why \(X\) may be approximated by a Poisson distribution. Write down the mean of this Poisson distribution.
  3. Use this Poisson distribution to find the probability that the sample contains at least one four-leaf clover.
  4. Find the probability that in 20 samples, each of 5000 clover stems, there are exactly 9 samples which contain at least one four-leaf clover.
  5. Find the expected number of these 20 samples which contain at least one four-leaf clover. The table shows the numbers of four-leaf clovers in these 20 samples.
    Number of four-leaf clovers012\(> 2\)
    Number of samples11720
  6. Calculate the mean and variance of the data in the table.
  7. Briefly comment on whether your answers to parts (v) and (vi) support the use of the Poisson approximating distribution in part (iii).
Question 3
View details
3 The number of minutes, \(X\), for which a particular model of laptop computer will run on battery power is Normally distributed with mean 115.3 and standard deviation 21.9.
  1. (A) Find \(\mathrm { P } ( X < 120 )\).
    (B) Find \(\mathrm { P } ( 100 < X < 110 )\).
    (C) Find the value of \(k\) for which \(\mathrm { P } ( X > k ) = 0.9\). The number of minutes, \(Y\), for which a different model of laptop computer will run on battery power is known to be Normally distributed with mean \(\mu\) and standard deviation \(\sigma\).
  2. Given that \(\mathrm { P } ( Y < 180 ) = 0.7\) and \(\mathrm { P } ( Y < 140 ) = 0.15\), find the values of \(\mu\) and \(\sigma\).
  3. Find values of \(a\) and \(b\) for which \(\mathrm { P } ( a < Y < b ) = 0.95\).
Question 4
View details
4 A gardening research organisation is running a trial to examine the growth and the size of flowers of various plants.
  1. In the trial, seeds of three types of plant are sown. The growth of each plant is classified as good, average or poor. The results are shown in the table.
    \multirow{2}{*}{}Growth\multirow[t]{2}{*}{Row totals}
    GoodAveragePoor
    \multirow{3}{*}{Type of plant}Coriander12281555
    Aster7182348
    Fennel14221147
    Column totals336849150
    Carry out a test at the \(5 \%\) significance level to examine whether there is any association between growth and type of plant. State carefully your null and alternative hypotheses. Include a table of the contributions of each cell to the test statistic.
  2. It is known that the diameter of marigold flowers is Normally distributed with mean 47 mm and standard deviation 8.5 mm . A certain fertiliser is expected to cause flowers to have a larger mean diameter, but without affecting the standard deviation. A large number of marigolds are grown using this fertiliser. The diameters of a random sample of 50 of the flowers are measured and the mean diameter is found to be 49.2 mm . Carry out a hypothesis test at the \(1 \%\) significance level to check whether flowers grown with this fertiliser appear to be larger on average. Use hypotheses \(\mathrm { H } _ { 0 } : \mu = 47 , \mathrm { H } _ { 1 } : \mu > 47\), where \(\mu \mathrm { mm }\) represents the mean diameter of all marigold flowers grown with this fertiliser.