OCR MEI S2 (Statistics 2) 2012 June

Question 1
View details
1 The times, in seconds, taken by ten randomly selected competitors for the first and last sections of an Olympic bobsleigh run are denoted by \(x\) and \(y\) respectively. Summary statistics for these data are as follows. $$\Sigma x = 113.69 \quad \Sigma y = 52.81 \quad \Sigma x ^ { 2 } = 1292.56 \quad \Sigma y ^ { 2 } = 278.91 \quad \Sigma x y = 600.41 \quad n = 10$$
  1. Calculate the sample product moment correlation coefficient.
  2. Carry out a hypothesis test at the \(10 \%\) significance level to investigate whether there is any correlation between times taken for the first and last sections of the bobsleigh run.
  3. State the distributional assumption which is necessary for this test to be valid. Explain briefly how a scatter diagram may be used to check whether this assumption is likely to be valid.
  4. A commentator says that in order to have a fast time on the last section, you must have a fast time on the first section. Comment briefly on this suggestion.
  5. (A) Would your conclusion in part (ii) have been different if you had carried out the hypothesis test at the \(1 \%\) level rather than the \(10 \%\) level? Explain your answer.
    (B) State one advantage and one disadvantage of using a \(1 \%\) significance level rather than a \(10 \%\) significance level in a hypothesis test.
Question 2
View details
2 A particular genetic mutation occurs in one in every 300 births on average. A random sample of 1200 births is selected.
  1. State the exact distribution of \(X\), the number of births in the sample which have the mutation.
  2. Explain why \(X\) has, approximately, a Poisson distribution.
  3. Use a Poisson approximating distribution to find
    (A) \(\mathrm { P } ( X = 1 )\),
    (B) \(\mathrm { P } ( X > 4 )\).
  4. Twenty independent samples, each of 1200 births, are selected. State the mean and variance of a Normal approximating distribution suitable for modelling the total number of births with the mutation in the twenty samples.
  5. Use this Normal approximating distribution to
    (A) find the probability that there are at least 90 births which have the mutation,
    ( \(B\) ) find the least value of \(k\) such that the probability that there are at most \(k\) births with this mutation is greater than 5\%.
Question 3
View details
3 At a vineyard, the process used to fill bottles with wine is subject to variation. The contents of bottles are independently Normally distributed with mean \(\mu = 751.4 \mathrm { ml }\) and standard deviation \(\sigma = 2.5 \mathrm { ml }\).
  1. Find the probability that a randomly selected bottle contains at least 750 ml .
  2. A case of wine consists of 6 bottles. Find the probability that all 6 bottles in a case contain at least 750 ml .
  3. Find the probability that, in a random sample of 25 cases, there are at least 2 cases in which all 6 bottles contain at least 750 ml . It is decided to increase the proportion of bottles which contain at least 750 ml to \(98 \%\).
  4. This can be done by changing the value of \(\mu\), but retaining the original value of \(\sigma\). Find the required value of \(\mu\).
  5. An alternative is to change the value of \(\sigma\), but retain the original value of \(\mu\). Find the required value of \(\sigma\).
  6. Comment briefly on which method might be easier to implement and which might be preferable to the vineyard owners.
Question 4 9 marks
View details
4
  1. Mary is opening a cake shop. As part of her market research, she carries out a survey into which type of cake people like best. She offers people 4 types of cake to taste: chocolate, carrot, lemon and ginger. She selects a random sample of 150 people and she classifies the people as children and adults. The results are as follows.
    \multirow{2}{*}{}Classification of person\multirow{2}{*}{Row totals}
    ChildAdult
    \multirow{4}{*}{Type of cake}Chocolate342357
    Carrot161834
    Lemon41822
    Ginger132437
    Column totals6783150
    The contributions to the test statistic for the usual \(\chi ^ { 2 }\) test are shown in the table below.
    Classification of person
    \cline { 3 - 4 } \multicolumn{2}{|c|}{}ChildAdult
    \multirow{3}{*}{
    Type
    of
    cake
    }
    Chocolate2.86462.3124
    \cline { 2 - 4 }Carrot0.04360.0352
    \cline { 2 - 4 }Lemon3.45492.7889
    \cline { 2 - 4 }Ginger0.75260.6075
    The sum of these contributions, correct to 2 decimal places, is 12.86 .
    1. Calculate the expected frequency for children preferring chocolate cake. Verify the corresponding contribution, 2.8646, to the test statistic.
    2. Carry out the test at the \(1 \%\) level of significance.
  2. Mary buys flour in bags which are labelled as containing 5 kg . She suspects that the average contents of these bags may be less than 5 kg . In order to test this, she selects a random sample of 8 bags and weighs their contents. Assuming that weights are Normally distributed with standard deviation 0.0072 kg , carry out a test at the \(5 \%\) level, given that the weights of the 8 bags in kg are as follows.
    4.992
    4.981
    4.982
    4.996
    4.991
    5.006
    5.009
    5.003
    [0pt] [9] OCR is committed to seeking permission to reproduce all third-party content that it uses in its assessment materials. OCR has attempted to identify and contact all copyright holders whose work is used in this paper. To avoid the issue of disclosure of answer-related information to candidates, all copyright acknowledgements are reproduced in the OCR Copyright Acknowledgements Booklet. This is produced for each series of examinations and is freely available to download from our public website (\href{http://www.ocr.org.uk}{www.ocr.org.uk}) after the live examination series.
    If OCR has unwittingly failed to correctly acknowledge or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity.
    For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1GE.
    OCR is part of the Cambridge Assessment Group; Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.