OCR MEI S2 (Statistics 2) 2014 June

Question 1
View details
1 A medical student is investigating the claim that young adults with high diastolic blood pressure tend to have high systolic blood pressure. The student measures the diastolic and systolic blood pressures of a random sample of ten young adults. The data are shown in the table and illustrated in the scatter diagram.
Diastolic blood pressure60616263737684879095
Systolic blood pressure98121118114108112132130134139
\includegraphics[max width=\textwidth, alt={}, center]{17e474c4-f5be-4ca1-b7c3-e444b46c3bec-2_865_809_593_628}
  1. Calculate the value of Spearman's rank correlation coefficient for these data.
  2. Carry out a hypothesis test at the \(5 \%\) significance level to examine whether there is positive association between diastolic blood pressure and systolic blood pressure in the population of young adults.
  3. Explain why, in the light of the scatter diagram, it might not be valid to carry out a test based on the product moment correlation coefficient. The product moment correlation coefficient between the diastolic and systolic blood pressures of a random sample of 10 athletes is 0.707 .
  4. Carry out a hypothesis test at the \(1 \%\) significance level to investigate whether there appears to be positive correlation between these two variables in the population of athletes. You may assume that in this case such a test is valid.
Question 2
View details
2 Manufacturing defects occur in a particular type of aluminium sheeting randomly, independently and at a constant average rate of 1.7 defects per square metre.
  1. Explain the meaning of the term 'independently' and name the distribution that models this situation.
  2. Find the probability that there are exactly 2 defects in a sheet of area 1 square metre.
  3. Find the probability that there are exactly 12 defects in a sheet of area 7 square metres. In another type of aluminium sheet, defects occur randomly, independently and at a constant average rate of 0.8 defects per square metre.
  4. A large box is made from 2 square metres of the first type of sheet and 2 square metres of the second type of sheet, chosen independently. Show that the probability that there are at least 8 defects altogether in the box is 0.1334 . A random sample of 100 of these boxes is selected.
  5. State the exact distribution of the number of boxes which have at least 8 defects.
  6. Use a suitable approximating distribution to find the probability that there are at least 20 boxes in the sample which have at least 8 defects.
Question 3
View details
3 The wing lengths of native English male blackbirds, measured in mm , are Normally distributed with mean 130.5 and variance 11.84.
  1. Find the probability that a randomly selected native English male blackbird has a wing length greater than 135 mm .
  2. Given that \(1 \%\) of native English male blackbirds have wing length more than \(k \mathrm {~mm}\), find the value of \(k\).
  3. Find the probability that a randomly selected native English male blackbird has a wing length which is 131 mm correct to the nearest millimetre. It is suspected that Scandinavian male blackbirds have, on average, longer wings than native English male blackbirds. A random sample of 20 Scandinavian male blackbirds has mean wing length 132.4 mm . You may assume that wing lengths in this population are Normally distributed with variance \(11.84 \mathrm {~mm} ^ { 2 }\).
  4. Carry out an appropriate hypothesis test, at the \(5 \%\) significance level.
  5. Discuss briefly one advantage and one disadvantage of using a \(10 \%\) significance level rather than a \(5 \%\) significance level in hypothesis testing in general.
Question 4
View details
4 A researcher at a large company thinks that there may be some relationship between the numbers of working days lost due to illness per year and the ages of the workers in the company. The researcher selects a random sample of 190 workers. The ages of the workers and numbers of days lost for a period of 1 year are summarised below.
\cline { 3 - 5 } \multicolumn{2}{c|}{}Working days lost
\cline { 3 - 5 } \multicolumn{2}{c|}{}0 to 45 to 910 or more
\multirow{3}{*}{Age}Under 3531274
\cline { 2 - 5 }35 to 5028328
\cline { 2 - 5 }Over 50162816
  1. Carry out a test at the \(1 \%\) significance level to investigate whether the researcher's belief appears to be true. Your working should include a table showing the contributions of each cell to the test statistic.
  2. For the 'Over 50' age group, comment briefly on how the working days lost compare with what would be expected if there were no association.
  3. A student decides to reclassify the 'working days lost' into two groups, ' 0 to 4 ' and ' 5 or more', but leave the age groups as before. The test statistic with this classification is 7.08 . Carry out the test at the \(1 \%\) level with this new classification, using the same hypotheses as for the original test.
  4. Comment on the results of the two tests. \section*{END OF QUESTION PAPER}