OCR MEI S2 (Statistics 2) 2006 January

Question 1 5 marks
View details
1 A roller-coaster ride has a safety system to detect faults on the track.
  1. State conditions for a Poisson distribution to be a suitable model for the number of faults occurring on a randomly selected day. Faults are detected at an average rate of 0.15 per day. You may assume that a Poisson distribution is a suitable model.
  2. Find the probability that on a randomly chosen day there are
    (A) no faults,
    (B) at least 2 faults.
  3. Find the probability that, in a randomly chosen period of 30 days, there are at most 3 faults. There is also a separate safety system to detect faults on the roller-coaster train itself. Faults are detected by this system at an average rate of 0.05 per day, independently of the faults detected on the track. You may assume that a Poisson distribution is also suitable for modelling the number of faults detected on the train.
  4. State the distribution of the total number of faults detected by the two systems in a period of 10 days. Find the probability that a total of 5 faults is detected in a period of 10 days.
    [0pt]
  5. The roller-coaster is operational for 200 days each year. Use a suitable approximating distribution to find the probability that a total of at least 50 faults is detected in 200 days. [5]
Question 2
View details
2 The drug EPO (erythropoetin) is taken by some athletes to improve their performance. This drug is in fact banned and blood samples taken from athletes are tested to measure their 'hematocrit level'. If the level is over 50 it is considered that the athlete is likely to have taken EPO and the result is described as 'positive'. The measured hematocrit level of each athlete varies over time, even if EPO has not been taken.
  1. For each athlete in a large population of innocent athletes, the variation in measured hematocrit level is described by the Normal distribution with mean 42.0 and standard deviation 3.0.
    (A) Show that the probability that such an athlete tests positive for EPO in a randomly chosen test is 0.0038 .
    (B) Find the probability that such an athlete tests positive on at least 1 of the 7 occasions during the year when hematocrit level is measured. (These occasions are spread at random through the year and all test results are assumed to be independent.)
    (C) It is standard policy to apply a penalty after testing positive. Comment briefly on this policy in the light of your answer to part (i)(B).
  2. Suppose that 1000 tests are carried out on innocent athletes whose variation in measured hematocrit level is as described in part (i). It may be assumed that the probability of a positive result in each test is 0.0038 , independently of all other test results.
    (A) State the exact distribution of the number of positive tests.
    (B) Use a suitable approximating distribution to find the probability that at least 10 tests are positive.
  3. Because of genetic factors, a particular innocent athlete has an abnormally high natural hematocrit level. This athlete's measured level is Normally distributed with mean 48.0 and standard deviation 2.0. The usual limit of 50 for a positive test is to be altered for this athlete to a higher value \(h\). Find the value of \(h\) for which this athlete would test positive on average just once in 200 occasions.
Question 3
View details
3 A researcher is investigating the relationship between temperature and levels of the air pollutant nitrous oxide at a particular site. The researcher believes that there will be a positive correlation between the daily maximum temperature, \(x\), and nitrous oxide level, \(y\). Data are collected for 10 randomly selected days. The data, measured in suitable units, are given in the table and illustrated on the scatter diagram.
\(x\)13.317.216.918.718.419.323.115.020.614.4
\(y\)911142643255215107
\includegraphics[max width=\textwidth, alt={}, center]{794b337f-6306-4d2e-bb5e-af8cedc9742e-4_823_1234_774_370}
  1. Calculate the value of Spearman's rank correlation coefficient for these data.
  2. Perform a hypothesis test at the \(5 \%\) level to check the researcher's belief, stating your hypotheses clearly.
  3. It is suggested that it would be preferable to carry out a test based on the product moment correlation coefficient. State the distributional assumption required for such a test to be valid. Explain how a scatter diagram can be used to check whether the distributional assumption is likely to be valid and comment on the validity in this case.
  4. A statistician investigates data over a much longer period and finds that the assumptions for the use of the product moment correlation coefficient are in fact valid. Give the critical region for the test at the \(1 \%\) level, based on a sample of 60 days.
  5. In a different research project, into the correlation between daily temperature and ozone pollution levels, a positive correlation is found. It is argued that this shows that high temperatures cause increased ozone levels. Comment on this claim.
Question 4
View details
4 The table summarises the usual method of travelling to school for 200 randomly selected pupils from primary and secondary schools in a city.
PrimarySecondary
\multirow{3}{*}{
Method of
travel
}
Bus2149
\cline { 2 - 4 }Car6515
\cline { 2 - 4 }Cycle or Walk3416
  1. Write down null and alternative hypotheses for a test to examine whether there is any association between method of travel and type of school.
  2. Calculate the expected frequency for primary school bus users. Calculate also the corresponding contribution to the test statistic for the usual \(\chi ^ { 2 }\) test.
  3. Given that the value of the test statistic for the usual \(\chi ^ { 2 }\) test is 42.64 , carry out the test at the \(5 \%\) level of significance, stating your conclusion clearly. The mean travel time for pupils who travel by bus is known to be 18.3 minutes. A survey is carried out to determine whether the mean travel time to school by car is different from 18.3 minutes. In the survey, 20 pupils who travel by car are selected at random. Their mean travel time is found to be 22.4 minutes.
  4. Assuming that car travel times are Normally distributed with standard deviation 8.0 minutes, carry out a test at the \(10 \%\) level, stating your hypotheses and conclusion clearly.
  5. Comment on the suggestion that pupils should use a bus if they want to get to school quickly.