OCR MEI S2 (Statistics 2) 2008 June

Question 1
View details
1 A researcher believes that there is a negative correlation between money spent by the government on education and population growth in various countries. A random sample of 48 countries is selected to investigate this belief. The level of government spending on education \(x\), measured in suitable units, and the annual percentage population growth rate \(y\), are recorded for these countries. Summary statistics for these data are as follows. $$\Sigma x = 781.3 \quad \Sigma y = 57.8 \quad \Sigma x ^ { 2 } = 14055 \quad \Sigma y ^ { 2 } = 106.3 \quad \Sigma x y = 880.1 \quad n = 48$$
  1. Calculate the sample product moment correlation coefficient.
  2. Carry out a hypothesis test at the \(5 \%\) significance level to investigate the researcher's belief. State your hypotheses clearly, defining any symbols which you use.
  3. State the distributional assumption which is necessary for this test to be valid. Explain briefly how a scatter diagram may be used to check whether this assumption is likely to be valid.
  4. A student suggests that if the variables are negatively correlated then population growth rates can be reduced by increasing spending on education. Explain why the student may be wrong. Discuss an alternative explanation for the correlation.
  5. State briefly one advantage and one disadvantage of using a smaller sample size in this investigation.
Question 2
View details
2 A public water supply contains bacteria. Each day an analyst checks the water quality by counting the number of bacteria in a random sample of 5 ml of water. Throughout this question, you should assume that the bacteria occur randomly at a mean rate of 0.37 bacteria per 5 ml of water.
  1. Use a Poisson distribution to
    (A) find the probability that a 5 ml sample contains exactly 2 bacteria,
    (B) show that the probability that a 5 ml sample contains more than 2 bacteria is 0.0064 .
  2. The month of September has 30 days. Find the probability that during September there is at most one day when a 5 ml sample contains more than 2 bacteria. The daily 5 ml sample is the first stage of the quality control process. The remainder of the process is as follows.
    • If the 5 ml sample contains more than 2 bacteria, then a 50 ml sample is taken.
    • If this 50 ml sample contains more than 8 bacteria, then a sample of 1000 ml is taken.
    • If this 1000 ml sample contains more than 90 bacteria, then the supply is declared to be 'questionable'.
    • Find the probability that a random sample of 50 ml contains more than 8 bacteria.
    • Use a suitable approximating distribution to find the probability that a random sample of 1000 ml contains more than 90 bacteria.
    • Find the probability that the supply is declared to be questionable.
Question 3
View details
3 A company has a fleet of identical vans. Company policy is to replace all of the tyres on a van as soon as any one of them is worn out. The random variable \(X\) represents the number of miles driven before the tyres on a van are replaced. \(X\) is Normally distributed with mean 27500 and standard deviation 4000.
  1. Find \(\mathrm { P } ( X > 25000 )\).
  2. 10 vans in the fleet are selected at random. Find the probability that the tyres on exactly 7 of them last for more than 25000 miles.
  3. The tyres of \(99 \%\) of vans last for more than \(k\) miles. Find the value of \(k\). A tyre supplier claims that a different type of tyre will have a greater mean lifetime. A random sample of 15 vans is fitted with these tyres. For each van, the number of miles driven before the tyres are replaced is recorded. A hypothesis test is carried out to investigate the claim. You may assume that these lifetimes are also Normally distributed with standard deviation 4000.
  4. Write down suitable null and alternative hypotheses for the test.
  5. For the 15 vans, it is found that the mean lifetime of the tyres is 28630 miles. Carry out the test at the \(5 \%\) level.
Question 4
View details
4 A student is investigating whether there is any association between the species of shellfish that occur on a rocky shore and where they are located. A random sample of 160 shellfish is selected and the numbers of shellfish in each category are summarised in the table below.
Location
\cline { 3 - 5 } \multicolumn{2}{|c|}{}ExposedShelteredPool
\multirow{3}{*}{Species}Limpet243216
\cline { 2 - 5 }Mussel24113
\cline { 2 - 5 }Other52223
  1. Write down null and alternative hypotheses for a test to examine whether there is any association between species and location. The contributions to the test statistic for the usual \(\chi ^ { 2 }\) test are shown in the table below.
    ContributionLocation
    \cline { 3 - 5 }ExposedShelteredPool
    \multirow{3}{*}{Species}Limpet0.00090.25850.4450
    \cline { 2 - 5 }Mussel10.34721.27564.8773
    \cline { 2 - 5 }Other8.07190.14027.4298
    The sum of these contributions is 32.85 .
  2. Calculate the expected frequency for mussels in pools. Verify the corresponding contribution 4.8773 to the test statistic.
  3. Carry out the test at the \(5 \%\) level of significance, stating your conclusion clearly.
  4. For each species, comment briefly on how its distribution compares with what would be expected if there were no association.
  5. If 3 of the 160 shellfish are selected at random, one from each of the 3 types of location, find the probability that all 3 of them are limpets.