Edexcel S4 (Statistics 4) 2002 June

Question 1
View details
  1. The random variable \(X\) has an \(F\) distribution with 10 and 12 degrees of freedom. Find \(a\) and \(b\) such that \(\mathrm { P } ( a < X < b ) = 0.90\).
    (3)
  2. A chemist has developed a fuel additive and claims that it reduces the fuel consumption of cars. To test this claim, 8 randomly selected cars were each filled with 20 litres of fuel and driven around a race circuit. Each car was tested twice, once with the additive and once without. The distances, in miles, that each car travelled before running out of fuel are given in the table below.
Car12345678
Distance without additive163172195170183185161176
Distance with additive168185187172180189172175
Assuming that the distances travelled follow a normal distribution and stating your hypotheses clearly test, at the \(10 \%\) level of significance, whether or not there is evidence to support the chemist's claim.
(8)
Question 3
View details
3. A technician is trying to estimate the area \(\mu ^ { 2 }\) of a metal square. The independent random variables \(X _ { 1 }\) and \(X _ { 2 }\) are each distributed \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\) and represent two measurements of the sides of the square. Two estimators of the area, \(A _ { 1 }\) and \(A _ { 2 }\), are proposed where $$A _ { 1 } = X _ { 1 } X _ { 2 } \quad \text { and } \quad A _ { 2 } = \left( \frac { X _ { 1 } + X _ { 2 } } { 2 } \right) ^ { 2 } .$$ [You may assume that if \(X _ { 1 }\) and \(X _ { 2 }\) are independent random variables then $$\left. \mathrm { E } \left( X _ { 1 } X _ { 2 } \right) = \mathrm { E } \left( X _ { 1 } \right) \mathrm { E } \left( X _ { 2 } \right) \right]$$
  1. Find \(\mathrm { E } \left( A _ { 1 } \right)\) and show that \(\mathrm { E } \left( A _ { 2 } \right) = \mu ^ { 2 } + \frac { \sigma ^ { 2 } } { 2 }\).
  2. Find the bias of each of these estimators. The technician is told that \(\operatorname { Var } \left( A _ { 1 } \right) = \sigma ^ { 4 } + 2 \mu ^ { 2 } \sigma ^ { 2 }\) and \(\operatorname { Var } \left( A _ { 2 } \right) = \frac { 1 } { 2 } \sigma ^ { 4 } + 2 \mu ^ { 2 } \sigma ^ { 2 }\). The technician decided to use \(A _ { 1 }\) as the estimator for \(\mu ^ { 2 }\).
  3. Suggest a possible reason for this decision. A statistician suggests taking a random sample of \(n\) measurements of sides of the square and finding the mean \(\bar { X }\). He knows that \(\mathrm { E } \left( \bar { X } ^ { 2 } \right) = \mu ^ { 2 } + \frac { \sigma ^ { 2 } } { n }\) and \(\operatorname { Var } \left( \bar { X } ^ { 2 } \right) = \frac { 2 \sigma ^ { 4 } } { n ^ { 2 } } + \frac { 4 \sigma ^ { 2 } \mu ^ { 2 } } { n }\).
  4. Explain whether or not \(\bar { X } ^ { 2 }\) is a consistent estimator of \(\mu ^ { 2 }\).
Question 4
View details
4. A recent census in the U.K. revealed that the heights of females in the U.K. have a mean of 160.9 cm . A doctor is studying the heights of female Indians in a remote region of South America. The doctor measured the height, \(x \mathrm {~cm}\), of each of a random sample of 30 female Indians and obtained the following statistics. $$\Sigma x = 4400.7 , \quad \Sigma \mathrm { x } ^ { 2 } = 646904.41 .$$ The heights of female Indians may be assumed to follow a normal distribution.
The doctor presented the results of the study in a medical journal and wrote 'the female Indians in this region are more than 10 cm shorter than females in the U.K.'
  1. Stating your hypotheses clearly and using a \(5 \%\) level of significance, test the doctor's statement.
    (6) The census also revealed that the standard deviation of the heights of U.K. females was 6.0 cm .
  2. Stating your hypotheses clearly test, at the \(5 \%\) level of significance, whether or not there is evidence that the variance of the heights of female Indians is different from that of females in the U.K.
    (6)
Question 5
View details
5. The times, \(x\) seconds, taken by the competitors in the 100 m freestyle events at a school swimming gala are recorded. The following statistics are obtained from the data.
No. of competitorsSample Mean \(\bar { x }\)\(\sum x ^ { 2 }\)
Girls883.1055746
Boys788.9056130
Following the gala a proud parent claims that girls are faster swimmers than boys. Assuming that the times taken by the competitors are two independent random samples from normal distributions,
  1. test, at the \(10 \%\) level of significance, whether or not the variances of the two distributions are the same. State your hypotheses clearly.
  2. Stating your hypotheses clearly, test the parent's claim. Use a \(5 \%\) level of significance.
Question 6
View details
6. A nutritionist studied the levels of cholesterol, \(X \mathrm { mg } / \mathrm { cm } ^ { 3 }\), of male students at a large college. She assumed that \(X\) was distributed \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\) and examined a random sample of 25 male students. Using this sample she obtained unbiased estimates of \(\mu\) and \(\sigma ^ { 2 }\) as $$\hat { \mu } = 1.68 , \quad \hat { \sigma } ^ { 2 } = 1.79 .$$
  1. Find a 95\% confidence interval for \(\mu\).
  2. Obtain a \(95 \%\) confidence interval for \(\sigma ^ { 2 }\). A cholesterol reading of more than \(2.5 \mathrm { mg } / \mathrm { cm } ^ { 3 }\) is regarded as high.
  3. Use appropriate confidence limits from parts (a) and (b) to find the lowest estimate of the proportion of male students in the college with high cholesterol.
Question 7
View details
  1. A proportion \(p\) of the items produced by a factory is defective. A quality assurance manager selects a random sample of 5 items from each batch produced to check whether or not there is evidence that \(p\) is greater than 0.10 . The criterion that the manager uses for rejecting the hypothesis that \(p\) is 0.10 is that there are more than 2 defective items in the sample.
    1. Find the size of the test.
      (2)
    Table 1 gives some values, to 2 decimal places, of the power function of this test. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 1}
    \(p\)0.150.200.250.300.350.40
    Power0.03\(r\)0.100.160.240.32
    \end{table}
  2. Find the value of \(r\). One day the manager is away and an assistant checks the production by random sample of 10 items from each batch produced. The hypothesis that \(p = 0.10\) is rejected if more than 4 defectives are found in the sample.
  3. Find P (Type I error) using the assistant's test. Table 2 gives some values, to 2 decimal places, of the power function for this test. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 2}
    \(p\)0.150.200.250.300.350.40
    Power0.010.030.080.150.25\(s\)
    \end{table}
  4. Find the value of \(s\).
  5. Using the same axes, draw the graphs of the power functions of these two tests.
    1. State the value of \(p\) where these graphs cross.
    2. Explain the significance if \(p\) is greater than this value. The manager studies the graphs in part ( \(e\) ) but decides to carry on using the test based on a sample of size 5 .
  6. Suggest 2 reasons why the manager might have made this decision.