Edexcel S4 (Statistics 4) 2003 June

Question 1
View details
  1. A beach is divided into two areas \(A\) and \(B\). A random sample of pebbles is taken from each of the two areas and the length of each pebble is measured. A sample of size 26 is taken from area \(A\) and the unbiased estimate for the population variance is \(s _ { A } ^ { 2 } = 0.495 \mathrm {~mm} ^ { 2 }\). A sample of size 25 is taken from area \(B\) and the unbiased estimate for the population variance is \(s _ { B } ^ { 2 } = 1.04 \mathrm {~mm} ^ { 2 }\).
    1. Stating your hypotheses clearly test, at the \(10 \%\) significance level, whether or not there is a difference in variability of pebble length between area \(A\) and area \(B\).
    2. State the assumption you have made about the populations of pebble lengths in order to carry out the test.
    3. A random sample of 10 mustard plants had the following heights, in mm , after 4 days growth.
    $$5.0,4.5,4.8,5.2,4.3,5.1,5.2,4.9,5.1,5.0$$ Those grown previously had a mean height of 5.1 mm after 4 days. Using a \(2.5 \%\) significance level, test whether or not the mean height of these plants is less than that of those grown previously.
    (You may assume that the height of mustard plants after 4 days follows a normal distribution.)
Question 3
View details
3. A train company claims that the probability \(p\) of one of its trains arriving late is \(10 \%\). A regular traveller on the company's trains believes that the probability is greater than \(10 \%\) and decides to test this by randomly selecting 12 trains and recording the number \(X\) of trains that were late. The traveller sets up the hypotheses \(\mathrm { H } _ { 0 } : p = 0.1\) and \(\mathrm { H } _ { 1 } : p > 0.1\) and accepts the null hypothesis if \(x \leq 2\).
  1. Find the size of the test.
  2. Show that the power function of the test is $$1 - ( 1 - p ) ^ { 10 } \left( 1 + 10 p + 55 p ^ { 2 } \right) .$$
  3. Calculate the power of the test when
    1. \(p = 0.2\),
    2. \(p = 0.6\).
  4. Comment on your results from part (c).
Question 4
View details
4. A random sample of 15 tomatoes is taken and the weight \(x\) grams of each tomato is found. The results are summarised by \(\sum x = 208\) and \(\sum x ^ { 2 } = 2962\).
  1. Assuming that the weights of the tomatoes are normally distributed, calculate the \(90 \%\) confidence interval for the variance \(\sigma ^ { 2 }\) of the weights of the tomatoes.
  2. State with a reason whether or not the confidence interval supports the assertion \(\sigma ^ { 2 } = 3\).
Question 5
View details
5. (a) Define
  1. a Type I error,
  2. a Type II error. A small aviary, that leaves the eggs with the parent birds, rears chicks at an average rate of 5 per year. In order to increase the number of chicks reared per year it is decided to remove the eggs from the aviary as soon as they are laid and put them in an incubator. At the end of the first year of using an incubator 7 chicks had been successfully reared.
    (b) Assuming that the number of chicks reared per year follows a Poisson distribution test, at the \(5 \%\) significance level, whether or not there is evidence of an increase in the number of chicks reared per year. State your hypotheses clearly.
    (c) Calculate the probability of the Type I error for this test.
    (d) Given that the true average number of chicks reared per year when the eggs are hatched in an incubator is 8 , calculate the probability of a Type II error.
Question 6
View details
6. A random sample of three independent variables \(X _ { 1 } , X _ { 2 }\) and \(X _ { 3 }\) is taken from a distribution with mean \(\mu\) and variance \(\sigma ^ { 2 }\).
  1. Show that \(\frac { 2 } { 3 } X _ { 1 } - \frac { 1 } { 2 } X _ { 2 } + \frac { 5 } { 6 } X _ { 3 }\) is an unbiased estimator for \(\mu\). An unbiased estimator for \(\mu\) is given by \(\hat { \mu } = a X _ { 1 } + b X _ { 2 }\) where \(a\) and \(b\) are constants.
  2. Show that \(\operatorname { Var } ( \hat { \mu } ) = \left( 2 a ^ { 2 } - 2 a + 1 \right) \sigma ^ { 2 }\).
  3. Hence determine the value of \(a\) and the value of \(b\) for which \(\hat { \mu }\) has minimum variance.
Question 7
View details
7. Two methods of extracting juice from an orange are to be compared. Eight oranges are halved. One half of each orange is chosen at random and allocated to Method \(A\) and the other half is allocated to Method \(B\). The amounts of juice extracted, in ml , are given in the table.
\cline { 2 - 9 } \multicolumn{1}{c|}{}Orange
\cline { 2 - 9 } \multicolumn{1}{c|}{}12345678
Method \(A\)2930262526222328
Method \(B\)2725282423262225
One statistician suggests performing a two-sample \(t\)-test to investigate whether or not there is a difference between the mean amounts of juice extracted by the two methods.
  1. Stating your hypotheses clearly and using a \(5 \%\) significance level, carry out this test.
    (You may assume \(\bar { x } _ { A } = 26.125 , s _ { A } ^ { 2 } = 7.84 , \bar { x } _ { B } = 25 , s _ { B } ^ { 2 } = 4\) and \(\sigma _ { A } ^ { 2 } = \sigma _ { B } ^ { 2 }\) ) Another statistician suggests analysing these data using a paired \(t\)-test.
  2. Using a \(5 \%\) significance level, carry out this test.
  3. State which of these two tests you consider to be more appropriate. Give a reason for your choice.