Edexcel S4 (Statistics 4) 2006 January

Question 1
View details
  1. A diabetic patient records her blood glucose readings in \(\mathrm { mmol } / \mathrm { l }\) at random times of day over several days. Her readings are given below.
$$\begin{array} { l l l l l l l } 5.3 & 5.7 & 8.4 & 8.7 & 6.3 & 8.0 & 7.2 \end{array}$$ Assuming that the blood glucose readings are normally distributed calculate
  1. an unbiased estimate for the variance \(\sigma ^ { 2 }\) of the blood glucose readings,
  2. a \(90 \%\) confidence interval for the variance \(\sigma ^ { 2 }\) of blood glucose readings.
  3. State whether or not the confidence interval supports the assertion that \(\sigma = 0.9\). Give a reason for your answer.
Question 2
View details
2. (a) Define
  1. a Type I error,
  2. a Type II error. A manufacturer sells socks in boxes of 50 .
    The mean number of faulty socks per box is 7.5 . In order to reduce the number of faulty socks a new machine is tried. A box of socks made on the new machine was tested and the number of faulty socks was 2.
    (b) (i) Assuming that the number of faulty socks per box follows a binomial distribution derive a critical region needed to test whether or not there is evidence that the new machine has reduced the mean number of faulty socks per box. Use a \(5 \%\) significance level.
  3. Stating your hypotheses clearly, carry out the test in part (i).
    (c) Find the probability of the Type I error for this test.
    (d) Given that the true mean number of faulty socks per box on the new machine is 5 , calculate the probability of a Type II error for this test.
    (e) Explain what would have been the effect of changing the significance level for the test in part (b) to \(2 \frac { 1 } { 2 } \%\).
Question 3
View details
3. A population has mean \(\mu\) and variance \(\sigma ^ { 2 }\). A random sample of size 3 is to be taken from this population and \(\bar { X }\) denotes its sample mean. A second random sample of size 4 is to be taken from this population and \(\bar { Y }\) denotes its sample mean.
  1. Show that unbiased estimators for \(\mu\) are given by
    1. \(\hat { \mu } _ { 1 } = \frac { 1 } { 3 } \bar { X } + \frac { 2 } { 3 } \bar { Y }\),
    2. \(\hat { \mu } _ { 2 } = \frac { 5 \bar { X } + 4 \bar { Y } } { 9 }\).
  2. Calculate Var \(\left( \hat { \mu } _ { 1 } \right)\)
  3. Given that \(\operatorname { Var } \left( \hat { \mu } _ { 2 } \right) = \frac { 37 } { 243 } \sigma ^ { 2 }\), state, giving a reason, which of these two estimators should be
    used. used.
Question 4
View details
4. The number of accidents that occur at a crossroads has a mean of 3 per month. In order to improve the flow of traffic the priority given to traffic is changed. Colin believes that since the change in priority the number of accidents has increased. He tests his belief by recording the number of accidents \(x\) in the month following the change. Colin sets up the hypotheses \(\mathrm { H } _ { 0 } : \lambda = 3\) and \(\mathrm { H } _ { 1 } : \lambda > 3\), where \(\lambda\) is the mean number of accidents per month, and rejects the null hypothesis if \(x > 4\).
  1. Find the size of the test. The table gives the values of the power function of the test to two decimal places.
    \(\lambda\)4567
    Power\(r\)0.56\(s\)0.83
  2. Calculate the value of \(r\) and the value of \(s\).
  3. Comment on the suitability of the test when \(\lambda = 4\).
Question 5
View details
5. Seven pipes of equal length are selected at random. Each pipe is cut in half. One piece of each pipe is coated with protective paint and the other is left uncoated. All of the pieces of pipe are buried to the same depth in various soils for 6 months. The table gives the percentage area of the pieces of pipe in the various soils that are subject to corrosion.
SoilABCDEFG
\% Corrosion
coated pipe
39404332423336
\% Corrosion
uncoated pipe
41366148424845
  1. Stating your hypotheses clearly and using a \(5 \%\) significance level, carry out a paired \(t\)-test to assess whether or not there is a difference between the mean percentage of corrosion on the coated pipes and the mean percentage of corrosion on the uncoated pipes.
    1. State an assumption that has been made in order to carry out this test.
    2. Comment on the validity of this assumption.
  2. State what difference would be made to the conclusion in part (a) if the test had been to determine whether or not the percentage of corrosion on the uncoated pipes was higher than the mean percentage of corrosion on the coated pipes. Justify your answer.
Question 6
View details
6. A tree is cut down and sawn into pieces. Half of the pieces are stored outside and half of the pieces are stored inside. After a year, a random sample of pieces is taken from each location and the hardness is measured. The hardness \(x\) units are summarised in the following table.
Number of
pieces sampled
\(\Sigma x\)\(\Sigma x ^ { 2 }\)
Stored outside202340274050
Stored inside374884645282
  1. Show that unbiased estimates for the variance of the values of hardness for wood stored outside and for the wood stored inside are 14.2 and 16.5 , to 1 decimal place, respectively.
    (2) The hardness of wood stored outside and the hardness of wood stored inside can be assumed to be normally distributed with equal variances.
  2. Calculate \(95 \%\) confidence limits for the difference in mean hardness between the wood that was stored outside and the wood that was stored inside.
    (8)
  3. Using your answer to part (b), comment on the means of the hardness of wood stored outside and inside. Give a reason for your answer.
    (2)
    (Total 12 marks)
Question 7
View details
7. A psychologist gives a test to students from two different schools, \(A\) and \(B\). A group of 9 students is randomly selected from school \(A\) and given instructions on how to do the test.
A group of 7 students is randomly selected from school \(B\) and given the test without the instructions. The table shows the time taken, to the nearest second, to complete the test by the two groups.
\(A\)111212131415161717
\(B\)8101113131414
Stating your hypotheses clearly,
  1. test at the \(10 \%\) significance level, whether or not the variance of the times taken to complete the test by students from school \(A\) is the same as the variance of the times taken to complete the test by students from school \(B\). (You may assume that times taken for each school are normally distributed.)
  2. test at the \(5 \%\) significance level, whether or not the mean time taken to complete the test by students from school \(A\) is greater than the mean time taken to complete the test by students from school \(B\).
  3. Why does the result to part (a) enable you to carry out the test in part (b)?
  4. Give one factor that has not been taken into account in your analysis.