Edexcel S4 (Statistics 4)

Question 1
View details
  1. A beach is divided into two areas \(A\) and \(B\). A random sample of pebbles is taken from each of the two areas and the length of each pebble is measured. A sample of size 26 is taken from area \(A\) and the unbiased estimate for the population variance is \(s _ { A } ^ { 2 } = 0.495 \mathrm {~mm} ^ { 2 }\). A sample of size 25 is taken from area \(B\) and the unbiased estimate for the population variance is \(s _ { B } ^ { 2 } = 1.04 \mathrm {~mm} ^ { 2 }\).
    1. Stating your hypotheses clearly test, at the \(10 \%\) significance level, whether or not there is a difference in variability of pebble length between area \(A\) and area \(B\).
    2. State the assumption you have made about the populations of pebble lengths in order to carry out the test.
      (1)
    3. A random sample of 10 mustard plants had the following heights, in mm , after 4 days growth.
    $$5.0,4.5,4.8,5.2,4.3,5.1,5.2,4.9,5.1,5.0$$ Those grown previously had a mean height of 5.1 mm after 4 days. Using a \(2.5 \%\) significance level, test whether or not the mean height of these plants is less than that of those grown previously.
    (You may assume that the height of mustard plants after 4 days follows a normal distribution.)
    (9)
    3. A train company claims that the probability \(p\) of one of its trains arriving late is \(10 \%\). A regular traveller on the company's trains believes that the probability is greater than \(10 \%\) and decides to test this by randomly selecting 12 trains and recording the number \(X\) of trains that were late. The traveller sets up the hypotheses \(\mathrm { H } _ { 0 } : p = 0.1\) and \(\mathrm { H } _ { 1 } : p > 0.1\) and accepts the null hypothesis if \(x \leq 2\).
  2. Find the size of the test.
  3. Show that the power function of the test is $$1 - ( 1 - p ) ^ { 10 } \left( 1 + 10 p + 55 p ^ { 2 } \right)$$
  4. Calculate the power of the test when
    1. \(p = 0.2\),
    2. \(p = 0.6\).
  5. Comment on your results from part (c).
    4. A random sample of 15 tomatoes is taken and the weight \(x\) grams of each tomato is found. The results are summarised by \(\sum x = 208\) and \(\sum x ^ { 2 } = 2962\).
  6. Assuming that the weights of the tomatoes are normally distributed, calculate the \(90 \%\) confidence interval for the variance \(\sigma ^ { 2 }\) of the weights of the tomatoes.
  7. State with a reason whether or not the confidence interval supports the assertion \(\sigma ^ { 2 } = 3\).
    5. (a) Define
    1. a Type I error,
    2. a Type II error. A small aviary, that leaves the eggs with the parent birds, rears chicks at an average rate of 5 per year. In order to increase the number of chicks reared per year it is decided to remove the eggs from the aviary as soon as they are laid and put them in an incubator. At the end of the first year of using an incubator 7 chicks had been successfully reared.
  8. Assuming that the number of chicks reared per year follows a Poisson distribution test, at the \(5 \%\) significance level, whether or not there is evidence of an increase in the number of chicks reared per year. State your hypotheses clearly.
  9. Calculate the probability of the Type I error for this test.
  10. Given that the true average number of chicks reared per year when the eggs are hatched in an incubator is 8, calculate the probability of a Type II error.
    6. A random sample of three independent variables \(X _ { 1 } , X _ { 2 }\) and \(X _ { 3 }\) is taken from a distribution with mean \(\mu\) and variance \(\sigma ^ { 2 }\).
  11. Show that \(\frac { 2 } { 3 } X _ { 1 } - \frac { 1 } { 2 } X _ { 2 } + \frac { 5 } { 6 } X _ { 3 }\) is an unbiased estimator for \(\mu\). An unbiased estimator for \(\mu\) is given by \(\hat { \mu } = a X _ { 1 } + b X _ { 2 }\) where \(a\) and \(b\) are constants.
  12. Show that \(\operatorname { Var } ( \hat { \mu } ) = \left( 2 a ^ { 2 } - 2 a + 1 \right) \sigma ^ { 2 }\).
  13. Hence determine the value of \(a\) and the value of \(b\) for which \(\hat { \mu }\) has minimum variance.
    7. Two methods of extracting juice from an orange are to be compared. Eight oranges are halved. One half of each orange is chosen at random and allocated to Method \(A\) and the other half is allocated to Method \(B\). The amounts of juice extracted, in ml , are given in the table.
    \cline { 2 - 9 } \multicolumn{1}{c|}{}Orange
    \cline { 2 - 9 } \multicolumn{1}{c|}{}12345678
    Method \(A\)2930262526222328
    Method \(B\)2725282423262225
    One statistician suggests performing a two-sample \(t\)-test to investigate whether or not there is a difference between the mean amounts of juice extracted by the two methods.
  14. Stating your hypotheses clearly and using a \(5 \%\) significance level, carry out this test.
    (You may assume \(\bar { x } _ { A } = 26.125 , s _ { A } ^ { 2 } = 7.84 , \bar { x } _ { B } = 25 , s _ { B } ^ { 2 } = 4\) and \(\sigma _ { A } ^ { 2 } = \sigma _ { B } ^ { 2 }\) ) Another statistician suggests analysing these data using a paired \(t\)-test.
  15. Using a \(5 \%\) significance level, carry out this test.
  16. State which of these two tests you consider to be more appropriate. Give a reason for your choice.
    (1) \section*{END} \section*{Advanced/Advanced Subsidiary} Wednesday 16 June 2004 - Afternoon Time: \(\mathbf { 1 }\) hour \(\mathbf { 3 0 }\) minutes Answer Book (AB16)
    Nil
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac) Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S4), the paper reference (6686), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has seven questions. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. The random variable \(X\) has an \(F\)-distribution with 8 and 12 degrees of freedom.
    Find \(\mathrm { P } \left( \frac { 1 } { 5.67 } < X < 2.85 \right)\).
Question 3
View details
3. A train company claims that the probability \(p\) of one of its trains arriving late is \(10 \%\). A regular traveller on the company's trains believes that the probability is greater than \(10 \%\) and decides to test this by randomly selecting 12 trains and recording the number \(X\) of trains that were late. The traveller sets up the hypotheses \(\mathrm { H } _ { 0 } : p = 0.1\) and \(\mathrm { H } _ { 1 } : p > 0.1\) and accepts the null hypothesis if \(x \leq 2\).
  1. Find the size of the test.
  2. Show that the power function of the test is $$1 - ( 1 - p ) ^ { 10 } \left( 1 + 10 p + 55 p ^ { 2 } \right)$$
  3. Calculate the power of the test when
    1. \(p = 0.2\),
    2. \(p = 0.6\).
  4. Comment on your results from part (c).
Question 5
View details
5. (a) Define
  1. a Type I error,
  2. a Type II error. A small aviary, that leaves the eggs with the parent birds, rears chicks at an average rate of 5 per year. In order to increase the number of chicks reared per year it is decided to remove the eggs from the aviary as soon as they are laid and put them in an incubator. At the end of the first year of using an incubator 7 chicks had been successfully reared.
    (b) Assuming that the number of chicks reared per year follows a Poisson distribution test, at the \(5 \%\) significance level, whether or not there is evidence of an increase in the number of chicks reared per year. State your hypotheses clearly.
    (c) Calculate the probability of the Type I error for this test.
    (d) Given that the true average number of chicks reared per year when the eggs are hatched in an incubator is 8, calculate the probability of a Type II error.
Question 7
View details
7. Two methods of extracting juice from an orange are to be compared. Eight oranges are halved. One half of each orange is chosen at random and allocated to Method \(A\) and the other half is allocated to Method \(B\). The amounts of juice extracted, in ml , are given in the table. The lengths of components produced by the machines can be assumed to follow normal distributions.
  1. Use a two tail test to show, at the \(10 \%\) significance level, that the variances of the lengths of components produced by each machine can be assumed to be equal.
  2. Showing your working clearly, find a \(95 \%\) confidence interval for \(\mu _ { B } - \mu _ { A }\), where \(\mu _ { A }\) and \(\mu _ { B }\) are the mean lengths of the populations of components produced by machine \(A\) and machine \(B\) respectively. There are serious consequences for the production at the factory if the difference in mean lengths of the components produced by the two machines is more than 0.7 cm .
  3. State, giving your reason, whether or not the factory manager should be concerned.
    5. Rolls of cloth delivered to a factory contain defects at an average rate of \(\lambda\) per metre. A quality assurance manager selects a random sample of 15 metres of cloth from each delivery to test whether or not there is evidence that \(\lambda > 0.3\). The criterion that the manager uses for rejecting the hypothesis that \(\lambda = 0.3\) is that there are 9 or more defects in the sample.
  4. Find the size of the test. Table 1 gives some values, to 2 decimal places, of the power function of this test. \begin{table}[h]
  5. Use a paired \(t\)-test to determine, at the \(10 \%\) level of significance, whether or not there is a difference in the mean blood pressure measured using the two methods. State your hypotheses clearly.
  6. State an assumption about the underlying distribution of measured blood pressure required for this test.
    2. The value of orders, in \(\pounds\), made to a firm over the internet has distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\). A random sample of \(n\) orders is taken and \(\bar { X }\) denotes the sample mean.
  7. Write down the mean and variance of \(\bar { X }\) in terms of \(\mu\) and \(\sigma ^ { 2 }\). A second sample of \(m\) orders is taken and \(\bar { Y }\) denotes the mean of this sample.
    An estimator of the population mean is given by $$U = \frac { n \bar { X } + m \bar { Y } } { n + m }$$
  8. Show that \(U\) is an unbiased estimator for \(\mu\).
  9. Show that the variance of \(U\) is \(\frac { \sigma ^ { 2 } } { n + m }\).
  10. State which of \(\bar { X }\) or \(U\) is a better estimator for \(\mu\). Give a reason for your answer.
    3. The lengths, \(x \mathrm {~mm}\), of the forewings of a random sample of male and female adult butterflies are measured. The following statistics are obtained from the data.
  11. Stating your hypotheses clearly, and using a \(10 \%\) level of significance, test whether or not there is evidence of a difference between the variances of the marks of the two groups.
  12. State clearly an assumption you have made to enable you to carry out the test in part (a).
  13. Use a two tailed test, with a \(5 \%\) level of significance, to determine if the playing of music during the test has made any difference in the mean marks of the two groups. State your hypotheses clearly.
  14. Write down what you can conclude about the effect of music on a student's performance during the test.
    3. The weights, in grams, of mice are normally distributed. A biologist takes a random sample of 10 mice. She weighs each mouse and records its weight. The ten mice are then fed on a special diet. They are weighed again after two weeks.
    Their weights in grams are as follows:
  15. State an assumption that needs to be made in order to carry out a \(t\)-test in this case.
  16. State why a paired \(t\)-test is suitable for use with these data.
  17. Using a \(5 \%\) level of significance, test whether or not there is evidence that the device reduces \(\mathrm { CO } _ { 2 }\) emissions from cars.
  18. Explain, in context, what a type II error would be in this case.
    3. Define, in terms of \(\mathrm { H } _ { 0 }\) and/or \(\mathrm { H } _ { 1 }\),
  19. the size of a hypothesis test,
  20. the power of a hypothesis test. The probability of getting a head when a coin is tossed is denoted by \(p\). This coin is tossed 12 times in order to test the hypotheses \(\mathrm { H } _ { 0 } : p = 0.5\) against \(\mathrm { H } _ { 1 } : p \neq 0.5\), using a \(5 \%\) level of significance.
  21. Find the largest critical region for this test, such that the probability in each tail is less than \(2.5 \%\).
  22. Given that \(p = 0.4\)
    1. find the probability of a type II error when using this test,
    2. find the power of this test.
  23. Suggest two ways in which the power of the test can be increased.
    4. A farmer set up a trial to assess whether adding water to dry feed increases the milk yield of his cows. He randomly selected 22 cows. Thirteen of the cows were given dry feed and the other 9 cows were given the feed with water added. The milk yields, in litres per day, were recorded with the following results. You may assume that the times taken to complete the task by the students are two independent random samples from normal distributions.
  24. Stating your hypotheses clearly, test, at the \(10 \%\) level of significance, whether or not the variances of the times taken to complete the task with and without background music are equal.
  25. Find a \(99 \%\) confidence interval for the difference in the mean times taken to complete the task with and without background music. Experiments like this are often performed using the same people in each group.
  26. Explain why this would not be appropriate in this case.
    2. As part of an investigation, a random sample of 10 people had their heart rate, in beats per minute, measured whilst standing up and whilst lying down. The results are summarized below. Stating your hypotheses clearly, test, at the \(10 \%\) level of significance, whether or not the mean amount of juice produced by machine \(B\) is more than the mean amount produced by machine \(A\).
    4. A proportion \(p\) of letters sent by a company are incorrectly addressed and if \(p\) is thought to be greater than 0.05 then action is taken. Using \(\mathrm { H } _ { 0 } : p = 0.05\) and \(\mathrm { H } _ { 1 } : p > 0.05\), a manager from the company takes a random sample of 40 letters and rejects \(\mathrm { H } _ { 0 }\) if the number of incorrectly addressed letters is more than 3 .
  27. Find the size of this test.
  28. Find the probability of a Type II error in the case where \(p\) is in fact 0.10 . Table 1 below gives some values, to 2 decimal places, of the power function of this test. The student decides to carry out a paired \(t\)-test to investigate whether, on average, the blood pressure of a person when sitting down is more than their blood pressure after standing up.
  29. State clearly the hypotheses that should be used and any necessary assumption that needs to be made.
  30. Carry out the test at the \(1 \%\) level of significance.
    2. A biologist investigating the shell size of turtles takes random samples of adult female and adult make turtles and records the length, \(x \mathrm {~cm}\), of the shell. The results are summarised below. Assuming that the scores are normally distributed and stating your hypotheses clearly, test at the \(5 \%\) level of significance whether or not there is evidence to support the teacher's belief.
    (8)
    6. A machine fills bottles with water. The amount of water in each bottle is normally distributed. To check the machine is working properly, a random sample of 12 bottles is selected and the amount of water, in ml , in each bottle is recorded. Unbiased estimates for the mean and variance are $$\mu = 502 \quad s ^ { 2 } = 5.6$$ Stating your hypotheses clearly, test at the \(1 \%\) level of significance
  31. whether or not the mean amount of water in a bottle is more than 500 ml ,
  32. whether or not the standard deviation of the amount of water in a bottle is less than 3 ml .
    7. A machine produces bricks. The lengths, \(x \mathrm {~mm}\), of the bricks are distributed \(\mathrm { N } \left( \mu , 2 ^ { 2 } \right)\). At the start of each week a random sample of \(n\) bricks is taken to check the machine is working correctly.
    A test is then carried out at the \(1 \%\) level of significance with $$\mathrm { H } _ { 0 } : \mu = 202 \quad \text { and } \quad \mathrm { H } _ { 1 } : \mu < 202$$
  33. Find, in terms of \(n\), the critical region of the test. The probability of a type II error, when \(\mu = 200\), is less than 0.05 .
  34. Find the minimum value of \(n\).
Question 8
View details
8. A random sample \(W _ { 1 } , W _ { 2 } \ldots , W _ { n }\) is taken from a distribution with mean \(\mu\) and variance \(\sigma ^ { 2 }\).
  1. Write down \(\mathrm { E } \left( \sum _ { i = 1 } ^ { n } W _ { i } \right)\) and show that \(\mathrm { E } \left( \sum _ { i = 1 } ^ { n } W _ { i } ^ { 2 } \right) = n \left( \sigma ^ { 2 } + \mu ^ { 2 } \right)\). An estimator for \(\mu\) is $$\bar { X } = \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i }$$
  2. Show that \(\bar { X }\) is a consistent estimator for \(\mu\). An estimator of \(\sigma ^ { 2 }\) is $$U = \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i } ^ { 2 } - \left( \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i } \right) ^ { 2 }$$
  3. Find the bias of \(U\).
  4. Write down an unbiased estimator of \(\sigma ^ { 2 }\) in the form \(k U\), where \(k\) is in terms of \(n\). \section*{Advanced/Advanced Subsidiary} \section*{Friday 21 June 2013 - Morning} Mathematical Formulae (Pink) Nil Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation or symbolic differentiation/integration, or have retrievable mathematical formulae stored in them. In the boxes above, write your centre number, candidate number, your surname, initials and signature. Check that you have the correct question paper.
    Answer ALL the questions.
    You must write your answer for each question in the space following the question.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    The marks for the parts of questions are shown in round brackets, e.g. (2).
    There are 6 questions in this question paper. The total mark for this paper is 75.
    There are 20 pages in this question paper. Any blank pages are indicated. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner.
    Answers without working may not gain full credit.
    1. George owns a garage and he records the mileage of cars, \(x\) thousands of miles, between services. The results from a random sample of 10 cars are summarised below.
    $$\sum x = 113.4 \quad \sum x ^ { 2 } = 1414.08$$ The mileage of cars between services is normally distributed and George believes that the standard deviation is 2.4 thousand miles. Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, whether or not these data support George's belief.
    2. Every 6 months some engineers are tested to see if their times, in minutes, to assemble a particular component have changed. The times taken to assemble the component are normally distributed. A random sample of 8 engineers was chosen and their times to assemble the component were recorded in January and in July. The data are given in the table below.
  5. Using a suitable test, at the \(5 \%\) level of significance, state whether or not, on the basis of this trial, you would recommend using the new medicine. State your hypotheses clearly.
  6. State an assumption needed to carry out this test.
    2. The cloth produced by a certain manufacturer has defects that occur randomly at a constant rate of \(\lambda\) per square metre. If \(\lambda\) is thought to be greater than 1.5 then action has to be taken. Using \(\mathrm { H } _ { 0 } : \lambda = 1.5\) and \(\mathrm { H } _ { 1 } : \lambda > 1.5\) a quality control officer takes a \(4 \mathrm {~m} ^ { 2 }\) sample of cloth and rejects \(\mathrm { H } _ { 0 }\) if there are 11 or more defects. If there are 8 or fewer defects she accepts \(\mathrm { H } _ { 0 }\). If there are 9 or 10 defects a second sample of \(4 \mathrm {~m} ^ { 2 }\) is taken and H 0 is rejected if there are 11 or more defects in this second sample, otherwise it is accepted.
  7. Find the size of this test.
  8. Find the power of this test when \(\lambda = 2\).
    3. A farmer is investigating the milk yields of two breeds of cow. He takes a random sample of 9 cows of breed \(A\) and an independent random sample of 12 cows of breed \(B\). For a 5 day period he measures the amount of milk, \(x\) gallons, produced by each cow. The results are summarised in the table below.
  9. State one assumption that needs to be made in order to carry out a paired \(t\)-test.
  10. Stating your hypotheses clearly, test, at the \(1 \%\) level of significance, whether or not the drug increases the mean number of hours of sleep per night by more than 10 minutes. State the critical value for this test.
    5. A statistician believes a coin is biased and the probability, \(p\), of getting a head when the coin is tossed is less than 0.5 . The statistician decides to test this by tossing the coin 10 times and recording the number, \(X\), of heads. He sets up the hypotheses \(\mathrm { H } _ { 0 } : p = 0.5\) and \(\mathrm { H } _ { 1 } : p < 0.5\) and rejects the null hypothesis if \(x < 3\).
  11. Find the size of the test.
  12. Show that the power function of this test is $$( 1 - p ) ^ { 8 } \left( 36 p ^ { 2 } + 8 p + 1 \right)$$ Table 1 gives values, to 2 decimal places, of the power function for the statistician's test. \begin{table}[h] \section*{Table 1}
  13. On the axes below draw the graph of the power function for the statistician's test.
  14. Find the range of values of \(p\) for which the probability of accepting the coin as unbiased, when in fact it is biased, is less than or equal to 0.4 .
    (3)
    \includegraphics[max width=\textwidth, alt={}, center]{47023328-16c0-452b-be48-046187e4193e-38_747_792_731_351}
    6. (a) Explain what is meant by the sampling distribution of an estimator \(T\) of the population parameter \(\theta\).
  15. Explain what you understand by the statement that \(T\) is a biased estimator of \(\theta\). A population has mean \(\mu\) and variance \(\sigma ^ { 2 }\).
    A random sample \(X _ { 1 } , X _ { 2 } , \ldots , X _ { 10 }\) is taken from this population.
  16. Calculate the bias of each of the following estimators of \(\mu\). $$\begin{aligned} & \hat { \mu } _ { 1 } = \frac { X _ { 3 } + X _ { 5 } + X _ { 7 } } { 3 }
    & \hat { \mu } _ { 2 } = \frac { 5 X _ { 1 } + 2 X _ { 2 } + X _ { 9 } } { 6 }
    & \hat { \mu } _ { 3 } = \frac { 3 X _ { 10 } - X _ { 1 } } { 3 } \end{aligned}$$
  17. Find the variance of each of these three estimators.
  18. State, giving a reason, which of these three estimators for \(\mu\) is
    1. the best estimator,
    2. the worst estimator.
      7. Two groups of students take the same examination. A random sample of students is taken from each of the groups.
      The marks of the 9 students from Group 1 are as follows $$\begin{array} { l l l l l l l l l } 30 & 29 & 35 & 27 & 23 & 33 & 33 & 35 & 28 \end{array}$$ The marks, \(x\), of the 7 students from Group 2 gave the following statistics $$\bar { x } = 31.29 \quad s ^ { 2 } = 12.9$$ A test is to be carried out to see whether or not there is a difference between the mean marks of the two groups of students. You may assume that the samples are taken from normally distributed populations and that they are independent.
  19. State one other assumption that must be made in order to apply this test and show that this assumption is reasonable by testing it at a \(10 \%\) level of significance. State your hypotheses clearly.
  20. Stating your hypotheses clearly, test, using a significance level of \(5 \%\), whether or not there is a difference between the mean marks of the two groups of students. \section*{TOTAL FOR PAPER: 75 MARKS} \section*{END} Materials required for examination
    Answer Book (AB16)
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac) Items included with question papers Nil 6686 Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S4), the paper reference (6686), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has seven questions. Pages 6, 7 and 8 are blank. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. The random variable \(X\) has an \(F\) distribution with 10 and 12 degrees of freedom. Find \(a\) and \(b\) such that \(\mathrm { P } ( a < X < b ) = 0.90\).
    2. A chemist has developed a fuel additive and claims that it reduces the fuel consumption of cars. To test this claim, 8 randomly selected cars were each filled with 20 litres of fuel and driven around a race circuit. Each car was tested twice, once with the additive and once without. The distances, in miles, that each car travelled before running out of fuel are given in the table below.
    Car12345678
    Distance without additive163172195170183185161176
    Distance with additive168185187172180189172175
    Assuming that the distances travelled follow a normal distribution and stating your hypotheses clearly test, at the \(10 \%\) level of significance, whether or not there is evidence to support the chemist's claim.
    3. A technician is trying to estimate the area \(\mu ^ { 2 }\) of a metal square. The independent random variables \(X _ { 1 }\) and \(X _ { 2 }\) are each distributed \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\) and represent two measurements of the sides of the square. Two estimators of the area, \(A _ { 1 }\) and \(A _ { 2 }\), are proposed where $$A _ { 1 } = X _ { 1 } X _ { 2 } \quad \text { and } \quad A _ { 2 } = \left( \frac { X _ { 1 } + X _ { 2 } } { 2 } \right) ^ { 2 } .$$ [You may assume that if \(X _ { 1 }\) and \(X _ { 2 }\) are independent random variables then $$\left. \mathrm { E } \left( X _ { 1 } X _ { 2 } \right) = \mathrm { E } \left( X _ { 1 } \right) \mathrm { E } \left( X _ { 2 } \right) \right]$$
  21. Find \(\mathrm { E } \left( A _ { 1 } \right)\) and show that \(\mathrm { E } \left( A _ { 2 } \right) = \mu ^ { 2 } + \frac { \sigma ^ { 2 } } { 2 }\).
  22. Find the bias of each of these estimators. The technician is told that \(\operatorname { Var } \left( A _ { 1 } \right) = \sigma ^ { 4 } + 2 \mu ^ { 2 } \sigma ^ { 2 }\) and \(\operatorname { Var } \left( A _ { 2 } \right) = \frac { 1 } { 2 } \sigma ^ { 4 } + 2 \mu ^ { 2 } \sigma ^ { 2 }\). The technician decided to use \(A _ { 1 }\) as the estimator for \(\mu ^ { 2 }\).
  23. Suggest a possible reason for this decision. A statistician suggests taking a random sample of \(n\) measurements of sides of the square and finding the mean \(\bar { X }\). He knows that \(\mathrm { E } \left( \bar { X } ^ { 2 } \right) = \mu ^ { 2 } + \frac { \sigma ^ { 2 } } { n }\) and \(\operatorname { Var } \left( \bar { X } ^ { 2 } \right) = \frac { 2 \sigma ^ { 4 } } { n ^ { 2 } } + \frac { 4 \sigma ^ { 2 } \mu ^ { 2 } } { n }\).
  24. Explain whether or not \(\bar { X } ^ { 2 }\) is a consistent estimator of \(\mu ^ { 2 }\).
    4. A recent census in the U.K. revealed that the heights of females in the U.K. have a mean of 160.9 cm . A doctor is studying the heights of female Indians in a remote region of South America. The doctor measured the height, \(x \mathrm {~cm}\), of each of a random sample of 30 female Indians and obtained the following statistics. $$\Sigma x = 4400.7 , \quad \Sigma \mathrm { x } ^ { 2 } = 646904.41 .$$ The heights of female Indians may be assumed to follow a normal distribution.
    The doctor presented the results of the study in a medical journal and wrote 'the female Indians in this region are more than 10 cm shorter than females in the U.K.'
  25. Stating your hypotheses clearly and using a \(5 \%\) level of significance, test the doctor's statement. The census also revealed that the standard deviation of the heights of U.K. females was 6.0 cm .
  26. Stating your hypotheses clearly test, at the \(5 \%\) level of significance, whether or not there is evidence that the variance of the heights of female Indians is different from that of females in the U.K.
    5. The times, \(x\) seconds, taken by the competitors in the 100 m freestyle events at a school swimming gala are recorded. The following statistics are obtained from the data.
    No. of competitorsSample Mean \(\bar { x }\)\(\sum x ^ { 2 }\)
    Girls883.1055746
    Boys788.9056130
    Following the gala a proud parent claims that girls are faster swimmers than boys. Assuming that the times taken by the competitors are two independent random samples from normal distributions,
  27. test, at the \(10 \%\) level of significance, whether or not the variances of the two distributions are the same. State your hypotheses clearly.
  28. Stating your hypotheses clearly, test the parent's claim. Use a \(5 \%\) level of significance.
    6. A nutritionist studied the levels of cholesterol, \(X \mathrm { mg } / \mathrm { cm } ^ { 3 }\), of male students at a large college. She assumed that \(X\) was distributed \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\) and examined a random sample of 25 male students. Using this sample she obtained unbiased estimates of \(\mu\) and \(\sigma ^ { 2 }\) as $$\hat { \mu } = 1.68 , \quad \hat { \sigma } ^ { 2 } = 1.79 .$$
  29. Find a 95\% confidence interval for \(\mu\).
  30. Obtain a \(95 \%\) confidence interval for \(\sigma ^ { 2 }\). A cholesterol reading of more than \(2.5 \mathrm { mg } / \mathrm { cm } ^ { 3 }\) is regarded as high.
  31. Use appropriate confidence limits from parts (a) and (b) to find the lowest estimate of the proportion of male students in the college with high cholesterol.