OCR S4 (Statistics 4) 2011 June

Question 1
View details
1 The random variable \(X\) has the distribution \(\mathrm { B } ( n , p )\).
  1. Show, from the definition, that the probability generating function of \(X\) is \(( q + p t ) ^ { n }\), where \(q = 1 - p\).
  2. The independent random variable \(Y\) has the distribution \(\mathrm { B } ( 2 n , p )\) and \(T = X + Y\). Use probability generating functions to determine the distribution of \(T\), giving its parameters.
Question 2
View details
2 A botanist believes that some species of plants produce more flowers at high altitudes than at low altitudes. In order to investigate this belief the botanist randomly samples 11 species of plants each of which occurs at both altitudes. The numbers of flowers on the plants are shown in the table.
Species1234567891011
Number of flowers at low altitude534729654112
Number of flowers at high altitude161081416202115212
  1. Use the Wilcoxon signed rank test at the 5\% significance level to test the botanist's belief.
  2. Explain why the Wilcoxon rank sum test should not be used for this test.
Question 3
View details
3 For the events \(A\) and \(B , \mathrm { P } ( A ) = \mathrm { P } ( B ) = \frac { 3 } { 4 }\) and \(\mathrm { P } \left( A \mid B ^ { \prime } \right) = \frac { 1 } { 2 }\).
  1. Find \(\mathrm { P } ( A \cap B )\). For a third event \(C , \mathrm { P } ( C ) = \frac { 1 } { 4 }\) and \(C\) is independent of the event \(A \cap B\).
  2. Find \(\mathrm { P } ( A \cap B \cap C )\).
  3. Given that \(\mathrm { P } ( C \mid A ) = \lambda\) and \(\mathrm { P } ( B \mid C ) = 3 \lambda\), and that no event occurs outside \(A \cup B \cup C\), find the value of \(\lambda\).
Question 4
View details
4 The discrete random variable \(X\) has moment generating function \(\left( \frac { 1 } { 4 } + \frac { 3 } { 4 } \mathrm { e } ^ { t } \right) ^ { 3 }\).
  1. Find \(\mathrm { E } ( X )\).
  2. Find \(\mathrm { P } ( X = 2 )\).
  3. Show that \(X\) can be expressed as a sum of 3 independent observations of a random variable \(Y\). Obtain the probability distribution of \(Y\), and the variance of \(Y\).
Question 5
View details
5 A test was carried out to compare the breaking strengths of two brands of elastic band, \(A\) and \(B\), of the same size. Random samples of 6 were selected from each brand and the breaking strengths were measured. The results, in suitable units and arranged in ascending order for each brand, are as follows.
Brand \(A :\)5.68.79.210.711.212.6
Brand \(B :\)10.111.612.012.212.913.5
  1. Give one advantage that a non-parametric test might have over a parametric test in this context.
  2. Carry out a suitable Wilcoxon test at the \(5 \%\) significance level of whether there is a difference between the average breaking strengths of the two brands.
  3. An extra elastic band of brand \(B\) was tested and found to have a breaking strength exceeding all of the other 12 bands. Determine whether this information alters the conclusion of your test.
Question 6
View details
6 A City Council comprises 16 Labour members, 14 Conservative members and 6 members of Other parties. A sample of two members was chosen at random to represent the Council at an event. The number of Labour members and the number of Conservative members in this sample are denoted by \(L\) and \(C\) respectively. The joint probability distribution of \(L\) and \(C\) is given in the following table. \(C\)
\(L\)
012
0\(\frac { 1 } { 42 }\)\(\frac { 16 } { 105 }\)\(\frac { 4 } { 21 }\)
1\(\frac { 2 } { 15 }\)\(\frac { 16 } { 45 }\)0
2\(\frac { 13 } { 90 }\)00
  1. Verify the two non-zero probabilities in the table for which \(C = 1\).
  2. Find the expected number of Conservatives in the sample.
  3. Find the expected number of Other members in the sample.
  4. Explain why \(L\) and \(C\) are not independent, and state what can be deduced about \(\operatorname { Cov } ( L , C )\).
Question 7
View details
7 The continuous random variable \(U\) has unknown mean \(\mu\) and known variance \(\sigma ^ { 2 }\). In order to estimate \(\mu\), two random samples, one of 4 observations of \(U\) and the other of 6 observations of \(U\), are taken. The sample means are denoted by \(\bar { U } _ { 4 }\) and \(\bar { U } _ { 6 }\) respectively. One estimator \(S\), given by \(S = \frac { 1 } { 2 } \left( \bar { U } _ { 4 } + \bar { U } _ { 6 } \right)\), is proposed.
  1. Show that \(S\) is unbiased and find \(\operatorname { Var } ( S )\) in terms of \(\sigma ^ { 2 }\). A second estimator \(T\) of the form \(a \bar { U } _ { 4 } + b \bar { U } _ { 6 }\) is proposed, where \(a\) and \(b\) are chosen such that \(T\) is an unbiased estimator for \(\mu\) with the smallest possible variance.
  2. Find the values of \(a\) and \(b\) and the corresponding variance of \(T\).
  3. State, giving a reason, which of \(S\) and \(T\) is the better estimator.
  4. Compare the efficiencies of this preferred estimator and the mean of all 10 observations.