OCR S4 (Statistics 4) 2008 June

Mark scheme PDF ↗

Question 1 7 marks
View details
1 For the mutually exclusive events \(A\) and \(B , \mathrm { P } ( A ) = \mathrm { P } ( B ) = x\), where \(x \neq 0\).
  1. Show that \(x \leqslant \frac { 1 } { 2 }\).
  2. Show that \(A\) and \(B\) are not independent. The event \(C\) is independent of \(A\) and also independent of \(B\), and \(\mathrm { P } ( C ) = 2 x\).
  3. Show that \(\mathrm { P } ( A \cup B \cup C ) = 4 x ( 1 - x )\).
Question 2 8 marks
View details
2 Part of Helen's psychology dissertation involved the reaction times to a certain stimulus. She measured the reaction times of 30 randomly selected students, in seconds correct to 2 decimal places. The results are shown in the following stem-and-leaf diagram.
1412
1524
16036
17157
1834579
19246789
2001345789
217
Key: 18 | 3 means 1.83 seconds Helen wishes to test whether the population median time exceeds 1.80 seconds.
  1. Give a reason why the Wilcoxon signed-rank test should not be used.
  2. Carry out a suitable non-parametric test at the \(5 \%\) significance level.
Question 3 11 marks
View details
3 From the records of Mulcaster United Football Club the following distribution was suggested as a probability model for future matches. \(X\) and \(Y\) denoted the numbers of goals scored by the home team and the away team respectively.
\(X\)
\cline { 2 - 5 } \multicolumn{1}{c}{}0123
00.110.040.060.08
10.080.050.120.05
20.050.080.070.03
30.030.060.070.02
Use the model to find
  1. \(\mathrm { E } ( X )\),
  2. the probability that the away team wins a randomly chosen match,
  3. the probability that the away team wins a randomly chosen match, given that the home team scores. One of the directors, an amateur statistician, finds that \(\operatorname { Cov } ( X , Y ) = 0.007\). He states that, as this value is very close to zero, \(X\) and \(Y\) may be considered to be independent.
  4. Comment on the director's statement.
Question 4 7 marks
View details
4 William takes a bus regularly on the same journey, sometimes in the morning and sometimes in the afternoon. He wishes to compare morning and afternoon journey times. He records the journey times on 7 randomly chosen mornings and 8 randomly chosen afternoons. The results, each correct to the nearest minute, are as follows, where M denotes a morning time and A denotes an afternoon time.
MAAMMMMMMAAAAAA
192022242526283031333537383942
William wishes to test for a difference between the average times of morning and afternoon journeys.
  1. Given that \(s _ { M } ^ { 2 } = 16.5\) and \(s _ { A } ^ { 2 } = 64.5\), with the usual notation, explain why a \(t\)-test is not appropriate in this case.
  2. William chooses a non-parametric test at the \(5 \%\) significance level. Carry out the test, stating the rejection region.
Question 5 11 marks
View details
5 The discrete random variable \(X\) has moment generating function \(\frac { 1 } { 4 } \mathrm { e } ^ { 2 t } + a \mathrm { e } ^ { 3 t } + b \mathrm { e } ^ { 4 t }\), where \(a\) and \(b\) are constants. It is given that \(\mathrm { E } ( X ) = 3 \frac { 3 } { 8 }\).
  1. Show that \(a = \frac { 1 } { 8 }\), and find the value of \(b\).
  2. Find \(\operatorname { Var } ( X )\).
  3. State the possible values of \(X\).
Question 6 15 marks
View details
6 The continuous random variable \(Y\) has cumulative distribution function given by $$\mathrm { F } ( y ) = \begin{cases} 0 & y < a , \\ 1 - \frac { a ^ { 3 } } { y ^ { 3 } } & y \geqslant a , \end{cases}$$ where \(a\) is a positive constant. A random sample of 3 observations, \(Y _ { 1 } , Y _ { 2 } , Y _ { 3 }\), is taken, and the smallest is denoted by \(S\).
  1. Show that \(\mathrm { P } ( S > s ) = \left( \frac { a } { s } \right) ^ { 9 }\) and hence obtain the probability density function of \(S\).
  2. Show that \(S\) is not an unbiased estimator of \(a\), and construct an unbiased estimator, \(T _ { 1 }\), based on \(S\). It is given that \(T _ { 2 }\), where \(T _ { 2 } = \frac { 2 } { 9 } \left( Y _ { 1 } + Y _ { 2 } + Y _ { 3 } \right)\), is another unbiased estimator of \(a\).
  3. Given that \(\operatorname { Var } ( Y ) = \frac { 3 } { 4 } a ^ { 2 }\) and \(\operatorname { Var } ( S ) = \frac { 9 } { 448 } a ^ { 2 }\), determine which of \(T _ { 1 }\) and \(T _ { 2 }\) is the more efficient estimator.
  4. The values of \(Y\) for a particular sample are 12.8, 4.5 and 7.0. Find the values of \(T _ { 1 }\) and \(T _ { 2 }\) for this sample, and give a reason, unrelated to efficiency, why \(T _ { 1 }\) gives a better estimate of \(a\) than \(T _ { 2 }\) in this case.
Question 7 13 marks
View details
7 The probability generating function of the random variable \(X\) is given by $$\mathrm { G } ( t ) = \frac { 1 + a t } { 4 - t }$$ where \(a\) is a constant.
  1. Find the value of \(a\).
  2. Find \(\mathrm { P } ( X = 3 )\). The sum of 3 independent observations of \(X\) is denoted by \(Y\). The probability generating function of \(Y\) is denoted by \(\mathrm { H } ( t )\).
  3. Use \(\mathrm { H } ( t )\) to find \(\mathrm { E } ( Y )\).
  4. By considering \(\mathrm { H } ( - 1 ) + \mathrm { H } ( 1 )\), show that \(\mathrm { P } ( Y\) is an even number \() = \frac { 62 } { 125 }\).