OCR S4 (Statistics 4) 2008 June

Question 1
View details
1 For the mutually exclusive events \(A\) and \(B , \mathrm { P } ( A ) = \mathrm { P } ( B ) = x\), where \(x \neq 0\).
  1. Show that \(x \leqslant \frac { 1 } { 2 }\).
  2. Show that \(A\) and \(B\) are not independent. The event \(C\) is independent of \(A\) and also independent of \(B\), and \(\mathrm { P } ( C ) = 2 x\).
  3. Show that \(\mathrm { P } ( A \cup B \cup C ) = 4 x ( 1 - x )\).
Question 2
View details
2 Part of Helen’s psychology dissertation involved the reaction times to a certain stimulus. She measured the reaction times of 30 randomly selected students, in seconds correct to 2 decimal places. The results are shown in the following stem-and-leaf diagram.
1412
1524
16036
17157
1834579
19246789
2001345789
217
Key: 18 | 3 means 1.83 seconds Helen wishes to test whether the population median time exceeds 1.80 seconds.
  1. Give a reason why the Wilcoxon signed-rank test should not be used.
  2. Carry out a suitable non-parametric test at the \(5 \%\) significance level.
Question 3
View details
3 From the records of Mulcaster United Football Club the following distribution was suggested as a probability model for future matches. \(X\) and \(Y\) denoted the numbers of goals scored by the home team and the away team respectively.
\(X\)
\cline { 2 - 5 } \multicolumn{1}{c}{}0123
00.110.040.060.08
10.080.050.120.05
20.050.080.070.03
30.030.060.070.02
Use the model to find
  1. \(\mathrm { E } ( X )\),
  2. the probability that the away team wins a randomly chosen match,
  3. the probability that the away team wins a randomly chosen match, given that the home team scores. One of the directors, an amateur statistician, finds that \(\operatorname { Cov } ( X , Y ) = 0.007\). He states that, as this value is very close to zero, \(X\) and \(Y\) may be considered to be independent.
  4. Comment on the director's statement.
Question 4
View details
4 William takes a bus regularly on the same journey, sometimes in the morning and sometimes in the afternoon. He wishes to compare morning and afternoon journey times. He records the journey times on 7 randomly chosen mornings and 8 randomly chosen afternoons. The results, each correct to the nearest minute, are as follows, where M denotes a morning time and A denotes an afternoon time.
MAAMMMMMMAAAAAA
192022242526283031333537383942
William wishes to test for a difference between the average times of morning and afternoon journeys.
  1. Given that \(s _ { M } ^ { 2 } = 16.5\) and \(s _ { A } ^ { 2 } = 64.5\), with the usual notation, explain why a \(t\)-test is not appropriate in this case.
  2. William chooses a non-parametric test at the \(5 \%\) significance level. Carry out the test, stating the rejection region.
Question 5
View details
5 The discrete random variable \(X\) has moment generating function \(\frac { 1 } { 4 } \mathrm { e } ^ { 2 t } + a \mathrm { e } ^ { 3 t } + b \mathrm { e } ^ { 4 t }\), where \(a\) and \(b\) are constants. It is given that \(\mathrm { E } ( X ) = 3 \frac { 3 } { 8 }\).
  1. Show that \(a = \frac { 1 } { 8 }\), and find the value of \(b\).
  2. Find \(\operatorname { Var } ( X )\).
  3. State the possible values of \(X\).
Question 6
View details
6 The continuous random variable \(Y\) has cumulative distribution function given by $$\mathrm { F } ( y ) = \begin{cases} 0 & y < a ,
1 - \frac { a ^ { 3 } } { y ^ { 3 } } & y \geqslant a , \end{cases}$$ where \(a\) is a positive constant. A random sample of 3 observations, \(Y _ { 1 } , Y _ { 2 } , Y _ { 3 }\), is taken, and the smallest is denoted by \(S\).
  1. Show that \(\mathrm { P } ( S > s ) = \left( \frac { a } { s } \right) ^ { 9 }\) and hence obtain the probability density function of \(S\).
  2. Show that \(S\) is not an unbiased estimator of \(a\), and construct an unbiased estimator, \(T _ { 1 }\), based on \(S\). It is given that \(T _ { 2 }\), where \(T _ { 2 } = \frac { 2 } { 9 } \left( Y _ { 1 } + Y _ { 2 } + Y _ { 3 } \right)\), is another unbiased estimator of \(a\).
  3. Given that \(\operatorname { Var } ( Y ) = \frac { 3 } { 4 } a ^ { 2 }\) and \(\operatorname { Var } ( S ) = \frac { 9 } { 448 } a ^ { 2 }\), determine which of \(T _ { 1 }\) and \(T _ { 2 }\) is the more efficient estimator.
  4. The values of \(Y\) for a particular sample are 12.8, 4.5 and 7.0. Find the values of \(T _ { 1 }\) and \(T _ { 2 }\) for this sample, and give a reason, unrelated to efficiency, why \(T _ { 1 }\) gives a better estimate of \(a\) than \(T _ { 2 }\) in this case.
Question 7
View details
7 The probability generating function of the random variable \(X\) is given by $$\mathrm { G } ( t ) = \frac { 1 + a t } { 4 - t }$$ where \(a\) is a constant.
  1. Find the value of \(a\).
  2. Find \(\mathrm { P } ( X = 3 )\). The sum of 3 independent observations of \(X\) is denoted by \(Y\). The probability generating function of \(Y\) is denoted by \(\mathrm { H } ( t )\).
  3. Use \(\mathrm { H } ( t )\) to find \(\mathrm { E } ( Y )\).
  4. By considering \(\mathrm { H } ( - 1 ) + \mathrm { H } ( 1 )\), show that \(\mathrm { P } ( Y\) is an even number \() = \frac { 62 } { 125 }\). \footnotetext{Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every reasonable effort has been made by the publisher (OCR) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the publisher will be pleased to make amends at the earliest possible opportunity. OCR is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge. }