OCR S3 (Statistics 3) 2013 January

Question 1
View details
1 The independent random variables \(X\) and \(Y\) have the distributions \(\mathrm { N } \left( 10 , \sigma ^ { 2 } \right)\) and \(\operatorname { Po } ( 2 )\) respectively. The random variable \(S\) is given by \(S = 5 X - 2 Y + c\), where \(c\) is a constant.
It is given that \(\mathrm { E } ( S ) = \operatorname { Var } ( S ) = 408\).
  1. Find the value of \(c\) and show that \(\sigma ^ { 2 } = 16\).
  2. Find \(\mathrm { P } ( X \geqslant \mathrm { E } ( Y ) )\).
Question 2
View details
2 A new running track has been developed and part of the testing procedure involves 7 randomly chosen athletes. They each run 100 m on both the old and new tracks.
The results are as follows.
Athlete1234567
Time on old track \(( s )\)12.210.311.513.011.811.711.9
Time on new track \(( s )\)11.110.511.012.611.010.912.0
The population mean times on the old and new tracks are denoted by \(\mu _ { \mathrm { O } }\) seconds and \(\mu _ { \mathrm { N } }\) seconds respectively. Stating any necessary assumption, carry out a suitable \(t\)-test of the null hypothesis \(\mu _ { \mathrm { O } } - \mu _ { \mathrm { N } } = 0\) against the alternative hypothesis \(\mu _ { \mathrm { O } } - \mu _ { \mathrm { N } } > 0\). Use a \(2 \frac { 1 } { 2 } \%\) significance level .
Question 3
View details
3 Two reading schemes, \(A\) and \(B\), are compared by using them with a random sample of 9 five-year-old children. The children are divided into two groups, 5 allotted to scheme \(A\) and 4 to scheme \(B\), and the schemes are taught under similar conditions.
After one year the children are given the same test and their scores, \(x _ { A }\) and \(x _ { B }\), are summarised below. With the usual notation, $$\begin{aligned} & n _ { A } = 5 , \bar { x } _ { A } = 52.0 , \sum \left( x _ { A } - \bar { x } _ { A } \right) ^ { 2 } = 248 ,
& n _ { B } = 4 , \bar { x } _ { B } = 56.5 , \sum \left( x _ { B } - \bar { x } _ { B } \right) ^ { 2 } = 381 . \end{aligned}$$ It may be assumed that scores have normal distributions.
  1. Calculate an \(80 \%\) confidence interval for the difference in population mean scores for the two methods.
  2. State a further assumption required for the validity of the interval.
Question 4
View details
4 The continuous random variable \(X\) has probability density function given by $$f ( x ) = \begin{cases} \frac { 3 } { 2 } \sqrt { x } & 0 < x \leqslant 1
0 & \text { otherwise } \end{cases}$$ The random variable \(Y\) is given by \(Y = \frac { 1 } { \sqrt { X } }\).
  1. Find the (cumulative) distribution function of \(Y\), and hence show that its probability density function is given by $$\mathrm { g } ( y ) = \frac { 3 } { y ^ { 4 } }$$ for a set of values of \(y\) to be stated.
  2. Find the value of \(\mathrm { E } \left( Y ^ { 2 } \right)\).
Question 5
View details
5 A constitutional change was proposed for a Golf Club with a large membership. This was to be voted on at the Annual General Meeting. A month before this meeting the secretary asked a random sample of 50 members for their opinions. Out of the 50 members \(70 \%\) said they approved.
  1. Calculate an approximate \(90 \%\) confidence interval for the proportion \(p\) of all members who would approve the proposal.
  2. Explain what is meant by a \(90 \%\) confidence interval in this context.
  3. Nearer the date of the meeting the secretary asked a random sample of \(n\) members, and, as before, \(70 \%\) said they approved. This time the secretary calculated an approximate \(99 \%\) confidence interval for \(p\). It is given that the confidence interval does not include 0.85 . Find the smallest possible value of \(n\).
Question 6
View details
6 A large population of plants consists of five species \(A , B , C , D\) and \(E\) in the proportions \(p _ { A } , p _ { B } , p _ { C } , p _ { D }\) and \(p _ { E }\) respectively. A random sample of 120 plants consisted of \(23,14,24,27\) and 32 of \(A , B , C , D\) and \(E\) respectively. Carry out a test at the \(10 \%\) significance level of the null hypothesis that the proportions are \(p _ { \mathrm { A } } = p _ { \mathrm { B } } = 0.15 , p _ { \mathrm { C } } = p _ { \mathrm { D } } = 0.25\) and \(p _ { \mathrm { E } } = 0.2\).
Question 7
View details
7 The random variable \(X\) has distribution \(\mathrm { N } ( \mu , 1 )\). A random sample of 4 observations of \(X\) is taken. The sample mean is denoted by \(\bar { X }\).
  1. Find the value of the constant \(a\) for which ( \(\bar { X } - a , \bar { X } + a\) ) is a \(98 \%\) confidence interval for \(\mu\). The independent random variable \(Y\) has distribution \(\mathrm { N } ( \mu , 9 )\). A random sample of 16 observations of \(Y\) is taken. The sample mean is denoted by \(\bar { Y }\).
  2. Write down the distribution of \(\bar { X } - \bar { Y }\).
  3. A \(90 \%\) confidence interval for \(\mu\) based on \(\bar { Y }\) is given by ( \(\bar { Y } - 1.234 , \bar { Y } + 1.234\) ). Find the probability that this interval does not overlap with the interval in part (i).
Question 8
View details
8 After contracting a particular disease, patients from a hospital are advised to have their blood tested monthly for a year. In order to test whether patients comply with this advice the hospital management commissioned a survey of 100 patients. A hospital statistician selected the patients randomly from records and asked the patients whether or not they had complied with the advice. The results classified by gender are as follows.
Gender
\cline { 2 - 4 }FemaleMale
\cline { 2 - 4 } ComplyYes3430
\cline { 2 - 4 }No1125
\cline { 2 - 4 }
\cline { 2 - 4 }
  1. Test at the \(5 \%\) significance level whether compliance with the advice is independent of gender.
  2. A manager believed that a greater proportion of female patients than male patients comply with the advice. Carry out an appropriate test of proportions at the \(10 \%\) significance level.