OCR S2 (Statistics 2) 2007 June

Question 1
View details
1 A random sample of observations of a random variable \(X\) is summarised by $$n = 100 , \quad \Sigma x = 4830.0 , \quad \Sigma x ^ { 2 } = 249 \text { 509.16. }$$
  1. Obtain unbiased estimates of the mean and variance of \(X\).
  2. The sample mean of 100 observations of \(X\) is denoted by \(\bar { X }\). Explain whether you would need any further information about the distribution of \(X\) in order to estimate \(\mathrm { P } ( \bar { X } > 60 )\). [You should not attempt to carry out the calculation.]
Question 2
View details
2 It is given that on average one car in forty is yellow. Using a suitable approximation, find the probability that, in a random sample of 130 cars, exactly 4 are yellow.
Question 3
View details
3 The proportion of adults in a large village who support a proposal to build a bypass is denoted by \(p\). A random sample of size 20 is selected from the adults in the village, and the members of the sample are asked whether or not they support the proposal.
  1. Name the probability distribution that would be used in a hypothesis test for the value of \(p\).
  2. State the properties of a random sample that explain why the distribution in part (i) is likely to be a good model.
    \(4 X\) is a continuous random variable.
Question 4
View details
  1. State two conditions needed for \(X\) to be well modelled by a normal distribution.
  2. It is given that \(X \sim \mathrm {~N} \left( 50.0,8 ^ { 2 } \right)\). The mean of 20 random observations of \(X\) is denoted by \(\bar { X }\). Find \(\mathrm { P } ( \bar { X } > 47.0 )\). 5 The number of system failures per month in a large network is a random variable with the distribution \(\operatorname { Po } ( \lambda )\). A significance test of the null hypothesis \(\mathrm { H } _ { 0 } : \lambda = 2.5\) is carried out by counting \(R\), the number of system failures in a period of 6 months. The result of the test is that \(\mathrm { H } _ { 0 }\) is rejected if \(R > 23\) but is not rejected if \(R \leqslant 23\).
  3. State the alternative hypothesis.
  4. Find the significance level of the test.
  5. Given that \(\mathrm { P } ( R > 23 ) < 0.1\), use tables to find the largest possible actual value of \(\lambda\). You should show the values of any relevant probabilities. 6 In a rearrangement code, the letters of a message are rearranged so that the frequency with which any particular letter appears is the same as in the original message. In ordinary German the letter \(e\) appears \(19 \%\) of the time. A certain encoded message of 20 letters contains one letter \(e\).
  6. Using an exact binomial distribution, test at the \(10 \%\) significance level whether there is evidence that the proportion of the letter \(e\) in the language from which this message is a sample is less than in German, i.e., less than \(19 \%\).
  7. Give a reason why a binomial distribution might not be an appropriate model in this context. 7 Two continuous random variables \(S\) and \(T\) have probability density functions as follows. $$\begin{array} { l l } S : & f ( x ) = \begin{cases} \frac { 1 } { 2 } & - 1 \leqslant x \leqslant 1
    0 & \text { otherwise } \end{cases}
    T : & g ( x ) = \begin{cases} \frac { 3 } { 2 } x ^ { 2 } & - 1 \leqslant x \leqslant 1
    0 & \text { otherwise } \end{cases} \end{array}$$
  8. Sketch on the same axes the graphs of \(y = \mathrm { f } ( x )\) and \(y = \mathrm { g } ( x )\). [You should not use graph paper or attempt to plot points exactly.]
  9. Explain in everyday terms the difference between the two random variables.
  10. Find the value of \(t\) such that \(\mathrm { P } ( T > t ) = 0.2\). 8 A random variable \(Y\) is normally distributed with mean \(\mu\) and variance 12.25. Two statisticians carry out significance tests of the hypotheses \(\mathrm { H } _ { 0 } : \mu = 63.0 , \mathrm { H } _ { 1 } : \mu > 63.0\).
  11. Statistician \(A\) uses the mean \(\bar { Y }\) of a sample of size 23, and the critical region for his test is \(\bar { Y } > 64.20\). Find the significance level for \(A\) 's test.
  12. Statistician \(B\) uses the mean of a sample of size 50 and a significance level of \(5 \%\).
    (a) Find the critical region for \(B\) 's test.
    (b) Given that \(\mu = 65.0\), find the probability that \(B\) 's test results in a Type II error.
  13. Given that, when \(\mu = 65.0\), the probability that \(A\) 's test results in a Type II error is 0.1365 , state with a reason which test is better. 9 (a) The random variable \(G\) has the distribution \(\mathrm { B } ( n , 0.75 )\). Find the set of values of \(n\) for which the distribution of \(G\) can be well approximated by a normal distribution.
    (b) The random variable \(H\) has the distribution \(\mathrm { B } ( n , p )\). It is given that, using a normal approximation, \(\mathrm { P } ( H \geqslant 71 ) = 0.0401\) and \(\mathrm { P } ( H \leqslant 46 ) = 0.0122\).
  14. Find the mean and standard deviation of the approximating normal distribution.
  15. Hence find the values of \(n\) and \(p\).
Question 5
View details
5 The number of system failures per month in a large network is a random variable with the distribution \(\operatorname { Po } ( \lambda )\). A significance test of the null hypothesis \(\mathrm { H } _ { 0 } : \lambda = 2.5\) is carried out by counting \(R\), the number of system failures in a period of 6 months. The result of the test is that \(\mathrm { H } _ { 0 }\) is rejected if \(R > 23\) but is not rejected if \(R \leqslant 23\).
  1. State the alternative hypothesis.
  2. Find the significance level of the test.
  3. Given that \(\mathrm { P } ( R > 23 ) < 0.1\), use tables to find the largest possible actual value of \(\lambda\). You should show the values of any relevant probabilities.
Question 6
View details
6 In a rearrangement code, the letters of a message are rearranged so that the frequency with which any particular letter appears is the same as in the original message. In ordinary German the letter \(e\) appears \(19 \%\) of the time. A certain encoded message of 20 letters contains one letter \(e\).
  1. Using an exact binomial distribution, test at the \(10 \%\) significance level whether there is evidence that the proportion of the letter \(e\) in the language from which this message is a sample is less than in German, i.e., less than \(19 \%\).
  2. Give a reason why a binomial distribution might not be an appropriate model in this context.
Question 7
View details
7 Two continuous random variables \(S\) and \(T\) have probability density functions as follows. $$\begin{array} { l l } S : & f ( x ) = \begin{cases} \frac { 1 } { 2 } & - 1 \leqslant x \leqslant 1
0 & \text { otherwise } \end{cases}
T : & g ( x ) = \begin{cases} \frac { 3 } { 2 } x ^ { 2 } & - 1 \leqslant x \leqslant 1
0 & \text { otherwise } \end{cases} \end{array}$$
  1. Sketch on the same axes the graphs of \(y = \mathrm { f } ( x )\) and \(y = \mathrm { g } ( x )\). [You should not use graph paper or attempt to plot points exactly.]
  2. Explain in everyday terms the difference between the two random variables.
  3. Find the value of \(t\) such that \(\mathrm { P } ( T > t ) = 0.2\).
Question 8
View details
8 A random variable \(Y\) is normally distributed with mean \(\mu\) and variance 12.25. Two statisticians carry out significance tests of the hypotheses \(\mathrm { H } _ { 0 } : \mu = 63.0 , \mathrm { H } _ { 1 } : \mu > 63.0\).
  1. Statistician \(A\) uses the mean \(\bar { Y }\) of a sample of size 23, and the critical region for his test is \(\bar { Y } > 64.20\). Find the significance level for \(A\) 's test.
  2. Statistician \(B\) uses the mean of a sample of size 50 and a significance level of \(5 \%\).
    (a) Find the critical region for \(B\) 's test.
    (b) Given that \(\mu = 65.0\), find the probability that \(B\) 's test results in a Type II error.
  3. Given that, when \(\mu = 65.0\), the probability that \(A\) 's test results in a Type II error is 0.1365 , state with a reason which test is better.
Question 9
View details
9
  1. The random variable \(G\) has the distribution \(\mathrm { B } ( n , 0.75 )\). Find the set of values of \(n\) for which the distribution of \(G\) can be well approximated by a normal distribution.
  2. The random variable \(H\) has the distribution \(\mathrm { B } ( n , p )\). It is given that, using a normal approximation, \(\mathrm { P } ( H \geqslant 71 ) = 0.0401\) and \(\mathrm { P } ( H \leqslant 46 ) = 0.0122\).
    1. Find the mean and standard deviation of the approximating normal distribution.
    2. Hence find the values of \(n\) and \(p\). 4