OCR S2 2007 June — Question 4 6 marks

Exam BoardOCR
ModuleS2 (Statistics 2)
Year2007
SessionJune
Marks6
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicNormal Distribution
TypeSingle tail probability P(X < a) or P(X > a)
DifficultyModerate -0.3 This is a straightforward S2 question testing standard knowledge of sampling distributions. Part (i) requires recall of conditions for normal approximation. Part (ii) is a routine application of the sampling distribution of the mean: recognizing that X̄ ~ N(50, 8²/20), standardizing, and using tables. Both parts are textbook exercises with no problem-solving required, making it slightly easier than average.
Spec2.04e Normal distribution: as model N(mu, sigma^2)5.01a Permutations and combinations: evaluate probabilities5.02l Poisson conditions: for modelling5.05a Sample mean distribution: central limit theorem5.05c Hypothesis test: normal distribution for population mean

  1. State two conditions needed for \(X\) to be well modelled by a normal distribution.
  2. It is given that \(X \sim \mathrm {~N} \left( 50.0,8 ^ { 2 } \right)\). The mean of 20 random observations of \(X\) is denoted by \(\bar { X }\). Find \(\mathrm { P } ( \bar { X } > 47.0 )\). 5 The number of system failures per month in a large network is a random variable with the distribution \(\operatorname { Po } ( \lambda )\). A significance test of the null hypothesis \(\mathrm { H } _ { 0 } : \lambda = 2.5\) is carried out by counting \(R\), the number of system failures in a period of 6 months. The result of the test is that \(\mathrm { H } _ { 0 }\) is rejected if \(R > 23\) but is not rejected if \(R \leqslant 23\).
  3. State the alternative hypothesis.
  4. Find the significance level of the test.
  5. Given that \(\mathrm { P } ( R > 23 ) < 0.1\), use tables to find the largest possible actual value of \(\lambda\). You should show the values of any relevant probabilities. 6 In a rearrangement code, the letters of a message are rearranged so that the frequency with which any particular letter appears is the same as in the original message. In ordinary German the letter \(e\) appears \(19 \%\) of the time. A certain encoded message of 20 letters contains one letter \(e\).
  6. Using an exact binomial distribution, test at the \(10 \%\) significance level whether there is evidence that the proportion of the letter \(e\) in the language from which this message is a sample is less than in German, i.e., less than \(19 \%\).
  7. Give a reason why a binomial distribution might not be an appropriate model in this context. 7 Two continuous random variables \(S\) and \(T\) have probability density functions as follows. $$\begin{array} { l l } S : & f ( x ) = \begin{cases} \frac { 1 } { 2 } & - 1 \leqslant x \leqslant 1 \\ 0 & \text { otherwise } \end{cases} \\ T : & g ( x ) = \begin{cases} \frac { 3 } { 2 } x ^ { 2 } & - 1 \leqslant x \leqslant 1 \\ 0 & \text { otherwise } \end{cases} \end{array}$$
  8. Sketch on the same axes the graphs of \(y = \mathrm { f } ( x )\) and \(y = \mathrm { g } ( x )\). [You should not use graph paper or attempt to plot points exactly.]
  9. Explain in everyday terms the difference between the two random variables.
  10. Find the value of \(t\) such that \(\mathrm { P } ( T > t ) = 0.2\). 8 A random variable \(Y\) is normally distributed with mean \(\mu\) and variance 12.25. Two statisticians carry out significance tests of the hypotheses \(\mathrm { H } _ { 0 } : \mu = 63.0 , \mathrm { H } _ { 1 } : \mu > 63.0\).
  11. Statistician \(A\) uses the mean \(\bar { Y }\) of a sample of size 23, and the critical region for his test is \(\bar { Y } > 64.20\). Find the significance level for \(A\) 's test.
  12. Statistician \(B\) uses the mean of a sample of size 50 and a significance level of \(5 \%\).
    1. Find the critical region for \(B\) 's test.
    2. Given that \(\mu = 65.0\), find the probability that \(B\) 's test results in a Type II error.
    3. Given that, when \(\mu = 65.0\), the probability that \(A\) 's test results in a Type II error is 0.1365 , state with a reason which test is better. 9 (a) The random variable \(G\) has the distribution \(\mathrm { B } ( n , 0.75 )\). Find the set of values of \(n\) for which the distribution of \(G\) can be well approximated by a normal distribution.
      (b) The random variable \(H\) has the distribution \(\mathrm { B } ( n , p )\). It is given that, using a normal approximation, \(\mathrm { P } ( H \geqslant 71 ) = 0.0401\) and \(\mathrm { P } ( H \leqslant 46 ) = 0.0122\).
      1. Find the mean and standard deviation of the approximating normal distribution.
      2. Hence find the values of \(n\) and \(p\).

AnswerMarks Guidance
(i) Two of: Distribution symmetric; No substantial truncation; Unimodal/Increasingly unlikely further from u, etcB1 One property
B1Another definitely different property; Don't give both marks for just these two "Bell-shaped": B1 only unless "no truncation"
Variance \(8^2/20\)M1 Standardise, allow cc, don't need r
\(z = \frac{47.0 - 50.0}{\sqrt{8^2/20}} = -1.677\)A1 Denominator (8 or \(8^2\) or ∀8) ÷ (20 or ∀20 or \(20^2\))
\(\Phi(1.677) = 0.9532\)A1 \(z\)-value, a.r.t. \(-1.68\) or \(1.68\)
A1Answer, a.r.t. 0.953
(i) Two of: Distribution symmetric; No substantial truncation; Unimodal/Increasingly unlikely further from u, etc | B1 | One property
| B1 | Another definitely different property; Don't give both marks for just these two "Bell-shaped": B1 only unless "no truncation"
Variance $8^2/20$ | M1 | Standardise, allow cc, don't need r
$z = \frac{47.0 - 50.0}{\sqrt{8^2/20}} = -1.677$ | A1 | Denominator (8 or $8^2$ or ∀8) ÷ (20 or ∀20 or $20^2$)
$\Phi(1.677) = 0.9532$ | A1 | $z$-value, a.r.t. $-1.68$ or $1.68$
| A1 | Answer, a.r.t. 0.953
(i) State two conditions needed for $X$ to be well modelled by a normal distribution.\\
(ii) It is given that $X \sim \mathrm {~N} \left( 50.0,8 ^ { 2 } \right)$. The mean of 20 random observations of $X$ is denoted by $\bar { X }$. Find $\mathrm { P } ( \bar { X } > 47.0 )$.

5 The number of system failures per month in a large network is a random variable with the distribution $\operatorname { Po } ( \lambda )$. A significance test of the null hypothesis $\mathrm { H } _ { 0 } : \lambda = 2.5$ is carried out by counting $R$, the number of system failures in a period of 6 months. The result of the test is that $\mathrm { H } _ { 0 }$ is rejected if $R > 23$ but is not rejected if $R \leqslant 23$.\\
(i) State the alternative hypothesis.\\
(ii) Find the significance level of the test.\\
(iii) Given that $\mathrm { P } ( R > 23 ) < 0.1$, use tables to find the largest possible actual value of $\lambda$. You should show the values of any relevant probabilities.

6 In a rearrangement code, the letters of a message are rearranged so that the frequency with which any particular letter appears is the same as in the original message. In ordinary German the letter $e$ appears $19 \%$ of the time. A certain encoded message of 20 letters contains one letter $e$.\\
(i) Using an exact binomial distribution, test at the $10 \%$ significance level whether there is evidence that the proportion of the letter $e$ in the language from which this message is a sample is less than in German, i.e., less than $19 \%$.\\
(ii) Give a reason why a binomial distribution might not be an appropriate model in this context.

7 Two continuous random variables $S$ and $T$ have probability density functions as follows.

$$\begin{array} { l l } 
S : & f ( x ) = \begin{cases} \frac { 1 } { 2 } & - 1 \leqslant x \leqslant 1 \\
0 & \text { otherwise } \end{cases} \\
T : & g ( x ) = \begin{cases} \frac { 3 } { 2 } x ^ { 2 } & - 1 \leqslant x \leqslant 1 \\
0 & \text { otherwise } \end{cases}
\end{array}$$

(i) Sketch on the same axes the graphs of $y = \mathrm { f } ( x )$ and $y = \mathrm { g } ( x )$. [You should not use graph paper or attempt to plot points exactly.]\\
(ii) Explain in everyday terms the difference between the two random variables.\\
(iii) Find the value of $t$ such that $\mathrm { P } ( T > t ) = 0.2$.

8 A random variable $Y$ is normally distributed with mean $\mu$ and variance 12.25. Two statisticians carry out significance tests of the hypotheses $\mathrm { H } _ { 0 } : \mu = 63.0 , \mathrm { H } _ { 1 } : \mu > 63.0$.\\
(i) Statistician $A$ uses the mean $\bar { Y }$ of a sample of size 23, and the critical region for his test is $\bar { Y } > 64.20$. Find the significance level for $A$ 's test.\\
(ii) Statistician $B$ uses the mean of a sample of size 50 and a significance level of $5 \%$.
\begin{enumerate}[label=(\alph*)]
\item Find the critical region for $B$ 's test.
\item Given that $\mu = 65.0$, find the probability that $B$ 's test results in a Type II error.\\
(iii) Given that, when $\mu = 65.0$, the probability that $A$ 's test results in a Type II error is 0.1365 , state with a reason which test is better.

9 (a) The random variable $G$ has the distribution $\mathrm { B } ( n , 0.75 )$. Find the set of values of $n$ for which the distribution of $G$ can be well approximated by a normal distribution.\\
(b) The random variable $H$ has the distribution $\mathrm { B } ( n , p )$. It is given that, using a normal approximation, $\mathrm { P } ( H \geqslant 71 ) = 0.0401$ and $\mathrm { P } ( H \leqslant 46 ) = 0.0122$.
\begin{enumerate}[label=(\roman*)]
\item Find the mean and standard deviation of the approximating normal distribution.
\item Hence find the values of $n$ and $p$.
\end{enumerate}\end{enumerate}

\hfill \mbox{\textit{OCR S2 2007 Q4 [6]}}