OCR S2 2007 June — Question 6 9 marks

Exam BoardOCR
ModuleS2 (Statistics 2)
Year2007
SessionJune
Marks9
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicHypothesis test of binomial distributions
TypeOne-tailed hypothesis test (lower tail, H₁: p < p₀)
DifficultyStandard +0.3 This is a straightforward one-tailed binomial hypothesis test with clearly stated hypotheses (p < 0.19), requiring calculation of P(X ≤ 1) where X ~ B(20, 0.19) and comparison to 10% significance level. Part (ii) requires understanding that letters in words aren't independent. Slightly easier than average due to small n allowing exact calculation and explicit guidance on the test direction.
Spec2.04b Binomial distribution: as model B(n,p)2.05b Hypothesis test for binomial proportion2.05c Significance levels: one-tail and two-tail

6 In a rearrangement code, the letters of a message are rearranged so that the frequency with which any particular letter appears is the same as in the original message. In ordinary German the letter \(e\) appears \(19 \%\) of the time. A certain encoded message of 20 letters contains one letter \(e\).
  1. Using an exact binomial distribution, test at the \(10 \%\) significance level whether there is evidence that the proportion of the letter \(e\) in the language from which this message is a sample is less than in German, i.e., less than \(19 \%\).
  2. Give a reason why a binomial distribution might not be an appropriate model in this context.

AnswerMarks Guidance
(i) Horizontal straight lineB1
Positive parabola, symmetric about 0B1
Completely correct, including correct relationship between two; Don't need vertical lines or horizontal lines outside range, but don't give last B1 if horizontal line continues past "±1"B1 3
(ii) \(S\) is equally likely to take any value in range, \(T\) is more likely at extremitiesB2 Correct statement about distributions (not graphs) [Partial statement, or correct description for one only: B1]
(iii) \(\int_{-1}^{1} x^2 dx = \left[\frac{x^3}{3}\right]\)M1 Integrate f(x) with limits (−1, 1) or (t, 1) [recoverable if \(t\) used later]
\(\frac{1}{2}(1 - t^2) = 0.2\) or \(\frac{1}{2}(t^3 + 1) = 0.8\); \(t^3 = 0.6\); \(t = 0.8434\)B1 Correct indefinite integral
M1Equate to 0.2, or 0.8 if [−1, t] used
M1Solve cubic equation to find \(t\)
A1Answer, in range [0.843, 0.844] 5
(i) Horizontal straight line | B1 |
Positive parabola, symmetric about 0 | B1 |
Completely correct, including correct relationship between two; Don't need vertical lines or horizontal lines outside range, but don't give last B1 if horizontal line continues past "±1" | B1 | 3

(ii) $S$ is equally likely to take any value in range, $T$ is more likely at extremities | B2 | Correct statement about distributions (not graphs) [Partial statement, or correct description for one only: B1]

(iii) $\int_{-1}^{1} x^2 dx = \left[\frac{x^3}{3}\right]$ | M1 | Integrate f(x) with limits (−1, 1) or (t, 1) [recoverable if $t$ used later]
$\frac{1}{2}(1 - t^2) = 0.2$ or $\frac{1}{2}(t^3 + 1) = 0.8$; $t^3 = 0.6$; $t = 0.8434$ | B1 | Correct indefinite integral
| M1 | Equate to 0.2, or 0.8 if [−1, t] used
| M1 | Solve cubic equation to find $t$
| A1 | Answer, in range [0.843, 0.844] | 5
6 In a rearrangement code, the letters of a message are rearranged so that the frequency with which any particular letter appears is the same as in the original message. In ordinary German the letter $e$ appears $19 \%$ of the time. A certain encoded message of 20 letters contains one letter $e$.\\
(i) Using an exact binomial distribution, test at the $10 \%$ significance level whether there is evidence that the proportion of the letter $e$ in the language from which this message is a sample is less than in German, i.e., less than $19 \%$.\\
(ii) Give a reason why a binomial distribution might not be an appropriate model in this context.

\hfill \mbox{\textit{OCR S2 2007 Q6 [9]}}