| Exam Board | OCR MEI |
|---|---|
| Module | S3 (Statistics 3) |
| Year | 2007 |
| Session | June |
| Marks | 18 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Chi-squared goodness of fit: Other continuous |
| Difficulty | Standard +0.3 This is a straightforward two-part chi-squared question with all probabilities pre-calculated in part (i), requiring only mechanical computation of test statistic and comparison to critical value. Part (ii) is a standard sign test for median. Both parts follow routine procedures with no conceptual challenges or novel insights required. |
| Spec | 5.06c Fit other distributions: discrete and continuous5.07b Sign test: and Wilcoxon signed-rank |
| Length \(x\) (hundreds of metres) | Observed frequency | Probability |
| \(0 < x \leqslant 0.5\) | 21 | 0.2653 |
| \(0.5 < x \leqslant 1\) | 24 | 0.1722 |
| \(1 < x \leqslant 2\) | 12 | 0.2025 |
| \(2 < x \leqslant 3\) | 15 | 0.1100 |
| \(3 < x \leqslant 5\) | 13 | 0.1094 |
| \(5 < x \leqslant 10\) | 9 | 0.0874 |
| \(x > 10\) | 6 | 0.0532 |
| Answer | Marks | Guidance |
|---|---|---|
| Obs | 21 | 24 |
| Exp | 26.53 | 17.22 |
| M1, A1 | Probabilities \(\times\) 100. All Expected frequencies correct. | |
| \(\therefore X^2 = \frac{(21-26.53)^2}{26.53} + \text{etc} = 1.1527 + 2.6695 + 3.3611 + 1.4545 + 0.3879 + 0.0077 + 0.0869 = 9.1203\) | M1, A1 | At least 4 values correct. |
| d.o.f. = 7 – 1 = 6 | M1 | No ft from here if wrong. |
| Refer to \(\chi_6^2\). Upper 5% point is 12.59 | M1, A1 | No ft from here if wrong. |
| 9.1203 < 12.59 \(\therefore\) Result is not significant. Evidence suggests the model fits the data at the 5% level. | E1, E1 | ft only c's test statistic. ft only c's test statistic. |
| Answer | Marks | Guidance |
|---|---|---|
| Data | Diff = data −124 | Rank of |
| 239 | 115 | 9 |
| 77 | −47 | 3 |
| 179 | 55 | 4 |
| 221 | 97 | 7 |
| 100 | −24 | 2 |
| 312 | 188 | 10 |
| 52 | −72 | 5 |
| 129 | 5 | 1 |
| 236 | 112 | 8 |
| 42 | −82 | 6 |
| M1, M1, A1 | For differences. For ranks of | difference |
| \(W_+ = 3 + 2 + 5 + 6 = 16\) | B1 | Or \(W_+ = 9 + 4 + 7 + 10 + 1 + 8 = 39\) |
| Refer to Wilcoxon single sample (/paired) tables for \(n = 10\). Lower two-tail 10% point is ..., = 10. | M1 | No ft from here if wrong. Or, if 39 used, upper point is 45. No ft from here if wrong. |
| 16 > 10 \(\therefore\) Result is not significant. | E1 | Or 39 < 45. ft only c's test statistic. |
| Seems there is no evidence against the median length being 124. | E1 | ft only c's test statistic. |
**Part (i):**
| Obs | 21 | 24 | 12 | 15 | 13 | 9 | 6 |
| Exp | 26.53 | 17.22 | 20.25 | 11.00 | 10.94 | 8.74 | 5.32 |
| M1, A1 | Probabilities $\times$ 100. All Expected frequencies correct.
$\therefore X^2 = \frac{(21-26.53)^2}{26.53} + \text{etc} = 1.1527 + 2.6695 + 3.3611 + 1.4545 + 0.3879 + 0.0077 + 0.0869 = 9.1203$ | M1, A1 | At least 4 values correct.
d.o.f. = 7 – 1 = 6 | M1 | No ft from here if wrong.
Refer to $\chi_6^2$. Upper 5% point is 12.59 | M1, A1 | No ft from here if wrong.
9.1203 < 12.59 $\therefore$ Result is not significant. Evidence suggests the model fits the data at the 5% level. | E1, E1 | ft only c's test statistic. ft only c's test statistic. | 9 marks
**Part (ii):**
| Data | Diff = data −124 | Rank of |diff| |
|------|------------------|--------|
| 239 | 115 | 9 |
| 77 | −47 | 3 |
| 179 | 55 | 4 |
| 221 | 97 | 7 |
| 100 | −24 | 2 |
| 312 | 188 | 10 |
| 52 | −72 | 5 |
| 129 | 5 | 1 |
| 236 | 112 | 8 |
| 42 | −82 | 6 |
| M1, M1, A1 | For differences. For ranks of |difference|. All correct. ft from here if ranks wrong.
$W_+ = 3 + 2 + 5 + 6 = 16$ | B1 | Or $W_+ = 9 + 4 + 7 + 10 + 1 + 8 = 39$
Refer to Wilcoxon single sample (/paired) tables for $n = 10$. Lower two-tail 10% point is ..., = 10. | M1 | No ft from here if wrong. Or, if 39 used, upper point is 45. No ft from here if wrong.
16 > 10 $\therefore$ Result is not significant. | E1 | Or 39 < 45. ft only c's test statistic.
Seems there is no evidence against the median length being 124. | E1 | ft only c's test statistic. | 9 marks
4 A machine produces plastic strip in a continuous process. Occasionally there is a flaw at some point along the strip. The length of strip (in hundreds of metres) between successive flaws is modelled by a continuous random variable $X$ with probability density function $\mathrm { f } ( x ) = \frac { 18 } { ( 3 + x ) ^ { 3 } }$ for $x > 0$. The table below gives the frequencies for 100 randomly chosen observations of $X$. It also gives the probabilities for the class intervals using the model.
\begin{center}
\begin{tabular}{|l|l|l|}
\hline
Length $x$ (hundreds of metres) & Observed frequency & Probability \\
\hline
$0 < x \leqslant 0.5$ & 21 & 0.2653 \\
\hline
$0.5 < x \leqslant 1$ & 24 & 0.1722 \\
\hline
$1 < x \leqslant 2$ & 12 & 0.2025 \\
\hline
$2 < x \leqslant 3$ & 15 & 0.1100 \\
\hline
$3 < x \leqslant 5$ & 13 & 0.1094 \\
\hline
$5 < x \leqslant 10$ & 9 & 0.0874 \\
\hline
$x > 10$ & 6 & 0.0532 \\
\hline
\end{tabular}
\end{center}
(i) Examine the fit of this model to the data at the $5 \%$ level of significance.
You are given that the median length between successive flaws is 124 metres. At a later date the following random sample of ten lengths (in metres) between flaws is obtained.
$$\begin{array} { l l l l l l l l l l }
239 & 77 & 179 & 221 & 100 & 312 & 52 & 129 & 236 & 42
\end{array}$$
(ii) Test at the $10 \%$ level of significance whether the median length may still be assumed to be 124 metres.
\hfill \mbox{\textit{OCR MEI S3 2007 Q4 [18]}}