| Exam Board | OCR |
|---|---|
| Module | S3 (Statistics 3) |
| Year | 2010 |
| Session | June |
| Marks | 10 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Chi-squared goodness of fit: Other continuous |
| Difficulty | Standard +0.3 This is a straightforward chi-squared goodness of fit test with a continuous distribution where most expected frequencies are already provided. Students only need to calculate two expected frequencies using the given CDF (simple exponential calculations), then perform a standard chi-squared test with table lookup. The mechanics are routine for S3 level with no conceptual challenges or novel problem-solving required. |
| Spec | 5.06b Fit prescribed distribution: chi-squared test |
| Values | \(0 \leqslant x < 0.5\) | \(0.5 \leqslant x < 1\) | \(1 \leqslant x < 1.5\) | \(1.5 \leqslant x < 2\) | \(x \geqslant 2\) |
| Frequency | 41 | 50 | 32 | 23 | 4 |
| Answer | Marks | Guidance |
|---|---|---|
| Working | Marks | Guidance |
| \(e^{-2.25} - e^{-4}\) | M1 | Or find last entry using \(F(x)\) |
| \(\times 150\) | A1 | |
| \(= 13.1\) | A1 | Or 2.7 if found first |
| Last: \(150 - \text{sum} = 2.7\) | A1 ft 4 | Or 13.1 any accuracy |
| Answer | Marks | Guidance |
|---|---|---|
| Working | Marks | Guidance |
| \(H_0\): Data fits the model, \(H_1\): Data does not fit | B1 | At least two correct, All correct |
| Combine last two cells | M1\*Dep | In range 13.2 to 13.5 |
| \(\chi^2 = 7.8^2/33.2 + 11.6^2/61.6 + 7.4^2/39.4 + 11.2^2/15.8\) | A1 | SR: If last 2 cells are not combined B0M1A1A1(for 13.5) M1A1 |
| \(= 13.3(46)\) | M1 | If no explicit comparison B1 if conclusion follows |
| Compare with \(9.348\) (or \(11.14\)), reject \(H_0\) | A1 ft | |
| There is sufficient evidence at the \(2\frac{1}{2}\%\) significance level that the model is not a good fit | Dep\* 6 |
# Question 5:
## Part (i):
| Working | Marks | Guidance |
|---------|-------|----------|
| $e^{-2.25} - e^{-4}$ | M1 | Or find last entry using $F(x)$ |
| $\times 150$ | A1 | |
| $= 13.1$ | A1 | Or 2.7 if found first |
| Last: $150 - \text{sum} = 2.7$ | A1 ft **4** | Or 13.1 any accuracy |
## Part (ii):
| Working | Marks | Guidance |
|---------|-------|----------|
| $H_0$: Data fits the model, $H_1$: Data does not fit | B1 | At least two correct, All correct |
| Combine last two cells | M1\*Dep | In range 13.2 to 13.5 |
| $\chi^2 = 7.8^2/33.2 + 11.6^2/61.6 + 7.4^2/39.4 + 11.2^2/15.8$ | A1 | SR: If last 2 cells are not combined B0M1A1A1(for 13.5) M1A1 |
| $= 13.3(46)$ | M1 | If no explicit comparison B1 if conclusion follows |
| Compare with $9.348$ (or $11.14$), reject $H_0$ | A1 ft | |
| There is sufficient evidence at the $2\frac{1}{2}\%$ significance level that the model is not a good fit | Dep\* **6** | |
---
5 A random variable $X$ is believed to have (cumulative) distribution function given by
$$\mathrm { F } ( x ) = \begin{cases} 0 & x < 0 , \\ 1 - \mathrm { e } ^ { - x ^ { 2 } } & x \geqslant 0 . \end{cases}$$
In order to test this, a random sample of 150 observations of $X$ were taken, and their values are summarised in the following grouped frequency table.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | }
\hline
Values & $0 \leqslant x < 0.5$ & $0.5 \leqslant x < 1$ & $1 \leqslant x < 1.5$ & $1.5 \leqslant x < 2$ & $x \geqslant 2$ \\
\hline
Frequency & 41 & 50 & 32 & 23 & 4 \\
\hline
\end{tabular}
\end{center}
The expected frequencies, correct to 1 decimal place, corresponding to the above distribution, are 33.2, 61.6 and 39.4 respectively for the first 3 cells.\\
(i) Find the expected frequencies for the last 2 cells.\\
(ii) Carry out a goodness of fit test at the $2 \frac { 1 } { 2 } \%$ significance level.
\hfill \mbox{\textit{OCR S3 2010 Q5 [10]}}