OCR S3 Specimen — Question 4 10 marks

Exam BoardOCR
ModuleS3 (Statistics 3)
SessionSpecimen
Marks10
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared goodness of fit
TypeChi-squared goodness of fit: Other continuous
DifficultyStandard +0.3 This is a straightforward chi-squared goodness of fit test with expected frequencies provided. Part (i) requires routine calculation of P(5<X≤10) using the exponential distribution CDF, and part (ii) is a standard hypothesis test following a textbook procedure. The only minor complexity is combining cells to meet the expected frequency requirement, but this is a standard technique taught in S3.
Spec5.06c Fit other distributions: discrete and continuous

4 The lengths of time, in seconds, between vehicles passing a fixed observation point on a road were recorded at a time when traffic was flowing freely. The frequency distribution in Table 1 is a summary of the data from 100 observations. \begin{table}[h]
Time interval \(( x\) seconds \()\)\(0 < x \leqslant 5\)\(5 < x \leqslant 10\)\(10 < x \leqslant 20\)\(20 < x \leqslant 40\)\(40 < x\)
Observed frequency49222072
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table} It is thought that the distribution of times might be modelled by the continuous random variable \(X\) with probability density function given by $$f ( x ) = \begin{cases} 0.1 e ^ { - 0.1 x } & x > 0 \\ 0 & \text { otherwise } \end{cases}$$ Using this model, the expected frequencies (correct to 2 decimal places) for the given time intervals are shown in Table 2. \begin{table}[h]
Time interval \(( x\) seconds \()\)\(0 < x \leqslant 5\)\(5 < x \leqslant 10\)\(10 < x \leqslant 20\)\(20 < x \leqslant 40\)\(40 < x\)
Expected frequency39.3523.8723.2511.701.83
\captionsetup{labelformat=empty} \caption{Table 2}
\end{table}
  1. Show how the expected frequency of 23.87, corresponding to the interval \(5 < x \leqslant 10\), is obtained.
  2. Test, at the 10\% significance level, the goodness of fit of the model to the data.

AnswerMarks Guidance
(i) \(f_c = 100 \times \int_5^{10} 0.1e^{-0.1x}dx\)M1 For attempting to integrate \(f(x)\)
\(= 100\left[-e^{-0.1x}\right]_5^{10}\)A1 For correct indefinite integral
\(= 100(e^{-0.5} - e^{-1}) = 23.87\)M1 For multiplying by total frequency
M1For use of correct limits
A15 For obtaining given answer correctly
(ii) Combining: \(\frac{f_o}{f_c}: \frac{49}{39.35}, \frac{22}{23.87}, \frac{20}{23.25}, \frac{9}{13.53}\)M1 For combining the last two classes
Test statistic is \(\frac{9.65^2}{39.35} + \frac{1.87^2}{23.87} + \frac{3.25^2}{23.25} + \frac{4.53^2}{13.53}\)M1 For correct calculation process
\(= 4.484\)A1 For correct value 4.48
This is less than 6.251M1 For comparison with the correct critical value
Hence there is a satisfactory fitA1 5
**(i)** $f_c = 100 \times \int_5^{10} 0.1e^{-0.1x}dx$ | M1 | For attempting to integrate $f(x)$
$= 100\left[-e^{-0.1x}\right]_5^{10}$ | A1 | For correct indefinite integral
$= 100(e^{-0.5} - e^{-1}) = 23.87$ | M1 | For multiplying by total frequency
| M1 | For use of correct limits
| A1 | 5 | For obtaining given answer correctly

**(ii)** Combining: $\frac{f_o}{f_c}: \frac{49}{39.35}, \frac{22}{23.87}, \frac{20}{23.25}, \frac{9}{13.53}$ | M1 | For combining the last two classes
Test statistic is $\frac{9.65^2}{39.35} + \frac{1.87^2}{23.87} + \frac{3.25^2}{23.25} + \frac{4.53^2}{13.53}$ | M1 | For correct calculation process
$= 4.484$ | A1 | For correct value 4.48
This is less than 6.251 | M1 | For comparison with the correct critical value
Hence there is a satisfactory fit | A1 | 5 | For correct conclusion, in terms of the fit
4 The lengths of time, in seconds, between vehicles passing a fixed observation point on a road were recorded at a time when traffic was flowing freely. The frequency distribution in Table 1 is a summary of the data from 100 observations.

\begin{table}[h]
\begin{center}
\begin{tabular}{ | l | c c c c c | }
\hline
Time interval $( x$ seconds $)$ & $0 < x \leqslant 5$ & $5 < x \leqslant 10$ & $10 < x \leqslant 20$ & $20 < x \leqslant 40$ & $40 < x$ \\
Observed frequency & 49 & 22 & 20 & 7 & 2 \\
\hline
\end{tabular}
\captionsetup{labelformat=empty}
\caption{Table 1}
\end{center}
\end{table}

It is thought that the distribution of times might be modelled by the continuous random variable $X$ with probability density function given by

$$f ( x ) = \begin{cases} 0.1 e ^ { - 0.1 x } & x > 0 \\ 0 & \text { otherwise } \end{cases}$$

Using this model, the expected frequencies (correct to 2 decimal places) for the given time intervals are shown in Table 2.

\begin{table}[h]
\begin{center}
\begin{tabular}{ | l | c c c c c | }
\hline
Time interval $( x$ seconds $)$ & $0 < x \leqslant 5$ & $5 < x \leqslant 10$ & $10 < x \leqslant 20$ & $20 < x \leqslant 40$ & $40 < x$ \\
Expected frequency & 39.35 & 23.87 & 23.25 & 11.70 & 1.83 \\
\hline
\end{tabular}
\captionsetup{labelformat=empty}
\caption{Table 2}
\end{center}
\end{table}

(i) Show how the expected frequency of 23.87, corresponding to the interval $5 < x \leqslant 10$, is obtained.\\
(ii) Test, at the 10\% significance level, the goodness of fit of the model to the data.

\hfill \mbox{\textit{OCR S3  Q4 [10]}}