| Exam Board | OCR MEI |
|---|---|
| Module | Paper 2 (Paper 2) |
| Year | 2018 |
| Session | June |
| Marks | 9 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Data representation |
| Type | Estimate mean and standard deviation from frequency table |
| Difficulty | Moderate -0.8 This is a straightforward statistics question requiring identification of errors in a cumulative frequency diagram and basic interpretation of sampling issues. Part (i) involves recognizing systematic errors (likely plotting points incorrectly) and understanding median estimation—standard S1 content. Parts (ii) and (iii) require contextual reasoning about pre-release material rather than mathematical calculation. No complex calculations or novel problem-solving required; mainly testing understanding of basic statistical concepts and data interpretation. |
| Spec | 2.01d Select/critique sampling: in context2.02a Interpret single variable data: tables and diagrams |
| Unemployment rate | \(0 -\) | \(5 -\) | \(10 -\) | \(15 -\) | \(20 -\) | \(35 - 50\) |
| Frequency | 15 | 21 | 5 | 5 | 2 | 2 |
| Answer | Marks | Guidance |
|---|---|---|
| the cumulative frequencies have been plotted against the mid-points of the class intervals | B1 | AO 2.4 |
| mis-plotting [at centre of each class] reduces estimate (by 2.5) oe | B1 [2] | AO 2.4 |
| Answer | Marks | Guidance |
|---|---|---|
| grouped data has been used | B1 | AO 2.4 |
| grouping has slightly reduced the error introduced by misplotting (because the error is less than 2.5) | B1 [2] | AO 2.4 |
| Answer | Marks | Guidance |
|---|---|---|
| percentage unemployment is often estimated oe | E1 [1] | AO 2.4 |
| Answer | Marks | Guidance |
|---|---|---|
| there are many other countries in the pre-release material; it is very unlikely that a random sample would only include European countries | E1 [1] | AO 2.4 |
| Answer | Marks | Guidance |
|---|---|---|
| negative correlation / association (may be embedded) | B1 | AO 2.2b |
| comparison of \(p\)-value with 0.05 or 0.01 or other appropriate significance level and supporting comment | B1 [2] | AO 2.2b |
| Answer | Marks | Guidance |
|---|---|---|
| (even though this is interpolation), the scatter / weak correlation / presence of an outlier would suggest that the use of a line of best fit is inappropriate | E1 [1] | AO 2.2b |
## Question 14(i)A:
the cumulative frequencies have been plotted against the mid-points of the class intervals | B1 | AO 2.4 |
mis-plotting [at centre of each class] reduces estimate (by 2.5) oe | B1 [2] | AO 2.4 |
---
## Question 14(i)B:
grouped data has been used | B1 | AO 2.4 | or eg Hodge used the graph (instead of the raw data)
grouping has slightly reduced the error introduced by misplotting (because the error is less than 2.5) | B1 [2] | AO 2.4 |
---
## Question 14(ii):
percentage unemployment is often estimated oe | E1 [1] | AO 2.4 | allow: data (on percentage unemployment) is not available for all countries **in Europe** oe
---
## Question 14(iii):
there are many other countries in the pre-release material; it is very unlikely that a random sample would only include European countries | E1 [1] | AO 2.4 |
---
## Question 14(iv):
negative correlation / association (may be embedded) | B1 | AO 2.2b | if **B0B0** allow **SC2** for eg comment on no significant association justified by comparison of $p$-value with appropriate significance level (eg 0.025)
comparison of $p$-value with 0.05 or 0.01 or other appropriate significance level and supporting comment | B1 [2] | AO 2.2b |
---
## Question 14(v):
(even though this is interpolation), the scatter / weak correlation / presence of an outlier would suggest that the use of a line of best fit is inappropriate | E1 [1] | AO 2.2b | allow explanation based on the value for Kosovo being an outlier or on it lying in the (large) gap in the scatter
---
14 The pre-release material includes data on unemployment rates in different countries. A sample from this material has been taken. All the countries in the sample are in Europe. The data have been grouped and are shown in Fig 14.1.
\begin{table}[h]
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Unemployment rate & $0 -$ & $5 -$ & $10 -$ & $15 -$ & $20 -$ & $35 - 50$ \\
\hline
Frequency & 15 & 21 & 5 & 5 & 2 & 2 \\
\hline
\end{tabular}
\captionsetup{labelformat=empty}
\caption{Fig. 14.1}
\end{center}
\end{table}
A cumulative frequency curve has been generated for the sample data using a spreadsheet. This is shown in Fig. 14.2.
\begin{figure}[h]
\begin{center}
\includegraphics[alt={},max width=\textwidth]{d8ff9511-aff7-45ea-ba55-e6667e8ba760-08_639_1081_808_466}
\captionsetup{labelformat=empty}
\caption{Fig. 14.2}
\end{center}
\end{figure}
Hodge used Fig. 14.2 to estimate the median unemployment rate in Europe. He obtained the answer 5.0. The correct value for this sample is 6.9.
\begin{enumerate}[label=(\roman*)]
\item (A) There is a systematic error in the diagram.
\begin{itemize}
\item Identify this error.
\item State how this error affects Hodge's estimate.\\
(B) There is another factor which has affected Hodge's estimate.
\item Identify this factor.
\item State how this factor affects Hodge's estimate.
\item Use your knowledge of the pre-release material to give another reason why any estimation of the median unemployment rate in Europe may be unreliable.
\item Use your knowledge of the pre-release material to explain why it is very unlikely that the sample has been randomly selected from the pre-release material.
\end{itemize}
The scatter diagram shown in Fig. 14.3 shows the unemployment rate and life expectancy at birth for the 47 countries in the sample for which this information is available.
\begin{figure}[h]
\begin{center}
\captionsetup{labelformat=empty}
\caption{Scatter diagram to show life expectancy at birth against unemployment rate}
\includegraphics[alt={},max width=\textwidth]{d8ff9511-aff7-45ea-ba55-e6667e8ba760-09_627_1281_456_367}
\end{center}
\end{figure}
Fig. 14.3
The product moment correlation coefficient for the 47 items in the sample is - 0.2607 .\\
The $p$-value associated with $r = - 0.2607$ and $n = 47$ is 0.0383 .
\item Does this information suggest that there is an association between unemployment rate and life expectancy at birth in countries in Europe?
Hodge uses the spreadsheet tools to obtain the equation of a line of best fit for this data.
\item The unemployment rate in Kosovo is 35.3 , but there is no data available on life expectancy. Is it reasonable to use Hodge's line of best fit to estimate life expectancy at birth in Kosovo?
\end{enumerate}
\hfill \mbox{\textit{OCR MEI Paper 2 2018 Q14 [9]}}