OCR MEI AS Paper 2 2021 November — Question 7 7 marks

Exam BoardOCR MEI
ModuleAS Paper 2 (AS Paper 2)
Year2021
SessionNovember
Marks7
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicData representation
TypeState advantages of diagram types
DifficultyEasy -1.2 This is a straightforward AS-level statistics question testing basic data cleaning concepts, outlier identification using the 2-standard-deviation rule, and recognition of systematic sampling. All parts require only recall and simple application of standard definitions with no problem-solving or novel insight required.
Spec2.02h Recognize outliers2.02j Clean data: missing data, errors

7 The pre-release material contains information about health expenditure. Fig. 7.1 shows an extract from the data. \begin{table}[h]
CountryHealth expenditure (\% of GDP)
Algeria7.2
Egypt5.6
Libya5
Morocco5.9
Sudan8.4
Tunisia7
Western Sahara\#N/A
Angola3.3
Benin4.6
Botswana5.4
Burkina Faso5
\captionsetup{labelformat=empty} \caption{Fig. 7.1}
\end{table}
  1. Explain how the data should be cleaned before any analysis takes place. Kareem uses all the available data to conduct an investigation into health expenditure as a percentage of GDP in different countries. He calculates the mean to be 6.79 and the standard deviation to be 2.78 . Fig. 7.2 shows the smallest values and the largest values of health expenditure as a percentage of GDP. \begin{table}[h]
    Smallest values of Health expenditure (\% of GDP)Largest values of Health expenditure (\% of GDP)
    1.511.7
    1.911.9
    2.113.7
    13.7
    16.5
    17.1
    17.1
    \captionsetup{labelformat=empty} \caption{Fig. 7.2}
    \end{table}
  2. Determine which of these values are outliers. Kareem removes the outliers from the data and finds that there are 187 values left. He decides to collect a sample of size 30 . He uses the following sampling procedure.
    Assign each value a number from 1 to 187. Generate a random number, \(n\), between 1 and 13 . Starting with the \(n\)th value, choose every 6th value after that until 30 values have been chosen.
  3. Explain whether Kareem is using simple random sampling.

Question 7:
Part (a):
AnswerMarks Guidance
AnswerMark Guidance
Remove Western Sahara since there is no data available (#N/A)B1 Ignore any comments about removing outliers
[1]
Part (b):
AnswerMarks Guidance
AnswerMark Guidance
\(6.79 + 2\times 2.78\) or \(6.79 - 2\times 2.78\) soiM1 NB \(1.23\) or \(12.35\) implies M1
None of the smallest values are outliersA1 soi
At least 1 of largest values identifiedA1
\(13.7, 13.7, 16.5, 17.1, 17.1\) onlyA1 CAO
[4]
Part (c):
AnswerMarks Guidance
AnswerMark Guidance
Not simple random sampling because every possible sample does not have an equal chance of being selectedB2 oe. Allow B1 for: He is using systematic sampling
[2]
## Question 7:

### Part (a):
| Answer | Mark | Guidance |
|--------|------|----------|
| Remove Western Sahara since there is no data available (#N/A) | B1 | Ignore any comments about removing outliers |
| **[1]** | | |

### Part (b):
| Answer | Mark | Guidance |
|--------|------|----------|
| $6.79 + 2\times 2.78$ or $6.79 - 2\times 2.78$ **soi** | M1 | NB $1.23$ or $12.35$ implies M1 |
| None of the smallest values are outliers | A1 | soi |
| At least 1 of largest values identified | A1 | |
| $13.7, 13.7, 16.5, 17.1, 17.1$ only | A1 | CAO |
| **[4]** | | |

### Part (c):
| Answer | Mark | Guidance |
|--------|------|----------|
| Not simple random sampling because every possible sample does not have an equal chance of being selected | B2 | oe. Allow B1 for: He is using systematic sampling |
| **[2]** | | |

---
7 The pre-release material contains information about health expenditure. Fig. 7.1 shows an extract from the data.

\begin{table}[h]
\begin{center}
\begin{tabular}{|l|l|}
\hline
Country & Health expenditure (\% of GDP) \\
\hline
Algeria & 7.2 \\
\hline
Egypt & 5.6 \\
\hline
Libya & 5 \\
\hline
Morocco & 5.9 \\
\hline
Sudan & 8.4 \\
\hline
Tunisia & 7 \\
\hline
Western Sahara & \#N/A \\
\hline
Angola & 3.3 \\
\hline
Benin & 4.6 \\
\hline
Botswana & 5.4 \\
\hline
Burkina Faso & 5 \\
\hline
\end{tabular}
\captionsetup{labelformat=empty}
\caption{Fig. 7.1}
\end{center}
\end{table}
\begin{enumerate}[label=(\alph*)]
\item Explain how the data should be cleaned before any analysis takes place.

Kareem uses all the available data to conduct an investigation into health expenditure as a percentage of GDP in different countries.

He calculates the mean to be 6.79 and the standard deviation to be 2.78 .

Fig. 7.2 shows the smallest values and the largest values of health expenditure as a percentage of GDP.

\begin{table}[h]
\begin{center}
\begin{tabular}{|l|l|}
\hline
Smallest values of Health expenditure (\% of GDP) & Largest values of Health expenditure (\% of GDP) \\
\hline
1.5 & 11.7 \\
\hline
1.9 & 11.9 \\
\hline
2.1 & 13.7 \\
\hline
 & 13.7 \\
\hline
 & 16.5 \\
\hline
 & 17.1 \\
\hline
 & 17.1 \\
\hline
\end{tabular}
\captionsetup{labelformat=empty}
\caption{Fig. 7.2}
\end{center}
\end{table}
\item Determine which of these values are outliers.

Kareem removes the outliers from the data and finds that there are 187 values left. He decides to collect a sample of size 30 .

He uses the following sampling procedure.\\
Assign each value a number from 1 to 187.

Generate a random number, $n$, between 1 and 13 .

Starting with the $n$th value, choose every 6th value after that until 30 values have been chosen.
\item Explain whether Kareem is using simple random sampling.
\end{enumerate}

\hfill \mbox{\textit{OCR MEI AS Paper 2 2021 Q7 [7]}}