7 The pre-release material contains information about health expenditure. Fig. 7.1 shows an extract from the data.
\begin{table}[h]
| Country | Health expenditure (\% of GDP) |
| Algeria | 7.2 |
| Egypt | 5.6 |
| Libya | 5 |
| Morocco | 5.9 |
| Sudan | 8.4 |
| Tunisia | 7 |
| Western Sahara | \#N/A |
| Angola | 3.3 |
| Benin | 4.6 |
| Botswana | 5.4 |
| Burkina Faso | 5 |
\captionsetup{labelformat=empty}
\caption{Fig. 7.1}
\end{table}
- Explain how the data should be cleaned before any analysis takes place.
Kareem uses all the available data to conduct an investigation into health expenditure as a percentage of GDP in different countries.
He calculates the mean to be 6.79 and the standard deviation to be 2.78 .
Fig. 7.2 shows the smallest values and the largest values of health expenditure as a percentage of GDP.
\begin{table}[h]
| Smallest values of Health expenditure (\% of GDP) | Largest values of Health expenditure (\% of GDP) |
| 1.5 | 11.7 |
| 1.9 | 11.9 |
| 2.1 | 13.7 |
| 13.7 |
| 16.5 |
| 17.1 |
| 17.1 |
\captionsetup{labelformat=empty}
\caption{Fig. 7.2}
\end{table} - Determine which of these values are outliers.
Kareem removes the outliers from the data and finds that there are 187 values left. He decides to collect a sample of size 30 .
He uses the following sampling procedure.
Assign each value a number from 1 to 187.
Generate a random number, \(n\), between 1 and 13 .
Starting with the \(n\)th value, choose every 6th value after that until 30 values have been chosen. - Explain whether Kareem is using simple random sampling.