Shona calculated four correlation coefficients using data from the Large Data Set.
In each case she calculated the correlation coefficient between the masses of the cars and the CO₂ emissions for varying sample sizes.
A summary of these calculations, labelled A to D, are listed in the table below.
| Sample size | Correlation coefficient |
| A | 3827 | 0.088 |
| B | 3735 | 0.246 |
| C | 24 | 0.400 |
| D | 1250 | -1.183 |
Shona would like to use calculation A to test whether there is evidence of positive correlation between mass and CO₂ emissions.
She finds the critical value for a one-tailed test at the 5% level for a sample of size 3827 is 0.027
- State appropriate hypotheses for Shona to use in her test. [1 mark]
- Determine if there is sufficient evidence to reject the null hypothesis.
Fully justify your answer. [1 mark]
- Shona's teacher tells her to remove calculation D from the table as it is incorrect.
Explain how the teacher knew it was incorrect. [1 mark]
- Before performing calculation B, Shona cleaned the data. She removed all cars from the Large Data Set that had incorrect masses.
Using your knowledge of the large data set, explain what was incorrect about the masses which were removed from the calculation. [1 mark]
- Apart from CO2 and CO emissions, state one other type of emission that Shona could investigate using the Large Data Set. [1 mark]
- Wesley claims that calculation C shows that a heavier car causes higher CO2 emissions.
Give two reasons why Wesley's claim may be incorrect. [2 marks]