- Sara was studying the relationship between rainfall, \(r \mathrm {~mm}\), and humidity, \(h \%\), in the UK. She takes a random sample of 11 days from May 1987 for Leuchars from the large data set.
She obtained the following results.
| \(h\) | 93 | 86 | 95 | 97 | 86 | 94 | 97 | 97 | 87 | 97 | 86 |
| \(r\) | 1.1 | 0.3 | 3.7 | 20.6 | 0 | 0 | 2.4 | 1.1 | 0.1 | 0.9 | 0.1 |
Sara examined the rainfall figures and found
$$Q _ { 1 } = 0.1 \quad Q _ { 2 } = 0.9 \quad Q _ { 3 } = 2.4$$
A value that is more than 1.5 times the interquartile range (IQR) above \(Q _ { 3 }\) is called an outlier.
- Show that \(r = 20.6\) is an outlier.
- Give a reason why Sara might:
- include
- exclude
this day's reading.
Sara decided to exclude this day's reading and drew the following scatter diagram for the remaining 10 days' values of \(r\) and \(h\).
\includegraphics[max width=\textwidth, alt={}, center]{8f3dbcb4-3260-4493-a230-12577b4ed691-08_988_1081_1555_420}
- Give an interpretation of the correlation between rainfall and humidity.
The equation of the regression line of \(r\) on \(h\) for these 10 days is \(r = - 12.8 + 0.15 h\)
- Give an interpretation of the gradient of this regression line.
- Comment on the suitability of Sara's sampling method for this study.
- Suggest how Sara could make better use of the large data set for her study.