- A French test and a Spanish test were sat by 11 students.
The table below shows their marks.
| Student | A | B | C | D | E | F | G | H | I | J | K |
| French mark ( f ) | 24 | 30 | 32 | 32 | 36 | 36 | 40 | 44 | 50 | 60 | 68 |
| Spanish mark ( \(\boldsymbol { s }\) ) | 16 | 90 | 24 | 28 | 32 | 36 | 38 | 44 | 48 | 48 | 68 |
Greg says that if these points were plotted on a scatter diagram, then the point \(( 30,90 )\) would be an outlier because 90 is an outlier for the Spanish marks.
An outlier is defined as a value that is
$$\text { greater than } Q _ { 3 } + 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { or smaller than } Q _ { 1 } - 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right)$$
- Show that 90 is an outlier for the Spanish marks.
Ignoring the point (30, 90), Greg calculated the following summary statistics.
$$\sum f = 422 \quad \sum s = 382 \quad S _ { f f } = 1667.6 \quad S _ { f s } = 1735.6$$
- Use these summary statistics to show that the equation of the least squares regression line of \(s\) on \(f\) for the remaining 10 students is
$$s = - 5.72 + 1.04 f$$
where the values of the intercept and gradient are given to 3 significant figures. You must show your working.
- Give an interpretation of the gradient of the regression line.
Two further students sat the French test but missed the Spanish test.
- Using the equation given in part (b), estimate
- a Spanish mark for the student who scored 55 marks in their French test,
- a Spanish mark for the student who scored 18 marks in their French test.
- State, giving a reason, which of the two estimates found in part (d) would be the more reliable estimate.