- Two students, Jim and Dora, collected data on the mean annual rainfall, \(w \mathrm {~cm}\), and the annual yield of leeks, \(l\) tonnes per hectare, for 10 years.
Jim summarised the data as follows
$$\mathrm { S } _ { w l } = 42.786 \quad \mathrm {~S} _ { w w } = 9936.9 \quad \sum l ^ { 2 } = 26.2326 \quad \sum l = 16.06$$
- Find the product moment correlation coefficient between \(l\) and \(w\)
Dora decided to code the data first using \(s = w - 6\) and \(t = l - 20\)
- Write down the value of the product moment correlation coefficient between \(s\) and \(t\). Give a justification for your answer.
Dora calculates the equation of the regression line of \(t\) on \(s\) to be \(t = 0.00431 s - 18.87\)
- Find the equation of the regression line of \(l\) on \(w\) in the form \(l = a + b w\), giving the values of \(a\) and \(b\) to 3 significant figures.
- Use your equation to estimate the yield of leeks when \(w\) is 100 cm .
- Calculate the residual sum of squares.
The graph shows the residual for each value of \(l\)
\includegraphics[max width=\textwidth, alt={}, center]{7e46e14a-0f5a-4d02-8f00-a92bc4def6d7-08_716_1594_1594_239} - State whether this graph suggests that the use of a linear regression model is suitable for these data. Give a reason for your answer.
- Other than collecting more data, suggest how to improve the fit of the model in part (c) to the data.