- Kwame is investigating a possible relationship between average March temperature, \(t ^ { \circ } \mathrm { C }\), and tea yield, \(y \mathrm {~kg} /\) hectare, for tea grown in a particular location. He uses 30 years of past data to produce the following summary statistics for a linear regression model, with tea yield as the dependent variable.
$$\begin{aligned}
& \text { Residual Sum of Squares } ( \mathrm { RSS } ) = 1666567 \quad \mathrm {~S} _ { t t } = 52.0 \quad \mathrm {~S} _ { y y } = 1774155
& \text { least squares regression line: } \quad \text { gradient } = 45.5 \quad y \text {-intercept } = 2080
\end{aligned}$$
- Use the regression model to predict the tea yield for an average March temperature of \(20 ^ { \circ } \mathrm { C }\)
He also produces the following residual plot for the data.
\includegraphics[max width=\textwidth, alt={}, center]{d139840b-16ec-42ce-8501-f79c263c8017-02_663_880_868_589} - Explain what you understand by the term residual.
- Calculate the product moment correlation coefficient between \(t\) and \(y\)
- Explain why the linear model may not be a good fit for the data
- with reference to your answer to part (c)
- with reference to the residual plot.
\section*{Question 1 continues on page 4}
Kwame also collects data on total March rainfall, \(w \mathrm {~mm}\), for each of these 30 years. For a linear regression model of \(w\) on \(t\) the following summary statistic is found.
$$\text { Residual Sum of Squares (RSS) = } 86754$$
Kwame concludes that since this model has a smaller RSS, there must be a stronger linear relationship between \(w\) and \(t\) than between \(y\) and \(t\) (where RSS \(= 1666567\) )
- State, giving a reason, whether or not you agree with the reasoning that led to Kwame's conclusion.