Linear regression

212 questions · 28 question types identified

Calculate y on x from raw data table

Questions that provide raw bivariate data in a table and ask to find the regression line of y on x.

66
31.1% of questions
Show example »
5. The weight, \(w\) grams, and the length, \(l \mathrm {~mm}\), of 10 randomly selected newborn turtles are given in the table below.
\(l\)49.052.053.054.554.153.450.051.649.551.2
\(w\)29323439383530312930
$$\text { (You may use } \mathrm { S } _ { l l } = 33.381 \quad \mathrm {~S} _ { w l } = 59.99 \quad \mathrm {~S} _ { w w } = 120.1 \text { ) }$$
  1. Find the equation of the regression line of \(w\) on \(l\) in the form \(w = a + b l\).
  2. Use your regression line to estimate the weight of a newborn turtle of length 60 mm .
  3. Comment on the reliability of your estimate giving a reason for your answer.
View full question →
Convert regression equation between coded and original

Questions that require finding the regression equation in coded variables and then converting it to original variables, or vice versa, using the coding transformations.

13
6.1% of questions
Show example »
3. Twenty pairs of observations are made of two variables \(x\) and \(y\), which are believed to be related. It is found that $$\sum x = 200 , \quad \sum y = 174 , \quad \sum x ^ { 2 } = 6201 , \quad \sum y ^ { 2 } = 5102 , \quad \sum x y = 5200 .$$ Find
  1. the product-moment correlation coefficient between \(x\) and \(y\),
  2. the equation of the regression line of \(y\) on \(x\). Given that \(p = x + 30\) and \(q = y + 50\),
  3. find the equation of the regression line of \(q\) on \(p\), in the form \(q = m p + c\).
  4. Estimate the value of \(q\) when \(p = 46\), stating any assumptions you make.
View full question →
Assess validity of predictions

A question is this type if and only if it asks whether a regression line provides reliable estimates, whether extrapolation is appropriate, or to comment on the validity of using the model for prediction.

11
5.2% of questions
Show example »
1 A set of bivariate data ( \(X , Y\) ) is summarised as follows.
\(n = 25 , \sum x = 9.975 , \sum y = 11.175 , \sum x ^ { 2 } = 5.725 , \sum y ^ { 2 } = 46.200 , \sum x y = 11.575\)
  1. Calculate the value of Pearson's product-moment correlation coefficient.
  2. Calculate the equation of the regression line of \(y\) on \(x\). It is desired to know whether the regression line of \(y\) on \(x\) will provide a reliable estimate of \(y\) when \(x = 0.75\).
  3. State one reason for believing that the estimate will be reliable.
  4. State what further information is needed in order to determine whether the estimate is reliable.
View full question →
Calculate y on x from summary statistics

Questions that provide summary statistics (sums, means, variances, Sxx, Sxy, etc.) and ask to find the regression line of y on x.

11
5.2% of questions
Show example »
10 The means and variances for a random sample of 8 pairs of values of \(x\) and \(y\) taken from a bivariate distribution are given in the following table.
MeanVariance
\(x\)3.31253.3086
\(y\)6.73757.9473
The product moment correlation coefficient for the sample is 0.5815 , correct to 4 decimal places.
  1. Find the equation of the regression line of \(y\) on \(x\).
  2. Test at the \(5 \%\) significance level whether there is evidence of positive correlation between \(x\) and \(y\). [4]
  3. Calculate an estimate of \(y\) when \(x = 6.0\) and comment on the reliability of your estimate.
View full question →
Interpret regression line parameters

A question is this type if and only if it asks to interpret the meaning of the gradient, intercept, or other feature of a regression line in context.

9
4.2% of questions
Show example »
  1. The relationship between two variables \(p\) and \(t\) is modelled by the regression line with equation
$$p = 22 - 1.1 t$$ The model is based on observations of the independent variable, \(t\), between 1 and 10
  1. Describe the correlation between \(p\) and \(t\) implied by this model. Given that \(p\) is measured in centimetres and \(t\) is measured in days,
  2. state the units of the gradient of the regression line. Using the model,
  3. calculate the change in \(p\) over a 3-day period. Tisam uses this model to estimate the value of \(p\) when \(t = 19\)
  4. Comment, giving a reason, on the reliability of this estimate.
View full question →
Relate two regression lines

A question is this type if and only if it asks to find the correlation coefficient or other relationship given both regression line equations (y on x and x on y).

9
4.2% of questions
Show example »
7 For a random sample of 10 observations of pairs of values \(( x , y )\), the equation of the regression line of \(y\) on \(x\) is \(y = 3.25 x - 4.27\). The sum of the ten \(x\) values is 15.6 and the product moment correlation coefficient for the sample is 0.56 . Find the equation of the regression line of \(x\) on \(y\). Test, at the \(5 \%\) significance level, whether there is evidence of non-zero correlation between the variables.
View full question →
Identify response/explanatory variables

A question is this type if and only if it asks to identify which variable is the independent/explanatory/controlled variable and which is the dependent/response variable.

9
4.2% of questions
Show example »
4. Crickets make a noise. The pitch, \(v \mathrm { kHz }\), of the noise made by a cricket was recorded at 15 different temperatures, \(t ^ { \circ } \mathrm { C }\). These data are summarised below. $$\sum t ^ { 2 } = 10922.81 , \sum v ^ { 2 } = 42.3356 , \sum t v = 677.971 , \sum t = 401.3 , \sum v = 25.08$$
  1. Find \(S _ { t t } , S _ { v v }\) and \(S _ { t v }\) for these data.
  2. Find the product moment correlation coefficient between \(t\) and \(v\).
  3. State, with a reason, which variable is the explanatory variable.
  4. Give a reason to support fitting a regression model of the form \(v = a + b t\) to these data.
  5. Find the value of \(a\) and the value of \(b\). Give your answers to 3 significant figures.
  6. Using this model, predict the pitch of the noise at \(19 ^ { \circ } \mathrm { C }\).
View full question →
Calculate PMCC from summary statistics

Questions that provide summary statistics (such as Sxx, Syy, Sxy, sums of x, y, x², y², xy) and require calculating the product moment correlation coefficient using these given values.

9
4.2% of questions
Show example »
3 A firm wishes to assess whether there is a linear relationship between the annual amount spent on advertising, \(\pounds x\) thousand, and the annual profit, \(\pounds y\) thousand. A summary of the figures for 12 years is as follows. $$n = 12 \quad \Sigma x = 86.6 \quad \Sigma y = 943.8 \quad \Sigma x ^ { 2 } = 658.76 \quad \Sigma y ^ { 2 } = 83663.00 \quad \Sigma x y = 7351.12$$
  1. Calculate the product moment correlation coefficient, showing that it is greater than 0.9 .
  2. Comment briefly on this value in this context.
  3. A manager claims that this result shows that spending more money on advertising in the future will result in greater profits. Make two criticisms of this claim.
  4. Calculate the equation of the regression line of \(y\) on \(x\).
  5. Estimate the annual profit during a year when \(\pounds 7400\) was spent on advertising.
View full question →
Calculate regression line then predict

A question is this sub-type if and only if the student must first calculate the regression line equation from summary statistics (using formulas for gradient and intercept) before making a prediction.

9
4.2% of questions
Show example »
2. An experiment carried out by a student yielded pairs of \(( x , y )\) observations such that $$\bar { x } = 36 , \quad \bar { y } = 28.6 , \quad S _ { x x } = 4402 , \quad S _ { x y } = 3477.6$$
  1. Calculate the equation of the regression line of \(y\) on \(x\) in the form \(y = a + b x\). Give your values of \(a\) and \(b\) to 2 decimal places.
  2. Find the value of \(y\) when \(x = 45\).
View full question →
Find unknown values from regression

A question is this type if and only if it requires finding unknown data values given the regression line equation and some of the data points.

8
3.8% of questions
Show example »
9 A random sample of five pairs of values of \(x\) and \(y\) is taken from a bivariate distribution. The values are shown in the following table, where \(p\) and \(q\) are constants.
\(x\)12345
\(y\)4\(p\)\(q\)21
The equation of the regression line of \(y\) on \(x\) is \(y = - 0.5 x + 3.5\).
  1. Find the values of \(p\) and \(q\).
  2. Find the value of the product moment correlation coefficient.
View full question →
Interpret features of scatter diagram

A question is this sub-type if and only if it provides a scatter diagram and requires interpretation of its features such as correlation strength, outliers, or relationship patterns without requiring drawing.

8
3.8% of questions
Show example »
2 A road transport researcher is investigating the link between the age of a person, a years, and the distance, \(d\) metres, at which the person can read a large road sign. The researcher selects 13 individuals of different ages between 20 and 80 and measures the value of \(d\) for each of them. The spreadsheet below shows the data which the researcher obtained, together with a scatter diagram which illustrates the data.
\includegraphics[max width=\textwidth, alt={}, center]{691e8b55-e9a1-4fff-b9ee-a71ff1f73ead-3_725_1566_495_251}
  1. Explain which of the two variables \(a\) and \(d\) is the independent variable.
  2. Find the equation of the regression line of \(d\) on \(a\).
  3. Use the regression line to predict the average distance at which a 60-year-old person can read the road sign.
  4. Explain why it might not be sensible to use the regression line to predict the average distance at which a 5 -year-old child can read the road sign.
  5. Determine the value of the residual for \(a = 40\).
  6. Explain why it would not be useful to find the equation of the regression line of \(a\) on \(d\).
View full question →
Linearize non-linear relationships

A question is this type if and only if it involves transforming a non-linear relationship (e.g., y = Ca^x) into linear form by taking logarithms or other transformations to enable linear regression.

6
2.8% of questions
Show example »
2 Two variable quantities \(x\) and \(y\) are believed to satisfy an equation of the form \(y = C \left( a ^ { x } \right)\), where \(C\) and \(a\) are constants. An experiment produced four pairs of values of \(x\) and \(y\). The table below gives the corresponding values of \(x\) and \(\ln y\).
\(x\)0.91.62.43.2
\(\ln y\)1.71.92.32.6
By plotting \(\ln y\) against \(x\) for these four pairs of values and drawing a suitable straight line, estimate the values of \(C\) and \(a\). Give your answers correct to 2 significant figures.
\includegraphics[max width=\textwidth, alt={}, center]{21878d10-7f16-4dbb-86ef-65a7ba5eeafb-03_759_944_749_596}
View full question →
Calculate PMCC from raw data

Questions that provide raw bivariate data in a table and require calculating the product moment correlation coefficient directly from the individual data values.

6
2.8% of questions
Show example »
3 An investor obtains data about the profits of 8 randomly chosen investment accounts over two one-year periods. The profit in the first year for each account is \(p \%\) and the profit in the second year for each account is \(q \%\). The results are shown in the table and in the scatter diagram.
AccountABCDEFGH
\(p\)1.62.12.42.72.83.35.28.4
\(q\)1.62.32.22.23.12.97.64.8
\(n = 8 \quad \sum \mathrm { p } = 28.5 \quad \sum \mathrm { q } = 26.7 \quad \sum \mathrm { p } ^ { 2 } = 136.35 \quad \sum \mathrm { q } ^ { 2 } = 116.35 \quad \sum \mathrm { pq } = 116.70\)
\includegraphics[max width=\textwidth, alt={}, center]{bf1468d1-e02e-47d2-bf41-5bc8f5b4d7c4-3_782_1280_998_242}
  1. State which, if either, of the variables \(p\) and \(q\) is independent.
  2. Calculate the equation of the regression line of \(q\) on \(p\).
    1. Use the regression line to estimate the value of \(q\) for an investment account for which \(p = 2.5\).
    2. Give two reasons why this estimate could be considered reliable.
  3. Comment on the reliability of using the regression line to predict the value of \(q\) when \(p = 7.0\).
View full question →
Calculate from summary statistics

A question is this sub-type if and only if it provides summary statistics (such as Σx, Σy, Σx², Σy², Σxy, n) and asks to calculate Sxx, Syy, or Sxy using the standard formulas.

6
2.8% of questions
Show example »
2. Paul believes there is a relationship between the value and the floor size of a house. He takes a random sample of 20 houses and records the value, \(\pounds v\), and the floor size, \(s \mathrm {~m} ^ { 2 }\) The data were coded using \(x = \frac { s - 50 } { 10 }\) and \(y = \frac { v } { 100000 }\) and the following statistics obtained. $$\sum x = 441.5 , \quad \sum y = 59.8 , \quad \sum x ^ { 2 } = 11261.25 , \quad \sum y ^ { 2 } = 196.66 , \quad \sum x y = 1474.1$$
  1. Find the value of \(S _ { x y }\) and the value of \(S _ { x x }\)
  2. Find the equation of the least squares regression line of \(y\) on \(x\) in the form \(y = a + b x\) The least squares regression line of \(v\) on \(s\) is \(v = c + d s\)
  3. Show that \(d = 1020\) to 3 significant figures and find the value of \(c\)
  4. Estimate the value of a house of floor size \(130 \mathrm {~m} ^ { 2 }\)
  5. Interpret the value \(d\) Paul wants to increase the value of his house. He decides to add an extension to increase the floor size by \(31 \mathrm {~m} ^ { 2 }\)
  6. Estimate the increase in the value of Paul's house after adding the extension.
View full question →
Explain least squares concept

A question is this type if and only if it asks to explain what is meant by 'least squares' in the context of regression, typically requiring reference to minimizing sum of squared residuals.

5
2.4% of questions
Show example »
3
  1. Using the scatter diagram in the Printed Answer Booklet, explain what is meant by least squares in the context of a regression line of \(y\) on \(x\).
  2. A set of bivariate data \(( t , u )\) is summarised as follows.
    \(n = 5 \quad \sum t = 35 \quad \sum u = 54\)
    \(\sum t ^ { 2 } = 285 \quad \sum u ^ { 2 } = 758 \quad \sum \mathrm { tu } = 460\)
    1. Calculate the equation of the regression line of \(u\) on \(t\).
    2. The variables \(t\) and \(u\) are now scaled using the following scaling.
      \(\mathrm { v } = 2 \mathrm { t } , \mathrm { w } = \mathrm { u } + 4\)
      Find the equation of the regression line of \(w\) on \(v\), giving your equation in the form \(w = f ( v )\).
View full question →
Hypothesis test for zero correlation

Questions that require testing whether the population correlation coefficient is zero (or equivalently, whether there is significant correlation) using the product moment correlation coefficient and t-distribution or critical value tables.

5
2.4% of questions
Show example »
10 A random sample of 5 pairs of values \(( x , y )\) is given in the following table.
\(x\)12458
\(y\)75864
  1. Find, showing all necessary working, the equation of the regression line of \(y\) on \(x\).
  2. Find, showing all necessary working, the value of the product moment correlation coefficient for this sample.
  3. Test, at the \(10 \%\) significance level, whether there is evidence of non-zero correlation between the variables.
View full question →
Calculate from raw data

A question is this sub-type if and only if it provides raw data values and asks to calculate Sxx, Syy, or Sxy directly from those values.

4
1.9% of questions
Show example »
  1. The percentage oil content, \(p\), and the weight, \(w\) milligrams, of each of 10 randomly selected sunflower seeds were recorded. These data are summarised below.
$$\sum w ^ { 2 } = 41252 \quad \sum w p = 27557.8 \quad \sum w = 640 \quad \sum p = 431 \quad \mathrm {~S} _ { p p } = 2.72$$
  1. Find the value of \(\mathrm { S } _ { w w }\) and the value of \(\mathrm { S } _ { w p }\)
  2. Calculate the product moment correlation coefficient between \(p\) and \(w\)
  3. Give an interpretation of your product moment correlation coefficient. The equation of the regression line of \(p\) on \(w\) is given in the form \(p = a + b w\)
  4. Find the equation of the regression line of \(p\) on \(w\)
  5. Hence estimate the percentage oil content of a sunflower seed which weighs 60 milligrams.
View full question →
Find means from regression lines

A question is this type if and only if it asks to find the mean values of x and y given the equations of both regression lines (using the fact that both pass through the mean point).

3
1.4% of questions
Show example »
8 The equations of the regression lines for a random sample of 25 pairs of data \(( x , y )\) from a bivariate population are $$\begin{array} { c c } y \text { on } x : & y = 1.28 - 0.425 x ,
x \text { on } y : & x = 1.05 - 0.516 y . \end{array}$$
  1. Find the sample means, \(\bar { x }\) and \(\bar { y }\).
  2. Find the product moment correlation coefficient for the sample.
  3. Test at the \(5 \%\) significance level whether the population correlation coefficient differs from zero.
View full question →
Interpret correlation strength/direction

A question is this type if and only if it asks to describe, interpret, or comment on the type, strength, or direction of correlation from a given correlation coefficient or scatter diagram.

3
1.4% of questions
Show example »
1 For each of the last five years the number of tourists, \(x\) thousands, visiting Sackton, and the average weekly sales, \(\pounds y\) thousands, in Sackton Stores were noted. The table shows the results.
Year20072008200920102011
\(x\)250270264290292
\(y\)4.23.73.23.53.0
  1. Calculate the product moment correlation coefficient \(r\) between \(x\) and \(y\).
  2. It is required to estimate the average weekly sales at Sackton Stores in a year when the number of tourists is 280000 . Calculate the equation of an appropriate regression line, and use it to find this estimate.
  3. Over a longer period the value of \(r\) is - 0.8 . The mayor says, "This shows that having more tourists causes sales at Sackton Stores to decrease." Give a reason why this statement is not correct.
View full question →
Calculate x on y regression line

Questions that ask to find the regression line of x on y (the reverse regression), either from summary statistics or raw data.

3
1.4% of questions
Show example »
4 The table shows the load a lorry was carrying, \(x\) tonnes, and the fuel economy, \(y \mathrm {~km}\) per litre, for 8 different journeys. You should assume that neither variable is controlled.
Load
\(( x\) tonnes \()\)
5.15.86.57.17.68.49.510.5
Fuel economy
\(( y \mathrm {~km}\) per litre \()\)
6.26.15.95.65.35.45.35.1
$$n = 8 \quad \sum x = 60.5 \quad \sum y = 44.9 \quad \sum x ^ { 2 } = 481.13 \quad \sum y ^ { 2 } = 253.17 \quad \sum x y = 334.65$$
  1. Calculate the equation of the regression line of \(y\) on \(x\).
  2. Estimate the fuel economy for a load of 9.2 tonnes.
  3. An analyst calculated the equation of the regression line of \(x\) on \(y\). Without calculating this equation, state the coordinates of the point where the two regression lines intersect.
  4. Describe briefly the method required to estimate the load when the fuel economy is 5.8 km per litre.
View full question →
Direct prediction from given regression line

A question is this sub-type if and only if the regression line equation is already provided in the question and the task is simply to substitute a value to make a prediction.

2
0.9% of questions
Show example »
2 A student is investigating the link between temperature and electricity consumption in the winter months. The student finds the average minimum temperature, \(x ^ { \circ } \mathrm { C }\), from across the country on a day. The student then finds the total electricity consumption for that day, \(y \mathrm { GWh }\). The scatter diagram below shows the values of \(x\) and \(y\) obtained from a random sample of 10 winter days. It also shows the equation of the regression line of \(y\) on \(x\) and the value of \(r ^ { 2 }\), where \(r\) is the product moment correlation coefficient.
\includegraphics[max width=\textwidth, alt={}, center]{c692fb20-436f-4bc1-89bd-10fdba41ceba-03_776_1043_609_244}
  1. Use the regression line to estimate the electricity consumption at each of the following average minimum temperatures.
    • \(5 ^ { \circ } \mathrm { C }\)
    • \(- 4 ^ { \circ } \mathrm { C }\)
    • Comment on the reliability of your estimates.
View full question →
Draw scatter diagram from data

A question is this sub-type if and only if it explicitly requires the student to draw or plot a scatter diagram from given data values.

2
0.9% of questions
Show example »
3. A manufacturer stores drums of chemicals. During storage, evaporation takes place. A random sample of 10 drums was taken and the time in storage, \(x\) weeks, and the evaporation loss, \(y \mathrm { ml }\), are shown in the table below.
\(x\)3568101213151618
\(y\)36505361697982908896
  1. On graph paper, draw a scatter diagram to represent these data.
  2. Give a reason to support fitting a regression model of the form \(y = a + b x\) to these data.
  3. Find, to 2 decimal places, the value of \(a\) and the value of \(b\). $$\text { (You may use } \Sigma x ^ { 2 } = 1352 , \Sigma y ^ { 2 } = 53112 \text { and } \Sigma x y = 8354 \text {.) }$$
  4. Give an interpretation of the value of \(b\).
  5. Using your model, predict the amount of evaporation that would take place after
    1. 19 weeks,
    2. 35 weeks.
  6. Comment, with a reason, on the reliability of each of your predictions.
View full question →
Calculate variance from summations

A question is this type if and only if it asks to calculate the variance of x or y from given summary statistics like Σx, Σx², and n.

1
0.5% of questions
Show example »
  1. A company wants to pay its employees according to their performance at work. Last year's performance score \(x\) and annual salary \(y\), in thousands of dollars, were recorded for a random sample of 10 employees of the company.
The performance scores were $$\begin{array} { l l l l l l l l l l } 15 & 24 & 32 & 39 & 41 & 18 & 16 & 22 & 34 & 42 \end{array}$$ (You may use \(\sum x ^ { 2 } = 9011\) )
  1. Find the mean and the variance of these performance scores. The corresponding \(y\) values for these 10 employees are summarised by $$\sum y = 306.1 \quad \text { and } \quad \mathrm { S } _ { y y } = 546.3$$
  2. Find the mean and the variance of these \(y\) values. The regression line of \(y\) on \(x\) based on this sample is $$y = 12.0 + 0.659 x$$
  3. Find the product moment correlation coefficient for these data.
  4. State, giving a reason, whether or not the value of the product moment correlation coefficient supports the use of a regression line to model the relationship between performance score and annual salary. The company decides to use this regression model to determine future salaries.
  5. Find the proposed annual salary, in dollars, for an employee who has a performance score of 35
View full question →
Minimize sum of squared residuals

A question is this type if and only if it involves algebraically minimizing an expression for the sum of squared residuals to derive regression line parameters.

1
0.5% of questions
Show example »
7 The coordinates of a set of 10 points are denoted by ( \(\mathrm { x } _ { \mathrm { i } } , \mathrm { y } _ { \mathrm { i } }\) ) for \(i = 1,2 , \ldots , 10\). For a particular set of values of ( \(\mathrm { x } _ { \mathrm { i } } , \mathrm { y } _ { \mathrm { i } }\) ) and any constants \(a\) and \(b\) it can be shown that
\(\Sigma \left( y _ { i } - a - b x _ { i } \right) ^ { 2 } = 10 ( 11 - a - 6 b ) ^ { 2 } + 126 \left( b - \frac { 83 } { 42 } \right) ^ { 2 } + \frac { 139 } { 14 }\).
    1. Explain why \(\sum \left( \mathrm { y } _ { \mathrm { i } } - \mathrm { a } - \mathrm { bx } _ { \mathrm { i } } \right) ^ { 2 }\) is minimised by taking \(b = \frac { 83 } { 42 }\) and \(\mathrm { a } = 11 - 6 \mathrm {~b}\).
    2. Hence explain why the equation of the regression line of \(y\) on \(x\) for these points is given by the corresponding values of \(a\) and \(b\) (so that the equation is \(\mathrm { y } = \frac { 83 } { 42 } \mathrm { x } - \frac { 6 } { 7 }\) ).
  1. State which of the following terms cannot apply to the variable \(X\) if the regression line of \(y\) on \(x\) can be used for estimating values of \(Y\). Dependent Independent Controlled Response
  2. Use the regression line to estimate the value of \(y\) corresponding to \(x = 8\).
  3. State what must be true of the value \(x = 8\) if the estimate in part (c) is to be reliable.
  4. Variables \(u\) and \(v\) are related to \(x\) and \(y\) by the following relationships.
    \(u = 2 + 4 x \quad v = 8 - 2 y\) Show that the gradient of the regression line of \(v\) on \(u\) is very close to - 1 .
View full question →
Prediction with confidence or prediction intervals

A question is this sub-type if and only if it requires constructing a confidence interval or prediction interval around the predicted value, involving variance and distributional assumptions.

1
0.5% of questions
Show example »
9 The values of a set of bivariate data \(\left( x _ { i } , y _ { i } \right)\) can be summarised by $$n = 50 , \sum x = 1270 , \sum y = 5173 , \sum x ^ { 2 } = 42767 , \sum y ^ { 2 } = 701301 , \sum x y = 173161 .$$ Ten independent observations of \(Y\) are obtained, all corresponding to \(x = 20\). It may be assumed that the variance of \(Y\) is 1.9 , independently of the value of \(x\). Find a \(95 \%\) confidence interval for the mean \(\bar { Y }\) of the 10 observations of \(Y\). \section*{END OF QUESTION PAPER}
View full question →
Hypothesis test for regression slope

Questions that require testing whether the regression slope coefficient is significantly different from zero using regression output, standard errors, and t-tests to assess the significance of the relationship.

1
0.5% of questions
Show example »
  1. Kwame is investigating a possible relationship between average March temperature, \(t ^ { \circ } \mathrm { C }\), and tea yield, \(y \mathrm {~kg} /\) hectare, for tea grown in a particular location. He uses 30 years of past data to produce the following summary statistics for a linear regression model, with tea yield as the dependent variable.
$$\begin{aligned} & \text { Residual Sum of Squares } ( \mathrm { RSS } ) = 1666567 \quad \mathrm {~S} _ { t t } = 52.0 \quad \mathrm {~S} _ { y y } = 1774155
& \text { least squares regression line: } \quad \text { gradient } = 45.5 \quad y \text {-intercept } = 2080 \end{aligned}$$
  1. Use the regression model to predict the tea yield for an average March temperature of \(20 ^ { \circ } \mathrm { C }\) He also produces the following residual plot for the data.
    \includegraphics[max width=\textwidth, alt={}, center]{d139840b-16ec-42ce-8501-f79c263c8017-02_663_880_868_589}
  2. Explain what you understand by the term residual.
  3. Calculate the product moment correlation coefficient between \(t\) and \(y\)
  4. Explain why the linear model may not be a good fit for the data
    1. with reference to your answer to part (c)
    2. with reference to the residual plot. \section*{Question 1 continues on page 4} Kwame also collects data on total March rainfall, \(w \mathrm {~mm}\), for each of these 30 years. For a linear regression model of \(w\) on \(t\) the following summary statistic is found. $$\text { Residual Sum of Squares (RSS) = } 86754$$ Kwame concludes that since this model has a smaller RSS, there must be a stronger linear relationship between \(w\) and \(t\) than between \(y\) and \(t\) (where RSS \(= 1666567\) )
  5. State, giving a reason, whether or not you agree with the reasoning that led to Kwame's conclusion.
View full question →
Convert correlation or regression statistics between scales

Questions that require converting summary statistics (like Sxx, Sxy, Syy, or correlation coefficient) between coded and original variables using properties of linear transformations.

0
0.0% of questions
Calculate with one S-value given

A question is this sub-type if and only if it provides summary statistics where one of Sxx, Syy, or Sxy is already given and asks to calculate one or both of the remaining S-values.

0
0.0% of questions