Explain least squares concept

A question is this type if and only if it asks to explain what is meant by 'least squares' in the context of regression, typically requiring reference to minimizing sum of squared residuals.

5 questions

OCR S1 2011 June Q7
7 The diagram shows the results of an experiment involving some bivariate data. The least squares regression line of \(y\) on \(x\) for these results is also shown.
\includegraphics[max width=\textwidth, alt={}, center]{48ffcd44-d933-40e0-818a-20d6db607298-5_748_919_390_612}
  1. Given that the least squares regression line of \(y\) on \(x\) is used for an estimation, state which of \(x\) or \(y\) is treated as the independent variable.
  2. Use the diagram to explain what is meant by 'least squares'.
  3. State, with a reason, the value of Spearman's rank correlation coefficient for these data.
  4. What can be said about the value of the product moment correlation coefficient for these data?
CAIE FP2 2009 November Q11 OR
\includegraphics[max width=\textwidth, alt={}]{20dd0f90-8c10-4a7b-a383-4ba89d167cd6-5_374_569_1123_788}
The scatter diagram shows a sample of size 5 of bivariate data, together with the regression line of \(y\) on \(x\). State what is minimised in obtaining this regression line, illustrating your answer on a copy of this diagram. State, giving a reason, whether, for the data shown, the regression line of \(y\) on \(x\) is the same as the regression line of \(x\) on \(y\). A car is travelling along a stretch of road with speed \(v \mathrm {~km} \mathrm {~h} ^ { - 1 }\) when the brakes are applied. The car comes to rest after travelling a further distance of \(z \mathrm {~m}\). The values of \(z\) (and \(\sqrt { } z\) ) for 8 different values of \(v\) are given in the table, correct to 2 decimal places.
\(v\)2530354045505560
\(z\)2.834.634.845.299.7310.3014.8215.21
\(\sqrt { } z\)1.682.152.202.303.123.213.853.90
$$\left[ \Sigma v = 340 , \Sigma v ^ { 2 } = 15500 , \Sigma \sqrt { } z = 22.41 , \Sigma z = 67.65 , \Sigma v \sqrt { } z = 1022.15 . \right]$$
  1. Calculate the product moment correlation coefficient between \(v\) and \(\sqrt { } z\). What does this indicate about the scatter diagram of the points \(( v , \sqrt { } z )\) ?
  2. Given that the product moment correlation coefficient between \(v\) and \(z\) is 0.965 , correct to 3 decimal places, state why the regression line of \(\sqrt { } z\) on \(v\) is more suitable than the regression line of \(z\) on \(v\), and find the equation of the regression line of \(\sqrt { } z\) on \(v\).
  3. Comment, in the context of the question, on the value of the constant term in the equation of the regression line of \(\sqrt { } z\) on \(v\).
OCR Further Statistics AS 2021 November Q3
3
  1. Using the scatter diagram in the Printed Answer Booklet, explain what is meant by least squares in the context of a regression line of \(y\) on \(x\).
  2. A set of bivariate data \(( t , u )\) is summarised as follows.
    \(n = 5 \quad \sum t = 35 \quad \sum u = 54\)
    \(\sum t ^ { 2 } = 285 \quad \sum u ^ { 2 } = 758 \quad \sum \mathrm { tu } = 460\)
    1. Calculate the equation of the regression line of \(u\) on \(t\).
    2. The variables \(t\) and \(u\) are now scaled using the following scaling.
      \(\mathrm { v } = 2 \mathrm { t } , \mathrm { w } = \mathrm { u } + 4\)
      Find the equation of the regression line of \(w\) on \(v\), giving your equation in the form \(w = f ( v )\).
OCR Further Statistics 2021 November Q1
1 At a seaside resort the number \(X\) of ice-creams sold and the temperature \(Y ^ { \circ } \mathrm { F }\) were recorded on 20 randomly chosen summer days. The data can be summarised as follows.
\(\sum x = 1506 \quad \sum x ^ { 2 } = 127542 \quad \sum y = 1431 \quad \sum y ^ { 2 } = 104451 \quad \sum x y = 111297\)
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\), giving your answer in the form \(y = a + b x\).
  2. Explain the significance for the regression line of the quantity \(\sum \left[ y _ { i } - \left( a x _ { i } + b \right) \right] ^ { 2 }\).
  3. It is decided to measure the temperature in degrees Centigrade instead of degrees Fahrenheit. If the same temperature is measured both as \(f ^ { \circ }\) Fahrenheit and \(c ^ { \circ }\) Centigrade, the relationship between \(f\) and \(c\) is \(\mathrm { c } = \frac { 5 } { 9 } ( \mathrm { f } - 32 )\). Find the equation of the new regression line.
SPS SPS FM Statistics 2022 February Q4
  1. (a) Using the scatter diagram below, explain what is meant by least squares in the context of a regression line of \(y\) on \(x\).
    \includegraphics[max width=\textwidth, alt={}, center]{5a60e87d-7a09-4ef5-96ca-8f33030c8747-08_481_889_276_219}
    (b) A set of bivariate data \(( t , u )\) is summarised as follows.
$$\begin{array} { l l l } n = 5 & \sum t = 35 & \sum u = 54
\sum t ^ { 2 } = 285 & \sum u ^ { 2 } = 758 & \sum t u = 460 \end{array}$$
  1. Calculate the equation of the regression line of \(u\) on \(t\).
  2. The variables \(t\) and \(u\) are now scaled using the following scaling. $$v = 2 t , w = u + 4$$ Find the equation of the regression line of \(w\) on \(v\), giving your equation in the form $$w = \mathrm { f } ( v ) .$$ [BLANK PAGE]