Find unknown values from regression

A question is this type if and only if it requires finding unknown data values given the regression line equation and some of the data points.

8 questions

CAIE FP2 2013 June Q10 OR
The regression line of \(y\) on \(x\), obtained from a random sample of five pairs of values of \(x\) and \(y\), has equation $$y = x + k$$ where \(k\) is a constant. The following table shows the data.
\(x\)2334\(p\)
\(y\)45842
Find the two possible values of \(p\). For the smaller of these two values of \(p\), find
  1. the product moment correlation coefficient,
  2. the equation of the regression line of \(x\) on \(y\).
CAIE FP2 2018 June Q11 OR
The regression line of \(y\) on \(x\), obtained from a random sample of 6 pairs of values of \(x\) and \(y\), has equation $$y = 0.25 x + k$$ where \(k\) is a constant. The values from the sample are shown in the following table.
\(x\)45781014
\(y\)58\(p\)7\(p\)9
  1. Find the value of \(p\) and the value of \(k\).
  2. Find the product moment correlation coefficient for the data.
  3. Test, at the \(5 \%\) significance level, whether there is evidence of positive correlation between the variables.
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE FP2 2018 June Q8
8 For a random sample of 6 observations of pairs of values \(( x , y )\), the equation of the regression line of \(y\) on \(x\) is \(y = b x + 1.306\), where \(b\) is a constant. The corresponding equation of the regression line of \(x\) on \(y\) is \(x = 0.6331 y + d\), where \(d\) is a constant. The values of \(x\) from the sample are $$\begin{array} { l l l l l l } 2.3 & 2.8 & 3.7 & p & 6.1 & 6.4 \end{array}$$ and the sum of the values of \(y\) is 46.5 . The product moment correlation coefficient is 0.9797 .
  1. Find the value of \(b\) correct to 3 decimal places.
  2. Find the value of \(p\).
  3. Use the equation of the regression line of \(x\) on \(y\) to estimate the value of \(x\) when \(y = 8.5\).
CAIE FP2 2019 June Q10
10 The values from a random sample of five pairs \(( x , y )\) taken from a bivariate distribution are shown below.
\(x\)34468
\(y\)57\(q\)67
The equation of the regression line of \(x\) on \(y\) is given by \(x = \frac { 5 } { 4 } y + c\).
  1. Given that \(q\) is an integer, find its value.
  2. Find the value of \(c\).
  3. Find the value of the product moment correlation coefficient.
CAIE FP2 2011 November Q10 OR
The regression line of \(y\) on \(x\) obtained from a random sample of five pairs of values of \(x\) and \(y\) is $$y = 2.5 x - 1.5$$ The data is given in the following table.
\(x\)12426
\(y\)236\(p\)\(q\)
  1. Show that \(p + q = 19\).
  2. Find the values of \(p\) and \(q\).
  3. Determine the value of the product moment correlation coefficient for this sample.
  4. It is later discovered that the values of \(x\) given in the table have each been divided by 10 (that is, the actual values are \(10,20,40,20,60\) ). Without any further calculation, state
    (a) the equation of the actual regression line of \(y\) on \(x\),
    (b) the value of the actual product moment correlation coefficient.
CAIE FP2 2019 November Q9
9 A random sample of five pairs of values of \(x\) and \(y\) is taken from a bivariate distribution. The values are shown in the following table, where \(p\) and \(q\) are constants.
\(x\)12345
\(y\)4\(p\)\(q\)21
The equation of the regression line of \(y\) on \(x\) is \(y = - 0.5 x + 3.5\).
  1. Find the values of \(p\) and \(q\).
  2. Find the value of the product moment correlation coefficient.
OCR MEI Further Statistics A AS 2024 June Q4
4 A chemist is conducting an experiment in which the concentration of a certain chemical, A , is supposed to be recorded at the start of the experiment and then every 30 seconds after the start. The time after the start is denoted by \(t \mathrm {~s}\) and the concentration by \(\mathrm { z } \mathrm { mg } \mathrm { cm } ^ { - 3 }\). The collected data are shown in the table below. Note that the concentration at \(t = 90\) was not recorded.
Time, \(t\)03060120150
Concentration of A, \(z\)40.031.327.512.811.4
The chemist wishes to plot the data on a graph.
  1. Explain why \(t\) should be plotted on the horizontal axis. You are given that the summary statistics for the data are as follows.
    \(n = 5 \quad \sum t = 360 \quad \sum z = 123.0 \quad \sum t ^ { 2 } = 41400 \quad \sum z ^ { 2 } = 3629.74 \quad \sum \mathrm { t } = 5835\) The regression line of \(z\) on \(t\) is given by \(\mathbf { z = a + b t }\) and is used to model the concentration of chemical A for \(t \geqslant 0\).
    1. Use the summary statistics to determine the value of \(a\) and the value of \(b\).
    2. Find the value of the residual at each of the following values of \(t\).
      • \(t = 60\)
  2. \(t = 120\)
    1. Use the equation of the regression line to estimate the value of the concentration at 90 seconds.
    2. With reference to your answers to part (b)(ii), comment on the reliability of your answer to part (c)(i).
  3. Further experiments indicate that the model is reasonably reliable for times greater than 150 seconds up to about 200 seconds.
  4. Show that the model cannot be valid beyond a time of about 200 seconds.
WJEC Further Unit 2 2024 June Q4
4. An author poses the following question: Does using cash for transactions affect people's financial behaviour?
She collects data on 'Cash transactions as a \% of all transactions' and 'Household debt as a \(\%\) of net disposable income' from a random sample of 25 countries. The table below shows the data she collected. There are missing values, \(p\) and \(q\), for Malta and Denmark respectively.
CountryCash transactions as a \% of all transactions \(\boldsymbol { x }\)Household debt as a \% of net disposable income \(\boldsymbol { y }\)CountryCash transactions as a \% of all transactions \(\boldsymbol { x }\)Household debt as a \% of net disposable income \(\boldsymbol { y }\)
Malta92\(p\)France68120
Mexico90-14Luxembourg64177
Greece88107Belgium63113
Spain87110Finland54137
Italy8687Estonia4882
Austria8591The Netherlands45247
Portugal81131UK42147
Slovenia8056Australia37214
Germany8095USA32109
Ireland79154Sweden20187
Slovakia7874South Korea14182
Lithuania7546Denmark\(q\)261
Latvia7143
The summary statistics and scatter diagram below are for the other 23 countries. \begin{figure}[h]
\captionsetup{labelformat=empty} \caption{Household debt versus Cash transactions} \includegraphics[alt={},max width=\textwidth]{1538fa56-5b61-40ec-bb02-cf1ed9da5eb0-13_664_1296_511_379}
\end{figure} $$\begin{gathered} \sum x = 1467 \sum y = 2695 \sum x ^ { 2 } = 105073 \quad S _ { x x } = 11503 \cdot 91304 \quad S _ { y y } = 78669 \cdot 30435
\sum y ^ { 2 } = 394453 \sum x y = 152999 \quad S _ { x y } = - 18895 \cdot 13043 \end{gathered}$$
  1. Using the summary statistics for the 23 countries, calculate and interpret Pearson's product moment correlation coefficient.
  2. Calculate the equation of the least squares regression line of Household debt as a \% of net disposable income \(( y )\) on Cash transactions as a \% of all transactions ( \(x\) ). The regression line \(x\) on \(y\) is given below. $$x = - 0 \cdot 24 y + 91 \cdot 92$$
  3. By selecting the appropriate regression line in each case, estimate the values of \(p\) and \(q\) in the table.
  4. Comment on the reliability of your answers in part (c).
  5. Interpret the negative value of \(y\) for Mexico.