Use regression line for prediction

A question is this type if and only if it asks to estimate or predict a value using a regression equation, or assess reliability of such predictions.

3 questions

OCR MEI Paper 2 2022 June Q15
15 The pre-release material includes information on life expectancy at birth in countries of the world.
Fig. 15.1 shows the data for Liberia, which is in Africa, together with a time series graph. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{57007d39-abb0-475e-9ed8-03021fa1273b-12_721_1284_342_242} \captionsetup{labelformat=empty} \caption{Fig. 15.1}
\end{figure} Sundip uses the LINEST function on a spreadsheet to model life expectancy as a function of calendar year by a straight line. The equation of this line is \(L = 0.473 y - 892\), where \(L\) is life expectancy at birth and \(y\) is calendar year.
  1. Use this model to find an estimate of the life expectancy at birth in Liberia in 1995. According to the model, the life expectancy at birth in Liberia in 2025 is estimated to be 65.83 years.
  2. Explain whether each of these two estimates is likely to be reliable.
  3. Use your knowledge of the pre-release material to explain whether this model could be used to obtain a reliable estimate of the life expectancy at birth in other countries in 1995. \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{Fig. 15.2 shows the life expectancy at birth between 1960 and 2010 for Italy and South Africa.} \includegraphics[alt={},max width=\textwidth]{57007d39-abb0-475e-9ed8-03021fa1273b-13_652_1466_294_230}
    \end{figure} Fig. 15.2
  4. Use your knowledge of the pre-release material to
    • Explain whether series 1 or series 2 represents the data for Italy.
    • Explain how the data for South Africa differs from the data for most developed countries.
    Sundip is investigating whether there is an association between the wealth of a country and life expectancy at birth in that country. As part of her analysis she draws a scatter diagram of GDP per capita in US \$ and life expectancy at birth in 2010 for all the countries in Europe for which data is available. She accidentally includes the data for the Central African Republic. The diagram is shown in Fig. 15.3. \section*{Scatter diagram of life expectancy at birth in 2010 against GDP per capita in US \$} \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{57007d39-abb0-475e-9ed8-03021fa1273b-14_632_1554_607_244} \captionsetup{labelformat=empty} \caption{Fig. 15.3}
    \end{figure}
  5. On the copy of Fig. 15.3 in the Printed Answer Booklet, use your knowledge of the pre-release material to circle the point representing the data for the Central African Republic. Sundip states that as GDP per capita increases, life expectancy at birth increases.
  6. Explain to what extent the information in Fig. 15.3 supports Sundip's statement.
Edexcel S1 2019 June Q6
  1. Ranpose hospital offers services to a large number of clinics that refer patients to a range of hospitals.
    The manager at Ranpose hospital took a random sample of 16 clinics and recorded
  • the distance, \(x \mathrm {~km}\), of the clinic from Ranpose hospital
  • the percentage, \(y \%\), of the referrals from the clinic who attend Ranpose hospital.
The data are summarised as $$\bar { x } = 8.1 \quad \bar { y } = 20.5 \quad \sum y ^ { 2 } = 8266 \quad \mathrm {~S} _ { x x } = 368.16 \quad \mathrm {~S} _ { x y } = - 630.9$$
  1. Find the product moment correlation coefficient for these data.
  2. Give an interpretation of your correlation coefficient. The manager at Ranpose hospital believes that there may be a linear relationship between the distance of a clinic from the hospital and the percentage of the referrals who attend the hospital. She drew the following scatter diagram for these data.
    \includegraphics[max width=\textwidth, alt={}, center]{9ac7647f-b291-4a64-9518-fa6438a0cc7d-20_1106_926_1133_511}
  3. State, giving a reason, whether or not these data support the manager's belief.
    (1)
    \section*{[The summary data and the scatter diagram are repeated below.]} The data are summarised as $$\bar { x } = 8.1 \quad \bar { y } = 20.5 \quad \sum y ^ { 2 } = 8266 \quad \mathrm {~S} _ { x x } = 368.16 \quad \mathrm {~S} _ { x y } = - 630.9$$ \includegraphics[max width=\textwidth, alt={}, center]{9ac7647f-b291-4a64-9518-fa6438a0cc7d-22_1118_936_612_504}
  4. Find the equation of the regression line of \(y\) on \(x\), giving your answer in the form $$y = a + b x$$
  5. Give an interpretation of the gradient of your regression line.
  6. Draw your regression line on the scatter diagram. The manager believes that Ranpose hospital should be attracting an "above average" percentage of referrals from clinics that are less than 5 km from the hospital. She proposes to target one clinic with some extra publicity about the services Ranpose offers.
  7. On the scatter diagram circle the point representing the clinic she should target.
    VIIIV SIHI NI JIIYM ION OCNAMV SIHIL NI JAHAM ION OCVJ4V SIHII NI JIIYM ION OO
OCR Further Statistics 2018 September Q1
1 An experiment involves releasing a coin on a sloping plane so that it slides down the slope and then slides along a horizontal plane at the bottom of the slope before coming to rest. The angle \(\theta ^ { \circ }\) of the sloping plane is varied, and for each value of \(\theta\), the distance \(d \mathrm {~cm}\) the coin slides on the horizontal plane is recorded. A scatter diagram to illustrate the results of the experiment is shown below, together with the least squares regression line of \(d\) on \(\theta\).
\includegraphics[max width=\textwidth, alt={}, center]{28c6a0d9-09a6-4743-af0e-fe2e43e256c9-2_639_972_561_548}
  1. State which two of the following correctly describe the variable \(\theta\).
    Controlled variableCorrelation coefficient
    Dependent variableIndependent variable
    Response variableRegression coefficient
    The least squares regression line of \(d\) on \(\theta\) has equation \(d = 1.96 + 0.11 \theta\).
  2. Use the diagram in the Printed Answer Booklet to explain the term "least squares".
  3. State what difference, if any, it would make to the equation of the regression line if \(d\) were measured in inches rather than centimetres. ( 1 inch \(\approx 2.54 \mathrm {~cm}\) ).