5.09a Dependent/independent variables

164 questions

Sort by: Default | Easiest first | Hardest first
AQA S1 2012 January Q5
17 marks Moderate -0.8
5 An experiment was undertaken to collect information on the burning of a specific type of wood as a source of energy. At given fixed levels of the wood's moisture content, \(x\) per cent, its corresponding calorific value, \(y \mathrm { MWh } /\) tonne, on burning was determined. The results are shown in the table.
\(\boldsymbol { x }\)5101520253035404550556065
\(\boldsymbol { y }\)5.24.74.34.03.22.82.52.21.81.51.31.00.6
  1. Explain why calorific value is the response variable.
  2. Calculate the equation of the least squares regression line of \(y\) on \(x\), giving your answer in the form \(y = a + b x\).
  3. Interpret, in context, your values for \(a\) and \(b\).
  4. Use your equation to estimate the wood's calorific value when it has a moisture content of 27 per cent.
  5. Calculate the value of the residual for the point \(( 35,2.5 )\).
  6. Given that the values of the 13 residuals lie between - 0.28 and + 0.23 , comment on the likely accuracy of your estimate in part (d).
    1. Give a general reason why your equation should not be used to estimate the wood's calorific value when it has a moisture content of 80 per cent.
    2. Give a specific reason, based on the context of this question and with numerical support, why your equation cannot be used to estimate the wood's calorific value when it has a moisture content of 80 per cent.
AQA S1 2007 June Q5
13 marks Moderate -0.8
5 Bob, a gardener, measures the time taken, \(y\) minutes, for 60 grams of weedkiller pellets to dissolve in 10 litres of water at different set temperatures, \(x ^ { \circ } \mathrm { C }\). His results are shown in the table.
\(\boldsymbol { x }\)1620242832364044485256
\(\boldsymbol { y }\)4.74.33.83.53.02.72.42.01.81.61.1
  1. State why the explanatory variable is temperature.
  2. Calculate the equation of the least squares regression line \(y = a + b x\).
    1. Interpret, in the context of this question, your value for \(b\).
    2. Explain why no sensible practical interpretation can be given for your value of \(a\).
    1. Estimate the time taken to dissolve 60 grams of weedkiller pellets in 10 litres of water at \(30 ^ { \circ } \mathrm { C }\).
    2. Show why the equation cannot be used to make a valid estimate of the time taken to dissolve 60 grams of weedkiller pellets in 10 litres of water at \(75 ^ { \circ } \mathrm { C }\). (2 marks)
AQA S1 2009 June Q2
10 marks Moderate -0.8
2 Hermione, who is studying reptiles, measures the length, \(x \mathrm {~cm}\), and the weight, \(y\) grams, of a sample of 11 adult snakes of the same type. Her results are shown in the table.
AQA S1 2012 June Q3
11 marks Moderate -0.3
3 The table shows the maximum weight, \(y _ { A }\) grams, of Salt \(A\) that will dissolve in 100 grams of water at various temperatures, \(x ^ { \circ } \mathrm { C }\).
\(\boldsymbol { x }\)101520253035404550607080
\(\boldsymbol { y } _ { \boldsymbol { A } }\)203548577792101111121137159182
  1. Calculate the equation of the least squares regression line of \(y _ { A }\) on \(x\).
  2. The data in the above table are plotted on the scatter diagram on page 4. Draw your regression line on this scatter diagram.
  3. For water temperatures in the range \(10 ^ { \circ } \mathrm { C }\) to \(80 ^ { \circ } \mathrm { C }\), the maximum weight, \(y _ { B }\) grams, of Salt \(B\) that will dissolve in 100 grams of water is given by the equation $$y _ { B } = 60.1 + 0.255 x$$
    1. Draw this line on the scatter diagram.
    2. Estimate the water temperature at which the maximum weight of Salt \(A\) that will dissolve in 100 grams of water is the same as that of Salt B.
    3. For Salt \(A\) and Salt \(B\), compare the effects of water temperature on the maximum weight that will dissolve in 100 grams of water. Your answer should identify two distinct differences. \section*{Temperatures and Maximum Weights}
      \includegraphics[max width=\textwidth, alt={}]{91466019-8feb-4292-b616-e8e8667e2e54-4_2023_1682_404_173}
AQA S1 2013 June Q4
17 marks Standard +0.3
4 The girth, \(g\) metres, the length, \(l\) metres, and the weight, \(y\) kilograms, of each of a sample of 20 pigs were measured. The data collected is summarised as follows. $$S _ { g g } = 0.1196 \quad S _ { l l } = 0.0436 \quad S _ { y y } = 5880 \quad S _ { g y } = 24.15 \quad S _ { l y } = 10.25$$
  1. Calculate the value of the product moment correlation coefficient between:
    1. girth and weight;
    2. length and weight.
  2. Interpret, in context, each of the values that you obtained in part (a).
  3. Weighing pigs requires expensive equipment, whereas measuring their girths and lengths simply requires a tape measure. With this in mind, the following formula is proposed to make an estimate of a pig's weight, \(x\) kilograms, from its girth and length. $$x = 69.3 \times g ^ { 2 } \times l$$ Applying this formula to the relevant data on the 20 pigs resulted in $$S _ { x x } = 5656.15 \quad S _ { x y } = 5662.97$$
    1. By calculating a third value of the product moment correlation coefficient, state which of \(g , l\) or \(x\) is the most strongly correlated with \(y\), the weight.
    2. Estimate the weight of a pig that has a girth of 1.25 metres and a length of 1.15 metres.
    3. Given the additional information that \(\bar { x } = 115.4\) and \(\bar { y } = 116.0\), calculate the equation of the least squares regression line of \(y\) on \(x\), in the form \(y = a + b x\).
    4. Comment on the likely accuracy of the estimated weight found in part (c)(ii). Your answer should make reference to the value of the product moment correlation coefficient found in part (c)(i) and to the values of \(b\) and \(a\) found in part (c)(iii).
      (4 marks)
AQA S1 2014 June Q3
11 marks Moderate -0.8
3 The table shows the body mass index (BMI), \(x\), and the systolic blood pressure (SBP), \(y \mathrm { mmHg }\), for each of a random sample of 10 men, aged between 35 years and 40 years, from a particular population.
\(\boldsymbol { x }\)13232935173425203127
\(\boldsymbol { y }\)103115124126108120113117118119
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  2. Use your equation to estimate the SBP of a man from this population who is aged 38 years and who has a BMI of 30 .
  3. State why your equation might not be appropriate for estimating the SBP of a man from this population:
    1. who is aged 38 years and who has a BMI of 45 ;
    2. who is aged 50 years and who has a BMI of 25 .
  4. Find the value of the residual for the point \(( 20,117 )\).
  5. The mean of the vertical distances of the 10 points from the regression line calculated in part (a) is 2.71, correct to three significant figures. Comment on the likely accuracy of your estimate in part (b).
    [0pt] [1 mark]
AQA S1 2014 June Q5
13 marks Moderate -0.5
5 As part of a study of charity shops in a small market town, two such shops, \(X\) and \(Y\), were each asked to provide details of its takings on 12 randomly selected days. The table shows, for each of the 12 days, the day's takings, \(\pounds x\), of charity shop \(X\) and the day's takings, \(\pounds y\), of charity shop \(Y\).
Day\(\mathbf { A }\)\(\mathbf { B }\)\(\mathbf { C }\)\(\mathbf { D }\)\(\mathbf { E }\)\(\mathbf { F }\)\(\mathbf { G }\)\(\mathbf { H }\)\(\mathbf { I }\)\(\mathbf { J }\)\(\mathbf { K }\)\(\mathbf { L }\)
\(\boldsymbol { x }\)4657391166277416115536861
\(\boldsymbol { y }\)781026621498729813421679583
    1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Interpret your value in the context of this question.
  1. Complete the scatter diagram shown on the opposite page.
  2. The investigator realised subsequently that one of the 12 selected days was a particularly popular town market day and another was a day on which the weather was extremely severe. Identify each of these days giving a reason for each choice.
  3. Removing the two days described in part (c) from the data gives the following information. $$S _ { x x } = 1292.5 \quad S _ { y y } = 3850.1 \quad S _ { x y } = 407.5$$
    1. Use this information to recalculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Hence revise, as necessary, your interpretation in part (a)(ii).
      [0pt] [3 marks] Shop \(X\) takings(£) \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{harity Shops} \includegraphics[alt={},max width=\textwidth]{ddf7f158-b6ae-42c6-98f1-d59c205646ad-17_33_21_294_1617}
      \end{figure} \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{harity Shops} \includegraphics[alt={},max width=\textwidth]{ddf7f158-b6ae-42c6-98f1-d59c205646ad-17_49_24_276_1710}
      \end{figure}
      \includegraphics[max width=\textwidth, alt={}]{ddf7f158-b6ae-42c6-98f1-d59c205646ad-17_1304_415_406_1391}
AQA S1 2014 June Q6
12 marks Moderate -0.8
6 A rubber seal is fitted to the bottom of a flood barrier. When no pressure is applied, the depth of the seal is 15 cm . When pressure is applied, a watertight seal is created between the flood barrier and the ground. The table shows the pressure, \(x\) kilopascals ( kPa ), applied to the seal and the resultant depth, \(y\) centimetres, of the seal.
\(\boldsymbol { x }\)255075100125150175200250300
\(\boldsymbol { y }\)14.713.412.811.911.010.39.79.07.56.7
    1. State the value that you would expect for \(a\) in the equation of the least squares regression line, \(y = a + b x\).
    2. Calculate the equation of the least squares regression line, \(y = a + b x\).
    3. Interpret, in context, your value for \(b\).
  1. Calculate an estimate of the depth of the seal when it is subjected to a pressure of 225 kPa .
    1. Give a statistical reason as to why your equation is unlikely to give a realistic estimate of the depth of the seal if it were to be subjected to a pressure of 400 kPa .
    2. Give a reason based on the context of this question as to why your equation will not give a realistic estimate of the depth of the seal if it were to be subjected to a pressure of 525 kPa .
      [0pt] [3 marks]
      \includegraphics[max width=\textwidth, alt={}]{8aeacd54-a5a1-4f2d-b936-2faf635ffce7-20_946_1709_1761_153}
      \includegraphics[max width=\textwidth, alt={}]{8aeacd54-a5a1-4f2d-b936-2faf635ffce7-21_2484_1707_221_153}
      \includegraphics[max width=\textwidth, alt={}]{8aeacd54-a5a1-4f2d-b936-2faf635ffce7-23_2484_1707_221_153}
Edexcel S1 Q4
10 marks Moderate -0.8
4. An internet service provider runs a series of television adverts at weekly intervals. To investigate the effectiveness of the adverts the company recorded the viewing figures in millions, \(v\), for the programme in which the advert was shown, and the number of new customers, \(c\), who signed up for their service the next day. The results are summarised as follows. $$\bar { v } = 4.92 , \quad \bar { c } = 104.4 , \quad S _ { v c } = 594.05 , \quad S _ { v v } = 85.44 .$$
  1. Calculate the equation of the regression line of \(c\) on \(v\) in the form \(c = a + b v\).
  2. Give an interpretation of the constants \(a\) and \(b\) in this context.
  3. Estimate the number of customers that will sign up with the company the day after an advert is shown during a programme watched by 3.7 million viewers.
  4. State two other factors besides viewing figures that will affect the success of an advert in gaining new customers for the company.
Edexcel S1 Q7
15 marks Moderate -0.8
7. Pipes-R-us manufacture a special lightweight aluminium tubing. The price \(\pounds P\), for each length, \(l\) metres, that the company sells is shown in the table.
\(l\) (metres)0.50.81.01.5246
\(P ( \pounds )\)2.503.404.005.206.0010.5015.00
  1. Represent these data on a scatter diagram. You may use $$\Sigma l = 15.8 , \quad \Sigma P = 46.6 , \quad \Sigma l ^ { 2 } = 60.14 , \quad \Sigma l P = 159.77$$
  2. Find the equation of the regression line of \(P\) on \(l\) in the form \(P = a + b l\).
  3. Give a practical interpretation of the constant b. In response to customer demand Pipes- \(R\)-us decide to start selling tubes cut to specific lengths. Initially the company decides to use the regression line found in part (b) as a pricing formula for this new service.
  4. Calculate the price that Pipes- \(R\)-us should charge for 5.2 metres of the tubing.
  5. Suggest a reason why Pipes- \(R\)-us might not offer prices based on the regression line for any length of tubing.
Edexcel S1 Q7
17 marks Standard +0.3
7. A new vaccine is tested over a six-month period in one health authority. The table shows the number of new cases of the disease, \(d\), reported in the \(m\) th month after the trials began.
\(m\)123456
\(d\)1026961585248
A doctor suggests that a relationship of the form \(d = a + b x\) where \(x = \frac { 1 } { m }\) can be used to model the situation.
  1. Tabulate the values of \(x\) corresponding to the given values of \(d\) and plot a scatter diagram of \(d\) against \(x\).
  2. Explain how your scatter diagram supports the suggested model. You may use $$\Sigma x = 2.45 , \quad \Sigma d = 390 , \quad \Sigma x ^ { 2 } = 1.491 , \quad \Sigma x d = 189.733$$
  3. Find an equation of the regression line \(d\) on \(x\) in the form \(d = a + b x\).
  4. Use your regression line to estimate how many new cases of the disease there will be in the 13th month after the trial began.
  5. Comment on the reliability of your answer to part (d).
Edexcel S1 Q6
14 marks Moderate -0.8
6. A physics student recorded the length, \(l \mathrm {~cm}\), of a spring when different masses, \(m\) grams, were suspended from it giving the following results.
\(m ( \mathrm {~g} )\)50100200300400500600700
\(l ( \mathrm {~cm} )\)7.810.716.522.128.033.935.235.6
  1. Represent these data on a scatter diagram with \(l\) on the vertical axis. The student decides to find the equation of a regression line of the form \(l = a + b m\) using only the data for \(m \leq 500 \mathrm {~g}\).
  2. Give a reason to support the fitting of such a regression line and explain why the student is excluding two of his values.
    (2 marks)
    You may use $$\Sigma m = 1550 , \quad \Sigma l = 119 , \quad \Sigma m ^ { 2 } = 552500 , \quad \Sigma l ^ { 2 } = 2869.2 , \quad \Sigma m l = 39540 .$$
  3. Find the values of \(a\) and \(b\).
  4. Explain the significance of the values of \(a\) and \(b\) in this situation.
Edexcel S1 Q4
11 marks Standard +0.3
  1. An engineer tested a new material under extreme conditions in a wind tunnel. He recorded the number of microfractures, \(n\), that formed and the wind speed, \(v\) metres per second, for 8 different values of \(v\) with all other conditions remaining constant. He then coded the data using \(x = v - 700\) and \(y = n - 20\) and calculated the following summary statistics.
$$\Sigma x = 100 , \quad \Sigma y = 23 , \quad \Sigma x ^ { 2 } = 215000 , \quad \Sigma x y = 11600 .$$
  1. Find an equation of the regression line of \(y\) on \(x\).
  2. Hence, find an equation of the regression line of \(n\) on \(v\).
  3. Use your regression line to estimate the number of microfractures that would be formed if the material was tested in a wind speed of 900 metres per second with all other conditions remaining constant.
    (2 marks)
AQA S3 2012 June Q1
6 marks Moderate -0.8
1 A wildlife expert measured the neck lengths, \(x\) metres, and the tail lengths, \(y\) metres, of a sample of 12 mature male giraffes as part of a study into their physical characteristics. The results are shown in the table.
AQA S3 2015 June Q1
6 marks Moderate -0.8
1 A demographer measured the length of the right foot, \(x\) millimetres, and the length of the right hand, \(y\) millimetres, of each of a sample of 12 males aged between 19 years and 25 years. The results are given in the table.
OCR MEI Further Statistics A AS 2018 June Q6
9 marks Standard +0.3
6 A researcher is investigating various bodily characteristics of frogs of various species. She collects data on length, \(x \mathrm {~mm}\), and head width, \(y \mathrm {~mm}\), of a random sample of 14 frogs of a particular species. A scatter diagram of the data is shown in Fig. 6, together with the equation of the regression line of \(y\) on \(x\) and also the value of \(r ^ { 2 }\). \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{e3ac0ba0-9692-4018-894e-2b04b07eaf32-6_949_1616_450_228} \captionsetup{labelformat=empty} \caption{Fig. 6}
\end{figure}
  1. (A) Use the equation of the regression line to estimate the mean head width for frogs of each of the following lengths.
OCR MEI Further Statistics A AS 2019 June Q5
13 marks Standard +0.3
5 A researcher is investigating births of females and males in a particular species of animal which very often produces litters of 7 offspring.
The table shows some data about the number of females per litter in 200 litters of 7 offspring. The researcher thinks that a binomial distribution \(\mathrm { B } ( 7 , p )\) may be an appropriate model for these data. (c) Complete the test at the \(5 \%\) significance level. Fig. 5 shows the probability distribution \(\mathrm { B } ( 7,0.35 )\) together with the relative frequencies of the observed data (the numbers of litters each divided by 200). \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{fd496303-10f1-450e-bbeb-421ab6f4de21-5_659_1285_342_319} \captionsetup{labelformat=empty} \caption{Fig. 5}
\end{figure} (d) Comment on the result of the test completed in part (c) by considering Fig. 5.
OCR MEI Further Statistics A AS 2019 June Q6
13 marks Standard +0.3
6 A meteorologist is investigating the relationship between altitude \(x\) metres and mean annual temperature \(y ^ { \circ } \mathrm { C }\) in an American state.
She selects 12 locations at various altitudes and then stations a remote monitoring device at each of them to measure the temperature over the course of a year. Fig. 6 illustrates the data which she obtains. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{fd496303-10f1-450e-bbeb-421ab6f4de21-6_686_1477_486_292} \captionsetup{labelformat=empty} \caption{Fig. 6}
\end{figure}
  1. Explain why it would not be appropriate to carry out a hypothesis test for correlation based on the product moment correlation coefficient.
  2. Explain why altitude has been plotted on the horizontal axis in Fig. 6. Summary statistics for \(x\) and \(y\) are as follows. $$\sum x = 21200 \quad \sum y = 105.4 \quad \sum x ^ { 2 } = 39100000 \quad \sum y ^ { 2 } = 1004 \quad \sum x y = 176090$$
  3. Calculate the equation of the regression line of \(y\) on \(x\).
  4. Use the equation of the regression line to predict the values of the mean annual temperature at each of the following altitudes.
OCR MEI Further Statistics A AS 2022 June Q6
10 marks Moderate -0.8
6 Tom has read in a newspaper that you can tell the air temperature by counting how often a cricket chirps in a period of 20 seconds. (A cricket is a type of insect.) He wants to know exactly how the temperature can be predicted. On 8 randomly selected days, when Tom can hear crickets chirping, he records the number of chirps, \(x\), made by a cricket in a 20-second interval, and also the temperature, \(y ^ { \circ } \mathrm { C }\), at that time. The data are summarised as follows. \(n = 8 \quad \sum x = 268 \quad \sum y = 141.9 \quad \sum x ^ { 2 } = 9618 \quad \sum y ^ { 2 } = 2630.55 \quad \sum \mathrm { xy } = 5009.1\) These data are illustrated below. \includegraphics[max width=\textwidth, alt={}, center]{8f1e0c68-a334-4657-823e-386ab0994c02-5_661_1035_699_242}
  1. Determine the equation of the regression line of \(y\) on \(x\). Give your answer in the form \(\mathrm { y } = \mathrm { ax } + \mathrm { b }\), giving the values of \(a\) and \(b\) correct to \(\mathbf { 3 }\) significant figures.
  2. Use the equation of the regression line to predict the temperature for the following values of \(x\).
OCR MEI Further Statistics A AS 2024 June Q4
10 marks Standard +0.3
4 A chemist is conducting an experiment in which the concentration of a certain chemical, A , is supposed to be recorded at the start of the experiment and then every 30 seconds after the start. The time after the start is denoted by \(t \mathrm {~s}\) and the concentration by \(\mathrm { z } \mathrm { mg } \mathrm { cm } ^ { - 3 }\). The collected data are shown in the table below. Note that the concentration at \(t = 90\) was not recorded.
Time, \(t\)03060120150
Concentration of A, \(z\)40.031.327.512.811.4
The chemist wishes to plot the data on a graph.
  1. Explain why \(t\) should be plotted on the horizontal axis. You are given that the summary statistics for the data are as follows. \(n = 5 \quad \sum t = 360 \quad \sum z = 123.0 \quad \sum t ^ { 2 } = 41400 \quad \sum z ^ { 2 } = 3629.74 \quad \sum \mathrm { t } = 5835\) The regression line of \(z\) on \(t\) is given by \(\mathbf { z = a + b t }\) and is used to model the concentration of chemical A for \(t \geqslant 0\).
    1. Use the summary statistics to determine the value of \(a\) and the value of \(b\).
    2. Find the value of the residual at each of the following values of \(t\).
      • \(t = 60\)
      • \(t = 120\)
        1. Use the equation of the regression line to estimate the value of the concentration at 90 seconds.
        2. With reference to your answers to part (b)(ii), comment on the reliability of your answer to part (c)(i).
      Further experiments indicate that the model is reasonably reliable for times greater than 150 seconds up to about 200 seconds.
  2. Show that the model cannot be valid beyond a time of about 200 seconds.
OCR MEI Further Statistics A AS 2020 November Q5
8 marks Moderate -0.3
5 A doctor is investigating the relationship between the levels in the blood of a particular hormone and of calcium in healthy adults. The levels of the hormone and of calcium, each measured in suitable units, are denoted by \(x\) and \(y\) respectively. The doctor selects a random sample of 14 adults and measures the hormone and calcium levels in each of them. The spreadsheet in Fig. 5 shows the values obtained, together with a scatter diagram which illustrates the data. The equation of the regression line of \(y\) on \(x\) is shown on the scatter diagram, together with the value of the square of the product moment correlation coefficient. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{ba3fcd3c-6834-4116-be0e-d5b27aed0a7e-5_801_1644_646_255} \captionsetup{labelformat=empty} \caption{Fig. 5}
\end{figure}
  1. Use the equation of the regression line to estimate the mean calcium level of people with the following hormone levels.
OCR MEI Further Statistics A AS 2021 November Q6
11 marks Moderate -0.3
6 A health researcher is investigating the relationship between age and maximum heart rate. A commonly quoted formula states that 'maximum heart rate \(= 220\) - age in years'. The researcher wants to check if this formula is a satisfactory model for people who work in the large hospital where she is employed. The researcher selects a random sample of 20 people who work in her hospital, and measures their maximum heart rates.
  1. Explain why the researcher selects a sample, rather than using all of the people who work in the hospital. The ages, \(x\) years, and maximum heart rates, \(y\) beats per minute, of the people in the researcher's sample are summarised as follows. \(n = 20 \quad \sum x = 922 \quad \sum y = 3638 \quad \sum x ^ { 2 } = 47250 \quad \sum y ^ { 2 } = 664610 \quad \sum x y = 164998\) These data are illustrated below. \includegraphics[max width=\textwidth, alt={}, center]{5be067ff-4668-48d6-8ed2-b8dfa3e678f7-5_758_1246_1027_244}
    1. Draw the line which represents the formula 'maximum heart rate \(= 220 -\) age in years' on the copy of the scatter diagram in the Printed Answer Booklet.
    2. Comment on how well this model fits the data.
  2. Determine the equation of the regression line of maximum heart rate on age.
  3. Use the equation of the regression line to predict the values of the maximum heart rate for each of the following ages.
OCR MEI Further Statistics Minor 2022 June Q2
13 marks Moderate -0.8
2 A forester is investigating the relationship between the diameter and the height of young beech trees. She selects a random sample of 15 young beech trees in a forest and records their diameters, \(d \mathrm {~cm}\), and their heights, \(h \mathrm {~m}\). The data are illustrated in the scatter diagram. \includegraphics[max width=\textwidth, alt={}, center]{e8624e9b-5143-49d2-9683-cc3a1082694e-3_649_1116_386_230}
  1. State whether either or both of the variables \(d\) and \(h\) are random variables. Summary data for the diameters and heights are as follows. $$\mathrm { n } = 15 \quad \sum \mathrm {~d} = 84.9 \quad \sum \mathrm {~h} = 124.7 \quad \sum \mathrm {~d} ^ { 2 } = 624.55 \quad \sum \mathrm {~h} ^ { 2 } = 1230.57 \quad \sum \mathrm { dh } = 866.63$$
  2. Find the equation of the regression line of \(h\) on \(d\). Give your answer in the form \(h = a d + b\), giving the values of \(a\) and \(b\) correct to \(\mathbf { 2 }\) decimal places.
  3. Use the regression line to predict the heights of beech trees with the following diameters.
    Comment on this in relation to your regression line.
  4. State the coordinates of the point at which the regression line of \(d\) on \(h\) meets the line which you calculated in part (b).
OCR MEI Further Statistics Minor 2023 June Q5
8 marks Moderate -0.8
5 An ornithologist is investigating the link between the wing length and the mass of small birds, in order to try to predict the mass from the wing length without having to weigh birds. The ornithologist takes a random sample of 9 birds and measures their wing lengths \(w \mathrm {~mm}\) and their masses \(m g\). The spreadsheet below shows the data, together with a scatter diagram which illustrates the data. \includegraphics[max width=\textwidth, alt={}, center]{72215d69-c3e6-492d-bb3e-bdc28aeb4613-5_719_1424_495_246}
  1. Find the equation of the regression line of \(m\) on \(w\), giving the coefficients correct to \(\mathbf { 3 }\) significant figures.
  2. Use the equation which you found in part (a) to estimate the mass for each of the following wing lengths.
    Comment on this suggestion.
OCR MEI Further Statistics Minor 2021 November Q2
9 marks Moderate -0.8
2 A road transport researcher is investigating the link between the age of a person, a years, and the distance, \(d\) metres, at which the person can read a large road sign. The researcher selects 13 individuals of different ages between 20 and 80 and measures the value of \(d\) for each of them. The spreadsheet below shows the data which the researcher obtained, together with a scatter diagram which illustrates the data. \includegraphics[max width=\textwidth, alt={}, center]{691e8b55-e9a1-4fff-b9ee-a71ff1f73ead-3_725_1566_495_251}
  1. Explain which of the two variables \(a\) and \(d\) is the independent variable.
  2. Find the equation of the regression line of \(d\) on \(a\).
  3. Use the regression line to predict the average distance at which a 60-year-old person can read the road sign.
  4. Explain why it might not be sensible to use the regression line to predict the average distance at which a 5 -year-old child can read the road sign.
  5. Determine the value of the residual for \(a = 40\).
  6. Explain why it would not be useful to find the equation of the regression line of \(a\) on \(d\).