Calculate y on x from raw data table

Questions that provide raw bivariate data in a table and ask to find the regression line of y on x.

66 questions

Edexcel S1 2010 June Q6
6. A travel agent sells flights to different destinations from Beerow airport. The distance \(d\), measured in 100 km , of the destination from the airport and the fare \(\pounds f\) are recorded for a random sample of 6 destinations.
Destination\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)
\(d\)2.24.06.02.58.05.0
\(f\)182025233228
$$\text { [You may use } \sum d ^ { 2 } = 152.09 \quad \sum f ^ { 2 } = 3686 \quad \sum f d = 723.1 \text { ] }$$
  1. Using the axes below, complete a scatter diagram to illustrate this information.
  2. Explain why a linear regression model may be appropriate to describe the relationship between \(f\) and \(d\).
  3. Calculate \(S _ { d d }\) and \(S _ { f d }\)
  4. Calculate the equation of the regression line of \(f\) on \(d\) giving your answer in the form \(f = a + b d\).
  5. Give an interpretation of the value of \(b\). Jane is planning her holiday and wishes to fly from Beerow airport to a destination \(t \mathrm {~km}\) away. A rival travel agent charges 5 p per km.
  6. Find the range of values of \(t\) for which the first travel agent is cheaper than the rival.
    \includegraphics[max width=\textwidth, alt={}, center]{039e6fcf-3222-40cc-95ea-37b8dc4a4ddb-11_1013_1701_1718_116}
Edexcel S1 2011 June Q7
  1. A teacher took a random sample of 8 children from a class. For each child the teacher recorded the length of their left foot, \(f \mathrm {~cm}\), and their height, \(h \mathrm {~cm}\). The results are given in the table below.
\(f\)2326232227242021
\(h\)135144134136140134130132
(You may use \(\sum f = 186 \quad \sum h = 1085 \quad \mathrm {~S} _ { f f } = 39.5 \quad \mathrm {~S} _ { h h } = 139.875 \quad \sum f h = 25291\) )
  1. Calculate \(\mathrm { S } _ { f h }\)
  2. Find the equation of the regression line of \(h\) on \(f\) in the form \(h = a + b f\). Give the value of \(a\) and the value of \(b\) correct to 3 significant figures.
  3. Use your equation to estimate the height of a child with a left foot length of 25 cm .
  4. Comment on the reliability of your estimate in (c), giving a reason for your answer. The left foot length of the teacher is 25 cm .
  5. Give a reason why the equation in (b) should not be used to estimate the teacher's height.
Edexcel S1 2012 June Q3
3. A scientist is researching whether or not birds of prey exposed to pollutants lay eggs with thinner shells. He collects a random sample of egg shells from each of 6 different nests and tests for pollutant level, \(p\), and measures the thinning of the shell, \(t\). The results are shown in the table below.
\(p\)3830251512
\(t\)1391056
[You may use \(\sum p ^ { 2 } = 1967\) and \(\sum p t = 694\) ]
  1. Draw a scatter diagram on the axes on page 7 to represent these data.
  2. Explain why a linear regression model may be appropriate to describe the relationship between \(p\) and \(t\).
  3. Calculate the value of \(S _ { p t }\) and the value of \(S _ { p p }\).
  4. Find the equation of the regression line of \(t\) on \(p\), giving your answer in the form \(t = a + b p\).
  5. Plot the point ( \(\bar { p } , \bar { t }\) ) and draw the regression line on your scatter diagram. The scientist reviews similar studies and finds that pollutant levels above 16 are likely to result in the death of a chick soon after hatching.
  6. Estimate the minimum thinning of the shell that is likely to result in the death of a chick. \includegraphics[max width=\textwidth, alt={}, center]{0593544d-392d-465b-b922-c9cb1435abb5-05_1257_1568_301_173}
Edexcel S1 2013 June Q1
  1. A meteorologist believes that there is a relationship between the height above sea level, \(h \mathrm {~m}\), and the air temperature, \(t ^ { \circ } \mathrm { C }\). Data is collected at the same time from 9 different places on the same mountain. The data is summarised in the table below.
\(h\)140011002608409005501230100770
\(t\)310209101352416
[You may assume that \(\sum h = 7150 , \sum t = 110 , \sum h ^ { 2 } = 7171500 , \sum t ^ { 2 } = 1716\), \(\sum t h = 64980\) and \(\mathrm { S } _ { t t } = 371.56\) ]
  1. Calculate \(\mathrm { S } _ { t h }\) and \(\mathrm { S } _ { h h }\). Give your answers to 3 significant figures.
  2. Calculate the product moment correlation coefficient for this data.
  3. State whether or not your value supports the use of a regression equation to predict the air temperature at different heights on this mountain. Give a reason for your answer.
  4. Find the equation of the regression line of \(t\) on \(h\) giving your answer in the form \(t = a + b h\).
  5. Interpret the value of \(b\).
  6. Estimate the difference in air temperature between a height of 500 m and a height of 1000 m .
Edexcel S1 2014 June Q3
3. A large company is analysing how much money it spends on paper in its offices every year. The number of employees, \(x\), and the amount of money spent on paper, \(p\) ( \(\pounds\) hundreds), in 8 randomly selected offices are given in the table below.
\(x\)891214731619
\(p\) (£ hundreds)40.536.130.439.432.631.143.445.7
$$\text { (You may use } \sum x ^ { 2 } = 1160 \quad \sum p = 299.2 \quad \sum p ^ { 2 } = 11422 \quad \sum x p = 3449.5 \text { ) }$$
  1. Show that \(S _ { p p } = 231.92\) and find the value of \(S _ { x x }\) and the value of \(S _ { x p }\)
  2. Calculate the product moment correlation coefficient between \(x\) and \(p\). The equation of the regression line of \(p\) on \(x\) is given in the form \(p = a + b x\).
  3. Show that, to 3 significant figures, \(b = 0.824\) and find the value of \(a\).
  4. Estimate the amount of money spent on paper in an office with 10 employees.
  5. Explain the effect each additional employee has on the amount of money spent on paper. Later the company realised it had made a mistake in adding up its costs, \(p\). The true costs were actually half of the values recorded. The product moment correlation coefficient and the equation of the linear regression line are recalculated using this information.
  6. Write down the new value of
    1. the product moment correlation coefficient,
    2. the gradient of the regression line.
Edexcel S1 2014 June Q3
3. The table shows data on the number of visitors to the UK in a month, \(v\) (1000s), and the amount of money they spent, \(m\) ( \(\pounds\) millions), for each of 8 months.
Number of visitors
\(v ( 1000 \mathrm {~s} )\)
24502480254024202350229024002460
Amount of money spent
\(m ( \pounds\) millions \()\)
13701350140013301270121013301350
You may use
\(S _ { v v } = 42587.5 \quad S _ { v m } = 31512.5 \quad S _ { m m } = 25187.5 \quad \sum v = 19390 \quad \sum m = 10610\)
  1. Find the product moment correlation coefficient between \(m\) and \(v\).
  2. Give a reason to support fitting a regression model of the form \(m = a + b v\) to these data.
  3. Find the value of \(b\) correct to 3 decimal places.
  4. Find the equation of the regression line of \(m\) on \(v\).
  5. Interpret your value of \(b\).
  6. Use your answer to part (d) to estimate the amount of money spent when the number of visitors to the UK in a month is 2500000
  7. Comment on the reliability of your estimate in part (f). Give a reason for your answer.
Edexcel S1 2016 June Q1
  1. A biologist is studying the behaviour of bees in a hive. Once a bee has located a source of food, it returns to the hive and performs a dance to indicate to the other bees how far away the source of the food is. The dance consists of a series of wiggles. The biologist records the distance, \(d\) metres, of the food source from the hive and the average number of wiggles, \(w\), in the dance.
Distance, \(\boldsymbol { d } \mathbf { m }\)305080100150400500650
Average number
of wiggles, \(\boldsymbol { w }\)
0.7251.2101.7752.2503.5186.3828.1859.555
[You may use \(\sum w = 33.6 \sum d w = 13833 \mathrm {~S} _ { d d } = 394600 \mathrm {~S} _ { w w } = 80.481\) (to 3 decimal places)]
  1. Show that \(\mathrm { S } _ { d w } = 5601\)
  2. State, giving a reason, which is the response variable.
  3. Calculate the product moment correlation coefficient for these data.
  4. Calculate the equation of the regression line of \(w\) on \(d\), giving your answer in the form \(w = a + b d\) A new source of food is located 350 m from the hive.
    1. Use your regression equation to estimate the average number of wiggles in the corresponding dance.
    2. Comment, giving a reason, on the reliability of your estimate.
Edexcel S1 Q6
6. To test the heating of tyre material, tyres are run on a test rig at chosen speeds under given conditions of load, pressure and surrounding temperature. The following table gives values of \(x\), the test rig speed in miles per hour (mph), and the temperature, \(y ^ { \circ } \mathrm { C }\), generated in the shoulder of the tyre for a particular tyre material.
\(x ( \mathrm { mph } )\)1520253035404550
\(y \left( { } ^ { \circ } \mathrm { C } \right)\)53556365788391101
  1. Draw a scatter diagram to represent these data.
  2. Give a reason to support the fitting of a regression line of the form \(y = a + b x\) through these points.
  3. Find the values of \(a\) and \(b\).
    (You may use \(\Sigma x ^ { 2 } = 9500 , \Sigma y ^ { 2 } = 45483 , \Sigma x y = 20615\) )
  4. Give an interpretation for each of \(a\) and \(b\).
  5. Use your line to estimate the temperature at 50 mph and explain why this estimate differs from the value given in the table. A tyre specialist wants to estimate the temperature of this tyre material at 12 mph and 85 mph .
  6. Explain briefly whether or not you would recommend the specialist to use this regression equation to obtain these estimates.
Edexcel S1 2003 November Q1
  1. A company wants to pay its employees according to their performance at work. The performance score \(x\) and the annual salary, \(y\) in \(\pounds 100\) s, for a random sample of 10 of its employees for last year were recorded. The results are shown in the table below.
\(x\)15402739271520301924
\(y\)216384234399226132175316187196
$$\text { [You may assume } \left. \Sigma x y = 69798 , \Sigma x ^ { 2 } = 7266 \right]$$
  1. Draw a scatter diagram to represent these data.
  2. Calculate exact values of \(S _ { x y }\) and \(S _ { x x }\).
    1. Calculate the equation of the regression line of \(y\) on \(x\), in the form \(y = a + b x\). Give the values of \(a\) and \(b\) to 3 significant figures.
    2. Draw this line on your scatter diagram.
  3. Interpret the gradient of the regression line. The company decides to use this regression model to determine future salaries.
  4. Find the proposed annual salary for an employee who has a performance score of 35 .
AQA S1 2006 January Q1
1 At a certain small restaurant, the waiting time is defined as the time between sitting down at a table and a waiter first arriving at the table. This waiting time is dependent upon the number of other customers already seated in the restaurant. Alex is a customer who visited the restaurant on 10 separate days. The table shows, for each of these days, the number, \(x\), of customers already seated and his waiting time, \(y\) minutes.
\(\boldsymbol { x }\)9341081271126
\(\boldsymbol { y }\)11651191391247
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\) in the form \(y = a + b x\).
  2. Give an interpretation, in context, for each of your values of \(a\) and \(b\).
  3. Use your regression equation to estimate Alex's waiting time when the number of customers already seated in the restaurant is:
    1. 5 ;
    2. 25 .
  4. Comment on the likely reliability of each of your estimates in part (c), given that, for the regression line calculated in part (a), the values of the 10 residuals lie between + 1.1 minutes and - 1.1 minutes.
AQA S1 2008 January Q4
4 [Figure 1, printed on the insert, is provided for use in this question.]
Roseen is a self-employed decorator who wishes to estimate the times that it will take her to decorate bedrooms based upon their floor areas. She records the floor area, \(x \mathrm {~m} ^ { 2 }\), and the decorating time, \(y\) hours, for each of 10 bedrooms she has recently decorated.
\(\boldsymbol { x }\)11.022.07.521.013.016.514.016.018.520.5
\(\boldsymbol { y }\)15.035.016.023.524.017.514.527.522.534.5
  1. On Figure 1, plot a scatter diagram of these data.
  2. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  3. Draw your regression line on Figure 1.
    1. Use your regression equation to estimate the time that Roseen will take to decorate a bedroom with a floor area of \(15 \mathrm {~m} ^ { 2 }\).
    2. Making reference to Figure 1, comment on the likely reliability of your estimate in part (d)(i).
AQA S1 2009 January Q6
6 [Figure 1, printed on the insert, is provided for use in this question.]
For a random sample of 10 patients who underwent hip-replacement operations, records were kept of their ages, \(x\) years, and of the number of days, \(y\), following their operations before they were able to walk unaided safely.
Patient\(\mathbf { A }\)\(\mathbf { B }\)\(\mathbf { C }\)\(\mathbf { D }\)\(\mathbf { E }\)\(\mathbf { F }\)\(\mathbf { G }\)\(\mathbf { H }\)\(\mathbf { I }\)\(\mathbf { J }\)
\(\boldsymbol { x }\)55516266725978556270
\(\boldsymbol { y }\)34333949484351414651
  1. On Figure 1, complete the scatter diagram for these data.
  2. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  3. Draw your regression line on Figure 1.
  4. In fact, patients H, I and J were males and the other 7 patients were females.
    1. Calculate the mean of the residuals for the 3 male patients.
    2. Hence estimate, for a male patient aged 65 years, the number of days following his hip-replacement operation before he is able to walk unaided safely.
AQA S1 2011 January Q5
5 Craig uses his car to travel regularly from his home to the area hospital for treatment. He leaves home at \(x\) minutes after 7.30 am and then takes \(y\) minutes to arrive at the hospital's reception desk. His results for 11 mornings are shown in the table.
\(\boldsymbol { x }\)05101520253035404550
\(\boldsymbol { y }\)3142325847567968899585
  1. Explain why the time taken by Craig between leaving home and arriving at the hospital's reception desk is the response variable.
  2. Calculate the equation of the least squares regression line of \(y\) on \(x\), writing your answer in the form \(y = a + b x\).
  3. On a particular day, Craig needs to arrive at the hospital's reception desk no later than 9.00 am . He leaves home at 7.45 am . Estimate the number of minutes before 9.00 am that Craig will arrive at the hospital's reception desk. Give your answer to the nearest minute.
    1. Use your equation to estimate \(y\) when \(x = 85\).
    2. Give one statistical reason and one reason based on the context of this question as to why your estimate in part (d)(i) is unlikely to be realistic.埗
AQA S1 2012 January Q5
5 An experiment was undertaken to collect information on the burning of a specific type of wood as a source of energy. At given fixed levels of the wood's moisture content, \(x\) per cent, its corresponding calorific value, \(y \mathrm { MWh } /\) tonne, on burning was determined. The results are shown in the table.
\(\boldsymbol { x }\)5101520253035404550556065
\(\boldsymbol { y }\)5.24.74.34.03.22.82.52.21.81.51.31.00.6
  1. Explain why calorific value is the response variable.
  2. Calculate the equation of the least squares regression line of \(y\) on \(x\), giving your answer in the form \(y = a + b x\).
  3. Interpret, in context, your values for \(a\) and \(b\).
  4. Use your equation to estimate the wood's calorific value when it has a moisture content of 27 per cent.
  5. Calculate the value of the residual for the point \(( 35,2.5 )\).
  6. Given that the values of the 13 residuals lie between - 0.28 and + 0.23 , comment on the likely accuracy of your estimate in part (d).
    1. Give a general reason why your equation should not be used to estimate the wood's calorific value when it has a moisture content of 80 per cent.
    2. Give a specific reason, based on the context of this question and with numerical support, why your equation cannot be used to estimate the wood's calorific value when it has a moisture content of 80 per cent.
AQA S1 2013 January Q1
1 Bob, a church warden, decides to investigate the lifetime of a particular manufacturer's brand of beeswax candle. Each candle is 30 cm in length. From a box containing a large number of such candles, he selects one candle at random. He lights the candle and, after it has burned continuously for \(x\) hours, he records its length, \(y \mathrm {~cm}\), to the nearest centimetre. His results are shown in the table.
\(\boldsymbol { x }\)51015202530354045
\(\boldsymbol { y }\)272521191611952
  1. State the value that you would expect for \(a\) in the equation of the least squares regression line, \(y = a + b x\).
    1. Calculate the equation of the least squares regression line, \(y = a + b x\).
    2. Interpret the value that you obtain for \(b\).
    3. It is claimed by the candle manufacturer that the total length of time that such candles are likely to burn for is more than 50 hours. Comment on this claim, giving a numerical justification for your answer.
AQA S1 2007 June Q5
5 Bob, a gardener, measures the time taken, \(y\) minutes, for 60 grams of weedkiller pellets to dissolve in 10 litres of water at different set temperatures, \(x ^ { \circ } \mathrm { C }\). His results are shown in the table.
\(\boldsymbol { x }\)1620242832364044485256
\(\boldsymbol { y }\)4.74.33.83.53.02.72.42.01.81.61.1
  1. State why the explanatory variable is temperature.
  2. Calculate the equation of the least squares regression line \(y = a + b x\).
    1. Interpret, in the context of this question, your value for \(b\).
    2. Explain why no sensible practical interpretation can be given for your value of \(a\).
    1. Estimate the time taken to dissolve 60 grams of weedkiller pellets in 10 litres of water at \(30 ^ { \circ } \mathrm { C }\).
    2. Show why the equation cannot be used to make a valid estimate of the time taken to dissolve 60 grams of weedkiller pellets in 10 litres of water at \(75 ^ { \circ } \mathrm { C }\). (2 marks)
AQA S1 2009 June Q4
4 As part of an investigation, a chlorine block is immersed in a large tank of water held at a constant temperature. The block slowly dissolves, and its weight, \(y\) grams, is noted \(x\) days after immersion. The results are shown in the table.
\(\boldsymbol { x }\) days51015203040506075
\(\boldsymbol { y }\) grams47444238352723169
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  2. Hence estimate, to the nearest gram, the initial weight of the block.
  3. A company which markets the chlorine blocks claims that a block will usually dissolve completely after about 13 weeks. Comment, with justification, on this claim.
    PART PEFRENC
    .................................................................................................................................................
    \(\_\_\_\_\)\(\_\_\_\_\)
    \(\_\_\_\_\)
    \(\_\_\_\_\)
    \includegraphics[max width=\textwidth, alt={}]{adf1c0d2-b0a6-4a2f-baf2-cfb45d771315-08_57_1681_2227_161}
    \(\_\_\_\_\)
    .......... \(\_\_\_\_\)
    \includegraphics[max width=\textwidth, alt={}, center]{adf1c0d2-b0a6-4a2f-baf2-cfb45d771315-09_40_118_529_159}
AQA S1 2010 June Q6
6 During a study of reaction times, each of a random sample of 12 people, aged between 40 and 80 years, was asked to react as quickly as possible to a stimulus displayed on a computer screen. Their ages, \(x\) years, and reaction times, \(y\) milliseconds, are shown in the table.
PersonAge ( \(\boldsymbol { x }\) years)Reaction time ( \(y \mathrm {~ms}\) )
A41520
B54750
C66650
D72920
E71280
F57620
G60740
H47950
I77970
J65780
K51550
L59730
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\).
    1. Draw your regression line on the scatter diagram on page 16.
    2. Comment on what this reveals.
  2. It was later discovered that the reaction times for persons E and H had been recorded incorrectly. The values should have been 820 and 590 respectively. After making these corrections, computations gave $$S _ { x x } = 1272 \quad S _ { x y } = 14760 \quad \bar { x } = 60 \quad \bar { y } = 720$$
    1. Using the symbol ⋅ , plot the correct values for persons E and H on the scatter diagram on page 16.
    2. Recalculate the equation of the least squares regression line of \(y\) on \(x\), and draw this regression line on the scatter diagram on page 16.
    3. Hence revise as necessary your comments in part (b)(ii).
      \includegraphics[max width=\textwidth, alt={}]{c4844a30-6a86-49e3-b6aa-8e213dfc8ca1-15_2484_1709_223_153}
      \section*{Reaction Times}
      \includegraphics[max width=\textwidth, alt={}]{c4844a30-6a86-49e3-b6aa-8e213dfc8ca1-16_1943_1301_351_292}
      \includegraphics[max width=\textwidth, alt={}]{c4844a30-6a86-49e3-b6aa-8e213dfc8ca1-17_2484_1707_223_155}
AQA S1 2011 June Q3
3
  1. During a particular summer holiday, Rick worked in a fish and chip shop at a seaside resort. He suspected that the shop's takings, \(\pounds y\), on a weekday were dependent upon the forecast of that day's maximum temperature, \(x ^ { \circ } \mathrm { C }\), in the resort, made at 6.00 pm on the previous day. To investigate this suspicion, he recorded values of \(x\) and \(y\) for a random sample of 7 weekdays during July.
    \(\boldsymbol { x }\)23182719252022
    \(\boldsymbol { y }\)4290318851063829505742644485
    1. Calculate the equation of the least squares regression line of \(y\) on \(x\).
    2. Estimate the shop's takings on a weekday during July when the maximum temperature was forecast to be \(24 ^ { \circ } \mathrm { C }\).
    3. Explain why your equation may not be suitable for estimating the shop's takings on a weekday during February.
    4. Describe, in the context of this question, a variable other than the maximum temperature, \(x\), that may affect \(y\).
  2. Seren, who also worked in the fish and chip shop, investigated the possible linear relationship between the shop's takings, \(\pounds z\), recorded in \(\pounds 000\) s, and each of two other explanatory variables, \(v\) and \(w\).
    1. She calculated correctly that the regression line of \(z\) on \(v\) had a \(z\)-intercept of - 1 and a gradient of 0.15 . Draw this line, for values of \(v\) from 0 to 40, on Figure 1 on page 4.
    2. She also calculated correctly that the regression line of \(z\) on \(w\) had a \(z\)-intercept of 5 and a gradient of - 0.40 . Draw this line, for values of \(w\) from 0 to 10, on Figure 2 below. \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{Figure 1} \includegraphics[alt={},max width=\textwidth]{767ec629-6350-41d9-bbb9-e059a5fd8c70-4_792_604_680_717}
      \end{figure} \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{Figure 2} \includegraphics[alt={},max width=\textwidth]{767ec629-6350-41d9-bbb9-e059a5fd8c70-4_792_696_1692_687}
      \end{figure}
AQA S1 2012 June Q3
3 The table shows the maximum weight, \(y _ { A }\) grams, of Salt \(A\) that will dissolve in 100 grams of water at various temperatures, \(x ^ { \circ } \mathrm { C }\).
\(\boldsymbol { x }\)101520253035404550607080
\(\boldsymbol { y } _ { \boldsymbol { A } }\)203548577792101111121137159182
  1. Calculate the equation of the least squares regression line of \(y _ { A }\) on \(x\).
  2. The data in the above table are plotted on the scatter diagram on page 4. Draw your regression line on this scatter diagram.
  3. For water temperatures in the range \(10 ^ { \circ } \mathrm { C }\) to \(80 ^ { \circ } \mathrm { C }\), the maximum weight, \(y _ { B }\) grams, of Salt \(B\) that will dissolve in 100 grams of water is given by the equation $$y _ { B } = 60.1 + 0.255 x$$
    1. Draw this line on the scatter diagram.
    2. Estimate the water temperature at which the maximum weight of Salt \(A\) that will dissolve in 100 grams of water is the same as that of Salt B.
    3. For Salt \(A\) and Salt \(B\), compare the effects of water temperature on the maximum weight that will dissolve in 100 grams of water. Your answer should identify two distinct differences. \section*{Temperatures and Maximum Weights}
      \includegraphics[max width=\textwidth, alt={}]{91466019-8feb-4292-b616-e8e8667e2e54-4_2023_1682_404_173}
AQA S1 2014 June Q3
1 marks
3 The table shows the body mass index (BMI), \(x\), and the systolic blood pressure (SBP), \(y \mathrm { mmHg }\), for each of a random sample of 10 men, aged between 35 years and 40 years, from a particular population.
\(\boldsymbol { x }\)13232935173425203127
\(\boldsymbol { y }\)103115124126108120113117118119
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  2. Use your equation to estimate the SBP of a man from this population who is aged 38 years and who has a BMI of 30 .
  3. State why your equation might not be appropriate for estimating the SBP of a man from this population:
    1. who is aged 38 years and who has a BMI of 45 ;
    2. who is aged 50 years and who has a BMI of 25 .
  4. Find the value of the residual for the point \(( 20,117 )\).
  5. The mean of the vertical distances of the 10 points from the regression line calculated in part (a) is 2.71, correct to three significant figures. Comment on the likely accuracy of your estimate in part (b).
    [0pt] [1 mark]
AQA S1 2014 June Q6
3 marks
6 A rubber seal is fitted to the bottom of a flood barrier. When no pressure is applied, the depth of the seal is 15 cm . When pressure is applied, a watertight seal is created between the flood barrier and the ground. The table shows the pressure, \(x\) kilopascals ( kPa ), applied to the seal and the resultant depth, \(y\) centimetres, of the seal.
\(\boldsymbol { x }\)255075100125150175200250300
\(\boldsymbol { y }\)14.713.412.811.911.010.39.79.07.56.7
    1. State the value that you would expect for \(a\) in the equation of the least squares regression line, \(y = a + b x\).
    2. Calculate the equation of the least squares regression line, \(y = a + b x\).
    3. Interpret, in context, your value for \(b\).
  1. Calculate an estimate of the depth of the seal when it is subjected to a pressure of 225 kPa .
    1. Give a statistical reason as to why your equation is unlikely to give a realistic estimate of the depth of the seal if it were to be subjected to a pressure of 400 kPa .
    2. Give a reason based on the context of this question as to why your equation will not give a realistic estimate of the depth of the seal if it were to be subjected to a pressure of 525 kPa .
      [0pt] [3 marks]
      \includegraphics[max width=\textwidth, alt={}]{8aeacd54-a5a1-4f2d-b936-2faf635ffce7-20_946_1709_1761_153}
      \includegraphics[max width=\textwidth, alt={}]{8aeacd54-a5a1-4f2d-b936-2faf635ffce7-21_2484_1707_221_153}
      \includegraphics[max width=\textwidth, alt={}]{8aeacd54-a5a1-4f2d-b936-2faf635ffce7-23_2484_1707_221_153}
AQA S1 2016 June Q4
2 marks
4 As part of her science project, a student found the mass, \(y\) grams, of a particular compound that dissolved in 100 ml of water at each of 12 different set temperatures, \(x ^ { \circ } \mathrm { C }\). The results are shown in the table.
\(\boldsymbol { x }\)202530354045505560657075
\(\boldsymbol { y }\)242262269290298310326355359375390412
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  2. Interpret, in context, your value for the gradient of this regression line.
  3. Use your equation to estimate the mass of the compound which will dissolve in 100 ml of water at \(68 ^ { \circ } \mathrm { C }\).
  4. Given that the values of the 12 residuals for the regression line of \(y\) on \(x\) lie between - 7 and + 9 , comment, with justification, on the likely accuracy of your estimate in part (c).
    [0pt] [2 marks]
Edexcel S1 Q5
  1. The table shows the numbers of cars and vans in a company's fleet having registrations with the prefix letters shown.
Registration letter\(K\)\(L\)\(M\)\(N\)\(P\)\(R\)\(S\)\(T\)\(V\)
Number of cars \(( x )\)67911151412107
Number of vans \(( y )\)810141313151498
  1. Plot a scatter graph of this data, with the number of cars on the horizontal axis and the number of vans on the vertical axis.
  2. If there were \(4 J\)-registered cars, estimate the number of \(J\)-registered vans. Given that \(\sum x ^ { 2 } = 1001 , \sum y ^ { 2 } = 1264\) and \(\sum x y = 1106\),
  3. calculate the product-moment correlation coefficient between \(x\) and \(y\). Give a brief interpretation of your answer.
Edexcel S1 Q6
  1. The marks out of 75 obtained by a group of ten students in their first and second Statistics modules were as follows:
Student\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
Module 1 \(( x )\)54334271602739465964
Module 2 \(( y )\)50224458421935465560
  1. Find \(\sum x\) and \(\sum y\). Given that \(\sum x ^ { 2 } = 26353\) and \(\sum x y = 22991\),
  2. obtain the equation of the regression line of \(y\) on \(x\).
  3. Estimate the Module 2 result of a student whose mark in Module 1 was Explain why one of these estimates is less reliable than the other. The equation of the regression line of \(x\) on \(y\) is \(x = 0.921 y + 9.81\).
  4. Deduce the product moment correlation coefficient between \(x\) and \(y\), and briefly interpret its value.