Calculate y on x from raw data table

Questions that provide raw bivariate data in a table and ask to find the regression line of y on x.

65 questions · Moderate -0.6

5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context
Sort by: Default | Easiest first | Hardest first
Edexcel S1 2013 June Q1
13 marks Moderate -0.8
  1. A meteorologist believes that there is a relationship between the height above sea level, \(h \mathrm {~m}\), and the air temperature, \(t ^ { \circ } \mathrm { C }\). Data is collected at the same time from 9 different places on the same mountain. The data is summarised in the table below.
\(h\)140011002608409005501230100770
\(t\)310209101352416
[You may assume that \(\sum h = 7150 , \sum t = 110 , \sum h ^ { 2 } = 7171500 , \sum t ^ { 2 } = 1716\), \(\sum t h = 64980\) and \(\mathrm { S } _ { t t } = 371.56\) ]
  1. Calculate \(\mathrm { S } _ { t h }\) and \(\mathrm { S } _ { h h }\). Give your answers to 3 significant figures.
  2. Calculate the product moment correlation coefficient for this data.
  3. State whether or not your value supports the use of a regression equation to predict the air temperature at different heights on this mountain. Give a reason for your answer.
  4. Find the equation of the regression line of \(t\) on \(h\) giving your answer in the form \(t = a + b h\).
  5. Interpret the value of \(b\).
  6. Estimate the difference in air temperature between a height of 500 m and a height of 1000 m .
Edexcel S1 2014 June Q3
16 marks Moderate -0.8
3. A large company is analysing how much money it spends on paper in its offices every year. The number of employees, \(x\), and the amount of money spent on paper, \(p\) ( \(\pounds\) hundreds), in 8 randomly selected offices are given in the table below.
\(x\)891214731619
\(p\) (£ hundreds)40.536.130.439.432.631.143.445.7
$$\text { (You may use } \sum x ^ { 2 } = 1160 \quad \sum p = 299.2 \quad \sum p ^ { 2 } = 11422 \quad \sum x p = 3449.5 \text { ) }$$
  1. Show that \(S _ { p p } = 231.92\) and find the value of \(S _ { x x }\) and the value of \(S _ { x p }\)
  2. Calculate the product moment correlation coefficient between \(x\) and \(p\). The equation of the regression line of \(p\) on \(x\) is given in the form \(p = a + b x\).
  3. Show that, to 3 significant figures, \(b = 0.824\) and find the value of \(a\).
  4. Estimate the amount of money spent on paper in an office with 10 employees.
  5. Explain the effect each additional employee has on the amount of money spent on paper. Later the company realised it had made a mistake in adding up its costs, \(p\). The true costs were actually half of the values recorded. The product moment correlation coefficient and the equation of the linear regression line are recalculated using this information.
  6. Write down the new value of
    1. the product moment correlation coefficient,
    2. the gradient of the regression line.
Edexcel S1 2014 June Q3
13 marks Easy -1.2
3. The table shows data on the number of visitors to the UK in a month, \(v\) (1000s), and the amount of money they spent, \(m\) ( \(\pounds\) millions), for each of 8 months.
Number of visitors
\(v ( 1000 \mathrm {~s} )\)
24502480254024202350229024002460
Amount of money spent
\(m ( \pounds\) millions \()\)
13701350140013301270121013301350
You may use \(S _ { v v } = 42587.5 \quad S _ { v m } = 31512.5 \quad S _ { m m } = 25187.5 \quad \sum v = 19390 \quad \sum m = 10610\)
  1. Find the product moment correlation coefficient between \(m\) and \(v\).
  2. Give a reason to support fitting a regression model of the form \(m = a + b v\) to these data.
  3. Find the value of \(b\) correct to 3 decimal places.
  4. Find the equation of the regression line of \(m\) on \(v\).
  5. Interpret your value of \(b\).
  6. Use your answer to part (d) to estimate the amount of money spent when the number of visitors to the UK in a month is 2500000
  7. Comment on the reliability of your estimate in part (f). Give a reason for your answer.
Edexcel S1 2016 June Q1
11 marks Moderate -0.8
  1. A biologist is studying the behaviour of bees in a hive. Once a bee has located a source of food, it returns to the hive and performs a dance to indicate to the other bees how far away the source of the food is. The dance consists of a series of wiggles. The biologist records the distance, \(d\) metres, of the food source from the hive and the average number of wiggles, \(w\), in the dance.
Distance, \(\boldsymbol { d } \mathbf { m }\)305080100150400500650
Average number
of wiggles, \(\boldsymbol { w }\)
0.7251.2101.7752.2503.5186.3828.1859.555
[You may use \(\sum w = 33.6 \sum d w = 13833 \mathrm {~S} _ { d d } = 394600 \mathrm {~S} _ { w w } = 80.481\) (to 3 decimal places)]
  1. Show that \(\mathrm { S } _ { d w } = 5601\)
  2. State, giving a reason, which is the response variable.
  3. Calculate the product moment correlation coefficient for these data.
  4. Calculate the equation of the regression line of \(w\) on \(d\), giving your answer in the form \(w = a + b d\) A new source of food is located 350 m from the hive.
    1. Use your regression equation to estimate the average number of wiggles in the corresponding dance.
    2. Comment, giving a reason, on the reliability of your estimate.
Edexcel S1 Q6
16 marks Moderate -0.8
6. To test the heating of tyre material, tyres are run on a test rig at chosen speeds under given conditions of load, pressure and surrounding temperature. The following table gives values of \(x\), the test rig speed in miles per hour (mph), and the temperature, \(y ^ { \circ } \mathrm { C }\), generated in the shoulder of the tyre for a particular tyre material.
\(x ( \mathrm { mph } )\)1520253035404550
\(y \left( { } ^ { \circ } \mathrm { C } \right)\)53556365788391101
  1. Draw a scatter diagram to represent these data.
  2. Give a reason to support the fitting of a regression line of the form \(y = a + b x\) through these points.
  3. Find the values of \(a\) and \(b\).
    (You may use \(\Sigma x ^ { 2 } = 9500 , \Sigma y ^ { 2 } = 45483 , \Sigma x y = 20615\) )
  4. Give an interpretation for each of \(a\) and \(b\).
  5. Use your line to estimate the temperature at 50 mph and explain why this estimate differs from the value given in the table. A tyre specialist wants to estimate the temperature of this tyre material at 12 mph and 85 mph .
  6. Explain briefly whether or not you would recommend the specialist to use this regression equation to obtain these estimates.
Edexcel S1 2003 November Q1
16 marks Moderate -0.8
  1. A company wants to pay its employees according to their performance at work. The performance score \(x\) and the annual salary, \(y\) in \(\pounds 100\) s, for a random sample of 10 of its employees for last year were recorded. The results are shown in the table below.
\(x\)15402739271520301924
\(y\)216384234399226132175316187196
$$\text { [You may assume } \left. \Sigma x y = 69798 , \Sigma x ^ { 2 } = 7266 \right]$$
  1. Draw a scatter diagram to represent these data.
  2. Calculate exact values of \(S _ { x y }\) and \(S _ { x x }\).
    1. Calculate the equation of the regression line of \(y\) on \(x\), in the form \(y = a + b x\). Give the values of \(a\) and \(b\) to 3 significant figures.
    2. Draw this line on your scatter diagram.
  3. Interpret the gradient of the regression line. The company decides to use this regression model to determine future salaries.
  4. Find the proposed annual salary for an employee who has a performance score of 35 .
AQA S1 2006 January Q1
11 marks Moderate -0.8
1 At a certain small restaurant, the waiting time is defined as the time between sitting down at a table and a waiter first arriving at the table. This waiting time is dependent upon the number of other customers already seated in the restaurant. Alex is a customer who visited the restaurant on 10 separate days. The table shows, for each of these days, the number, \(x\), of customers already seated and his waiting time, \(y\) minutes.
\(\boldsymbol { x }\)9341081271126
\(\boldsymbol { y }\)11651191391247
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\) in the form \(y = a + b x\).
  2. Give an interpretation, in context, for each of your values of \(a\) and \(b\).
  3. Use your regression equation to estimate Alex's waiting time when the number of customers already seated in the restaurant is:
    1. 5 ;
    2. 25 .
  4. Comment on the likely reliability of each of your estimates in part (c), given that, for the regression line calculated in part (a), the values of the 10 residuals lie between + 1.1 minutes and - 1.1 minutes.
AQA S1 2008 January Q4
12 marks Moderate -0.3
4 [Figure 1, printed on the insert, is provided for use in this question.]
Roseen is a self-employed decorator who wishes to estimate the times that it will take her to decorate bedrooms based upon their floor areas. She records the floor area, \(x \mathrm {~m} ^ { 2 }\), and the decorating time, \(y\) hours, for each of 10 bedrooms she has recently decorated.
\(\boldsymbol { x }\)11.022.07.521.013.016.514.016.018.520.5
\(\boldsymbol { y }\)15.035.016.023.524.017.514.527.522.534.5
  1. On Figure 1, plot a scatter diagram of these data.
  2. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  3. Draw your regression line on Figure 1.
    1. Use your regression equation to estimate the time that Roseen will take to decorate a bedroom with a floor area of \(15 \mathrm {~m} ^ { 2 }\).
    2. Making reference to Figure 1, comment on the likely reliability of your estimate in part (d)(i).
AQA S1 2009 January Q6
15 marks Moderate -0.3
6 [Figure 1, printed on the insert, is provided for use in this question.]
For a random sample of 10 patients who underwent hip-replacement operations, records were kept of their ages, \(x\) years, and of the number of days, \(y\), following their operations before they were able to walk unaided safely.
Patient\(\mathbf { A }\)\(\mathbf { B }\)\(\mathbf { C }\)\(\mathbf { D }\)\(\mathbf { E }\)\(\mathbf { F }\)\(\mathbf { G }\)\(\mathbf { H }\)\(\mathbf { I }\)\(\mathbf { J }\)
\(\boldsymbol { x }\)55516266725978556270
\(\boldsymbol { y }\)34333949484351414651
  1. On Figure 1, complete the scatter diagram for these data.
  2. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  3. Draw your regression line on Figure 1.
  4. In fact, patients H, I and J were males and the other 7 patients were females.
    1. Calculate the mean of the residuals for the 3 male patients.
    2. Hence estimate, for a male patient aged 65 years, the number of days following his hip-replacement operation before he is able to walk unaided safely.
AQA S1 2011 January Q5
14 marks Moderate -0.3
5 Craig uses his car to travel regularly from his home to the area hospital for treatment. He leaves home at \(x\) minutes after 7.30 am and then takes \(y\) minutes to arrive at the hospital's reception desk. His results for 11 mornings are shown in the table.
\(\boldsymbol { x }\)05101520253035404550
\(\boldsymbol { y }\)3142325847567968899585
  1. Explain why the time taken by Craig between leaving home and arriving at the hospital's reception desk is the response variable.
  2. Calculate the equation of the least squares regression line of \(y\) on \(x\), writing your answer in the form \(y = a + b x\).
  3. On a particular day, Craig needs to arrive at the hospital's reception desk no later than 9.00 am . He leaves home at 7.45 am . Estimate the number of minutes before 9.00 am that Craig will arrive at the hospital's reception desk. Give your answer to the nearest minute.
    1. Use your equation to estimate \(y\) when \(x = 85\).
    2. Give one statistical reason and one reason based on the context of this question as to why your estimate in part (d)(i) is unlikely to be realistic.埗
AQA S1 2012 January Q5
17 marks Moderate -0.8
5 An experiment was undertaken to collect information on the burning of a specific type of wood as a source of energy. At given fixed levels of the wood's moisture content, \(x\) per cent, its corresponding calorific value, \(y \mathrm { MWh } /\) tonne, on burning was determined. The results are shown in the table.
\(\boldsymbol { x }\)5101520253035404550556065
\(\boldsymbol { y }\)5.24.74.34.03.22.82.52.21.81.51.31.00.6
  1. Explain why calorific value is the response variable.
  2. Calculate the equation of the least squares regression line of \(y\) on \(x\), giving your answer in the form \(y = a + b x\).
  3. Interpret, in context, your values for \(a\) and \(b\).
  4. Use your equation to estimate the wood's calorific value when it has a moisture content of 27 per cent.
  5. Calculate the value of the residual for the point \(( 35,2.5 )\).
  6. Given that the values of the 13 residuals lie between - 0.28 and + 0.23 , comment on the likely accuracy of your estimate in part (d).
    1. Give a general reason why your equation should not be used to estimate the wood's calorific value when it has a moisture content of 80 per cent.
    2. Give a specific reason, based on the context of this question and with numerical support, why your equation cannot be used to estimate the wood's calorific value when it has a moisture content of 80 per cent.
AQA S1 2013 January Q1
9 marks Moderate -0.8
1 Bob, a church warden, decides to investigate the lifetime of a particular manufacturer's brand of beeswax candle. Each candle is 30 cm in length. From a box containing a large number of such candles, he selects one candle at random. He lights the candle and, after it has burned continuously for \(x\) hours, he records its length, \(y \mathrm {~cm}\), to the nearest centimetre. His results are shown in the table.
\(\boldsymbol { x }\)51015202530354045
\(\boldsymbol { y }\)272521191611952
  1. State the value that you would expect for \(a\) in the equation of the least squares regression line, \(y = a + b x\).
    1. Calculate the equation of the least squares regression line, \(y = a + b x\).
    2. Interpret the value that you obtain for \(b\).
    3. It is claimed by the candle manufacturer that the total length of time that such candles are likely to burn for is more than 50 hours. Comment on this claim, giving a numerical justification for your answer.
AQA S1 2007 June Q5
13 marks Moderate -0.8
5 Bob, a gardener, measures the time taken, \(y\) minutes, for 60 grams of weedkiller pellets to dissolve in 10 litres of water at different set temperatures, \(x ^ { \circ } \mathrm { C }\). His results are shown in the table.
\(\boldsymbol { x }\)1620242832364044485256
\(\boldsymbol { y }\)4.74.33.83.53.02.72.42.01.81.61.1
  1. State why the explanatory variable is temperature.
  2. Calculate the equation of the least squares regression line \(y = a + b x\).
    1. Interpret, in the context of this question, your value for \(b\).
    2. Explain why no sensible practical interpretation can be given for your value of \(a\).
    1. Estimate the time taken to dissolve 60 grams of weedkiller pellets in 10 litres of water at \(30 ^ { \circ } \mathrm { C }\).
    2. Show why the equation cannot be used to make a valid estimate of the time taken to dissolve 60 grams of weedkiller pellets in 10 litres of water at \(75 ^ { \circ } \mathrm { C }\). (2 marks)
AQA S1 2009 June Q4
8 marks Moderate -0.8
4 As part of an investigation, a chlorine block is immersed in a large tank of water held at a constant temperature. The block slowly dissolves, and its weight, \(y\) grams, is noted \(x\) days after immersion. The results are shown in the table.
\(\boldsymbol { x }\) days51015203040506075
\(\boldsymbol { y }\) grams47444238352723169
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  2. Hence estimate, to the nearest gram, the initial weight of the block.
  3. A company which markets the chlorine blocks claims that a block will usually dissolve completely after about 13 weeks. Comment, with justification, on this claim.
    PART PEFRENC
    .................................................................................................................................................
    \(\_\_\_\_\)\(\_\_\_\_\)
    \(\_\_\_\_\)
    \(\_\_\_\_\)
    \includegraphics[max width=\textwidth, alt={}]{adf1c0d2-b0a6-4a2f-baf2-cfb45d771315-08_57_1681_2227_161}
    \(\_\_\_\_\)
    .......... \(\_\_\_\_\) \includegraphics[max width=\textwidth, alt={}, center]{adf1c0d2-b0a6-4a2f-baf2-cfb45d771315-09_40_118_529_159}
AQA S1 2010 June Q6
14 marks Moderate -0.3
6 During a study of reaction times, each of a random sample of 12 people, aged between 40 and 80 years, was asked to react as quickly as possible to a stimulus displayed on a computer screen. Their ages, \(x\) years, and reaction times, \(y\) milliseconds, are shown in the table.
PersonAge ( \(\boldsymbol { x }\) years)Reaction time ( \(y \mathrm {~ms}\) )
A41520
B54750
C66650
D72920
E71280
F57620
G60740
H47950
I77970
J65780
K51550
L59730
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\).
    1. Draw your regression line on the scatter diagram on page 16.
    2. Comment on what this reveals.
  2. It was later discovered that the reaction times for persons E and H had been recorded incorrectly. The values should have been 820 and 590 respectively. After making these corrections, computations gave $$S _ { x x } = 1272 \quad S _ { x y } = 14760 \quad \bar { x } = 60 \quad \bar { y } = 720$$
    1. Using the symbol ⋅ , plot the correct values for persons E and H on the scatter diagram on page 16.
    2. Recalculate the equation of the least squares regression line of \(y\) on \(x\), and draw this regression line on the scatter diagram on page 16.
    3. Hence revise as necessary your comments in part (b)(ii).
      \includegraphics[max width=\textwidth, alt={}]{c4844a30-6a86-49e3-b6aa-8e213dfc8ca1-15_2484_1709_223_153}
      \section*{Reaction Times}
      \includegraphics[max width=\textwidth, alt={}]{c4844a30-6a86-49e3-b6aa-8e213dfc8ca1-16_1943_1301_351_292}
      \includegraphics[max width=\textwidth, alt={}]{c4844a30-6a86-49e3-b6aa-8e213dfc8ca1-17_2484_1707_223_155}
AQA S1 2011 June Q3
15 marks Moderate -0.8
3
  1. During a particular summer holiday, Rick worked in a fish and chip shop at a seaside resort. He suspected that the shop's takings, \(\pounds y\), on a weekday were dependent upon the forecast of that day's maximum temperature, \(x ^ { \circ } \mathrm { C }\), in the resort, made at 6.00 pm on the previous day. To investigate this suspicion, he recorded values of \(x\) and \(y\) for a random sample of 7 weekdays during July.
    \(\boldsymbol { x }\)23182719252022
    \(\boldsymbol { y }\)4290318851063829505742644485
    1. Calculate the equation of the least squares regression line of \(y\) on \(x\).
    2. Estimate the shop's takings on a weekday during July when the maximum temperature was forecast to be \(24 ^ { \circ } \mathrm { C }\).
    3. Explain why your equation may not be suitable for estimating the shop's takings on a weekday during February.
    4. Describe, in the context of this question, a variable other than the maximum temperature, \(x\), that may affect \(y\).
  2. Seren, who also worked in the fish and chip shop, investigated the possible linear relationship between the shop's takings, \(\pounds z\), recorded in \(\pounds 000\) s, and each of two other explanatory variables, \(v\) and \(w\).
    1. She calculated correctly that the regression line of \(z\) on \(v\) had a \(z\)-intercept of - 1 and a gradient of 0.15 . Draw this line, for values of \(v\) from 0 to 40, on Figure 1 on page 4.
    2. She also calculated correctly that the regression line of \(z\) on \(w\) had a \(z\)-intercept of 5 and a gradient of - 0.40 . Draw this line, for values of \(w\) from 0 to 10, on Figure 2 below. \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{Figure 1} \includegraphics[alt={},max width=\textwidth]{767ec629-6350-41d9-bbb9-e059a5fd8c70-4_792_604_680_717}
      \end{figure} \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{Figure 2} \includegraphics[alt={},max width=\textwidth]{767ec629-6350-41d9-bbb9-e059a5fd8c70-4_792_696_1692_687}
      \end{figure}
AQA S1 2012 June Q3
11 marks Moderate -0.3
3 The table shows the maximum weight, \(y _ { A }\) grams, of Salt \(A\) that will dissolve in 100 grams of water at various temperatures, \(x ^ { \circ } \mathrm { C }\).
\(\boldsymbol { x }\)101520253035404550607080
\(\boldsymbol { y } _ { \boldsymbol { A } }\)203548577792101111121137159182
  1. Calculate the equation of the least squares regression line of \(y _ { A }\) on \(x\).
  2. The data in the above table are plotted on the scatter diagram on page 4. Draw your regression line on this scatter diagram.
  3. For water temperatures in the range \(10 ^ { \circ } \mathrm { C }\) to \(80 ^ { \circ } \mathrm { C }\), the maximum weight, \(y _ { B }\) grams, of Salt \(B\) that will dissolve in 100 grams of water is given by the equation $$y _ { B } = 60.1 + 0.255 x$$
    1. Draw this line on the scatter diagram.
    2. Estimate the water temperature at which the maximum weight of Salt \(A\) that will dissolve in 100 grams of water is the same as that of Salt B.
    3. For Salt \(A\) and Salt \(B\), compare the effects of water temperature on the maximum weight that will dissolve in 100 grams of water. Your answer should identify two distinct differences. \section*{Temperatures and Maximum Weights}
      \includegraphics[max width=\textwidth, alt={}]{91466019-8feb-4292-b616-e8e8667e2e54-4_2023_1682_404_173}
AQA S1 2014 June Q3
11 marks Moderate -0.8
3 The table shows the body mass index (BMI), \(x\), and the systolic blood pressure (SBP), \(y \mathrm { mmHg }\), for each of a random sample of 10 men, aged between 35 years and 40 years, from a particular population.
\(\boldsymbol { x }\)13232935173425203127
\(\boldsymbol { y }\)103115124126108120113117118119
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  2. Use your equation to estimate the SBP of a man from this population who is aged 38 years and who has a BMI of 30 .
  3. State why your equation might not be appropriate for estimating the SBP of a man from this population:
    1. who is aged 38 years and who has a BMI of 45 ;
    2. who is aged 50 years and who has a BMI of 25 .
  4. Find the value of the residual for the point \(( 20,117 )\).
  5. The mean of the vertical distances of the 10 points from the regression line calculated in part (a) is 2.71, correct to three significant figures. Comment on the likely accuracy of your estimate in part (b).
    [0pt] [1 mark]
AQA S1 2014 June Q6
12 marks Moderate -0.8
6 A rubber seal is fitted to the bottom of a flood barrier. When no pressure is applied, the depth of the seal is 15 cm . When pressure is applied, a watertight seal is created between the flood barrier and the ground. The table shows the pressure, \(x\) kilopascals ( kPa ), applied to the seal and the resultant depth, \(y\) centimetres, of the seal.
\(\boldsymbol { x }\)255075100125150175200250300
\(\boldsymbol { y }\)14.713.412.811.911.010.39.79.07.56.7
    1. State the value that you would expect for \(a\) in the equation of the least squares regression line, \(y = a + b x\).
    2. Calculate the equation of the least squares regression line, \(y = a + b x\).
    3. Interpret, in context, your value for \(b\).
  1. Calculate an estimate of the depth of the seal when it is subjected to a pressure of 225 kPa .
    1. Give a statistical reason as to why your equation is unlikely to give a realistic estimate of the depth of the seal if it were to be subjected to a pressure of 400 kPa .
    2. Give a reason based on the context of this question as to why your equation will not give a realistic estimate of the depth of the seal if it were to be subjected to a pressure of 525 kPa .
      [0pt] [3 marks]
      \includegraphics[max width=\textwidth, alt={}]{8aeacd54-a5a1-4f2d-b936-2faf635ffce7-20_946_1709_1761_153}
      \includegraphics[max width=\textwidth, alt={}]{8aeacd54-a5a1-4f2d-b936-2faf635ffce7-21_2484_1707_221_153}
      \includegraphics[max width=\textwidth, alt={}]{8aeacd54-a5a1-4f2d-b936-2faf635ffce7-23_2484_1707_221_153}
AQA S1 2016 June Q4
9 marks Moderate -0.8
4 As part of her science project, a student found the mass, \(y\) grams, of a particular compound that dissolved in 100 ml of water at each of 12 different set temperatures, \(x ^ { \circ } \mathrm { C }\). The results are shown in the table.
\(\boldsymbol { x }\)202530354045505560657075
\(\boldsymbol { y }\)242262269290298310326355359375390412
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  2. Interpret, in context, your value for the gradient of this regression line.
  3. Use your equation to estimate the mass of the compound which will dissolve in 100 ml of water at \(68 ^ { \circ } \mathrm { C }\).
  4. Given that the values of the 12 residuals for the regression line of \(y\) on \(x\) lie between - 7 and + 9 , comment, with justification, on the likely accuracy of your estimate in part (c).
    [0pt] [2 marks]
Edexcel S1 Q5
12 marks Moderate -0.8
  1. The table shows the numbers of cars and vans in a company's fleet having registrations with the prefix letters shown.
Registration letter\(K\)\(L\)\(M\)\(N\)\(P\)\(R\)\(S\)\(T\)\(V\)
Number of cars \(( x )\)67911151412107
Number of vans \(( y )\)810141313151498
  1. Plot a scatter graph of this data, with the number of cars on the horizontal axis and the number of vans on the vertical axis.
  2. If there were \(4 J\)-registered cars, estimate the number of \(J\)-registered vans. Given that \(\sum x ^ { 2 } = 1001 , \sum y ^ { 2 } = 1264\) and \(\sum x y = 1106\),
  3. calculate the product-moment correlation coefficient between \(x\) and \(y\). Give a brief interpretation of your answer.
Edexcel S1 Q7
17 marks Moderate -0.8
7. A doctor wished to investigate the effects of staying awake for long periods on a person's ability to complete simple tasks. She recorded the number of times, \(n\), that a subject could clinch his or her fist in 30 seconds after being awake for \(h\) hours. The results for one subject were as follows.
\(h\) (hours)161718192021222324
\(n\)1161141091019494868180
  1. Plot a scatter diagram of \(n\) against \(h\) for these results. You may use $$\Sigma h = 180 , \quad \Sigma n = 875 , \quad \Sigma h ^ { 2 } = 3660 , \quad \Sigma h n = 17204 .$$
  2. Obtain the equation of the regression line of \(n\) on \(h\) in the form \(n = a + b h\).
  3. Give a practical interpretation of the constant b.
  4. Explain why this regression line would be unlikely to be appropriate for values of \(h\) between 0 and 16 .
    (2 marks)
    Another subject underwent the same tests giving rise to a regression line of \(n = 213.4 - 5.87\) h
  5. After how many hours of being awake together would you expect these two subjects to be able to clench their fists the same number of times in 30 seconds?
Edexcel S1 Q6
17 marks Moderate -0.8
6. A school introduced a new programme of support lessons in 1994 with a view to improving grades in GCSE English. The table below shows the number of years since 1994, n, and the corresponding percentage of students achieving A to C grades in GCSE English, \(p\), for each year.
\(n\)123456
\(p ( \% )\)35.237.140.639.043.444.8
  1. Represent these data on a scatter diagram. You may use the following values. $$\Sigma n = 21 , \quad \Sigma p = 240.1 , \quad \Sigma n ^ { 2 } = 91 , \quad \Sigma p ^ { 2 } = 9675.41 , \quad \Sigma n p = 873 .$$
  2. Find an equation of the regression line of \(p\) on \(n\) and draw it on your graph.
  3. Calculate the product moment correlation coefficient for these data and comment on the suitability of a linear model for the relationship between \(n\) and \(p\) during this period.
Edexcel S1 Q7
15 marks Moderate -0.8
7. Pipes-R-us manufacture a special lightweight aluminium tubing. The price \(\pounds P\), for each length, \(l\) metres, that the company sells is shown in the table.
\(l\) (metres)0.50.81.01.5246
\(P ( \pounds )\)2.503.404.005.206.0010.5015.00
  1. Represent these data on a scatter diagram. You may use $$\Sigma l = 15.8 , \quad \Sigma P = 46.6 , \quad \Sigma l ^ { 2 } = 60.14 , \quad \Sigma l P = 159.77$$
  2. Find the equation of the regression line of \(P\) on \(l\) in the form \(P = a + b l\).
  3. Give a practical interpretation of the constant b. In response to customer demand Pipes- \(R\)-us decide to start selling tubes cut to specific lengths. Initially the company decides to use the regression line found in part (b) as a pricing formula for this new service.
  4. Calculate the price that Pipes- \(R\)-us should charge for 5.2 metres of the tubing.
  5. Suggest a reason why Pipes- \(R\)-us might not offer prices based on the regression line for any length of tubing.
Edexcel S1 Q6
14 marks Moderate -0.8
6. A physics student recorded the length, \(l \mathrm {~cm}\), of a spring when different masses, \(m\) grams, were suspended from it giving the following results.
\(m ( \mathrm {~g} )\)50100200300400500600700
\(l ( \mathrm {~cm} )\)7.810.716.522.128.033.935.235.6
  1. Represent these data on a scatter diagram with \(l\) on the vertical axis. The student decides to find the equation of a regression line of the form \(l = a + b m\) using only the data for \(m \leq 500 \mathrm {~g}\).
  2. Give a reason to support the fitting of such a regression line and explain why the student is excluding two of his values.
    (2 marks)
    You may use $$\Sigma m = 1550 , \quad \Sigma l = 119 , \quad \Sigma m ^ { 2 } = 552500 , \quad \Sigma l ^ { 2 } = 2869.2 , \quad \Sigma m l = 39540 .$$
  3. Find the values of \(a\) and \(b\).
  4. Explain the significance of the values of \(a\) and \(b\) in this situation.