5.09a Dependent/independent variables

164 questions

Sort by: Default | Easiest first | Hardest first
Edexcel S1 2022 June Q2
14 marks Moderate -0.8
  1. Stuart is investigating the relationship between Gross Domestic Product (GDP) and the size of the population for a particular country.
    He takes a random sample of 9 years and records the size of the population, \(t\) millions, and the GDP, \(g\) billion dollars for each of these years.
The data are summarised as $$n = 9 \quad \sum t = 7.87 \quad \sum g = 144.84 \quad \sum g ^ { 2 } = 3624.41 \quad S _ { t t } = 1.29 \quad S _ { t g } = 40.25$$
  1. Calculate the product moment correlation coefficient between \(t\) and \(g\)
  2. Give an interpretation of your product moment correlation coefficient.
  3. Find the equation of the least squares regression line of \(g\) on \(t\) in the form \(g = a + b t\)
  4. Give an interpretation of the value of \(b\) in your regression line.
    1. Use the regression line from part (c) to estimate the GDP, in billions of dollars, for a population of 7000000
    2. Comment on the reliability of your answer in part (i). Give a reason, in context, for your answer. Using the regression line from part (c), Stuart estimates that for a population increase of \(x\) million there will be an increase of 0.1 billion dollars in GDP.
  5. Find the value of \(x\)
Edexcel S1 2024 June Q4
13 marks Moderate -0.3
  1. A biologist is studying bears. The biologist records the length, \(d \mathrm {~cm}\), and the girth, \(g \mathrm {~cm}\), of 8 bears. The biologist summarises the data as follows
$$\begin{gathered} \sum d = 1456.8 \quad \sum g = 713.2 \quad \sum d g = 141978.84 \quad \sum g ^ { 2 } = 72675.98 \\ S _ { d d } = 16769.78 \end{gathered}$$
  1. Calculate the exact value of \(S _ { d g }\) and the exact value of \(S _ { g g }\)
  2. Calculate the value of the product moment correlation coefficient between \(d\) and \(g\)
  3. Show that the equation of the regression line of \(g\) on \(d\) can be written as $$g = - 42.3 + 0.722 d$$ where the values of the intercept and gradient are given to 3 significant figures.
  4. Give an interpretation, in context, of the gradient of the regression line. Using the equation of the regression line given in part (c)
    1. estimate the girth of a bear with a length of 2.5 metres,
    2. explain why an estimate for the girth of a bear with a length of 0.5 metres is not reliable. Using the regression line from part (c), the biologist estimates that for each \(x \mathrm {~cm}\) increase in the length of a bear there will be a 17.3 cm increase in the girth.
  5. Find the value of \(x\)
Edexcel S1 Specimen Q6
14 marks Moderate -0.8
  1. A travel agent sells flights to different destinations from Beerow airport. The distance \(d\), measured in 100 km , of the destination from the airport and the fare \(\pounds f\) are recorded for a random sample of 6 destinations.
Destination\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)
\(d\)2.24.06.02.58.05.0
\(f\)182025233228
$$\text { [You may use } \sum d ^ { 2 } = 152.09 \quad \sum f ^ { 2 } = 3686 \quad \sum f d = 723.1 \text { ] }$$
  1. Using the axes below, complete a scatter diagram to illustrate this information.
  2. Explain why a linear regression model may be appropriate to describe the relationship between \(f\) and \(d\).
  3. Calculate \(S _ { d d }\) and \(S _ { f d }\)
  4. Calculate the equation of the regression line of \(f\) on \(d\) giving your answer in the form \(f = a + b d\).
  5. Give an interpretation of the value of \(b\). Jane is planning her holiday and wishes to fly from Beerow airport to a destination \(t \mathrm {~km}\) away. A rival travel agent charges 5 p per km.
  6. Find the range of values of \(t\) for which the first travel agent is cheaper than the rival. \includegraphics[max width=\textwidth, alt={}, center]{61983561-79f7-4883-8ae7-ab1f4955d444-20_967_1630_1722_164}
Edexcel S1 2001 January Q6
18 marks Moderate -0.8
6. A local authority is investigating the cost of reconditioning its incinerators. Data from 10 randomly chosen incinerators were collected. The variables monitored were the operating time \(x\) (in thousands of hours) since last reconditioning and the reconditioning cost \(y\) (in \(\pounds 1000\) ). None of the incinerators had been used for more than 3000 hours since last reconditioning. The data are summarised below, $$\Sigma x = 25.0 , \Sigma x ^ { 2 } = 65.68 , \Sigma y = 50.0 , \Sigma y ^ { 2 } = 260.48 , \Sigma x y = 130.64 .$$
  1. Find \(\mathrm { S } _ { x x } , \mathrm {~S} _ { x y } , \mathrm {~S} _ { y y }\).
  2. Calculate the product moment correlation coefficient between \(x\) and \(y\).
  3. Explain why this value might support the fitting of a linear regression model of the form \(y = a + b x\).
  4. Find the values of \(a\) and \(b\).
  5. Give an interpretation of \(a\).
  6. Estimate
    1. the reconditioning cost for an operating time of 2400 hours,
    2. the financial effect of an increase of 1500 hours in operating time.
  7. Suggest why the authority might be cautious about making a prediction of the reconditioning cost of an incinerator which had been operating for 4500 hours since its last reconditioning.
Edexcel S1 2003 January Q6
19 marks Moderate -0.3
6. The chief executive of Rex cars wants to investigate the relationship between the number of new car sales and the amount of money spent on advertising. She collects data from company records on the number of new car sales, \(c\), and the cost of advertising each year, \(p\) (£000). The data are shown in the table below.
YearNumber of new car sale, \(c\)Cost of advertising (£000), \(p\)
19904240120
19914380126
19924420132
19934440134
19944430137
19954520144
19964590148
19974660150
19984700153
19994790158
  1. Using the coding \(x = ( p - 100 )\) and \(y = \frac { 1 } { 10 } ( c - 4000 )\), draw a scatter diagram to represent these data. Explain why \(x\) is the explanatory variable.
  2. Find the equation of the least squares regression line of \(y\) on \(x\). $$\text { [Use } \left. \Sigma x = 402 , \Sigma y = 517 , \Sigma x ^ { 2 } = 17538 \text { and } \Sigma x y = 22611 . \right]$$
  3. Deduce the equation of the least squares regression line of \(c\) on \(p\) in the form \(c = a + b p\).
  4. Interpret the value of \(a\).
  5. Predict the number of extra new cars sales for an increase of \(\pounds 2000\) in advertising budget. Comment on the validity of your answer.
    (2)
Edexcel S1 2006 January Q3
18 marks Easy -1.2
3. A manufacturer stores drums of chemicals. During storage, evaporation takes place. A random sample of 10 drums was taken and the time in storage, \(x\) weeks, and the evaporation loss, \(y \mathrm { ml }\), are shown in the table below.
\(x\)3568101213151618
\(y\)36505361697982908896
  1. On graph paper, draw a scatter diagram to represent these data.
  2. Give a reason to support fitting a regression model of the form \(y = a + b x\) to these data.
  3. Find, to 2 decimal places, the value of \(a\) and the value of \(b\). $$\text { (You may use } \Sigma x ^ { 2 } = 1352 , \Sigma y ^ { 2 } = 53112 \text { and } \Sigma x y = 8354 \text {.) }$$
  4. Give an interpretation of the value of \(b\).
  5. Using your model, predict the amount of evaporation that would take place after
    1. 19 weeks,
    2. 35 weeks.
  6. Comment, with a reason, on the reliability of each of your predictions.
Edexcel S1 2008 January Q4
10 marks Moderate -0.8
4. A second hand car dealer has 10 cars for sale. She decides to investigate the link between the age of the cars, \(x\) years, and the mileage, \(y\) thousand miles. The data collected from the cars are shown in the table below.
Age, \(x\)
(years)
22.5344.54.55366.5
Mileage, \(y\)
(thousands)
22343337404549305858
[You may assume that \(\sum x = 41 , \sum y = 406 , \sum x ^ { 2 } = 188 , \sum x y = 1818.5\) ]
  1. Find \(S _ { x x }\) and \(S _ { x y }\).
  2. Find the equation of the least squares regression line in the form \(y = a + b x\). Give the values of \(a\) and \(b\) to 2 decimal places.
  3. Give a practical interpretation of the slope \(b\).
  4. Using your answer to part (b), find the mileage predicted by the regression line for a 5 year old car. \(\_\_\_\_\)
Edexcel S1 2009 January Q1
11 marks Moderate -0.8
  1. A teacher is monitoring the progress of students using a computer based revision course. The improvement in performance, \(y\) marks, is recorded for each student along with the time, \(x\) hours, that the student spent using the revision course. The results for a random sample of 10 students are recorded below.
\(x\)
hours
1.03.54.01.51.30.51.82.52.33.0
\(y\)
marks
5302710- 3- 5715- 1020
$$\text { [You may use } \sum x = 21.4 , \quad \sum y = 96 , \quad \sum x ^ { 2 } = 57.22 , \quad \sum x y = 313.7 \text { ] }$$
  1. Calculate \(S _ { x x }\) and \(S _ { x y }\).
  2. Find the equation of the least squares regression line of \(y\) on \(x\) in the form \(y = a + b x\).
  3. Give an interpretation of the gradient of your regression line. Rosemary spends 3.3 hours using the revision course.
  4. Predict her improvement in marks. Lee spends 8 hours using the revision course claiming that this should give him an improvement in performance of over 60 marks.
  5. Comment on Lee's claim.
Edexcel S1 2011 January Q4
6 marks Moderate -0.8
  1. A farmer collected data on the annual rainfall, \(x \mathrm {~cm}\), and the annual yield of peas, \(p\) tonnes per acre.
The data for annual rainfall was coded using \(v = \frac { x - 5 } { 10 }\) and the following statistics were found. $$S _ { v v } = 5.753 \quad S _ { p v } = 1.688 \quad S _ { p p } = 1.168 \quad \bar { p } = 3.22 \quad \bar { v } = 4.42$$
  1. Find the equation of the regression line of \(p\) on \(v\) in the form \(p = a + b v\).
  2. Using your regression line estimate the annual yield of peas per acre when the annual rainfall is 85 cm .
Edexcel S1 2012 January Q5
15 marks Moderate -0.8
  1. The age, \(t\) years, and weight, \(w\) grams, of each of 10 coins were recorded. These data are summarised below.
$$\sum t ^ { 2 } = 2688 \quad \sum t w = 1760.62 \quad \sum t = 158 \quad \sum w = 111.75 \quad S _ { w w } = 0.16$$
  1. Find \(S _ { t t }\) and \(S _ { t w }\) for these data.
  2. Calculate, to 3 significant figures, the product moment correlation coefficient between \(t\) and \(w\).
  3. Find the equation of the regression line of \(w\) on \(t\) in the form \(w = a + b t\)
  4. State, with a reason, which variable is the explanatory variable.
  5. Using this model, estimate
    1. the weight of a coin which is 5 years old,
    2. the effect of an increase of 4 years in age on the weight of a coin. It was discovered that a coin in the original sample, which was 5 years old and weighed 20 grams, was a fake.
  6. State, without any further calculations, whether the exclusion of this coin would increase or decrease the value of the product moment correlation coefficient. Give a reason for your answer.
Edexcel S1 2013 January Q1
7 marks Easy -1.2
  1. A teacher asked a random sample of 10 students to record the number of hours of television, \(t\), they watched in the week before their mock exam. She then calculated their grade, \(g\), in their mock exam. The results are summarised as follows.
$$\sum t = 258 \quad \sum t ^ { 2 } = 8702 \quad \sum g = 63.6 \quad \mathrm {~S} _ { g g } = 7.864 \quad \sum g t = 1550.2$$
  1. Find \(\mathrm { S } _ { t t }\) and \(\mathrm { S } _ { g t }\)
  2. Calculate, to 3 significant figures, the product moment correlation coefficient between \(t\) and \(g\). The teacher also recorded the number of hours of revision, \(v\), these 10 students completed during the week before their mock exam. The correlation coefficient between \(t\) and \(v\) was -0.753
  3. Describe, giving a reason, the nature of the correlation you would expect to find between \(v\) and \(g\).
Edexcel S1 2013 January Q3
10 marks Moderate -0.8
3. A biologist is comparing the intervals ( \(m\) seconds) between the mating calls of a certain species of tree frog and the surrounding temperature ( \(t { } ^ { \circ } \mathrm { C }\) ). The following results were obtained.
\(t { } ^ { \circ } \mathrm { C }\)813141515202530
\(m\) secs6.54.5654321
$$\text { (You may use } \sum t m = 469.5 , \quad \mathrm {~S} _ { t t } = 354 , \quad \mathrm {~S} _ { m m } = 25.5 \text { ) }$$
  1. Show that \(\mathrm { S } _ { t m } = - 90.5\)
  2. Find the equation of the regression line of \(m\) on \(t\) giving your answer in the form \(m = a + b t\).
  3. Use your regression line to estimate the time interval between mating calls when the surrounding temperature is \(10 ^ { \circ } \mathrm { C }\).
  4. Comment on the reliability of this estimate, giving a reason for your answer.
Edexcel S1 2001 June Q7
16 marks Moderate -0.3
7. A music teacher monitored the sight-reading ability of one of her pupils over a 10 week period. At the end of each week, the pupil was given a new piece to sight-read and the teacher noted the number of errors \(y\). She also recorded the
number of hours \(x\) that the pupil had practised each week. The data are shown in the table below.
\(x\)1215711184693
\(y\)84138181215141216
  1. Plot these data on a scatter diagram.
  2. Find the equation of the regression line of \(y\) on \(x\) in the form \(y = a + b x\). $$\text { (You may use } \left. \Sigma x ^ { 2 } = 746 , \Sigma x y = 749 . \right)$$
  3. Give an interpretation of the slope and the intercept of your regression line.
  4. State whether or not you think the regression model is reasonable
    1. for the range of \(x\)-values given in the table,
    2. for all possible \(x\)-values. In each case justify your answer either by giving a reason for accepting the model or by suggesting an alternative model. END
Edexcel S1 2010 June Q6
14 marks Moderate -0.8
6. A travel agent sells flights to different destinations from Beerow airport. The distance \(d\), measured in 100 km , of the destination from the airport and the fare \(\pounds f\) are recorded for a random sample of 6 destinations.
Destination\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)
\(d\)2.24.06.02.58.05.0
\(f\)182025233228
$$\text { [You may use } \sum d ^ { 2 } = 152.09 \quad \sum f ^ { 2 } = 3686 \quad \sum f d = 723.1 \text { ] }$$
  1. Using the axes below, complete a scatter diagram to illustrate this information.
  2. Explain why a linear regression model may be appropriate to describe the relationship between \(f\) and \(d\).
  3. Calculate \(S _ { d d }\) and \(S _ { f d }\)
  4. Calculate the equation of the regression line of \(f\) on \(d\) giving your answer in the form \(f = a + b d\).
  5. Give an interpretation of the value of \(b\). Jane is planning her holiday and wishes to fly from Beerow airport to a destination \(t \mathrm {~km}\) away. A rival travel agent charges 5 p per km.
  6. Find the range of values of \(t\) for which the first travel agent is cheaper than the rival. \includegraphics[max width=\textwidth, alt={}, center]{039e6fcf-3222-40cc-95ea-37b8dc4a4ddb-11_1013_1701_1718_116}
Edexcel S1 2012 June Q3
15 marks Moderate -0.5
3. A scientist is researching whether or not birds of prey exposed to pollutants lay eggs with thinner shells. He collects a random sample of egg shells from each of 6 different nests and tests for pollutant level, \(p\), and measures the thinning of the shell, \(t\). The results are shown in the table below.
\(p\)3830251512
\(t\)1391056
[You may use \(\sum p ^ { 2 } = 1967\) and \(\sum p t = 694\) ]
  1. Draw a scatter diagram on the axes on page 7 to represent these data.
  2. Explain why a linear regression model may be appropriate to describe the relationship between \(p\) and \(t\).
  3. Calculate the value of \(S _ { p t }\) and the value of \(S _ { p p }\).
  4. Find the equation of the regression line of \(t\) on \(p\), giving your answer in the form \(t = a + b p\).
  5. Plot the point ( \(\bar { p } , \bar { t }\) ) and draw the regression line on your scatter diagram. The scientist reviews similar studies and finds that pollutant levels above 16 are likely to result in the death of a chick soon after hatching.
  6. Estimate the minimum thinning of the shell that is likely to result in the death of a chick. \includegraphics[max width=\textwidth, alt={}, center]{0593544d-392d-465b-b922-c9cb1435abb5-05_1257_1568_301_173}
Edexcel S1 2013 June Q1
13 marks Moderate -0.8
  1. A meteorologist believes that there is a relationship between the height above sea level, \(h \mathrm {~m}\), and the air temperature, \(t ^ { \circ } \mathrm { C }\). Data is collected at the same time from 9 different places on the same mountain. The data is summarised in the table below.
\(h\)140011002608409005501230100770
\(t\)310209101352416
[You may assume that \(\sum h = 7150 , \sum t = 110 , \sum h ^ { 2 } = 7171500 , \sum t ^ { 2 } = 1716\), \(\sum t h = 64980\) and \(\mathrm { S } _ { t t } = 371.56\) ]
  1. Calculate \(\mathrm { S } _ { t h }\) and \(\mathrm { S } _ { h h }\). Give your answers to 3 significant figures.
  2. Calculate the product moment correlation coefficient for this data.
  3. State whether or not your value supports the use of a regression equation to predict the air temperature at different heights on this mountain. Give a reason for your answer.
  4. Find the equation of the regression line of \(t\) on \(h\) giving your answer in the form \(t = a + b h\).
  5. Interpret the value of \(b\).
  6. Estimate the difference in air temperature between a height of 500 m and a height of 1000 m .
Edexcel S1 2014 June Q3
16 marks Moderate -0.8
3. A large company is analysing how much money it spends on paper in its offices every year. The number of employees, \(x\), and the amount of money spent on paper, \(p\) ( \(\pounds\) hundreds), in 8 randomly selected offices are given in the table below.
\(x\)891214731619
\(p\) (£ hundreds)40.536.130.439.432.631.143.445.7
$$\text { (You may use } \sum x ^ { 2 } = 1160 \quad \sum p = 299.2 \quad \sum p ^ { 2 } = 11422 \quad \sum x p = 3449.5 \text { ) }$$
  1. Show that \(S _ { p p } = 231.92\) and find the value of \(S _ { x x }\) and the value of \(S _ { x p }\)
  2. Calculate the product moment correlation coefficient between \(x\) and \(p\). The equation of the regression line of \(p\) on \(x\) is given in the form \(p = a + b x\).
  3. Show that, to 3 significant figures, \(b = 0.824\) and find the value of \(a\).
  4. Estimate the amount of money spent on paper in an office with 10 employees.
  5. Explain the effect each additional employee has on the amount of money spent on paper. Later the company realised it had made a mistake in adding up its costs, \(p\). The true costs were actually half of the values recorded. The product moment correlation coefficient and the equation of the linear regression line are recalculated using this information.
  6. Write down the new value of
    1. the product moment correlation coefficient,
    2. the gradient of the regression line.
Edexcel S1 2014 June Q3
13 marks Easy -1.2
3. The table shows data on the number of visitors to the UK in a month, \(v\) (1000s), and the amount of money they spent, \(m\) ( \(\pounds\) millions), for each of 8 months.
Number of visitors
\(v ( 1000 \mathrm {~s} )\)
24502480254024202350229024002460
Amount of money spent
\(m ( \pounds\) millions \()\)
13701350140013301270121013301350
You may use \(S _ { v v } = 42587.5 \quad S _ { v m } = 31512.5 \quad S _ { m m } = 25187.5 \quad \sum v = 19390 \quad \sum m = 10610\)
  1. Find the product moment correlation coefficient between \(m\) and \(v\).
  2. Give a reason to support fitting a regression model of the form \(m = a + b v\) to these data.
  3. Find the value of \(b\) correct to 3 decimal places.
  4. Find the equation of the regression line of \(m\) on \(v\).
  5. Interpret your value of \(b\).
  6. Use your answer to part (d) to estimate the amount of money spent when the number of visitors to the UK in a month is 2500000
  7. Comment on the reliability of your estimate in part (f). Give a reason for your answer.
Edexcel S1 2015 June Q4
14 marks Easy -1.2
  1. Statistical models can provide a cheap and quick way to describe a real world situation.
    1. Give two other reasons why statistical models are used.
    A scientist wants to develop a model to describe the relationship between the average daily temperature, \(x ^ { \circ } \mathrm { C }\), and her household's daily energy consumption, \(y \mathrm { kWh }\), in winter. A random sample of the average daily temperature and her household's daily energy consumption are taken from 10 winter days and shown in the table.
    \(x\)- 0.4- 0.20.30.81.11.41.82.12.52.6
    \(y\)28302625262726242221
    $$\text { [You may use } \sum x ^ { 2 } = 24.76 \quad \sum y = 255 \quad \sum x y = 283.8 \quad \mathrm {~S} _ { x x } = 10.36 \text { ] }$$
  2. Find \(\mathrm { S } _ { x y }\) for these data.
  3. Find the equation of the regression line of \(y\) on \(x\) in the form \(y = a + b x\) Give the value of \(a\) and the value of \(b\) to 3 significant figures.
  4. Give an interpretation of the value of \(a\)
  5. Estimate her household's daily energy consumption when the average daily temperature is \(2 ^ { \circ } \mathrm { C }\) The scientist wants to use the linear regression model to predict her household's energy consumption in the summer.
  6. Discuss the reliability of using this model to predict her household's energy consumption in the summer.
Edexcel S1 Q6
16 marks Moderate -0.8
6. To test the heating of tyre material, tyres are run on a test rig at chosen speeds under given conditions of load, pressure and surrounding temperature. The following table gives values of \(x\), the test rig speed in miles per hour (mph), and the temperature, \(y ^ { \circ } \mathrm { C }\), generated in the shoulder of the tyre for a particular tyre material.
\(x ( \mathrm { mph } )\)1520253035404550
\(y \left( { } ^ { \circ } \mathrm { C } \right)\)53556365788391101
  1. Draw a scatter diagram to represent these data.
  2. Give a reason to support the fitting of a regression line of the form \(y = a + b x\) through these points.
  3. Find the values of \(a\) and \(b\).
    (You may use \(\Sigma x ^ { 2 } = 9500 , \Sigma y ^ { 2 } = 45483 , \Sigma x y = 20615\) )
  4. Give an interpretation for each of \(a\) and \(b\).
  5. Use your line to estimate the temperature at 50 mph and explain why this estimate differs from the value given in the table. A tyre specialist wants to estimate the temperature of this tyre material at 12 mph and 85 mph .
  6. Explain briefly whether or not you would recommend the specialist to use this regression equation to obtain these estimates.
Edexcel S1 2003 November Q1
16 marks Moderate -0.8
  1. A company wants to pay its employees according to their performance at work. The performance score \(x\) and the annual salary, \(y\) in \(\pounds 100\) s, for a random sample of 10 of its employees for last year were recorded. The results are shown in the table below.
\(x\)15402739271520301924
\(y\)216384234399226132175316187196
$$\text { [You may assume } \left. \Sigma x y = 69798 , \Sigma x ^ { 2 } = 7266 \right]$$
  1. Draw a scatter diagram to represent these data.
  2. Calculate exact values of \(S _ { x y }\) and \(S _ { x x }\).
    1. Calculate the equation of the regression line of \(y\) on \(x\), in the form \(y = a + b x\). Give the values of \(a\) and \(b\) to 3 significant figures.
    2. Draw this line on your scatter diagram.
  3. Interpret the gradient of the regression line. The company decides to use this regression model to determine future salaries.
  4. Find the proposed annual salary for an employee who has a performance score of 35 .
Edexcel S1 2004 November Q2
4 marks Moderate -0.8
2. An experiment carried out by a student yielded pairs of \(( x , y )\) observations such that $$\bar { x } = 36 , \quad \bar { y } = 28.6 , \quad S _ { x x } = 4402 , \quad S _ { x y } = 3477.6$$
  1. Calculate the equation of the regression line of \(y\) on \(x\) in the form \(y = a + b x\). Give your values of \(a\) and \(b\) to 2 decimal places.
  2. Find the value of \(y\) when \(x = 45\).
AQA S1 2008 January Q4
12 marks Moderate -0.3
4 [Figure 1, printed on the insert, is provided for use in this question.]
Roseen is a self-employed decorator who wishes to estimate the times that it will take her to decorate bedrooms based upon their floor areas. She records the floor area, \(x \mathrm {~m} ^ { 2 }\), and the decorating time, \(y\) hours, for each of 10 bedrooms she has recently decorated.
\(\boldsymbol { x }\)11.022.07.521.013.016.514.016.018.520.5
\(\boldsymbol { y }\)15.035.016.023.524.017.514.527.522.534.5
  1. On Figure 1, plot a scatter diagram of these data.
  2. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  3. Draw your regression line on Figure 1.
    1. Use your regression equation to estimate the time that Roseen will take to decorate a bedroom with a floor area of \(15 \mathrm {~m} ^ { 2 }\).
    2. Making reference to Figure 1, comment on the likely reliability of your estimate in part (d)(i).
AQA S1 2009 January Q6
15 marks Moderate -0.3
6 [Figure 1, printed on the insert, is provided for use in this question.]
For a random sample of 10 patients who underwent hip-replacement operations, records were kept of their ages, \(x\) years, and of the number of days, \(y\), following their operations before they were able to walk unaided safely.
Patient\(\mathbf { A }\)\(\mathbf { B }\)\(\mathbf { C }\)\(\mathbf { D }\)\(\mathbf { E }\)\(\mathbf { F }\)\(\mathbf { G }\)\(\mathbf { H }\)\(\mathbf { I }\)\(\mathbf { J }\)
\(\boldsymbol { x }\)55516266725978556270
\(\boldsymbol { y }\)34333949484351414651
  1. On Figure 1, complete the scatter diagram for these data.
  2. Calculate the equation of the least squares regression line of \(y\) on \(x\).
  3. Draw your regression line on Figure 1.
  4. In fact, patients H, I and J were males and the other 7 patients were females.
    1. Calculate the mean of the residuals for the 3 male patients.
    2. Hence estimate, for a male patient aged 65 years, the number of days following his hip-replacement operation before he is able to walk unaided safely.
AQA S1 2011 January Q5
14 marks Moderate -0.3
5 Craig uses his car to travel regularly from his home to the area hospital for treatment. He leaves home at \(x\) minutes after 7.30 am and then takes \(y\) minutes to arrive at the hospital's reception desk. His results for 11 mornings are shown in the table.
\(\boldsymbol { x }\)05101520253035404550
\(\boldsymbol { y }\)3142325847567968899585
  1. Explain why the time taken by Craig between leaving home and arriving at the hospital's reception desk is the response variable.
  2. Calculate the equation of the least squares regression line of \(y\) on \(x\), writing your answer in the form \(y = a + b x\).
  3. On a particular day, Craig needs to arrive at the hospital's reception desk no later than 9.00 am . He leaves home at 7.45 am . Estimate the number of minutes before 9.00 am that Craig will arrive at the hospital's reception desk. Give your answer to the nearest minute.
    1. Use your equation to estimate \(y\) when \(x = 85\).
    2. Give one statistical reason and one reason based on the context of this question as to why your estimate in part (d)(i) is unlikely to be realistic.埗