5.09a Dependent/independent variables

164 questions

Sort by: Default | Easiest first | Hardest first
AQA S1 2006 June Q3
11 marks Moderate -0.8
3 A new car tyre is fitted to a wheel. The tyre is inflated to its recommended pressure of 265 kPa and the wheel left unused. At 3-month intervals thereafter, the tyre pressure is measured with the following results:
Time after fitting
\(( x\) months \()\)
03691215182124
Tyre pressure
\(( y\) kPa \()\)
265250240235225215210195180
    1. Calculate the equation of the least squares regression line of \(y\) on \(x\).
    2. Interpret in context the value for the gradient of your line.
    3. Comment on the value for the intercept with the \(y\)-axis of your line.
  1. The tyre manufacturer states that, when one of these new tyres is fitted to the wheel of a car and then inflated to 265 kPa , a suitable regression equation is of the form $$y = 265 + b x$$ The manufacturer also states that, as the car is used, the tyre pressure will decrease at twice the rate of that found in part (a).
    1. Suggest a suitable value for \(b\).
    2. One of these new tyres is fitted to the wheel of a car and inflated to 265 kPa . The car is then used for 8 months, after which the tyre pressure is checked for the first time. Show that, accepting the manufacturer's statements, the tyre pressure can be expected to have fallen below its minimum safety value of 220 kPa .
      (2 marks)
AQA S1 2015 June Q5
11 marks Moderate -0.8
5 The table shows the number of customers, \(x\), and the takings, \(\pounds y\), recorded to the nearest \(\pounds 10\), at a local butcher's shop on each of 10 randomly selected weekdays.
\(\boldsymbol { x }\)86606546719356817557
\(\boldsymbol { y }\)9407906205307701050690780860550
  1. The first 6 pairs of data values in this table are plotted on the scatter diagram shown on the opposite page. Plot the final 4 pairs of data values on the scatter diagram.
    1. Calculate the equation of the least squares regression line in the form \(y = a + b x\) and draw your line on the scatter diagram.
    2. Interpret your value for \(b\) in the context of the question.
    3. State why your value for \(a\) has no practical interpretation.
  2. Estimate, to the nearest \(\pounds 10\), the shop's takings when the number of customers is 50 .
    [0pt] [1 mark]
    \includegraphics[max width=\textwidth, alt={}]{4c679380-894f-4d36-aec8-296b662058e2-14_1255_1705_1448_155}
    Butcher's shop \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{Answer space for question 5} \includegraphics[alt={},max width=\textwidth]{4c679380-894f-4d36-aec8-296b662058e2-15_2335_1760_372_100}
    \end{figure}
AQA S1 2015 June Q4
15 marks Moderate -0.3
4 Stephan is a roofing contractor who is often required to replace loose ridge tiles on house roofs. In order to help him to quote more accurately the prices for such jobs in the future, he records, for each of 11 recently repaired roofs, the number of ridge tiles replaced, \(x _ { i }\), and the time taken, \(y _ { i }\) hours. His results are shown in the table.
Roof \(( \boldsymbol { i } )\)\(\mathbf { 1 }\)\(\mathbf { 2 }\)\(\mathbf { 3 }\)\(\mathbf { 4 }\)\(\mathbf { 5 }\)\(\mathbf { 6 }\)\(\mathbf { 7 }\)\(\mathbf { 8 }\)\(\mathbf { 9 }\)\(\mathbf { 1 0 }\)\(\mathbf { 1 1 }\)
\(\boldsymbol { x } _ { \boldsymbol { i } }\)811141416202222252730
\(\boldsymbol { y } _ { \boldsymbol { i } }\)5.05.26.37.28.08.810.611.011.812.113.0
  1. The pairs of data values for roofs 1 to 7 are plotted on the scatter diagram shown on the opposite page. Plot the 4 pairs of data values for roofs 8 to 11 on the scatter diagram.
    1. Calculate the equation of the least squares regression line of \(y _ { i }\) on \(x _ { i }\), and draw your line on the scatter diagram.
    2. Interpret your values for the gradient and for the intercept of this regression line.
  2. Estimate the time that it would take Stephan to replace 15 loose ridge tiles on a house roof.
  3. Given that \(r _ { i }\) denotes the residual for the point representing roof \(i\) :
    1. calculate the value of \(r _ { 6 }\);
    2. state why the value of \(\sum _ { i = 1 } ^ { 11 } r _ { i }\) gives no useful information about the connection between the number of ridge tiles replaced and the time taken.
      [0pt] [1 mark]
      \section*{Answer space for question 4}
      \includegraphics[max width=\textwidth, alt={}]{6fbb8891-e6de-42fe-a195-ea643552fdcf-11_2385_1714_322_155}
OCR S1 Q4
8 marks Moderate -0.3
4 The table shows the latitude, \(x\) (in degrees correct to 3 significant figures), and the average rainfall \(y\) (in cm correct to 3 significant figures) of five European cities.
City\(x\)\(y\)
Berlin52.558.2
Bucharest44.458.7
Moscow55.853.3
St Petersburg60.047.8
Warsaw52.356.6
$$\left[ n = 5 , \Sigma x = 265.0 , \Sigma y = 274.6 , \Sigma x ^ { 2 } = 14176.54 , \Sigma y ^ { 2 } = 15162.22 , \Sigma x y = 14464.10 . \right]$$
  1. Calculate the product moment correlation coefficient.
  2. The values of \(y\) in the table were in fact obtained from measurements in inches and converted into centimetres by multiplying by 2.54. State what effect it would have had on the value of the product moment correlation coefficient if it had been calculated using inches instead of centimetres.
  3. It is required to estimate the annual rainfall at Bergen, where \(x = 60.4\). Calculate the equation of an appropriate line of regression, giving your answer in simplified form, and use it to find the required estimate. \section*{June 2005}
Edexcel AS Paper 2 2018 June Q1
3 marks Moderate -0.8
  1. A company is introducing a job evaluation scheme. Points ( \(x\) ) will be awarded to each job based on the qualifications and skills needed and the level of responsibility. Pay ( \(\pounds y\) ) will then be allocated to each job according to the number of points awarded.
Before the scheme is introduced, a random sample of 8 employees was taken and the linear regression equation of pay on points was \(y = 4.5 x - 47\)
  1. Describe the correlation between points and pay.
  2. Give an interpretation of the gradient of this regression line.
  3. Explain why this model might not be appropriate for all jobs in the company.
Edexcel S1 2024 October Q2
Moderate -0.8
  1. A biologist records the length, \(y \mathrm {~cm}\), and the weight, \(w \mathrm {~kg}\), of 50 rabbits. The following summary statistics are calculated from these data.
$$\sum y = 2015 \quad \sum y ^ { 2 } = 81938.5 \quad \sum w = 125 \quad \mathrm {~S} _ { w w } = 72.25 \quad \mathrm {~S} _ { y w } = 219.55$$
    1. Show that \(\mathrm { S } _ { y y } = 734\)
    2. Calculate the product moment correlation coefficient for these data. Give your answer to 3 decimal places.
  1. Interpret your value of the product moment correlation coefficient. The biologist believes that a linear regression model may be appropriate to describe these data.
  2. State, with a reason, whether or not your value of the product moment correlation coefficient is consistent with the biologist’s belief.
  3. Find the equation of the regression line of \(w\) on \(y\), giving your answer in the form \(w = a + b y\) Jeff has a pet rabbit of length 45 cm .
  4. Use your regression equation to estimate the weight of Jeff's rabbit.
CAIE FP2 2012 June Q11
24 marks Standard +0.3
Answer only one of the following two alternatives. EITHER A particle \(P\) of mass \(m\) is attached to one end of a light elastic string of modulus of elasticity \(4mg\) and natural length \(l\). The other end of the string is attached to a fixed point \(O\). The particle rests in equilibrium at the point \(E\), vertically below \(O\). The particle is pulled down a vertical distance \(\frac{3l}{4}\) from \(E\) and released from rest. Show that the motion of \(P\) is simple harmonic with period \(\pi\sqrt{\left(\frac{l}{g}\right)}\). [4] At an instant when \(P\) is moving vertically downwards through \(E\), the string is cut. When \(P\) has descended a further distance \(\frac{3l}{4}\) under gravity, it strikes a fixed smooth plane which is inclined at 30° to the horizontal. The coefficient of restitution between \(P\) and the plane is \(\frac{1}{3}\). Show that the speed of \(P\) immediately after the impact is \(\frac{1}{3}\sqrt{(5gl)}\). [8] OR A new restaurant \(S\) has recently opened in a particular town. In order to investigate any effect of \(S\) on an existing restaurant \(R\), the daily takings, \(x\) and \(y\) in thousands of dollars, at \(R\) and \(S\) respectively are recorded for a random sample of 8 days during a six-month period. The results are shown in the following table.
Day12345678
\(x\)1.21.40.91.10.81.00.61.5
\(y\)0.30.40.60.60.250.750.60.35
  1. Calculate the product moment correlation coefficient for this sample. [4]
  2. Stating your hypotheses, test, at the 2.5% significance level, whether there is negative correlation between daily takings at the two restaurants and comment on your result in the context of the question. [5]
Another sample is taken over \(N\) randomly chosen days and the product moment correlation coefficient is found to be \(-0.431\). A test, at the 5% significance level, shows that there is evidence of negative correlation between daily takings in the two restaurants.
  1. Find the range of possible values of \(N\). [3]
CAIE FP2 2012 June Q11
24 marks Standard +0.3
Answer only one of the following two alternatives. EITHER A particle \(P\) of mass \(m\) is attached to one end of a light elastic string of modulus of elasticity \(4mg\) and natural length \(l\). The other end of the string is attached to a fixed point \(O\). The particle rests in equilibrium at the point \(E\), vertically below \(O\). The particle is pulled down a vertical distance \(\frac{3l}{4}\) from \(E\) and released from rest. Show that the motion of \(P\) is simple harmonic with period \(\pi\sqrt{\left(\frac{l}{g}\right)}\). [4] At an instant when \(P\) is moving vertically downwards through \(E\), the string is cut. When \(P\) has descended a further distance \(\frac{5l}{4}\) under gravity, it strikes a fixed smooth plane which is inclined at 30° to the horizontal. The coefficient of restitution between \(P\) and the plane is \(\frac{1}{3}\). Show that the speed of \(P\) immediately after the impact is \(\frac{1}{3}\sqrt{(5gl)}\). [8] OR A new restaurant \(S\) has recently opened in a particular town. In order to investigate any effect of \(S\) on an existing restaurant \(R\), the daily takings, \(x\) and \(y\) in thousands of dollars, at \(R\) and \(S\) respectively are recorded for a random sample of 8 days during a six-month period. The results are shown in the following table.
Day12345678
\(x\)1.21.40.91.10.81.00.61.5
\(y\)0.30.40.60.60.250.750.60.35
  1. Calculate the product moment correlation coefficient for this sample. [4]
  2. Stating your hypotheses, test, at the 2.5\% significance level, whether there is negative correlation between daily takings at the two restaurants and comment on your result in the context of the question. [5]
Another sample is taken over \(N\) randomly chosen days and the product moment correlation coefficient is found to be \(-0.431\). A test, at the 5\% significance level, shows that there is evidence of negative correlation between daily takings in the two restaurants.
  1. Find the range of possible values of \(N\). [3]
CAIE FP2 2009 November Q11
28 marks Standard +0.3
Answer only one of the following two alternatives. EITHER A light elastic string, of natural length \(l\) and modulus of elasticity \(4mg\), is attached at one end to a fixed point and has a particle \(P\) of mass \(m\) attached to the other end. When \(P\) is hanging in equilibrium under gravity it is given a velocity \(\sqrt{(gl)}\) vertically downwards. At time \(t\) the downward displacement of \(P\) from its equilibrium position is \(x\). Show that, while the string is taut, $$\ddot{x} = -\frac{4g}{l}x.$$ [4] Find the speed of \(P\) when the length of the string is \(l\). [4] Show that the time taken for \(P\) to move from the lowest point to the highest point of its motion is $$\left(\frac{\pi}{3} + \frac{\sqrt{3}}{2}\right)\sqrt{\left(\frac{l}{g}\right)}.$$ [6] OR \includegraphics{figure_11} The scatter diagram shows a sample of size 5 of bivariate data, together with the regression line of \(y\) on \(x\). State what is minimised in obtaining this regression line, illustrating your answer on a copy of this diagram. [2] State, giving a reason, whether, for the data shown, the regression line of \(y\) on \(x\) is the same as the regression line of \(x\) on \(y\). [1] A car is travelling along a stretch of road with speed \(v\) km h\(^{-1}\) when the brakes are applied. The car comes to rest after travelling a further distance of \(z\) m. The values of \(z\) (and \(\sqrt{z}\)) for 8 different values of \(v\) are given in the table, correct to 2 decimal places.
\(v\)2530354045505560
\(z\)2.834.634.845.299.7310.3014.8215.21
\(\sqrt{z}\)1.682.152.202.303.123.213.853.90
[\(\sum v = 340\), \(\sum v^2 = 15500\), \(\sum \sqrt{z} = 22.41\), \(\sum z = 67.65\), \(\sum v\sqrt{z} = 1022.15\).]
  1. Calculate the product moment correlation coefficient between \(v\) and \(\sqrt{z}\). What does this indicate about the scatter diagram of the points \((v, \sqrt{z})\)? [4]
  2. Given that the product moment correlation coefficient between \(v\) and \(z\) is 0.965, correct to 3 decimal places, state why the regression line of \(\sqrt{z}\) on \(v\) is more suitable than the regression line of \(z\) on \(v\), and find the equation of the regression line of \(\sqrt{z}\) on \(v\). [5]
  3. Comment, in the context of the question, on the value of the constant term in the equation of the regression line of \(\sqrt{z}\) on \(v\). [2]
CAIE FP2 2010 November Q10
13 marks Standard +0.3
For each month of a certain year, a weather station recorded the average rainfall per day, \(x\) mm, and the average amount of sunshine per day, \(y\) hours. The results are summarised below. \(n = 12\), \(\Sigma x = 24.29\), \(\Sigma x^2 = 50.146\), \(\Sigma y = 45.8\), \(\Sigma y^2 = 211.16\), \(\Sigma xy = 88.415\).
  1. Find the mean values, \(\bar{x}\) and \(\bar{y}\). [1]
  2. Calculate the gradient of the line of regression of \(y\) on \(x\). [2]
  3. Use the answers to parts (i) and (ii) to obtain the equation of the line of regression of \(y\) on \(x\). [2]
  4. Find the product moment correlation coefficient and comment, in context, on its value. [4]
  5. Stating your hypotheses, test at the 1% level of significance whether there is negative correlation between average rainfall per day and average amount of sunshine per day. [4]
CAIE FP2 2014 November Q9
11 marks Standard +0.3
A random sample of 10 pairs of values of \(x\) and \(y\) is given in the following table.
\(x\)466827121495
\(y\)24686109865
  1. Find the equation of the regression line of \(y\) on \(x\). [4]
  2. Find the product moment correlation coefficient for the sample. [2]
  3. Find the estimated value of \(y\) when \(x = 10\), and comment on the reliability of this estimate. [2]
  4. Another sample of \(N\) pairs of data from the same population has the same product moment correlation coefficient as the first sample given. A test, at the 1% significance level, on this second sample indicates that there is sufficient evidence to conclude that there is positive correlation. Find the set of possible values of \(N\). [3]
Edexcel S1 2023 June Q2
13 marks Moderate -0.3
Two students, Olive and Shan, collect data on the weight, \(w\) grams, and the tail length, \(t\) cm, of 15 mice. Olive summarised the data as follows \(S_tt = 5.3173\) \quad \(\sum w^2 = 6089.12\) \quad \(\sum tw = 2304.53\) \quad \(\sum w = 297.8\) \quad \(\sum t = 114.8\)
  1. Calculate the value of \(S_{ww}\) and the value of \(S_{tw}\) [3]
  2. Calculate the value of the product moment correlation coefficient between \(w\) and \(t\) [2]
  3. Show that the equation of the regression line of \(w\) on \(t\) can be written as $$w = -16.7 + 4.77t$$ [3]
  4. Give an interpretation of the gradient of the regression line. [1]
  5. Explain why it would not be appropriate to use the regression line in part (c) to estimate the weight of a mouse with a tail length of 2cm. [2]
Shan decided to code the data using \(x = t - 6\) and \(y = \frac{w}{2} - 5\)
  1. Write down the value of the product moment correlation coefficient between \(x\) and \(y\) [1]
  2. Write down an equation of the regression line of \(y\) on \(x\) You do not need to simplify your equation. [1]
Edexcel S1 2002 January Q7
19 marks Moderate -0.3
A number of people were asked to guess the calorific content of 10 foods. The mean \(s\) of the guesses for each food and the true calorific content \(t\) are given in the table below.
Food\(t\)\(s\)
Packet of biscuits170420
1 potato90160
1 apple80110
Crisp breads1070
Chocolate bar260360
1 slice white bread75135
1 slice brown bread60115
Portion of beef curry270350
Portion of rice pudding165390
Half a pint of milk160200
[You may assume that \(\Sigma t = 1340\), \(\Sigma s = 2310\), \(\Sigma ts = 396775\), \(\Sigma t^2 = 246050\), \(\Sigma s^2 = 694650\).]
  1. Draw a scatter diagram, indicating clearly which is the explanatory (independent) and which is the response (dependent) variable. [3]
  2. Calculate, to 3 significant figures, the product moment correlation coefficient for the above data. [7]
  3. State, with a reason, whether or not the value of the product moment correlation coefficient changes if all the guesses are 50 calories higher than the values in the table. [2]
The mean of the guesses for the portion of rice pudding and for the packet of biscuits are outside the linear relation of the other eight foods.
  1. Find the equation of the regression line of \(s\) on \(t\) excluding the values for rice pudding and biscuits. [3]
[You may now assume that \(S_{tt} = 72587\), \(S_{st} = 63671.875\), \(\bar{t} = 125.625\), \(\bar{s} = 187.5\).]
  1. Draw the regression line on your scatter diagram. [2]
  2. State, with a reason, what the effect would be on the regression line of including the values for a portion of rice pudding and a packet of biscuits. [2]
Edexcel S1 2010 January Q6
18 marks Moderate -0.8
The blood pressures, \(p\) mmHg, and the ages, \(t\) years, of 7 hospital patients are shown in the table below.
PatientABCDEFG
\(t\)42744835562660
\(p\)981301208818280135
[\(\sum t = 341\), \(\sum p = 833\), \(\sum t^2 = 18181\), \(\sum p^2 = 106397\), \(\sum tp = 42948\)]
  1. Find \(S_{tt}\), \(S_{pp}\) and \(S_t\) for these data. [4]
  2. Calculate the product moment correlation coefficient for these data. [3]
  3. Interpret the correlation coefficient. [1]
  4. On the graph paper on page 17, draw the scatter diagram of blood pressure against age for these 7 patients. [2]
  5. Find the equation of the regression line of \(p\) on \(t\). [4]
  6. Plot your regression line on your scatter diagram. [2]
  7. Use your regression line to estimate the blood pressure of a 40 year old patient. [2]
Edexcel S1 2011 June Q7
12 marks Moderate -0.8
A teacher took a random sample of 8 children from a class. For each child the teacher recorded the length of their left foot, \(f\) cm, and their height, \(h\) cm. The results are given in the table below.
\(f\)2326232227242021
\(h\)135144134136140134130132
(You may use \(\sum f = 186 \quad \sum h = 1085 \quad S_{ff} = 39.5 \quad S_{hh} = 139.875 \quad \sum fh = 25291\))
  1. Calculate \(S_{fh}\) [2]
  2. Find the equation of the regression line of \(h\) on \(f\) in the form \(h = a + bf\). Give the value of \(a\) and the value of \(b\) correct to 3 significant figures. [5]
  3. Use your equation to estimate the height of a child with a left foot length of 25 cm. [2]
  4. Comment on the reliability of your estimate in (c), giving a reason for your answer. [2]
The left foot length of the teacher is 25 cm.
  1. Give a reason why the equation in (b) should not be used to estimate the teacher's height. [1]
Edexcel S1 2002 November Q5
12 marks Standard +0.3
An agricultural researcher collected data, in appropriate units, on the annual rainfall \(x\) and the annual yield of wheat \(y\) at 8 randomly selected places. The data were coded using \(s = x - 6\) and \(t = y - 20\) and the following summations were obtained. \(\Sigma s = 48.5\), \(\Sigma t = 65.0\), \(\Sigma s^2 = 402.11\), \(\Sigma t^2 = 701.80\), \(\Sigma st = 523.23\)
  1. Find the equation of the regression line of \(t\) on \(s\) in the form \(t = p + qs\). [7]
  2. Find the equation of the regression line of \(y\) on \(x\) in the form \(y = a + bx\), giving \(a\) and \(b\) to 3 decimal places. [3]
The value of the product moment correlation coefficient between \(s\) and \(t\) is 0.943, to 3 decimal places.
  1. Write down the value of the product moment correlation coefficient between \(x\) and \(y\). Give a justification for your answer. [2]
Edexcel S1 Specimen Q4
14 marks Moderate -0.3
A drilling machine can run at various speeds, but in general the higher the speed the sooner the drill needs to be replaced. Over several months, 15 pairs of observations relating to speed, \(s\) revolutions per minute, and life of drill, \(h\) hours, are collected. For convenience the data are coded so that \(x = s - 20\) and \(y = h - 100\) and the following summations obtained. \(\Sigma x = 143; \Sigma y = 391; \Sigma x^2 = 2413; \Sigma y^2 = 22441; \Sigma xy = 484\).
  1. Find the equation of the regression line of \(h\) on \(s\). [10]
  2. Interpret the slope of your regression line. [2]
Estimate the life of a drill revolving at 30 revolutions per minute. [2]
Edexcel S1 Q7
15 marks Moderate -0.3
The following data was collected for seven cars, showing their engine size, \(x\) litres, and their fuel consumption, \(y\) km per litre, on a long journey.
Car\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)
\(x\)0.951.201.371.762.252.502.875
\(y\)21.317.215.519.114.711.49.0
\(\sum x = 12.905\), \(\sum x^2 = 26.8951\), \(\sum y = 108.2\), \(\sum y^2 = 1781.64\), \(\sum xy = 183.176\).
  1. Calculate the equation of the regression line of \(x\) on \(y\), expressing your answer in the form \(x = ay + b\). [6 marks]
  2. Calculate the product moment correlation coefficient between \(y\) and \(x\) and give a brief interpretation of its value. [4 marks]
  3. Use the equation of the regression line to estimate the value of \(x\) when \(y = 12\). State, with a reason, how accurate you would expect this estimate to be. [3 marks]
  4. Comment on the use of the line to find values of \(x\) as \(y\) gets very small. [2 marks]
Edexcel S1 Q5
13 marks Standard +0.3
The following marks out of 50 were given by two judges to the contestants in a talent contest:
Contestant\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Judge 1 (\(x\))4332402147112938
Judge 2 (\(y\))3925402236132732
Given that \(\sum x = 261\), \(\sum x^2 = 9529\) and \(\sum xy = 8373\),
  1. calculate the product-moment correlation coefficient between the two judges' marks [5 marks]
  2. Find an equation of the regression line of \(x\) on \(y\). [4 marks]
Contestant \(I\) was awarded 45 marks by Judge 2.
  1. Estimate the mark that this contestant would have received from Judge 1. [2 marks]
  2. Comment, with explanation, on the probable accuracy of your answer. [2 marks]
Edexcel S1 Q6
21 marks Standard +0.3
A missile was fired vertically upwards and its height above ground level, \(h\) metres, was found at various times \(t\) seconds after it was released. The results are given in the following table:
\(t\)1234567
\(h\)68126174216240252266
It is thought that this data can be fitted to the formula \(h = pt - qt^2\).
  1. Show that this equation can be written as \(\frac{h}{t} = p - qt\). [1 mark]
  2. Plot a scatter diagram of \(\frac{h}{t}\) against \(t\). [5 marks]
Given that \(\sum h = 1342\), \(\sum \frac{h}{t} = 371\) and \(\sum \frac{h^2}{t^2} = 20385\),
  1. find the equation of the regression line of \(\frac{h}{t}\) on \(t\) and hence write down the values of \(p\) and \(q\). [8 marks]
  2. Use your equation to find the value of \(h\) when \(t = 10\). Comment on the implication of your answer. [3 marks]
  3. Find the product-moment correlation coefficient between \(\frac{h}{t}\) and \(t\) and state the significance of its value. [4 marks]
Edexcel S1 Q6
15 marks Standard +0.3
The marks out of 75 obtained by a group of ten students in their first and second Statistics modules were as follows:
Student\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
Module 1 \((x)\)\(54\)\(33\)\(42\)\(71\)\(60\)\(27\)\(39\)\(46\)\(59\)\(64\)
Module 2 \((y)\)\(50\)\(22\)\(44\)\(58\)\(42\)\(19\)\(35\)\(46\)\(55\)\(60\)
  1. Find \(\sum x\) and \(\sum y\). [2 marks]
Given that \(\sum x^2 = 26353\) and \(\sum xy = 22991\),
  1. obtain the equation of the regression line of \(y\) on \(x\). [5 marks]
  2. Estimate the Module 2 result of a student whose mark in Module 1 was (i) 65, (ii) 5. Explain why one of these estimates is less reliable than the other. [4 marks]
The equation of the regression line of \(x\) on \(y\) is \(x = 0.921y + 9.81\).
  1. Deduce the product moment correlation coefficient between \(x\) and \(y\), and briefly interpret its value. [4 marks]
Edexcel S1 Q6
13 marks Standard +0.3
Two variables \(x\) and \(y\) are such that, for a sample of ten pairs of values, $$\sum x = 104.5, \quad \sum y = 113.6, \quad \sum x^2 = 1954.1, \quad \sum y^2 = 2100.6.$$ The regression line of \(x\) on \(y\) has gradient 0.8. Find
  1. \(\sum xy\), [4 marks]
  2. the equation of the regression line of \(y\) on \(x\), [5 marks]
  3. the product moment correlation coefficient between \(y\) and \(x\). [3 marks]
  4. Describe the kind of correlation indicated by your answer to (c). [1 mark]
OCR S1 2010 January Q6
7 marks Standard +0.3
  1. A student calculated the values of the product moment correlation coefficient, \(r\), and Spearman's rank correlation coefficient, \(r_s\), for two sets of bivariate data, \(A\) and \(B\). His results are given below. $$A: \quad r = 0.9 \text{ and } r_s = 1$$ $$B: \quad r = 1 \quad \text{and } r_s = 0.9$$ With the aid of a diagram where appropriate, explain why the student's results for \(A\) could both be correct but his results for \(B\) cannot both be correct. [3]
  2. An old research paper has been partially destroyed. The surviving part of the paper contains the following incomplete information about some bivariate data from an experiment. \includegraphics{figure_6} The mean of \(x\) is 4.5. The equation of the regression line of \(y\) on \(x\) is \(y = 2.4x + 3.7\). The equation of the regression line of \(x\) on \(y\) is \(x = 0.40y\) + [missing constant] Calculate the missing constant at the end of the equation of the second regression line. [4]
OCR S1 2013 January Q3
12 marks Moderate -0.3
The Gross Domestic Product per Capita (GDP), \(x\) dollars, and the Infant Mortality Rate per thousand (IMR), \(y\), of 6 African countries were recorded and summarised as follows. \(n = 6\) \quad \(\sum x = 7000\) \quad \(\sum x^2 = 8700000\) \quad \(\sum y = 456\) \quad \(\sum y^2 = 36262\) \quad \(\sum xy = 509900\)
  1. Calculate the equation of the regression line of \(y\) on \(x\) for these 6 countries. [4]
The original data were plotted on a scatter diagram and the regression line of \(y\) on \(x\) was drawn, as shown below. \includegraphics{figure_3}
  1. The GDP for another country, Tanzania, is 1300 dollars. Use the regression line in the diagram to estimate the IMR of Tanzania. [1]
  2. The GDP for Nigeria is 2400 dollars. Give two reasons why the regression line is unlikely to give a reliable estimate for the IMR for Nigeria. [2]
  3. The actual value of the IMR for Tanzania is 96. The data for Tanzania (\(x = 1300, y = 96\)) is now included with the original 6 countries. Calculate the value of the product moment correlation coefficient, \(r\), for all 7 countries. [4]
  4. The IMR is now redefined as the infant mortality rate per hundred instead of per thousand, and the value of \(r\) is recalculated for all 7 countries. Without calculation state what effect, if any, this would have on the value of \(r\) found in part (iv). [1]
OCR S1 2009 June Q3
8 marks Moderate -0.3
In an agricultural experiment, the relationship between the amount of water supplied, \(x\) units, and the yield, \(y\) units, was investigated. Six values of \(x\) were chosen and for each value of \(x\) the corresponding value of \(y\) was measured. The results are shown in the table.
\(x\)123456
\(y\)36881110
These results, together with the regression line of \(y\) on \(x\), are plotted on the graph. \includegraphics{figure_1}
  1. Give a reason why the regression line of \(x\) on \(y\) is not suitable in this context. [1]
  2. Explain the significance, for the regression line of \(y\) on \(x\), of the distances shown by the vertical dotted lines in the diagram. [2]
  3. Calculate the value of the product moment correlation coefficient, \(r\). [3]
  4. Comment on your value of \(r\) in relation to the diagram. [2]