5.09a Dependent/independent variables

164 questions

Sort by: Default | Easiest first | Hardest first
OCR S1 2010 June Q3
10 marks Moderate -0.8
  1. Some values, \((x, y)\), of a bivariate distribution are plotted on a scatter diagram and a regression line is to be drawn. Explain how to decide whether the regression line of \(y\) on \(x\) or the regression line of \(x\) on \(y\) is appropriate. [2]
  2. In an experiment the temperature, \(x\) °C, of a rod was gradually increased from 0 °C, and the extension, \(y\), was measured nine times at 50 °C intervals. The results are summarised below. \(n = 9\) \quad \(\Sigma x = 1800\) \quad \(\Sigma y = 14.4\) \quad \(\Sigma x^2 = 510000\) \quad \(\Sigma y^2 = 32.6416\) \quad \(\Sigma xy = 4080\)
    1. Show that the gradient of the regression line of \(y\) on \(x\) is 0.008 and find the equation of this line. [4]
    2. Use your equation to estimate the temperature when the extension is 2.5 mm. [1]
    3. Use your equation to estimate the extension for a temperature of \(-50\) °C. [1]
    4. Comment on the meaning and the reliability of your estimate in part (c). [2]
OCR S1 2013 June Q5
9 marks Moderate -0.3
The table shows some of the values of the seasonally adjusted Unemployment Rate (UR), \(x\)\%, and the Consumer Price Index (CPI), \(y\)\%, in the United Kingdom from April 2008 to July 2010.
DateApril 2008July 2008October 2008January 2009April 2009July 2009October 2009January 2010April 2010July 2010
UR, \(x\)\%5.25.76.16.87.57.87.87.97.87.7
CPI, \(y\)\%3.04.44.53.02.31.81.53.53.73.1
These data are summarised below. $$n = 10 \quad \sum x = 70.3 \quad \sum x^2 = 503.45 \quad \sum y = 30.8 \quad \sum y^2 = 103.94 \quad \sum xy = 211.9$$
  1. Calculate the product moment correlation coefficient, \(r\), for the data, showing that \(-0.6 < r < -0.5\). [3]
  2. Karen says "The negative value of \(r\) shows that when the Unemployment Rate increases, it causes the Consumer Price Index to decrease." Give a criticism of this statement. [1]
    1. Calculate the equation of the regression line of \(x\) on \(y\). [3]
    2. Use your equation to estimate the value of the Unemployment Rate in a month when the Consumer Price Index is 4.0\%. [2]
Edexcel S1 Q6
17 marks Moderate -0.3
Penshop have stores selling stationary in each of 6 towns. The population, \(P\), in tens of thousands and the monthly turnover, \(T\), in thousands of pounds for each of the shops are as recorded below.
TownAbbertonBemberClasterDellerEdgetonFigland
\(P\) (0.000's)3.27.65.29.08.14.8
\(T\) (£ 000's)11.112.413.319.317.911.8
  1. Represent these data on a scatter diagram with \(T\) on the vertical axis. [4]
    1. Which town's shop might appear to be underachieving given the populations of the towns?
    2. Suggest two other factors that might affect each shop's turnover. [3]
You may assume that $$\Sigma P = 37.9, \quad \Sigma T = 85.8, \quad \Sigma P^2 = 264.69, \quad \Sigma T^2 = 1286, \quad \Sigma PT = 574.25.$$
  1. Find the equation of the regression line of \(T\) on \(P\). [7]
  2. Estimate the monthly turnover that might be expected if a shop were opened in Gratton, a town with a population of 68 000. [2]
  3. Why might the management of Penshop be reluctant to use the regression line to estimate the monthly turnover they could expect if a shop were opened in Haggin, a town with a population of 172 000? [1]
Edexcel S1 Q4
12 marks Standard +0.3
The owner of a mobile burger-bar believes that hot weather reduces his sales. To investigate the effect on his business he collected data on his daily sales, \(£P\), and the maximum temperature, \(T\)°C, on each of 20 days. He then coded the data, using \(x = T - 20\) and \(y = P - 300\), and calculated the summary statistics given below. $$\Sigma x = 57, \quad \Sigma y = 2222, \quad \Sigma x^2 = 401, \quad \Sigma y^2 = 305576, \quad \Sigma xy = 3871.$$
  1. Find an equation of the regression line of \(P\) on \(T\). [9 marks]
The owner of the bar doesn't believe it is profitable for him to run the bar if he takes less than £460 in a day.
  1. According to your regression line at what maximum daily temperature, to the nearest degree Celsius, does it become unprofitable for him to run the bar? [3 marks]
Edexcel S1 Q6
16 marks Moderate -0.3
The Principal of a school believes that more students are absent on days when the temperature is lower. Over a two-week period in December she records the percentage of students who are absent, \(A\%\), and the temperature, \(T°\)C, at 9 am each morning giving these results.
\(T\) (°C)4\(-3\)\(-2\)\(-6\)037\(-1\)32
\(A\) (\%)8.514.117.020.317.915.512.412.813.711.6
  1. Represent these data on a scatter diagram. [4 marks]
You may use $$\Sigma T = 7, \quad \Sigma A = 143.8, \quad \Sigma T^2 = 137, \quad \Sigma A^2 = 2172.66, \quad \Sigma TA = 20.7$$
  1. Calculate the product moment correlation coefficient for these data and comment on the Principal's hypothesis. [6 marks]
  2. Find an equation of the regression line of \(A\) on \(T\) in the form \(A = p + qT\). [4 marks]
  3. Draw the regression line on your scatter diagram. [2 marks]
OCR MEI S2 2007 January Q1
18 marks Moderate -0.8
In a science investigation into energy conservation in the home, a student is collecting data on the time taken for an electric kettle to boil as the volume of water in the kettle is varied. The student's data are shown in the table below, where \(v\) litres is the volume of water in the kettle and \(t\) seconds is the time taken for the kettle to boil (starting with the water at room temperature in each case). Also shown are summary statistics and a scatter diagram on which the regression line of \(t\) on \(v\) is drawn.
\(v\)0.20.40.60.81.0
\(t\)4478114156172
\(n = 5\), \(\Sigma v = 3.0\), \(\Sigma t = 564\), \(\Sigma v^2 = 2.20\), \(\Sigma vt = 405.2\). \includegraphics{figure_1}
  1. Calculate the equation of the regression line of \(t\) on \(v\), giving your answer in the form \(t = a + bv\). [5]
  2. Use this equation to predict the time taken for the kettle to boil when the amount of water which it contains is
    1. 0.5 litres,
    2. 1.5 litres.
    Comment on the reliability of each of these predictions. [4]
  3. In the equation of the regression line found in part (i), explain the role of the coefficient of \(v\) in the relationship between time taken and volume of water. [2]
  4. Calculate the values of the residuals for \(v = 0.8\) and \(v = 1.0\). [4]
  5. Explain how, on a scatter diagram with the regression line drawn accurately on it, a residual could be measured and its sign determined. [3]
WJEC Unit 2 2018 June Q05
6 marks Easy -1.2
A baker is aware that the pH of his sourdough, \(y\), and the hydration, \(x\), affect the taste and texture of the final product. The hydration is measured in ml of water per 100 g of flour (ml/100 g). The baker researches how the pH of his sourdough changes as the hydration changes. The results of his research are shown in the diagram below. \includegraphics{figure_5}
  1. Describe the relationship between pH and hydration. [2]
  2. The equation of the regression line for \(y\) on \(x\) is $$y = 5.4 - 0.02x.$$
    1. Interpret the gradient and intercept of the regression line in this context.
    2. Estimate the pH of the sourdough when the hydration is 20 ml/100 g. Comment on the reliability of this estimate. [4]
WJEC Unit 2 Specimen Q4
7 marks Easy -1.3
A researcher wishes to investigate the relationship between the amount of carbohydrate and the number of calories in different fruits. He compiles a list of 90 different fruits, e.g. apricots, kiwi fruits, raspberries. As he does not have enough time to collect data for each of the 90 different fruits, he decides to select a simple random sample of 14 different fruits from the list. For each fruit selected, he then uses a dieting website to find the number of calories (kcal) and the amount of carbohydrate (g) in a typical adult portion (e.g. a whole apple, a bunch of 10 grapes, half a cup of strawberries). He enters these data into a spreadsheet for analysis.
  1. Explain how the random number function on a calculator could be used to select this sample of 14 different fruits. [3]
  2. The scatter graph represents 'Number of calories' against 'Carbohydrate' for the sample of 14 different fruits.
    1. Describe the correlation between 'Number of calories' and 'Carbohydrate'. [1]
    2. Interpret the correlation between 'Number of calories' and 'Carbohydrate' in this context. [1]
    \includegraphics{figure_1}
  3. The equation of the regression line for this dataset is: 'Number of calories' = 12.4 + 2.9 × 'Carbohydrate'
    1. Interpret the gradient of the regression line in this context. [1]
    2. Explain why it is reasonable for the regression line to have a non-zero intercept in this context. [1]
WJEC Further Unit 2 2018 June Q7
7 marks Moderate -0.3
A university professor conducted some research into factors that affect job satisfaction. The four factors considered were Interesting work, Good wages, Job security and Appreciation of work done. The professor interviewed workers at 14 different companies and asked them to rate their companies on each of the factors. The workers' ratings were averaged to give each company a score out of 5 on each factor. Each company was also given a score out of 100 for Job satisfaction. The following graph shows the part of the research concerning Job Satisfaction versus Interesting work. \includegraphics{figure_2}
  1. Calculate the equation of the least squares regression line of Job satisfaction (\(y\)) on Interesting work (\(x\)), given the following summary statistics. [5] \(\sum x = 46 \cdot 2\), \quad \(\sum y = 898\), \quad \(S_{xx} = 3 \cdot 48\) \(S_{xy} = 49 \cdot 45\), \quad \(S_{yy} = 1437 \cdot 714\), \quad \(n = 14\)
  2. Give two reasons why it would be inappropriate for the professor to use this equation to calculate the score for Interesting work from a Job satisfaction score of 90. [2]
WJEC Further Unit 2 2023 June Q2
8 marks Moderate -0.3
For a set of 30 pairs of observations of the variables \(x\) and \(y\), it is known that \(\sum x = 420\) and \(\sum y = 240\). The least squares regression line of \(y\) on \(x\) passes through the point with coordinates \((19, 20)\).
  1. Show that the equation of the regression line of \(y\) on \(x\) is \(y = 2 \cdot 4x - 25 \cdot 6\) and use it to predict the value of \(y\) when \(x = 26\). [6]
  2. State two reasons why your prediction in part (a) may not be reliable. [2]
WJEC Further Unit 2 Specimen Q4
9 marks Moderate -0.8
A year 12 student wishes to study at a Welsh university. For a randomly chosen year between 2000 and 2017 she collected data for seven universities in Wales from the Complete University Guide website. The data are for the variables: • 'Entry standards' – the average UCAS tariff score of new undergraduate students; • 'Student satisfaction' – a measure of student views of the teaching quality at the university taken from the National Student Survey (maximum 5); • 'Graduate prospects' – a measure of the employability of a university's first degree graduates (maximum 100); • 'Research quality' – a measure of the quality of the research undertaken in the university (maximum 4).
  1. Pearson's product-moment correlation coefficients, for each pairing of the four variables, are shown in the table below. Discuss the correlation between graduate prospects and the other three variables. [2]
    VariableEntry standardsStudent satisfactionGraduate prospectsResearch quality
    Entry standards1
    Student satisfaction-0.0301
    Graduate prospects0.7720.2361
    Research quality0.8660.0660.8271
  2. Calculate the equation of the least squares regression line to predict 'Entry standards'(y) from 'Research quality'(x), given the summary statistics: $$\sum x = 22.24, \sum y = 2522, S_{xx} = 1.0542, S_{xy} = 20193.5, S_{yy} = 122.72.$$ [5]
  3. The data for one of the Welsh universities are missing. This university has a research quality of 3.00. Use your equation to predict the entry standard for this university. [2]
SPS SPS FM Statistics 2021 January Q3
7 marks Moderate -0.3
A large field of wheat is split into 8 plots of equal area. Each plot is treated with a different amount of fertiliser, \(f\) grams/m². The yield of wheat, \(w\) tonnes, from each plot is recorded. The results are summarised below. $$\sum f = 28 \quad \sum w = 303 \quad \sum w^2 = 13447 \quad S_{ff} = 42 \quad S_{fw} = 269.5$$
  1. Calculate the product moment correlation coefficient between \(f\) and \(w\) [2]
  2. Interpret the value of your product moment correlation coefficient. [1]
  3. Find the equation of the regression line of \(w\) on \(f\) in the form \(w = a + bf\) [3]
  4. Using your equation, estimate the decrease in yield when the amount of fertiliser decreases by 0.5 grams/m² [1]
OCR Further Statistics 2017 Specimen Q1
6 marks Moderate -0.8
The table below shows the typical stopping distances \(d\) metres for a particular car travelling at \(v\) miles per hour.
\(v\)203040506070
\(d\)132436527294
  1. State each of the following words that describe the variable \(v\). Independent \quad Dependent \quad Controlled \quad Response [1]
  2. Calculate the equation of the regression line of \(d\) on \(v\). [2]
  3. Use the equation found in part (ii) to estimate the typical stopping distance when this car is travelling at 45 miles per hour. [1]
It is given that the product moment correlation coefficient for the data is 0.990 correct to three significant figures.
  1. Explain whether your estimate found in part (iii) is reliable. [2]
Pre-U Pre-U 9794/3 2013 November Q4
6 marks Moderate -0.8
As part of a study into the effects of alcohol, volunteers have their reaction times measured after they have consumed various fixed amounts of alcohol. For a random sample of 12 volunteers the following information was collected.
Units of alcohol consumed23344.55.5667889
Reaction time (seconds)12553.85.54.88.57.26.898
  1. Which is the independent variable in this experiment? [1]
  2. Find the least squares regression line of \(y\) (Reaction time) on \(x\) (Units of alcohol), and use it to estimate the reaction time of someone who has consumed 5 units of alcohol. [5]