Calculate r from raw bivariate data

Questions that provide raw paired data values (x, y) in a table and ask to calculate the product moment correlation coefficient r, requiring computation of all summary statistics from scratch.

21 questions · Moderate -0.5

5.08a Pearson correlation: calculate pmcc
Sort by: Default | Easiest first | Hardest first
OCR S1 2007 January Q2
6 marks Moderate -0.8
2 The table contains data concerning five households selected at random from a certain town.
Number of people in the household23357
Number of cars belonging to people in the household11324
  1. Calculate the product moment correlation coefficient, \(r\), for the data in the table.
  2. Give a reason why it would not be sensible to use your answer to draw a conclusion about all the households in the town.
OCR S1 2016 June Q2
10 marks Moderate -0.3
2
  1. The table shows the amount, \(x\), in hundreds of pounds, spent on heating and the number of absences, \(y\), at a factory during each month in 2014.
    Amount, \(x\), spent on
    heating (£ hundreds)
    212319151452109201823
    Number of absences, \(y\)2325181812104911152026
    \(n = 12 \quad \Sigma x = 179 \quad \Sigma x ^ { 2 } = 3215 \quad \Sigma y = 191 \quad \Sigma y ^ { 2 } = 3565 \quad \Sigma x y = 3343\)
    1. Calculate \(r\), the product moment correlation coefficient, showing that \(r > 0.92\).
    2. A manager says, 'The value of \(r\) shows that spending more money on heating causes more absences, so we should spend less on heating.' Comment on this claim.
    3. The months in 2014 were numbered \(1,2,3 , \ldots , 12\). The output, \(z\), in suitable units was recorded along with the month number, \(n\), for each month in 2014. The equation of the regression line of \(z\) on \(n\) was found to be \(z = 0.6 n + 17\).
      (a) Use this equation to explain whether output generally increased or decreased over these months.
      (b) Find the mean of \(n\) and use the equation of the regression line to calculate the mean of \(z\).
    4. Hence calculate the total output in 2014.
OCR Further Statistics AS 2020 November Q1
5 marks Moderate -0.3
1 Five observations of bivariate data \(( x , y )\) are given in the table.
\(x\)781264
\(y\)201671723
  1. Find the value of Pearson's product-moment correlation coefficient.
  2. State what your answer to part (a) tells you about a scatter diagram representing the data.
  3. A new variable \(a\) is defined by \(\mathrm { a } = 3 \mathrm { x } + 4\). Dee says "The value of Pearson's product-moment correlation coefficient between \(a\) and \(y\) will not be the same as the answer to part (a)." State with a reason whether you agree with Dee.
AQA S1 2009 January Q2
7 marks Moderate -0.3
2 A greengrocer sells bunches of 9 carrots at his Saturday market stall. Tom and Geri are two Statistics students who work on the stall. Each selects a bunch of carrots at random.
  1. At home, Tom measures the length, \(x\) centimetres, and the maximum diameter, \(y\) centimetres, of each carrot in his selected bunch with the following results.
    \(\boldsymbol { x }\)16.213.110.412.114.69.711.813.617.3
    \(\boldsymbol { y }\)4.23.94.73.33.72.43.13.52.7
    1. Calculate the value of the product moment correlation coefficient.
    2. Interpret your value in context.
  2. At her home, Geri measures the length, in centimetres, and the weight, in grams, of each carrot in her selected bunch and then obtains a value of - 0.986 for the product moment correlation coefficient. Comment, with a reason, on the likely validity of Geri's value.
AQA S1 2007 June Q1
5 marks Moderate -0.8
1 The table shows the length, in centimetres, and maximum diameter, in centimetres, of each of 10 honeydew melons selected at random from those on display at a market stall.
Length24251928272135233226
Maximum diameter18141611131412161514
  1. Calculate the value of the product moment correlation coefficient.
  2. Interpret your value in the context of this question.
AQA S1 2009 June Q2
10 marks Moderate -0.8
2 Hermione, who is studying reptiles, measures the length, \(x \mathrm {~cm}\), and the weight, \(y\) grams, of a sample of 11 adult snakes of the same type. Her results are shown in the table.
AQA S1 2010 June Q1
5 marks Moderate -0.5
1 The weight, \(x \mathrm {~kg}\), and the engine power, \(y \mathrm { bhp }\), of each car in a random sample of 10 hatchback cars are shown in the table.
\(\boldsymbol { x }\)1196106213351429101213551145141712751284
\(\boldsymbol { y }\)123881501586912094143107128
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret your value in the context of the question.
    \includegraphics[max width=\textwidth, alt={}]{c4844a30-6a86-49e3-b6aa-8e213dfc8ca1-03_2484_1709_223_153}
AQA S1 2014 June Q5
13 marks Moderate -0.5
5 As part of a study of charity shops in a small market town, two such shops, \(X\) and \(Y\), were each asked to provide details of its takings on 12 randomly selected days. The table shows, for each of the 12 days, the day's takings, \(\pounds x\), of charity shop \(X\) and the day's takings, \(\pounds y\), of charity shop \(Y\).
Day\(\mathbf { A }\)\(\mathbf { B }\)\(\mathbf { C }\)\(\mathbf { D }\)\(\mathbf { E }\)\(\mathbf { F }\)\(\mathbf { G }\)\(\mathbf { H }\)\(\mathbf { I }\)\(\mathbf { J }\)\(\mathbf { K }\)\(\mathbf { L }\)
\(\boldsymbol { x }\)4657391166277416115536861
\(\boldsymbol { y }\)781026621498729813421679583
    1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Interpret your value in the context of this question.
  1. Complete the scatter diagram shown on the opposite page.
  2. The investigator realised subsequently that one of the 12 selected days was a particularly popular town market day and another was a day on which the weather was extremely severe. Identify each of these days giving a reason for each choice.
  3. Removing the two days described in part (c) from the data gives the following information. $$S _ { x x } = 1292.5 \quad S _ { y y } = 3850.1 \quad S _ { x y } = 407.5$$
    1. Use this information to recalculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Hence revise, as necessary, your interpretation in part (a)(ii).
      [0pt] [3 marks] Shop \(X\) takings(£) \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{harity Shops} \includegraphics[alt={},max width=\textwidth]{ddf7f158-b6ae-42c6-98f1-d59c205646ad-17_33_21_294_1617}
      \end{figure} \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{harity Shops} \includegraphics[alt={},max width=\textwidth]{ddf7f158-b6ae-42c6-98f1-d59c205646ad-17_49_24_276_1710}
      \end{figure}
      \includegraphics[max width=\textwidth, alt={}]{ddf7f158-b6ae-42c6-98f1-d59c205646ad-17_1304_415_406_1391}
AQA S1 2016 June Q1
5 marks Moderate -0.8
1 The table shows the heights, \(x \mathrm {~cm}\), and the arm spans, \(y \mathrm {~cm}\), of a random sample of 12 men aged between 21 years and 40 years.
\(\boldsymbol { x }\)152166154159179167155168174182161163
\(\boldsymbol { y }\)143154151153168160146163170175155158
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret, in context, your value calculated in part (a).
OCR MEI Further Statistics A AS 2018 June Q3
12 marks Standard +0.3
3 Samples of water are taken from 10 randomly chosen wells in an area of a country. A researcher is investigating whether there is any relationship between the levels of dissolved oxygen, \(x\), and the amounts of radium, \(y\), in the water from the wells. Both quantities are measured in suitable units. The table and the scatter diagram in Fig. 3 show the values of \(x\) and \(y\) for the ten wells.
\(x\)45.948.352.264.666.667.669.375.077.482.8
\(y\)25.423.926.618.818.919.016.816.317.817.2
\begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{e3ac0ba0-9692-4018-894e-2b04b07eaf32-3_865_786_657_635} \captionsetup{labelformat=empty} \caption{Fig. 3}
\end{figure}
  1. Explain why it may not be appropriate to carry out a hypothesis test based on the product moment correlation coefficient.
  2. Calculate Spearman's rank correlation coefficient for these data.
  3. Using this value of Spearman's rank correlation coefficient, carry out a hypothesis test at the 1\% significance level to investigate whether there is any association between \(x\) and \(y\).
  4. Explain the meaning of the term 'significance level' in the context of the test carried out in part (iii).
OCR Further Statistics 2018 December Q5
10 marks Moderate -0.3
5 The birth rate, \(x\) per thousand members of the population, and the life expectancy at birth, \(y\) years, in 14 randomly selected African countries are given in the table.
Country\(x\)\(y\)Country\(x\)\(y\)
Benin4.859.2Mozambique5.454.63
Cameroon4.754.87Nigeria5.752.29
Congo4.961.42Senegal5.165.81
Gambia5.759.83Somalia6.554.88
Liberia4.760.25Sudan4.463.08
Malawi5.160.97Uganda5.857.25
Mauretania4.662.77Zambia5.458.75
\(n = 14 , \sum x = 72.8 , \sum y = 826 , \sum x ^ { 2 } = 392.96 , \sum y ^ { 2 } = 48924.54 , \sum x y = 4279.16\)
  1. Calculate Pearson's product-moment correlation coefficient \(r\) for the data.
  2. State what would be the effect on the value of \(r\) if the birth rate were given per hundred and not per thousand.
  3. Explain what the sign of \(r\) tells you about the relationship between life expectancy and birth rate for these countries.
  4. Test at the \(5 \%\) significance level whether there is correlation between birth rate and life expectancy at birth in African countries.
  5. A researcher wants to estimate the life expectancy at birth in Zimbabwe, where the birth rate is 3.9 per thousand. Explain whether a reliable estimate could be obtained using the regression line of \(y\) on \(x\) for the given data.
Edexcel S1 2003 June Q3
10 marks Moderate -0.8
3. A company owns two petrol stations \(P\) and \(Q\) along a main road. Total daily sales in the same week for \(P ( \pounds p )\) and for \(Q ( \pounds q )\) are summarised in the table below.
\(p\)\(q\)
Monday47605380
Tuesday53954460
Wednesday58404640
Thursday46505450
Friday53654340
Saturday49905550
Sunday43655840
When these data are coded using \(x = \frac { p - 4365 } { 100 }\) and \(y = \frac { q - 4340 } { 100 }\), $$\Sigma x = 48.1 , \Sigma y = 52.8 , \Sigma x ^ { 2 } = 486.44 , \Sigma y ^ { 2 } = 613.22 \text { and } \Sigma x y = 204.95 .$$
  1. Calculate \(S _ { x y } , S _ { x x }\) and \(S _ { y y }\).
  2. Calculate, to 3 significant figures, the value of the product moment correlation coefficient between \(x\) and \(y\).
    1. Write down the value of the product moment correlation coefficient between \(p\) and \(q\).
    2. Give an interpretation of this value.
AQA S1 2005 January Q1
7 marks Moderate -0.3
1 Each Monday, Azher has a stall at a town's outdoor market. The table below shows, for each of a random sample of 10 Mondays during 2003, the air temperature, \(x ^ { \circ } \mathrm { C }\), at 9 am and Azher's takings, £y.
Monday\(\mathbf { 1 }\)\(\mathbf { 2 }\)\(\mathbf { 3 }\)\(\mathbf { 4 }\)\(\mathbf { 5 }\)\(\mathbf { 6 }\)\(\mathbf { 7 }\)\(\mathbf { 8 }\)\(\mathbf { 9 }\)\(\mathbf { 1 0 }\)
\(\boldsymbol { x }\)2691813712134
\(\boldsymbol { y }\)9710313624512178145128141312
  1. A scatter diagram of these data is shown below. \includegraphics[max width=\textwidth, alt={}, center]{7faa4a2d-f5cc-4cc3-a3a9-5d8290ceabdc-2_901_1068_1078_447} Give two distinct comments, in context, on what this diagram reveals.
  2. One of the Mondays is found to be Easter Monday, the busiest Monday market of the year. Identify which Monday this is most likely to be.
  3. Removing the data for the Monday you identified in part (b), calculate the value of the product moment correlation coefficient for the remaining 9 pairs of values of \(x\) and \(y\).
  4. Name one other variable that would have been likely to affect Azher's takings at this town's outdoor market.
    (l mark)
AQA S1 2010 January Q7
13 marks Standard +0.3
7 [Figure 1, printed on the insert, is provided for use in this question.]
Harold considers himself to be an expert in assessing the auction value of antiques. He regularly visits car boot sales to buy items that he then sells at his local auction rooms. Harold's father, Albert, who is not convinced of his son's expertise, collects the following data from a random sample of 12 items bought by Harold.
ItemPurchase price (£ \(\boldsymbol { x }\) )Auction price (£ y)
A2030
B3545
C1825
D5050
E4538
F5545
G4350
H8190
I9085
J30190
K5765
L11225
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret your value in the context of this question.
    1. On Figure 1, complete the scatter diagram for these data.
    2. Comment on what this reveals.
  3. When items J and L are omitted from the data, it is found that $$S _ { x x } = 4854.4 \quad S _ { y y } = 4216.1 \quad S _ { x y } = 4268.8$$
    1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\) for the remaining 10 items.
    2. Hence revise as necessary your interpretation in part (b).
AQA S1 2005 June Q1
6 marks Easy -1.2
1 For each of a random sample of 10 customers, a store records the time, \(x\) minutes, spent shopping and the value, \(\pounds y\), to the nearest 10 p, of items purchased. The results are tabulated below.
Time (x)1345109172316216
Value (y)12.55.72.318.47.917.117.918.68.321.3
    1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Interpret your value in context.
  1. Write down the value of the product moment correlation coefficient if the time had been recorded in seconds and the value in pence to the nearest 10p.
AQA S1 2006 June Q1
8 marks Moderate -0.3
1 The table shows, for each of a random sample of 8 paperback fiction books, the number of pages, \(x\), and the recommended retail price, \(\pounds y\), to the nearest 10 p.
\(\boldsymbol { x }\)223276374433564612704766
\(\boldsymbol { y }\)6.504.005.508.004.505.008.005.50
    1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Interpret your value in the context of this question.
    3. Suggest one other variable, in addition to the number of pages, which may affect the recommended retail price of a paperback fiction book.
  1. The same 8 books were later included in a book sale. The value of the product moment correlation coefficient between the number of pages and the sale price was 0.959 , correct to three decimal places. What can be concluded from this value?
AQA S1 2015 June Q3
11 marks Moderate -0.5
3 Fourteen candidates each sat two test papers, Paper 1 and Paper 2, on the same day. The marks, out of a total of 50, achieved by the students on each paper are shown in the table.
AQA S1 2015 June Q1
4 marks Moderate -0.8
1
The table shows the annual gas consumption, \(x \mathrm { kWh }\), and the annual electricity consumption, \(y \mathrm { kWh }\), for a sample of 10 bungalows of similar size and occupancy.
\(\boldsymbol { x }\)21371185211522217312198542356120738221111789724523
\(\boldsymbol { y }\)2281232722212378278728563078264725662559
$$S _ { x x } = 76581640 \quad S _ { y y } = 694250 \quad S _ { x y } = 3629670$$
  1. Calculate the value of \(r _ { x y }\), the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret your value of \(r _ { x y }\) in the context of this question.
OCR S1 Q8
13 marks Moderate -0.3
8 The table shows the population, \(x\) million, of each of nine countries in Western Europe together with the population, \(y\) million, of its capital city.
GermanyUnited KingdomFranceItalySpainThe NetherlandsPortugalAustriaSwitzerland
\(x\)82.159.259.156.739.215.99.98.17.3
\(y\)3.57.09.02.72.90.80.71.60.1
$$\left[ n = 9 , \Sigma x = 337.5 , \Sigma x ^ { 2 } = 18959.11 , \Sigma y = 28.3 , \Sigma y ^ { 2 } = 161.65 , \Sigma x y = 1533.76 . \right]$$
  1. (a) Calculate Spearman's rank correlation coefficient, \(r _ { s }\).
    (b) Explain what your answer indicates about the populations of these countries and their capital cities.
  2. Calculate the product moment correlation coefficient, \(r\). The data are illustrated in the scatter diagram. \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-09_936_881_1162_632}
  3. By considering the diagram, state the effect on the value of the product moment correlation coefficient, \(r\), if the data for France and the United Kingdom were removed from the calculation.
  4. In a certain country in Africa, most people live in remote areas and hence the population of the country is unknown. However, the population of the capital city is known to be approximately 1 million. An official suggests that the population of this country could be estimated by using a regression line drawn on the above scatter diagram.
    (a) State, with a reason, whether the regression line of \(y\) on \(x\) or the regression line of \(x\) on \(y\) would need to be used.
    (b) Comment on the reliability of such an estimate in this situation. 1 Some observations of bivariate data were made and the equations of the two regression lines were found to be as follows. $$\begin{array} { c c } y \text { on } x : & y = - 0.6 x + 13.0 \\ x \text { on } y : & x = - 1.6 y + 21.0 \end{array}$$
  5. State, with a reason, whether the correlation between \(x\) and \(y\) is negative or positive.
  6. Neither variable is controlled. Calculate an estimate of the value of \(x\) when \(y = 7.0\).
  7. Find the values of \(\bar { x }\) and \(\bar { y }\). 2 A bag contains 5 black discs and 3 red discs. A disc is selected at random from the bag. If it is red it is replaced in the bag. If it is black, it is not replaced. A second disc is now selected at random from the bag. Find the probability that
  8. the second disc is black, given that the first disc was black,
  9. the second disc is black,
  10. the two discs are of different colours. 3 Each of the 7 letters in the word DIVIDED is printed on a separate card. The cards are arranged in a row.
  11. How many different arrangements of the letters are possible?
  12. In how many of these arrangements are all three Ds together? The 7 cards are now shuffled and 2 cards are selected at random, without replacement.
  13. Find the probability that at least one of these 2 cards has D printed on it. 4
  14. The random variable \(X\) has the distribution \(\mathrm { B } ( 25,0.2 )\). Using the tables of cumulative binomial probabilities, or otherwise, find \(\mathrm { P } ( X \geqslant 5 )\).
  15. The random variable \(Y\) has the distribution \(\mathrm { B } ( 10,0.27 )\). Find \(\mathrm { P } ( Y = 3 )\).
  16. The random variable \(Z\) has the distribution \(B ( n , 0.27 )\). Find the smallest value of \(n\) such that \(\mathrm { P } ( Z \geqslant 1 ) > 0.95\). 5 The probability distribution of a discrete random variable, \(X\), is given in the table.
    \(x\)0123
    \(\mathrm { P } ( X = x )\)\(\frac { 1 } { 3 }\)\(\frac { 1 } { 4 }\)\(p\)\(q\)
    It is given that the expectation, \(\mathrm { E } ( X )\), is \(1 \frac { 1 } { 4 }\).
  17. Calculate the values of \(p\) and \(q\).
  18. Calculate the standard deviation of \(X\).
OCR Further Statistics 2021 June Q2
12 marks Standard +0.3
2 A book collector compared the prices of some books, \(\pounds x\), when new in 1972 and the prices of copies of the same books, \(\pounds y\), on a second-hand website in 2018.
The results are shown in Table 1 and are summarised below the table. \begin{table}[h]
BookABCDEFGHIJKL
\(x\)0.950.650.700.900.551.401.500.501.150.350.200.35
\(y\)6.067.002.005.874.005.367.192.503.008.291.372.00
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table} $$n = 12 , \Sigma x = 9.20 , \Sigma y = 54.64 , \Sigma x ^ { 2 } = 8.9950 , \Sigma y ^ { 2 } = 310.4572 , \Sigma x y = 46.0545$$
  1. It is given that the value of Pearson's product-moment correlation coefficient for the data is 0.381 , correct to 3 significant figures.
    1. State what this information tells you about a scatter diagram illustrating the data.
    2. Test at the \(5 \%\) significance level whether there is evidence of positive correlation between prices in 1972 and prices in 2018.
  2. The collector noticed that the second-hand copy of book J was unusually expensive and he decided to ignore the data for book J. Calculate the value of Pearson's product-moment correlation coefficient for the other 11 books.
Pre-U Pre-U 9794/3 2016 June Q1
4 marks Moderate -0.8
The following data refer to the annual rate of inflation and the annual percentage pay increase measured on 10 randomly chosen occasions.
Inflation rate (\%)0.91.21.61.51.73.04.13.72.84.2
Pay increase (\%)4.84.73.84.45.65.52.40.40.61.7
Show that, for these data, the product moment correlation coefficient between the rate of inflation and the annual pay increase is \(-0.679\), correct to 3 significant figures. [4]