Calculate r from raw bivariate data

Questions that provide raw paired data values (x, y) in a table and ask to calculate the product moment correlation coefficient r, requiring computation of all summary statistics from scratch.

22 questions

OCR S1 2007 January Q2
2 The table contains data concerning five households selected at random from a certain town.
Number of people in the household23357
Number of cars belonging to people in the household11324
  1. Calculate the product moment correlation coefficient, \(r\), for the data in the table.
  2. Give a reason why it would not be sensible to use your answer to draw a conclusion about all the households in the town.
OCR S1 2016 June Q2
2
  1. The table shows the amount, \(x\), in hundreds of pounds, spent on heating and the number of absences, \(y\), at a factory during each month in 2014.
    Amount, \(x\), spent on
    heating (£ hundreds)
    212319151452109201823
    Number of absences, \(y\)2325181812104911152026
    \(n = 12 \quad \Sigma x = 179 \quad \Sigma x ^ { 2 } = 3215 \quad \Sigma y = 191 \quad \Sigma y ^ { 2 } = 3565 \quad \Sigma x y = 3343\)
    (a) Calculate \(r\), the product moment correlation coefficient, showing that \(r > 0.92\).
    (b) A manager says, 'The value of \(r\) shows that spending more money on heating causes more absences, so we should spend less on heating.' Comment on this claim.
  2. The months in 2014 were numbered \(1,2,3 , \ldots , 12\). The output, \(z\), in suitable units was recorded along with the month number, \(n\), for each month in 2014. The equation of the regression line of \(z\) on \(n\) was found to be \(z = 0.6 n + 17\).
    (a) Use this equation to explain whether output generally increased or decreased over these months.
    (b) Find the mean of \(n\) and use the equation of the regression line to calculate the mean of \(z\).
    (c) Hence calculate the total output in 2014.
OCR S1 2013 June Q5
5 The table shows some of the values of the seasonally adjusted Unemployment Rate (UR), \(x \%\), and the Consumer Price Index (CPI), \(y \%\), in the United Kingdom from April 2008 to July 2010.
DateApril 2008July 2008October 2008January 2009April 2009July 2009October 2009January 2010April 2010July 2010
UR, \(x \%\)5.25.76.16.87.57.87.87.97.87.7
CPI, \(y \%\)3.04.44.53.02.31.81.53.53.73.1
These data are summarised below. $$n = 10 \quad \sum x = 70.3 \quad \sum x ^ { 2 } = 503.45 \quad \sum y = 30.8 \quad \sum y ^ { 2 } = 103.94 \quad \sum x y = 211.9$$
  1. Calculate the product moment correlation coefficient, \(r\), for the data, showing that \(- 0.6 < r < - 0.5\).
  2. Karen says "The negative value of \(r\) shows that when the Unemployment Rate increases, it causes the Consumer Price Index to decrease." Give a criticism of this statement.
  3. (a) Calculate the equation of the regression line of \(x\) on \(y\).
    (b) Use your equation to estimate the value of the Unemployment Rate in a month when the Consumer Price Index is 4.0\%.
OCR Further Statistics AS 2020 November Q1
1 Five observations of bivariate data \(( x , y )\) are given in the table.
\(x\)781264
\(y\)201671723
  1. Find the value of Pearson's product-moment correlation coefficient.
  2. State what your answer to part (a) tells you about a scatter diagram representing the data.
  3. A new variable \(a\) is defined by \(\mathrm { a } = 3 \mathrm { x } + 4\). Dee says "The value of Pearson's product-moment correlation coefficient between \(a\) and \(y\) will not be the same as the answer to part (a)." State with a reason whether you agree with Dee.
AQA S1 2009 January Q2
2 A greengrocer sells bunches of 9 carrots at his Saturday market stall. Tom and Geri are two Statistics students who work on the stall. Each selects a bunch of carrots at random.
  1. At home, Tom measures the length, \(x\) centimetres, and the maximum diameter, \(y\) centimetres, of each carrot in his selected bunch with the following results.
    \(\boldsymbol { x }\)16.213.110.412.114.69.711.813.617.3
    \(\boldsymbol { y }\)4.23.94.73.33.72.43.13.52.7
    1. Calculate the value of the product moment correlation coefficient.
    2. Interpret your value in context.
  2. At her home, Geri measures the length, in centimetres, and the weight, in grams, of each carrot in her selected bunch and then obtains a value of - 0.986 for the product moment correlation coefficient. Comment, with a reason, on the likely validity of Geri's value.
AQA S1 2007 June Q1
1 The table shows the length, in centimetres, and maximum diameter, in centimetres, of each of 10 honeydew melons selected at random from those on display at a market stall.
Length24251928272135233226
Maximum diameter18141611131412161514
  1. Calculate the value of the product moment correlation coefficient.
  2. Interpret your value in the context of this question.
AQA S1 2009 June Q2
2 Hermione, who is studying reptiles, measures the length, \(x \mathrm {~cm}\), and the weight, \(y\) grams, of a sample of 11 adult snakes of the same type. Her results are shown in the table.
AQA S1 2010 June Q1
1 The weight, \(x \mathrm {~kg}\), and the engine power, \(y \mathrm { bhp }\), of each car in a random sample of 10 hatchback cars are shown in the table.
\(\boldsymbol { x }\)1196106213351429101213551145141712751284
\(\boldsymbol { y }\)123881501586912094143107128
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret your value in the context of the question.
    \includegraphics[max width=\textwidth, alt={}]{c4844a30-6a86-49e3-b6aa-8e213dfc8ca1-03_2484_1709_223_153}
AQA S1 2014 June Q5
3 marks
5 As part of a study of charity shops in a small market town, two such shops, \(X\) and \(Y\), were each asked to provide details of its takings on 12 randomly selected days. The table shows, for each of the 12 days, the day's takings, \(\pounds x\), of charity shop \(X\) and the day's takings, \(\pounds y\), of charity shop \(Y\).
Day\(\mathbf { A }\)\(\mathbf { B }\)\(\mathbf { C }\)\(\mathbf { D }\)\(\mathbf { E }\)\(\mathbf { F }\)\(\mathbf { G }\)\(\mathbf { H }\)\(\mathbf { I }\)\(\mathbf { J }\)\(\mathbf { K }\)\(\mathbf { L }\)
\(\boldsymbol { x }\)4657391166277416115536861
\(\boldsymbol { y }\)781026621498729813421679583
    1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Interpret your value in the context of this question.
  1. Complete the scatter diagram shown on the opposite page.
  2. The investigator realised subsequently that one of the 12 selected days was a particularly popular town market day and another was a day on which the weather was extremely severe. Identify each of these days giving a reason for each choice.
  3. Removing the two days described in part (c) from the data gives the following information. $$S _ { x x } = 1292.5 \quad S _ { y y } = 3850.1 \quad S _ { x y } = 407.5$$
    1. Use this information to recalculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Hence revise, as necessary, your interpretation in part (a)(ii).
      [0pt] [3 marks] Shop \(X\) takings(£) \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{harity Shops} \includegraphics[alt={},max width=\textwidth]{ddf7f158-b6ae-42c6-98f1-d59c205646ad-17_33_21_294_1617}
      \end{figure} \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{harity Shops} \includegraphics[alt={},max width=\textwidth]{ddf7f158-b6ae-42c6-98f1-d59c205646ad-17_49_24_276_1710}
      \end{figure}
      \includegraphics[max width=\textwidth, alt={}]{ddf7f158-b6ae-42c6-98f1-d59c205646ad-17_1304_415_406_1391}
AQA S1 2016 June Q1
1 The table shows the heights, \(x \mathrm {~cm}\), and the arm spans, \(y \mathrm {~cm}\), of a random sample of 12 men aged between 21 years and 40 years.
\(\boldsymbol { x }\)152166154159179167155168174182161163
\(\boldsymbol { y }\)143154151153168160146163170175155158
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret, in context, your value calculated in part (a).
OCR MEI Further Statistics A AS 2018 June Q3
3 Samples of water are taken from 10 randomly chosen wells in an area of a country. A researcher is investigating whether there is any relationship between the levels of dissolved oxygen, \(x\), and the amounts of radium, \(y\), in the water from the wells. Both quantities are measured in suitable units. The table and the scatter diagram in Fig. 3 show the values of \(x\) and \(y\) for the ten wells.
\(x\)45.948.352.264.666.667.669.375.077.482.8
\(y\)25.423.926.618.818.919.016.816.317.817.2
\begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{e3ac0ba0-9692-4018-894e-2b04b07eaf32-3_865_786_657_635} \captionsetup{labelformat=empty} \caption{Fig. 3}
\end{figure}
  1. Explain why it may not be appropriate to carry out a hypothesis test based on the product moment correlation coefficient.
  2. Calculate Spearman's rank correlation coefficient for these data.
  3. Using this value of Spearman's rank correlation coefficient, carry out a hypothesis test at the 1\% significance level to investigate whether there is any association between \(x\) and \(y\).
  4. Explain the meaning of the term 'significance level' in the context of the test carried out in part (iii).
SPS SPS FM Statistics 2022 January Q4
4. The strength of beams compared against the moisture content of the beam is indicated in the following table.
Strength21.122.723.121.522.422.621.121.721.021.4
Moisture
content
11.18.98.88.98.89.910.710.510.510.7
a. Use your calculator to write down the value of the product moment correlation coefficient for these data.
b. Perform a two-tailed test, at the \(5 \%\) level of significance, to investigate whether there is correlation between strength and moisture content.
c. Use your calculator to write down the equation of the regression line of strength on moisture content.
d. Use the regression line to estimate the strength of a beam that has a moisture content of 9.5.
OCR Further Statistics 2018 December Q5
5 The birth rate, \(x\) per thousand members of the population, and the life expectancy at birth, \(y\) years, in 14 randomly selected African countries are given in the table.
Country\(x\)\(y\)Country\(x\)\(y\)
Benin4.859.2Mozambique5.454.63
Cameroon4.754.87Nigeria5.752.29
Congo4.961.42Senegal5.165.81
Gambia5.759.83Somalia6.554.88
Liberia4.760.25Sudan4.463.08
Malawi5.160.97Uganda5.857.25
Mauretania4.662.77Zambia5.458.75
\(n = 14 , \sum x = 72.8 , \sum y = 826 , \sum x ^ { 2 } = 392.96 , \sum y ^ { 2 } = 48924.54 , \sum x y = 4279.16\)
  1. Calculate Pearson's product-moment correlation coefficient \(r\) for the data.
  2. State what would be the effect on the value of \(r\) if the birth rate were given per hundred and not per thousand.
  3. Explain what the sign of \(r\) tells you about the relationship between life expectancy and birth rate for these countries.
  4. Test at the \(5 \%\) significance level whether there is correlation between birth rate and life expectancy at birth in African countries.
  5. A researcher wants to estimate the life expectancy at birth in Zimbabwe, where the birth rate is 3.9 per thousand. Explain whether a reliable estimate could be obtained using the regression line of \(y\) on \(x\) for the given data.
Edexcel S1 2003 June Q3
3. A company owns two petrol stations \(P\) and \(Q\) along a main road. Total daily sales in the same week for \(P ( \pounds p )\) and for \(Q ( \pounds q )\) are summarised in the table below.
\(p\)\(q\)
Monday47605380
Tuesday53954460
Wednesday58404640
Thursday46505450
Friday53654340
Saturday49905550
Sunday43655840
When these data are coded using \(x = \frac { p - 4365 } { 100 }\) and \(y = \frac { q - 4340 } { 100 }\), $$\Sigma x = 48.1 , \Sigma y = 52.8 , \Sigma x ^ { 2 } = 486.44 , \Sigma y ^ { 2 } = 613.22 \text { and } \Sigma x y = 204.95 .$$
  1. Calculate \(S _ { x y } , S _ { x x }\) and \(S _ { y y }\).
  2. Calculate, to 3 significant figures, the value of the product moment correlation coefficient between \(x\) and \(y\).
    1. Write down the value of the product moment correlation coefficient between \(p\) and \(q\).
    2. Give an interpretation of this value.
AQA S1 2005 January Q1
1 Each Monday, Azher has a stall at a town's outdoor market. The table below shows, for each of a random sample of 10 Mondays during 2003, the air temperature, \(x ^ { \circ } \mathrm { C }\), at 9 am and Azher's takings, £y.
Monday\(\mathbf { 1 }\)\(\mathbf { 2 }\)\(\mathbf { 3 }\)\(\mathbf { 4 }\)\(\mathbf { 5 }\)\(\mathbf { 6 }\)\(\mathbf { 7 }\)\(\mathbf { 8 }\)\(\mathbf { 9 }\)\(\mathbf { 1 0 }\)
\(\boldsymbol { x }\)2691813712134
\(\boldsymbol { y }\)9710313624512178145128141312
  1. A scatter diagram of these data is shown below.
    \includegraphics[max width=\textwidth, alt={}, center]{7faa4a2d-f5cc-4cc3-a3a9-5d8290ceabdc-2_901_1068_1078_447} Give two distinct comments, in context, on what this diagram reveals.
  2. One of the Mondays is found to be Easter Monday, the busiest Monday market of the year. Identify which Monday this is most likely to be.
  3. Removing the data for the Monday you identified in part (b), calculate the value of the product moment correlation coefficient for the remaining 9 pairs of values of \(x\) and \(y\).
  4. Name one other variable that would have been likely to affect Azher's takings at this town's outdoor market.
    (l mark)
AQA S1 2010 January Q7
7 [Figure 1, printed on the insert, is provided for use in this question.]
Harold considers himself to be an expert in assessing the auction value of antiques. He regularly visits car boot sales to buy items that he then sells at his local auction rooms. Harold's father, Albert, who is not convinced of his son's expertise, collects the following data from a random sample of 12 items bought by Harold.
ItemPurchase price (£ \(\boldsymbol { x }\) )Auction price (£ y)
A2030
B3545
C1825
D5050
E4538
F5545
G4350
H8190
I9085
J30190
K5765
L11225
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret your value in the context of this question.
    1. On Figure 1, complete the scatter diagram for these data.
    2. Comment on what this reveals.
  3. When items J and L are omitted from the data, it is found that $$S _ { x x } = 4854.4 \quad S _ { y y } = 4216.1 \quad S _ { x y } = 4268.8$$
    1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\) for the remaining 10 items.
    2. Hence revise as necessary your interpretation in part (b).
AQA S1 2005 June Q1
1 For each of a random sample of 10 customers, a store records the time, \(x\) minutes, spent shopping and the value, \(\pounds y\), to the nearest 10 p, of items purchased. The results are tabulated below.
Time (x)1345109172316216
Value (y)12.55.72.318.47.917.117.918.68.321.3
    1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Interpret your value in context.
  1. Write down the value of the product moment correlation coefficient if the time had been recorded in seconds and the value in pence to the nearest 10p.
AQA S1 2006 June Q1
1 The table shows, for each of a random sample of 8 paperback fiction books, the number of pages, \(x\), and the recommended retail price, \(\pounds y\), to the nearest 10 p.
\(\boldsymbol { x }\)223276374433564612704766
\(\boldsymbol { y }\)6.504.005.508.004.505.008.005.50
    1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Interpret your value in the context of this question.
    3. Suggest one other variable, in addition to the number of pages, which may affect the recommended retail price of a paperback fiction book.
  1. The same 8 books were later included in a book sale. The value of the product moment correlation coefficient between the number of pages and the sale price was 0.959 , correct to three decimal places. What can be concluded from this value?
AQA S1 2015 June Q3
2 marks
3 Fourteen candidates each sat two test papers, Paper 1 and Paper 2, on the same day. The marks, out of a total of 50, achieved by the students on each paper are shown in the table.
AQA S1 2015 June Q1
1
The table shows the annual gas consumption, \(x \mathrm { kWh }\), and the annual electricity consumption, \(y \mathrm { kWh }\), for a sample of 10 bungalows of similar size and occupancy.
\(\boldsymbol { x }\)21371185211522217312198542356120738221111789724523
\(\boldsymbol { y }\)2281232722212378278728563078264725662559
$$S _ { x x } = 76581640 \quad S _ { y y } = 694250 \quad S _ { x y } = 3629670$$
  1. Calculate the value of \(r _ { x y }\), the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret your value of \(r _ { x y }\) in the context of this question.
OCR S1 Q8
8 The table shows the population, \(x\) million, of each of nine countries in Western Europe together with the population, \(y\) million, of its capital city.
GermanyUnited KingdomFranceItalySpainThe NetherlandsPortugalAustriaSwitzerland
\(x\)82.159.259.156.739.215.99.98.17.3
\(y\)3.57.09.02.72.90.80.71.60.1
$$\left[ n = 9 , \Sigma x = 337.5 , \Sigma x ^ { 2 } = 18959.11 , \Sigma y = 28.3 , \Sigma y ^ { 2 } = 161.65 , \Sigma x y = 1533.76 . \right]$$
  1. (a) Calculate Spearman's rank correlation coefficient, \(r _ { s }\).
    (b) Explain what your answer indicates about the populations of these countries and their capital cities.
  2. Calculate the product moment correlation coefficient, \(r\). The data are illustrated in the scatter diagram.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-09_936_881_1162_632}
  3. By considering the diagram, state the effect on the value of the product moment correlation coefficient, \(r\), if the data for France and the United Kingdom were removed from the calculation.
  4. In a certain country in Africa, most people live in remote areas and hence the population of the country is unknown. However, the population of the capital city is known to be approximately 1 million. An official suggests that the population of this country could be estimated by using a regression line drawn on the above scatter diagram.
    (a) State, with a reason, whether the regression line of \(y\) on \(x\) or the regression line of \(x\) on \(y\) would need to be used.
    (b) Comment on the reliability of such an estimate in this situation. 1 Some observations of bivariate data were made and the equations of the two regression lines were found to be as follows. $$\begin{array} { c c } y \text { on } x : & y = - 0.6 x + 13.0
    x \text { on } y : & x = - 1.6 y + 21.0 \end{array}$$
  5. State, with a reason, whether the correlation between \(x\) and \(y\) is negative or positive.
  6. Neither variable is controlled. Calculate an estimate of the value of \(x\) when \(y = 7.0\).
  7. Find the values of \(\bar { x }\) and \(\bar { y }\). 2 A bag contains 5 black discs and 3 red discs. A disc is selected at random from the bag. If it is red it is replaced in the bag. If it is black, it is not replaced. A second disc is now selected at random from the bag. Find the probability that
  8. the second disc is black, given that the first disc was black,
  9. the second disc is black,
  10. the two discs are of different colours. 3 Each of the 7 letters in the word DIVIDED is printed on a separate card. The cards are arranged in a row.
  11. How many different arrangements of the letters are possible?
  12. In how many of these arrangements are all three Ds together? The 7 cards are now shuffled and 2 cards are selected at random, without replacement.
  13. Find the probability that at least one of these 2 cards has D printed on it. 4
  14. The random variable \(X\) has the distribution \(\mathrm { B } ( 25,0.2 )\). Using the tables of cumulative binomial probabilities, or otherwise, find \(\mathrm { P } ( X \geqslant 5 )\).
  15. The random variable \(Y\) has the distribution \(\mathrm { B } ( 10,0.27 )\). Find \(\mathrm { P } ( Y = 3 )\).
  16. The random variable \(Z\) has the distribution \(B ( n , 0.27 )\). Find the smallest value of \(n\) such that \(\mathrm { P } ( Z \geqslant 1 ) > 0.95\). 5 The probability distribution of a discrete random variable, \(X\), is given in the table.
    \(x\)0123
    \(\mathrm { P } ( X = x )\)\(\frac { 1 } { 3 }\)\(\frac { 1 } { 4 }\)\(p\)\(q\)
    It is given that the expectation, \(\mathrm { E } ( X )\), is \(1 \frac { 1 } { 4 }\).
  17. Calculate the values of \(p\) and \(q\).
  18. Calculate the standard deviation of \(X\). \section*{June 2006} 6 The table shows the total distance travelled, in thousands of miles, and the amount of commission earned, in thousands of pounds, by each of seven sales agents in 2005.
    Agent\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)
    Distance travelled18151214162413
    Commission earned18451924272223
  19. (a) Calculate Spearman's rank correlation coefficient, \(r _ { s }\), for these data.
    (b) Comment briefly on your value of \(r _ { s }\) with reference to this context.
    (c) After these data were collected, agent \(A\) found that he had made a mistake. He had actually travelled 19000 miles in 2005. State, with a reason, but without further calculation, whether the value of Spearman's rank correlation coefficient will increase, decrease or stay the same. The agents were asked to indicate their level of job satisfaction during 2005. A score of 0 represented no job satisfaction, and a score of 10 represented high job satisfaction. Their scores, \(y\), together with the data for distance travelled, \(x\), are illustrated in the scatter diagram below.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-11_680_972_1235_589}
  20. For this scatter diagram, what can you say about the value of
    (a) Spearman's rank correlation coefficient,
    (b) the product moment correlation coefficient? 7 In a UK government survey in 2000, smokers were asked to estimate the time between their waking and their having the first cigarette of the day. For heavy smokers, the results were as follows.
    Time between waking
    and first cigarette
    1 to 4
    minutes
    5 to 14
    minutes
    15 to 29
    minutes
    30 to 59
    minutes
    At least 60
    minutes
    Percentage of smokers312719149
    Times are given correct to the nearest minute.
  21. Assuming that 'At least 60 minutes' means 'At least 60 minutes but less than 240 minutes', calculate estimates for the mean and standard deviation of the time between waking and first cigarette for these smokers.
  22. Find an estimate for the interquartile range of the time between waking and first cigarette for these smokers. Give your answer correct to the nearest minute.
  23. The meaning of 'At least 60 minutes' is now changed to 'At least 60 minutes but less than 480 minutes'. Without further calculation, state whether this would cause an increase, a decrease or no change in the estimated value of
    (a) the mean,
    (b) the standard deviation,
    (c) the interquartile range. 8 Henry makes repeated attempts to light his gas fire. He makes the modelling assumption that the probability that the fire will light on any attempt is \(\frac { 1 } { 3 }\). Let \(X\) be the number of attempts at lighting the fire, up to and including the successful attempt.
  24. Name the distribution of \(X\), stating a further modelling assumption needed. In the rest of this question, you should use the distribution named in part (i).
  25. Calculate
    (a) \(\mathrm { P } ( X = 4 )\),
    (b) \(\mathrm { P } ( X < 4 )\).
  26. State the value of \(\mathrm { E } ( X )\).
  27. Henry has to light the fire once a day, starting on March 1st. Calculate the probability that the first day on which fewer than 4 attempts are needed to light the fire is March 3rd. 1 Part of the probability distribution of a variable, \(X\), is given in the table.
    \(x\)0123
    \(\mathrm { P } ( X = x )\)\(\frac { 3 } { 10 }\)\(\frac { 1 } { 5 }\)\(\frac { 2 } { 5 }\)
  28. Find \(\mathrm { P } ( X = 0 )\).
  29. Find \(\mathrm { E } ( X )\). 2 The table contains data concerning five households selected at random from a certain town.
    Number of people in the household23357
    Number of cars belonging to people in the household11324
  30. Calculate the product moment correlation coefficient, \(r\), for the data in the table.
  31. Give a reason why it would not be sensible to use your answer to draw a conclusion about all the households in the town. 3 The digits 1, 2, 3, 4 and 5 are arranged in random order, to form a five-digit number.
  32. How many different five-digit numbers can be formed?
  33. Find the probability that the five-digit number is
    (a) odd,
    (b) less than 23000 . 4 Each of the variables \(W , X , Y\) and \(Z\) takes eight integer values only. The probability distributions are illustrated in the following diagrams.
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-14_423_385_404_287}
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-14_419_376_406_687}
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-14_419_378_406_1082}
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-14_419_376_406_1482}
  34. For which one or more of these variables is
    (a) the mean equal to the median,
    (b) the mean greater than the median?
  35. Give a reason why none of these diagrams could represent a geometric distribution.
  36. Which one of these diagrams could not represent a binomial distribution? Explain your answer briefly. 5 A chemical solution was gradually heated. At five-minute intervals the time, \(x\) minutes, and the temperature, \(y ^ { \circ } \mathrm { C }\), were noted.
    \(x\)05101520253035
    \(y\)0.83.06.810.915.619.623.426.7
    $$\left[ n = 8 , \Sigma x = 140 , \Sigma y = 106.8 , \Sigma x ^ { 2 } = 3500 , \Sigma y ^ { 2 } = 2062.66 , \Sigma x y = 2685.0 . \right]$$
  37. Calculate the equation of the regression line of \(y\) on \(x\).
  38. Use your equation to estimate the temperature after 12 minutes.
  39. It is given that the value of the product moment correlation coefficient is close to + 1 . Comment on the reliability of using your equation to estimate \(y\) when
    (a) \(x = 17\),
    (b) \(x = 57\). 6 A coin is biased so that the probability that it will show heads on any throw is \(\frac { 2 } { 3 }\). The coin is thrown repeatedly. The number of throws up to and including the first head is denoted by \(X\). Find
  40. \(\mathrm { P } ( X = 4 )\),
  41. \(\mathrm { P } ( X < 4 )\),
  42. \(\mathrm { E } ( X )\). 7 A bag contains three 1 p coins and seven 2 p coins. Coins are removed at random one at a time, without replacement, until the total value of the coins removed is at least 3p. Then no more coins are removed.
  43. Copy and complete the probability tree diagram. First coin
    \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-15_350_317_1279_568} Find the probability that
  44. exactly two coins are removed,
  45. the total value of the coins removed is 4p. \section*{Jan 2007} 8 In the 2001 census, the household size (the number of people living in each household) was recorded. The percentages of households of different sizes were then calculated. The table shows the percentages for two wards, Withington and Old Moat, in Manchester.
    \cline { 2 - 8 } \multicolumn{1}{c|}{}Household size
    \cline { 2 - 8 } \multicolumn{1}{c|}{}1234567 or more
    Withington34.126.112.712.88.24.02.1
    Old Moat35.127.114.711.47.62.81.3
  46. Calculate the median and interquartile range of the household size for Withington.
  47. Making an appropriate assumption for the last class, which should be stated, calculate the mean and standard deviation of the household size for Withington. Give your answers to an appropriate degree of accuracy. The corresponding results for Old Moat are as follows.
    Median
    Interquartile
    range
    Mean
    Standard
    deviation
    222.41.5
  48. State one advantage of using the median rather than the mean as a measure of the average household size.
  49. By comparing the values for Withington with those for Old Moat, explain briefly why the interquartile range may be less suitable than the standard deviation as a measure of the variation in household size.
  50. For one of the above wards, the value of Spearman's rank correlation coefficient between household size and percentage is - 1 . Without any calculation, state which ward this is. Explain your answer.
OCR Further Statistics 2021 June Q2
2 A book collector compared the prices of some books, \(\pounds x\), when new in 1972 and the prices of copies of the same books, \(\pounds y\), on a second-hand website in 2018.
The results are shown in Table 1 and are summarised below the table. \begin{table}[h]
BookABCDEFGHIJKL
\(x\)0.950.650.700.900.551.401.500.501.150.350.200.35
\(y\)6.067.002.005.874.005.367.192.503.008.291.372.00
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table} $$n = 12 , \Sigma x = 9.20 , \Sigma y = 54.64 , \Sigma x ^ { 2 } = 8.9950 , \Sigma y ^ { 2 } = 310.4572 , \Sigma x y = 46.0545$$
  1. It is given that the value of Pearson's product-moment correlation coefficient for the data is 0.381 , correct to 3 significant figures.
    1. State what this information tells you about a scatter diagram illustrating the data.
    2. Test at the \(5 \%\) significance level whether there is evidence of positive correlation between prices in 1972 and prices in 2018.
  2. The collector noticed that the second-hand copy of book J was unusually expensive and he decided to ignore the data for book J. Calculate the value of Pearson's product-moment correlation coefficient for the other 11 books.