5.08e Spearman rank correlation

107 questions

Sort by: Default | Easiest first | Hardest first
OCR S1 2005 January Q3
6 marks Moderate -0.8
3 Two commentators gave ratings out of 100 for seven sports personalities. The ratings are shown in the table below.
Personality\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)
Commentator I73767865868291
Commentator II77787980868995
  1. Calculate Spearman's rank correlation coefficient for these ratings.
  2. State what your answer tells you about the ratings given by the two commentators.
OCR S1 2008 January Q3
6 marks Standard +0.3
3 A sample of bivariate data was taken and the results were summarised as follows. $$n = 5 \quad \Sigma x = 24 \quad \Sigma x ^ { 2 } = 130 \quad \Sigma y = 39 \quad \Sigma y ^ { 2 } = 361 \quad \Sigma x y = 212$$
  1. Show that the value of the product moment correlation coefficient \(r\) is 0.855 , correct to 3 significant figures.
  2. The ranks of the data were found. One student calculated Spearman's rank correlation coefficient \(r _ { s }\), and found that \(r _ { s } = 0.7\). Another student calculated the product moment coefficient, \(R\), of these ranks. State which one of the following statements is true, and explain your answer briefly.
    (A) \(R = 0.855\) (B) \(R = 0.7\) (C) It is impossible to give the value of \(R\) without carrying out a calculation using the original data.
  3. All the values of \(x\) are now multiplied by a scaling factor of 2 . State the new values of \(r\) and \(r _ { s }\).
OCR S1 2005 June Q1
6 marks Moderate -0.8
1
  1. Calculate the value of Spearman's rank correlation coefficient between the two sets of rankings, \(A\) and \(B\), shown in Table 1. \begin{table}[h]
    \(A\)12345
    \(B\)41325
    \captionsetup{labelformat=empty} \caption{Table 1}
    \end{table}
  2. The value of Spearman's rank correlation coefficient between the set of rankings \(B\) and a third set of rankings, \(C\), is known to be - 1 . Copy and complete Table 2 showing the set of rankings \(C\). \begin{table}[h]
    \(B\)41325
    \(C\)
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table}
OCR S1 2006 June Q6
10 marks Standard +0.3
6 The table shows the total distance travelled, in thousands of miles, and the amount of commission earned, in thousands of pounds, by each of seven sales agents in 2005.
Agent\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)
Distance travelled18151214162413
Commission earned18451924272223
  1. (a) Calculate Spearman's rank correlation coefficient, \(r _ { s }\), for these data.
    (b) Comment briefly on your value of \(r _ { s }\) with reference to this context.
    (c) After these data were collected, agent \(A\) found that he had made a mistake. He had actually travelled 19000 miles in 2005. State, with a reason, but without further calculation, whether the value of Spearman's rank correlation coefficient will increase, decrease or stay the same. The agents were asked to indicate their level of job satisfaction during 2005. A score of 0 represented no job satisfaction, and a score of 10 represented high job satisfaction. Their scores, \(y\), together with the data for distance travelled, \(x\), are illustrated in the scatter diagram below.
    [diagram]
  2. For this scatter diagram, what can you say about the value of
    (a) Spearman's rank correlation coefficient,
    (b) the product moment correlation coefficient?
OCR S1 2007 June Q2
5 marks Easy -1.2
2 Two judges each placed skaters from five countries in rank order.
Position1st2nd3rd4th5th
Judge 1UKFranceRussiaPolandCanada
Judge 2RussiaCanadaFranceUKPoland
Calculate Spearman's rank correlation coefficient, \(\mathrm { r } _ { \mathbf { s } ^ { \prime } }\) for the two judges' rankings.
OCR S1 2007 June Q6
12 marks Moderate -0.3
6 A machine with artificial intelligence is designed to improve its efficiency rating with practice. The table shows the values of the efficiency rating, y , after the machine has carried out its task various numbers of times, \(x\)
x0123471330
y0481011121314
$$\left[ n = 8 , \Sigma x = 60 , \Sigma y = 72 , \Sigma x ^ { 2 } = 1148 , \Sigma y ^ { 2 } = 810 , \Sigma x y = 767 . \right]$$ These data are illustrated in the scatter diagram. \includegraphics[max width=\textwidth, alt={}, center]{dfad6626-75ca-4dbd-9c45-42f809c163f3-4_769_1328_760_411}
  1. (a) Calculate the value of r , the product moment correlation coefficient.
    (b) Without calculation, state with a reason the value of \(\mathrm { r } _ { \mathrm { s } ^ { \prime } }\) Spearman's rank correlation coefficient.
  2. A researcher suggests that the data for \(\mathrm { x } = 0\) and \(\mathrm { x } = 1\) should be ignored. Without cal culation, state with a reason what effect this would have on the value of
    (a) \(r\),
    (b) \(r _ { s }\).
  3. Use the diagram to estimate the value of y when \(\mathrm { x } = 29\).
  4. Jack finds the equation of the regression line of y on xf for all the data, and uses it to estimate the value of \(y\) when \(x = 29\). Without calculation, state with a reason whether this estimate or the one found in part (iii) will be the more reliable.
OCR S1 2016 June Q4
8 marks Moderate -0.3
4 In this question the product moment correlation coefficient is denoted by \(r\) and Spearman's rank correlation coefficient is denoted by \(r _ { s }\).
  1. The scatter diagram in Fig. 1 shows the results of an experiment involving some bivariate data. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{b5ce3230-7528-439c-9e85-ef159a49cba3-4_597_595_434_733} \captionsetup{labelformat=empty} \caption{Fig. 1}
    \end{figure} Write down the value of \(r _ { s }\) for these data.
  2. On the diagram in the Answer Booklet, draw five points such that \(r _ { s } = 1\) and \(r \neq 1\).
  3. The scatter diagram in Fig. 2 shows the results of another experiment involving 5 items of bivariate data. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{b5ce3230-7528-439c-9e85-ef159a49cba3-4_604_608_1484_731} \captionsetup{labelformat=empty} \caption{Fig. 2}
    \end{figure} Calculate the value of \(r _ { s }\).
  4. A random variable \(X\) has the distribution \(\mathrm { B } ( 25,0.6 )\). Find
    1. \(\mathrm { P } ( X \leqslant 14 )\),
    2. \(\mathrm { P } ( X = 14 )\),
    3. \(\quad \operatorname { Var } ( X )\).
    4. A random variable \(Y\) has the distribution \(\mathrm { B } ( 24,0.3 )\). Write down an expression for \(\mathrm { P } ( Y = y )\) and evaluate this probability in the case where \(y = 8\).
    5. A random variable \(Z\) has the distribution \(\mathrm { B } ( 2,0.2 )\). Find the probability that two randomly chosen values of \(Z\) are equal.
      (a) Find the number of ways in which 12 people can be divided into three groups containing 5 people, 4 people and 3 people, without regard to order.
      (b) The diagram shows 7 cards, each with a letter on it. $$\mathrm { A } \mathrm {~A} \mathrm {~A} \mathrm {~B} \text { } \mathrm { B } \text { } \mathrm { R } \text { } \mathrm { R }$$ The 7 cards are arranged in a random order in a straight line.
      1. Find the number of possible arrangements of the 7 letters.
      2. Find the probability that the 7 letters form the name BARBARA. The 7 cards are shuffled. Now 4 of the 7 cards are chosen at random and arranged in a random order in a straight line.
      3. Find the probability that the letters form the word ABBA .
OCR S1 Specimen Q2
7 marks Standard +0.3
2 Two independent assessors awarded marks to each of 5 projects. The results were as shown in the table.
Project\(A\)\(B\)\(C\)\(D\)\(E\)
First assessor3891628361
Second assessor5684418562
  1. Calculate Spearman's rank correlation coefficient for the data.
  2. Show, by sketching a suitable scatter diagram, how two assessors might have assessed 5 projects in such a way that Spearman's rank correlation coefficient for their marks was + 1 while the product moment correlation coefficient for their marks was not + 1 . (Your scatter diagram need not be drawn accurately to scale.)
OCR MEI S2 2006 January Q3
18 marks Standard +0.3
3 A researcher is investigating the relationship between temperature and levels of the air pollutant nitrous oxide at a particular site. The researcher believes that there will be a positive correlation between the daily maximum temperature, \(x\), and nitrous oxide level, \(y\). Data are collected for 10 randomly selected days. The data, measured in suitable units, are given in the table and illustrated on the scatter diagram.
\(x\)13.317.216.918.718.419.323.115.020.614.4
\(y\)911142643255215107
[diagram]
  1. Calculate the value of Spearman's rank correlation coefficient for these data.
  2. Perform a hypothesis test at the \(5 \%\) level to check the researcher's belief, stating your hypotheses clearly.
  3. It is suggested that it would be preferable to carry out a test based on the product moment correlation coefficient. State the distributional assumption required for such a test to be valid. Explain how a scatter diagram can be used to check whether the distributional assumption is likely to be valid and comment on the validity in this case.
  4. A statistician investigates data over a much longer period and finds that the assumptions for the use of the product moment correlation coefficient are in fact valid. Give the critical region for the test at the \(1 \%\) level, based on a sample of 60 days.
  5. In a different research project, into the correlation between daily temperature and ozone pollution levels, a positive correlation is found. It is argued that this shows that high temperatures cause increased ozone levels. Comment on this claim.
OCR MEI S2 2007 June Q2
19 marks Standard +0.3
2 A medical student is trying to estimate the birth weight of babies using pre-natal scan images. The actual weights, \(x \mathrm {~kg}\), and the estimated weights, \(y \mathrm {~kg}\), of ten randomly selected babies are given in the table below.
\(x\)2.612.732.872.963.053.143.173.243.764.10
\(y\)3.22.63.53.12.82.73.43.34.44.1
  1. Calculate the value of Spearman's rank correlation coefficient.
  2. Carry out a hypothesis test at the \(5 \%\) level to determine whether there is positive association between the student's estimates and the actual birth weights of babies in the underlying population.
  3. Calculate the value of the product moment correlation coefficient of the sample. You may use the following summary statistics in your calculations: $$\Sigma x = 31.63 , \quad \Sigma y = 33.1 , \quad \Sigma x ^ { 2 } = 101.92 , \quad \Sigma y ^ { 2 } = 112.61 , \quad \Sigma x y = 106.51 .$$
  4. Explain why, if the underlying population has a bivariate Normal distribution, it would be preferable to carry out a hypothesis test based on the product moment correlation coefficient. Comment briefly on the significance of the product moment correlation coefficient in relation to that of Spearman's rank correlation coefficient.
OCR S1 2009 January Q4
7 marks Moderate -0.3
4 Three tutors each marked the coursework of five students. The marks are given in the table.
Student\(A\)\(B\)\(C\)\(D\)\(E\)
Tutor 17367604839
Tutor 26250617665
Tutor 34250635471
  1. Calculate Spearman's rank correlation coefficient, \(r _ { \mathrm { s } }\), between the marks for tutors 1 and 2 .
  2. The values of \(r _ { \mathrm { s } }\) for the other pairs of tutors, are as follows. $$\begin{array} { c c } \text { Tutors } 1 \text { and 3: } & r _ { \mathrm { s } } = - 0.9 \\ \text { Tutors } 2 \text { and 3: } & r _ { \mathrm { s } } = 0.3 \end{array}$$ State which two tutors differ most widely in their judgements. Give your reason.
OCR S1 2012 January Q4
8 marks Standard +0.8
4
  1. The table gives the heights and masses of 5 people.
    Person\(A\)\(B\)\(C\)\(D\)\(E\)
    Height (m)1.721.631.771.681.74
    Mass (kg)7562646070
    Calculate Spearman's rank correlation coefficient.
  2. In an art competition the value of Spearman's rank correlation coefficient, \(r _ { s }\), calculated from two judges' rankings was 0.75 . A late entry for the competition was received and both judges ranked this entry lower than all the others. By considering the formula for \(r _ { s }\), explain whether the new value of \(r _ { s }\) will be less than 0.75 , equal to 0.75 , or greater than 0.75 .
OCR S1 2011 June Q2
5 marks Easy -1.2
2 The orders in which 4 contestants, \(P , Q , R\) and \(S\), were placed in two competitions are shown in the table.
Position1st2nd3rd4th
Competition 1\(Q\)\(R\)\(S\)\(P\)
Competition 2\(Q\)\(P\)\(R\)\(S\)
Calculate Spearman's rank correlation coefficient between these two orders.
OCR S1 2011 June Q7
6 marks Moderate -0.8
7 The diagram shows the results of an experiment involving some bivariate data. The least squares regression line of \(y\) on \(x\) for these results is also shown. \includegraphics[max width=\textwidth, alt={}, center]{48ffcd44-d933-40e0-818a-20d6db607298-5_748_919_390_612}
  1. Given that the least squares regression line of \(y\) on \(x\) is used for an estimation, state which of \(x\) or \(y\) is treated as the independent variable.
  2. Use the diagram to explain what is meant by 'least squares'.
  3. State, with a reason, the value of Spearman's rank correlation coefficient for these data.
  4. What can be said about the value of the product moment correlation coefficient for these data?
OCR S1 2012 June Q5
8 marks Easy -1.2
5
  1. Write down the value of Spearman's rank correlation coefficent, \(r _ { s }\), for the following sets of ranks. All the discs are replaced in the bag. Tony now takes three discs from the bag at random without replacement.
  2. Given that the first disc Tony takes is red, find the probability that the third disc Tony takes is also red.
    [0pt] [2
  3. Write down the value of Spearman's rank correlation coefficent, \(r _ { s }\), for the following sets of ranks.
    (b)
    Judge \(A\) ranks1234
    Judge \(C\) ranks4321
    (a)
    (a)
    Judge \(A\) ranks1234
    Judge \(B\) ranks1234
  4. Calculate the value of \(r _ { s }\) for the following ranks.
    Judge \(A\) ranks1234
    Judge \(D\) ranks2413
  5. For each of parts (i)(a), (i)(b) and (ii), describe in everyday terms the relationship between the two judges' opinions.
OCR S1 2014 June Q6
5 marks Moderate -0.8
6 Fiona and James collected the results for six hockey teams at the end of the season. They then carried out various calculations using Spearman's rank correlation coefficient, \(r _ { s }\).
  1. Fiona calculated the value of \(r _ { s }\) between the number of goals scored FOR each team and the number of goals scored AGAINST each team. She found that \(r _ { s } = - 1\). Complete the table in the answer book showing the ranks.
    TeamABCDEF
    Number of goals FOR (rank)123456
    Number of goals AGAINST (rank)
  2. James calculated the value of \(r _ { s }\) between the number of goals scored and the number of points gained by the 6 teams. He found the value of \(r _ { s }\) to be 1 . He then decided to include the results of another two teams in the calculation of \(r _ { s }\). The table shows the ranks for these two teams.
    TeamGH
    Number of goals scored (rank)78
    Number of points gained (rank)87
    Calculate the value of \(r _ { s }\) for all 8 teams.
OCR S1 2015 June Q3
6 marks Moderate -0.8
3 An expert tested the quality of the wines produced by a vineyard in 9 particular years. He placed them in the following order, starting with the best. $$\begin{array} { l l l l l l l l l } 1980 & 1983 & 1981 & 1982 & 1984 & 1985 & 1987 & 1986 & 1988 \end{array}$$
  1. Calculate Spearman's rank correlation coefficient, \(r _ { s }\), between the year of production and the quality of these wines. The years should be ranked from the earliest (1) to the latest (9).
  2. State what this value of \(r _ { s }\) shows in this context.
OCR MEI S2 2009 January Q1
20 marks Moderate -0.3
1 A researcher is investigating whether there is a relationship between the population size of cities and the average walking speed of pedestrians in the city centres. Data for the population size, \(x\) thousands, and the average walking speed of pedestrians, \(y \mathrm {~m} \mathrm {~s} ^ { - 1 }\), of eight randomly selected cities are given in the table below.
\(x\)18435294982067841530
\(y\)1.150.971.261.351.281.421.321.64
  1. Calculate the value of Spearman's rank correlation coefficient.
  2. Carry out a hypothesis test at the \(5 \%\) significance level to determine whether there is any association between population size and average walking speed. In another investigation, the researcher selects a random sample of six adult males of particular ages and measures their maximum walking speeds. The data are shown in the table below, where \(t\) years is the age of the adult and \(w \mathrm {~m} \mathrm {~s} ^ { - 1 }\) is the maximum walking speed. Also shown are summary statistics and a scatter diagram on which the regression line of \(w\) on \(t\) is drawn.
    \(t\)203040506070
    \(w\)2.492.412.382.141.972.03
    $$n = 6 \quad \Sigma t = 270 \quad \Sigma w = 13.42 \quad \Sigma t ^ { 2 } = 13900 \quad \Sigma w ^ { 2 } = 30.254 \quad \Sigma t w = 584.6$$ \includegraphics[max width=\textwidth, alt={}, center]{77b97142-afb6-41d6-8fec-e982b7a7501b-2_728_1091_1379_529}
  3. Calculate the equation of the regression line of \(w\) on \(t\).
  4. (A) Use this equation to calculate an estimate of maximum walking speed of an 80 -year-old male.
    (B) Explain why it might not be appropriate to use the equation to calculate an estimate of maximum walking speed of a 10 -year-old male.
OCR MEI S2 2012 January Q1
17 marks Standard +0.3
1 Nine long-distance runners are starting an exercise programme to improve their strength. During the first session, each of them has to do a 100 metre run and to do as many push-ups as possible in one minute. The times taken for the run, together with the number of push-ups each runner achieves, are shown in the table.
RunnerABCDEFGHI
100 metre time (seconds)13.211.610.912.314.713.111.713.612.4
Push-ups achieved324222364127373833
  1. Draw a scatter diagram to illustrate the data.
  2. Calculate the value of Spearman's rank correlation coefficient.
  3. Carry out a hypothesis test at the \(5 \%\) significance level to examine whether there is any association between time taken for the run and number of push-ups achieved.
  4. Under what circumstances is it appropriate to carry out a hypothesis test based on the product moment correlation coefficient? State, with a reason, which test is more appropriate for these data.
OCR MEI S2 2010 June Q1
16 marks Standard +0.3
1 Two celebrities judge a talent contest. Each celebrity gives a score out of 20 to each of a random sample of 8 contestants. The scores, \(x\) and \(y\), given by the celebrities to each contestant are shown below.
ContestantABCDEFGH
\(x\)61792013151114
\(y\)6131011971215
  1. Calculate the value of Spearman's rank correlation coefficient.
  2. Carry out a hypothesis test at the \(5 \%\) significance level to determine whether there is positive association between the scores allocated by the two celebrities.
  3. State the distributional assumption required for a test based on the product moment correlation coefficient. Sketch a scatter diagram of the scores above, and discuss whether it appears that the assumption is likely to be valid.
OCR MEI S2 2014 June Q1
18 marks Standard +0.3
1 A medical student is investigating the claim that young adults with high diastolic blood pressure tend to have high systolic blood pressure. The student measures the diastolic and systolic blood pressures of a random sample of ten young adults. The data are shown in the table and illustrated in the scatter diagram.
Diastolic blood pressure60616263737684879095
Systolic blood pressure98121118114108112132130134139
\includegraphics[max width=\textwidth, alt={}, center]{17e474c4-f5be-4ca1-b7c3-e444b46c3bec-2_865_809_593_628}
  1. Calculate the value of Spearman's rank correlation coefficient for these data.
  2. Carry out a hypothesis test at the \(5 \%\) significance level to examine whether there is positive association between diastolic blood pressure and systolic blood pressure in the population of young adults.
  3. Explain why, in the light of the scatter diagram, it might not be valid to carry out a test based on the product moment correlation coefficient. The product moment correlation coefficient between the diastolic and systolic blood pressures of a random sample of 10 athletes is 0.707 .
  4. Carry out a hypothesis test at the \(1 \%\) significance level to investigate whether there appears to be positive correlation between these two variables in the population of athletes. You may assume that in this case such a test is valid.
OCR MEI S2 2016 June Q1
18 marks Standard +0.3
1 A researcher believes that there may be negative association between the quantity of fertiliser used and the percentage of the population who live in rural areas in different countries. The data below show the percentage of the population who live in rural areas and the fertiliser use measured in kg per hectare, for a random sample of 11 countries.
Percentage of population33658358169617747117
Fertiliser use764466831071765137157
  1. Draw a scatter diagram to illustrate the data.
  2. Explain why it might not be valid to carry out a test based on the product moment correlation coefficient in this case.
  3. Calculate the value of Spearman's rank correlation coefficient.
  4. Carry out a hypothesis test at the \(1 \%\) significance level to investigate the researcher's belief.
  5. Explain the meaning of ' \(1 \%\) significance level'.
  6. In order to carry out a test based on Spearman's rank correlation coefficient, what modelling assumptions, if any, are required about the underlying distribution?
OCR Further Statistics AS 2019 June Q3
6 marks Challenging +1.2
3
  1. Shula calculates the value of Spearman's rank correlation coefficient \(r _ { s }\) for 9 pairs of rankings.
    Find the largest possible value of \(r _ { s }\) that Shula can obtain that is less than 1 .
  2. A set of bivariate data consists of 5 pairs of values. It is known that for this data the value of Spearman's rank correlation coefficient is - 1 but the value of Pearson's product-moment correlation coefficient is not - 1 . Sketch a possible scatter diagram illustrating the data.
OCR Further Statistics AS 2024 June Q5
9 marks Standard +0.3
5 In a fashion competition, two judges gave marks to a large number of contestants. The value of Spearman's rank correlation coefficient, \(\mathrm { r } _ { \mathrm { s } }\), between the marks given to 7 randomly chosen contestants is \(\frac { 27 } { 28 }\).
  1. An excerpt from the table of critical values of \(\mathrm { r } _ { \mathrm { s } }\) is shown below. \section*{Critical values of Spearman's rank correlation coefficient}
    1-tail test5\%2.5\%1\%0.5\%
    2-tail test10\%5\%2\%1\%
    \multirow{3}{*}{\(n\)}60.82860.88570.94291.0000
    70.71430.78570.89290.9286
    80.64290.73810.83330.8810
    Test whether there is evidence, at the 1\% significance level, that the judges agree with each another. The marks given by the two judges to the 7 randomly chosen contestants were as follows, where \(x\) is an integer.
    ContestantABCD\(E\)\(F\)G
    Judge 164656778798086
    Judge 2616378808190\(x\)
  2. Use the value \(\mathrm { r } _ { \mathrm { s } } = \frac { 27 } { 28 }\) to determine the range of possible values of \(x\).
  3. Give a reason why it might be preferable to use the product moment correlation coefficient rather than Spearman's rank correlation coefficient in this context.
OCR Further Statistics AS 2020 November Q4
9 marks Standard +0.3
4 After a holiday organised for a group, the company organising the holiday obtained scores out of 10 for six different aspects of the holiday. The company obtained responses from 100 couples and 100 single travellers. The total scores for each of the aspects are given in the following table.
AspectCouplesSingle travellers
Organisation884867
Travel710633
Food692675
Leader898898
Included visits561736
Optional visits683712
Fred wishes to test whether there is significant positive correlation between the scores given by the two categories.
  1. Explain why it is probably not appropriate to use Pearson's product-moment correlation coefficient.
  2. Carry out an appropriate test at the \(1 \%\) level.
  3. Explain what is meant by the statement that the test carried out in part (b) is a non-parametric test.