5.08f Hypothesis test: Spearman rank

95 questions

Sort by: Default | Easiest first | Hardest first
OCR S1 2005 January Q3
6 marks Moderate -0.8
3 Two commentators gave ratings out of 100 for seven sports personalities. The ratings are shown in the table below.
Personality\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)
Commentator I73767865868291
Commentator II77787980868995
  1. Calculate Spearman's rank correlation coefficient for these ratings.
  2. State what your answer tells you about the ratings given by the two commentators.
OCR S1 2006 June Q6
10 marks Standard +0.3
6 The table shows the total distance travelled, in thousands of miles, and the amount of commission earned, in thousands of pounds, by each of seven sales agents in 2005.
Agent\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)
Distance travelled18151214162413
Commission earned18451924272223
  1. (a) Calculate Spearman's rank correlation coefficient, \(r _ { s }\), for these data.
    (b) Comment briefly on your value of \(r _ { s }\) with reference to this context.
    (c) After these data were collected, agent \(A\) found that he had made a mistake. He had actually travelled 19000 miles in 2005. State, with a reason, but without further calculation, whether the value of Spearman's rank correlation coefficient will increase, decrease or stay the same. The agents were asked to indicate their level of job satisfaction during 2005. A score of 0 represented no job satisfaction, and a score of 10 represented high job satisfaction. Their scores, \(y\), together with the data for distance travelled, \(x\), are illustrated in the scatter diagram below.
    [diagram]
  2. For this scatter diagram, what can you say about the value of
    (a) Spearman's rank correlation coefficient,
    (b) the product moment correlation coefficient?
OCR S1 2016 June Q4
8 marks Moderate -0.3
4 In this question the product moment correlation coefficient is denoted by \(r\) and Spearman's rank correlation coefficient is denoted by \(r _ { s }\).
  1. The scatter diagram in Fig. 1 shows the results of an experiment involving some bivariate data. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{b5ce3230-7528-439c-9e85-ef159a49cba3-4_597_595_434_733} \captionsetup{labelformat=empty} \caption{Fig. 1}
    \end{figure} Write down the value of \(r _ { s }\) for these data.
  2. On the diagram in the Answer Booklet, draw five points such that \(r _ { s } = 1\) and \(r \neq 1\).
  3. The scatter diagram in Fig. 2 shows the results of another experiment involving 5 items of bivariate data. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{b5ce3230-7528-439c-9e85-ef159a49cba3-4_604_608_1484_731} \captionsetup{labelformat=empty} \caption{Fig. 2}
    \end{figure} Calculate the value of \(r _ { s }\).
  4. A random variable \(X\) has the distribution \(\mathrm { B } ( 25,0.6 )\). Find
    1. \(\mathrm { P } ( X \leqslant 14 )\),
    2. \(\mathrm { P } ( X = 14 )\),
    3. \(\quad \operatorname { Var } ( X )\).
    4. A random variable \(Y\) has the distribution \(\mathrm { B } ( 24,0.3 )\). Write down an expression for \(\mathrm { P } ( Y = y )\) and evaluate this probability in the case where \(y = 8\).
    5. A random variable \(Z\) has the distribution \(\mathrm { B } ( 2,0.2 )\). Find the probability that two randomly chosen values of \(Z\) are equal.
      (a) Find the number of ways in which 12 people can be divided into three groups containing 5 people, 4 people and 3 people, without regard to order.
      (b) The diagram shows 7 cards, each with a letter on it. $$\mathrm { A } \mathrm {~A} \mathrm {~A} \mathrm {~B} \text { } \mathrm { B } \text { } \mathrm { R } \text { } \mathrm { R }$$ The 7 cards are arranged in a random order in a straight line.
      1. Find the number of possible arrangements of the 7 letters.
      2. Find the probability that the 7 letters form the name BARBARA. The 7 cards are shuffled. Now 4 of the 7 cards are chosen at random and arranged in a random order in a straight line.
      3. Find the probability that the letters form the word ABBA .
OCR S1 Specimen Q2
7 marks Standard +0.3
2 Two independent assessors awarded marks to each of 5 projects. The results were as shown in the table.
Project\(A\)\(B\)\(C\)\(D\)\(E\)
First assessor3891628361
Second assessor5684418562
  1. Calculate Spearman's rank correlation coefficient for the data.
  2. Show, by sketching a suitable scatter diagram, how two assessors might have assessed 5 projects in such a way that Spearman's rank correlation coefficient for their marks was + 1 while the product moment correlation coefficient for their marks was not + 1 . (Your scatter diagram need not be drawn accurately to scale.)
OCR MEI S2 2006 January Q3
18 marks Standard +0.3
3 A researcher is investigating the relationship between temperature and levels of the air pollutant nitrous oxide at a particular site. The researcher believes that there will be a positive correlation between the daily maximum temperature, \(x\), and nitrous oxide level, \(y\). Data are collected for 10 randomly selected days. The data, measured in suitable units, are given in the table and illustrated on the scatter diagram.
\(x\)13.317.216.918.718.419.323.115.020.614.4
\(y\)911142643255215107
[diagram]
  1. Calculate the value of Spearman's rank correlation coefficient for these data.
  2. Perform a hypothesis test at the \(5 \%\) level to check the researcher's belief, stating your hypotheses clearly.
  3. It is suggested that it would be preferable to carry out a test based on the product moment correlation coefficient. State the distributional assumption required for such a test to be valid. Explain how a scatter diagram can be used to check whether the distributional assumption is likely to be valid and comment on the validity in this case.
  4. A statistician investigates data over a much longer period and finds that the assumptions for the use of the product moment correlation coefficient are in fact valid. Give the critical region for the test at the \(1 \%\) level, based on a sample of 60 days.
  5. In a different research project, into the correlation between daily temperature and ozone pollution levels, a positive correlation is found. It is argued that this shows that high temperatures cause increased ozone levels. Comment on this claim.
OCR MEI S2 2007 June Q2
19 marks Standard +0.3
2 A medical student is trying to estimate the birth weight of babies using pre-natal scan images. The actual weights, \(x \mathrm {~kg}\), and the estimated weights, \(y \mathrm {~kg}\), of ten randomly selected babies are given in the table below.
\(x\)2.612.732.872.963.053.143.173.243.764.10
\(y\)3.22.63.53.12.82.73.43.34.44.1
  1. Calculate the value of Spearman's rank correlation coefficient.
  2. Carry out a hypothesis test at the \(5 \%\) level to determine whether there is positive association between the student's estimates and the actual birth weights of babies in the underlying population.
  3. Calculate the value of the product moment correlation coefficient of the sample. You may use the following summary statistics in your calculations: $$\Sigma x = 31.63 , \quad \Sigma y = 33.1 , \quad \Sigma x ^ { 2 } = 101.92 , \quad \Sigma y ^ { 2 } = 112.61 , \quad \Sigma x y = 106.51 .$$
  4. Explain why, if the underlying population has a bivariate Normal distribution, it would be preferable to carry out a hypothesis test based on the product moment correlation coefficient. Comment briefly on the significance of the product moment correlation coefficient in relation to that of Spearman's rank correlation coefficient.
OCR S1 2012 January Q4
8 marks Standard +0.8
4
  1. The table gives the heights and masses of 5 people.
    Person\(A\)\(B\)\(C\)\(D\)\(E\)
    Height (m)1.721.631.771.681.74
    Mass (kg)7562646070
    Calculate Spearman's rank correlation coefficient.
  2. In an art competition the value of Spearman's rank correlation coefficient, \(r _ { s }\), calculated from two judges' rankings was 0.75 . A late entry for the competition was received and both judges ranked this entry lower than all the others. By considering the formula for \(r _ { s }\), explain whether the new value of \(r _ { s }\) will be less than 0.75 , equal to 0.75 , or greater than 0.75 .
OCR S1 2014 June Q6
5 marks Moderate -0.8
6 Fiona and James collected the results for six hockey teams at the end of the season. They then carried out various calculations using Spearman's rank correlation coefficient, \(r _ { s }\).
  1. Fiona calculated the value of \(r _ { s }\) between the number of goals scored FOR each team and the number of goals scored AGAINST each team. She found that \(r _ { s } = - 1\). Complete the table in the answer book showing the ranks.
    TeamABCDEF
    Number of goals FOR (rank)123456
    Number of goals AGAINST (rank)
  2. James calculated the value of \(r _ { s }\) between the number of goals scored and the number of points gained by the 6 teams. He found the value of \(r _ { s }\) to be 1 . He then decided to include the results of another two teams in the calculation of \(r _ { s }\). The table shows the ranks for these two teams.
    TeamGH
    Number of goals scored (rank)78
    Number of points gained (rank)87
    Calculate the value of \(r _ { s }\) for all 8 teams.
OCR MEI S2 2009 January Q1
20 marks Moderate -0.3
1 A researcher is investigating whether there is a relationship between the population size of cities and the average walking speed of pedestrians in the city centres. Data for the population size, \(x\) thousands, and the average walking speed of pedestrians, \(y \mathrm {~m} \mathrm {~s} ^ { - 1 }\), of eight randomly selected cities are given in the table below.
\(x\)18435294982067841530
\(y\)1.150.971.261.351.281.421.321.64
  1. Calculate the value of Spearman's rank correlation coefficient.
  2. Carry out a hypothesis test at the \(5 \%\) significance level to determine whether there is any association between population size and average walking speed. In another investigation, the researcher selects a random sample of six adult males of particular ages and measures their maximum walking speeds. The data are shown in the table below, where \(t\) years is the age of the adult and \(w \mathrm {~m} \mathrm {~s} ^ { - 1 }\) is the maximum walking speed. Also shown are summary statistics and a scatter diagram on which the regression line of \(w\) on \(t\) is drawn.
    \(t\)203040506070
    \(w\)2.492.412.382.141.972.03
    $$n = 6 \quad \Sigma t = 270 \quad \Sigma w = 13.42 \quad \Sigma t ^ { 2 } = 13900 \quad \Sigma w ^ { 2 } = 30.254 \quad \Sigma t w = 584.6$$ \includegraphics[max width=\textwidth, alt={}, center]{77b97142-afb6-41d6-8fec-e982b7a7501b-2_728_1091_1379_529}
  3. Calculate the equation of the regression line of \(w\) on \(t\).
  4. (A) Use this equation to calculate an estimate of maximum walking speed of an 80 -year-old male.
    (B) Explain why it might not be appropriate to use the equation to calculate an estimate of maximum walking speed of a 10 -year-old male.
OCR MEI S2 2012 January Q1
17 marks Standard +0.3
1 Nine long-distance runners are starting an exercise programme to improve their strength. During the first session, each of them has to do a 100 metre run and to do as many push-ups as possible in one minute. The times taken for the run, together with the number of push-ups each runner achieves, are shown in the table.
RunnerABCDEFGHI
100 metre time (seconds)13.211.610.912.314.713.111.713.612.4
Push-ups achieved324222364127373833
  1. Draw a scatter diagram to illustrate the data.
  2. Calculate the value of Spearman's rank correlation coefficient.
  3. Carry out a hypothesis test at the \(5 \%\) significance level to examine whether there is any association between time taken for the run and number of push-ups achieved.
  4. Under what circumstances is it appropriate to carry out a hypothesis test based on the product moment correlation coefficient? State, with a reason, which test is more appropriate for these data.
OCR MEI S2 2010 June Q1
16 marks Standard +0.3
1 Two celebrities judge a talent contest. Each celebrity gives a score out of 20 to each of a random sample of 8 contestants. The scores, \(x\) and \(y\), given by the celebrities to each contestant are shown below.
ContestantABCDEFGH
\(x\)61792013151114
\(y\)6131011971215
  1. Calculate the value of Spearman's rank correlation coefficient.
  2. Carry out a hypothesis test at the \(5 \%\) significance level to determine whether there is positive association between the scores allocated by the two celebrities.
  3. State the distributional assumption required for a test based on the product moment correlation coefficient. Sketch a scatter diagram of the scores above, and discuss whether it appears that the assumption is likely to be valid.
OCR MEI S2 2014 June Q1
18 marks Standard +0.3
1 A medical student is investigating the claim that young adults with high diastolic blood pressure tend to have high systolic blood pressure. The student measures the diastolic and systolic blood pressures of a random sample of ten young adults. The data are shown in the table and illustrated in the scatter diagram.
Diastolic blood pressure60616263737684879095
Systolic blood pressure98121118114108112132130134139
\includegraphics[max width=\textwidth, alt={}, center]{17e474c4-f5be-4ca1-b7c3-e444b46c3bec-2_865_809_593_628}
  1. Calculate the value of Spearman's rank correlation coefficient for these data.
  2. Carry out a hypothesis test at the \(5 \%\) significance level to examine whether there is positive association between diastolic blood pressure and systolic blood pressure in the population of young adults.
  3. Explain why, in the light of the scatter diagram, it might not be valid to carry out a test based on the product moment correlation coefficient. The product moment correlation coefficient between the diastolic and systolic blood pressures of a random sample of 10 athletes is 0.707 .
  4. Carry out a hypothesis test at the \(1 \%\) significance level to investigate whether there appears to be positive correlation between these two variables in the population of athletes. You may assume that in this case such a test is valid.
OCR MEI S2 2016 June Q1
18 marks Standard +0.3
1 A researcher believes that there may be negative association between the quantity of fertiliser used and the percentage of the population who live in rural areas in different countries. The data below show the percentage of the population who live in rural areas and the fertiliser use measured in kg per hectare, for a random sample of 11 countries.
Percentage of population33658358169617747117
Fertiliser use764466831071765137157
  1. Draw a scatter diagram to illustrate the data.
  2. Explain why it might not be valid to carry out a test based on the product moment correlation coefficient in this case.
  3. Calculate the value of Spearman's rank correlation coefficient.
  4. Carry out a hypothesis test at the \(1 \%\) significance level to investigate the researcher's belief.
  5. Explain the meaning of ' \(1 \%\) significance level'.
  6. In order to carry out a test based on Spearman's rank correlation coefficient, what modelling assumptions, if any, are required about the underlying distribution?
OCR Further Statistics AS 2022 June Q2
7 marks Standard +0.3
2 Eight runners took part in two races. The positions in which the runners finished in the two races are shown in the table.
RunnerABCDEFGH
First race31562874
Second race43872561
Test at the 5\% significance level whether those runners who do better in one race tend to do better in the other.
OCR Further Statistics AS 2024 June Q5
9 marks Standard +0.3
5 In a fashion competition, two judges gave marks to a large number of contestants. The value of Spearman's rank correlation coefficient, \(\mathrm { r } _ { \mathrm { s } }\), between the marks given to 7 randomly chosen contestants is \(\frac { 27 } { 28 }\).
  1. An excerpt from the table of critical values of \(\mathrm { r } _ { \mathrm { s } }\) is shown below. \section*{Critical values of Spearman's rank correlation coefficient}
    1-tail test5\%2.5\%1\%0.5\%
    2-tail test10\%5\%2\%1\%
    \multirow{3}{*}{\(n\)}60.82860.88570.94291.0000
    70.71430.78570.89290.9286
    80.64290.73810.83330.8810
    Test whether there is evidence, at the 1\% significance level, that the judges agree with each another. The marks given by the two judges to the 7 randomly chosen contestants were as follows, where \(x\) is an integer.
    ContestantABCD\(E\)\(F\)G
    Judge 164656778798086
    Judge 2616378808190\(x\)
  2. Use the value \(\mathrm { r } _ { \mathrm { s } } = \frac { 27 } { 28 }\) to determine the range of possible values of \(x\).
  3. Give a reason why it might be preferable to use the product moment correlation coefficient rather than Spearman's rank correlation coefficient in this context.
OCR Further Statistics AS 2020 November Q4
9 marks Standard +0.3
4 After a holiday organised for a group, the company organising the holiday obtained scores out of 10 for six different aspects of the holiday. The company obtained responses from 100 couples and 100 single travellers. The total scores for each of the aspects are given in the following table.
AspectCouplesSingle travellers
Organisation884867
Travel710633
Food692675
Leader898898
Included visits561736
Optional visits683712
Fred wishes to test whether there is significant positive correlation between the scores given by the two categories.
  1. Explain why it is probably not appropriate to use Pearson's product-moment correlation coefficient.
  2. Carry out an appropriate test at the \(1 \%\) level.
  3. Explain what is meant by the statement that the test carried out in part (b) is a non-parametric test.
OCR Further Statistics 2019 June Q5
7 marks Standard +0.8
5 Five runners, \(A , B , C , D\) and \(E\), take part in two different races.
Spearman's rank correlation coefficient for the orders in which the runners finish is calculated and a test for positive agreement is carried out at the \(5 \%\) significance level.
  1. State suitable hypotheses for the test.
  2. Find the largest possible value of \(\sum d ^ { 2 }\) for which the result of the test is to reject the null hypothesis.
  3. In the first race, the order in which the five runners finished was: \(A , B , C , D , E\). In the second race, three of the runners finished in the same positions as in the first race. The result of the test is to reject the null hypothesis. Find a possible order for the runners to finish in the second race.
OCR Further Statistics 2023 June Q4
10 marks Standard +0.3
4 Two magazines give numerical ratings to hi-fi systems. Li wishes to test whether there is agreement between the opinions of the magazines. Li chooses a random sample of 5 hi -fi systems and looks up the ratings given by the two magazines. The results are shown in the table.
SystemABCDE
Magazine I6875778392
Magazine II3025403545
  1. Give a reason why Li might choose to use a test based on Spearman's rank correlation coefficient rather than on Pearson's product-moment correlation coefficient.
  2. Calculate the value of Spearman's rank correlation coefficient for the data.
  3. Use your answer to part (b) to carry out a hypothesis test at the \(5 \%\) significance level.
  4. The value of Spearman's rank correlation coefficient between the ratings given by magazine I and by a third magazine, magazine III, has the same numerical value as the answer to part (b) but with the sign changed. In the Printed Answer Booklet, complete the table showing the rankings given by magazine III.
OCR Further Statistics 2021 November Q5
10 marks Standard +0.3
5 The numbers of each of 9 items sold in two different supermarkets in a week are given in the following table.
Item123456789
Supermarket \(A\)1728414362697593115
Supermarket \(B\)24718124729584237
A researcher wants to test whether there is association between the numbers of these items sold in the two supermarkets. However, it is known that the collection of data in Supermarket \(B\) was done inaccurately and each of the numbers in the corresponding row of the table could have been in error by as much as 2 items greater or 2 items fewer.
  1. Explain why Spearman's rank correlation coefficient might be preferred to the use of Pearson's product-moment correlation coefficient in this context.
  2. Carry out the test at the \(5 \%\) significance level using Spearman's rank correlation coefficient.
Edexcel S3 2021 January Q2
9 marks Standard +0.3
2. A teacher believes that those of her students with strong mathematical ability may also have enhanced short-term memory. She shows a random sample of 11 students a tray of different objects for eight seconds and then asks them to write down as many of the objects as they can remember. The results, along with their percentage score in a recent mathematics test, are given in the table below.
Student\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)\(K\)
No. of objects811915176101412135
\% in maths test3062578075436551485532
  1. Calculate Spearman's rank correlation coefficient for these data. Show your working clearly.
  2. Stating your hypotheses clearly, carry out a suitable test to assess the teacher's belief. Use a \(5 \%\) level of significance and state your critical value. The teacher shows these results to her class and argues that spending more time trying to improve their short-term memory would improve their mathematical ability.
  3. Explain whether or not you agree with the teacher's argument.
Edexcel S3 2022 January Q3
8 marks Standard +0.3
3. The table shows the time, in seconds, of the fastest qualifying lap for 10 different Formula One racing drivers and their finishing position in the actual race.
Driver\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
Fastest
qualifying lap
62.9463.9263.6362.9563.9763.8764.3164.6465.1864.21
Finishing
position
12345678910
  1. Calculate the value of Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses clearly, test at the \(1 \%\) level of significance, whether or not there is evidence of a positive correlation between the fastest qualifying lap time and finishing position for these Formula One racing drivers.
Edexcel S3 2023 January Q2
12 marks Standard +0.3
2 The table shows the season's best times, \(x\) seconds, for the 8 athletes who took part in the 200 m final in the 2021 Tokyo Olympics. It also shows their finishing position in the race.
Athlete\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Season's best time19.8919.8319.7419.8419.9119.9920.1320.10
Finishing position12345678
Given that the fastest season's best time is ranked number 1
  1. calculate the value of the Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses clearly, test, at the \(1 \%\) level of significance, whether or not there is evidence of a positive correlation between the rank of the season's best time and the finishing position for these athletes. Chris suggests that it would be better to use the actual finishing time, \(y\) seconds, of these athletes rather than their finishing position. Given that $$S _ { x x } = 0.1286875 \quad S _ { y y } = 0.55275 \quad S _ { x y } = 0.225175$$
  3. calculate the product moment correlation coefficient between the season's best time and the finishing time for these athletes.
    Give your answer correct to 3 decimal places.
  4. Use your value of the product moment correlation coefficient to test, at the \(1 \%\) level of significance, whether or not there is evidence of a positive correlation between the season's best time and the finishing time for these athletes.
Edexcel S3 2024 January Q3
12 marks Standard +0.3
  1. The table shows the annual tea consumption, \(t\) (kg/person), and population, \(p\) (millions), for a random sample of 7 European countries.
CountryABCDEFG
Annual tea consumption, \(\boldsymbol { t }\) (kg/person)0.270.150.420.061.940.780.44
Population, \(\boldsymbol { p }\) (millions)5.45.8910.267.917.18.7
$$\text { (You may use } \mathrm { S } _ { t t } = 2.486 \quad \mathrm {~S} _ { p p } = 3026.234 \quad \mathrm {~S} _ { p t } = 83.634 \text { ) }$$ Angela suggests using the product moment correlation coefficient to calculate the correlation between annual tea consumption and population.
  1. Use Angela's suggestion to test, at the \(5 \%\) level of significance, whether or not there is evidence of any correlation between annual tea consumption and population. State your hypotheses clearly and the critical value used. Johan suggests using Spearman's rank correlation coefficient to calculate the correlation between the rank of annual tea consumption and the rank of population.
  2. Calculate Spearman's rank correlation coefficient between the rank of annual tea consumption and the rank of population.
  3. Use Johan's suggestion to test, at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between annual tea consumption and population.
    State your hypotheses clearly and the critical value used.
Edexcel S3 2014 June Q4
12 marks Standard +0.3
4. In a survey 10 randomly selected men had their systolic blood pressure, \(x\), and weight, \(w\), measured. Their results are as follows
Man\(\boldsymbol { A }\)\(\boldsymbol { B }\)\(\boldsymbol { C }\)\(\boldsymbol { D }\)\(\boldsymbol { E }\)\(\boldsymbol { F }\)\(\boldsymbol { G }\)\(\boldsymbol { H }\)\(\boldsymbol { I }\)\(\boldsymbol { J }\)
\(x\)123128137143149153154159162168
\(w\)78938583759888879599
  1. Calculate the value of Spearman's rank correlation coefficient between \(x\) and \(w\).
  2. Stating your hypotheses clearly, test at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between systolic blood pressure and weight. The product moment correlation coefficient for these data is 0.5114
  3. Use the value of the product moment correlation coefficient to test, at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between systolic blood pressure and weight.
  4. Using your conclusions to part (b) and part (c), describe the relationship between systolic blood pressure and weight.
Edexcel S3 2016 June Q1
9 marks Standard +0.3
  1. The table below shows the distance travelled by car and the amount of commission earned by each of 8 salespersons in 2015
SalespersonDistance travelled (in 1000's of km)Commission earned (in \\(1000's)
A20.417.7
B22.224.1
C29.920.3
D37.828.3
E25.534.9
\)F$30.229.3
G35.323.6
H16.526.8
  1. Find Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between the distance travelled by car and the amount of commission earned.