5.08e Spearman rank correlation

107 questions

Sort by: Default | Easiest first | Hardest first
Edexcel S3 2013 June Q2
8 marks Standard +0.3
2. The table below shows the number of students per member of staff and the student satisfaction scores for 7 universities.
University\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)
Number of
students per
member of staff
14.213.113.311.710.515.910.8
Student
satisfaction
score
4.14.23.84.03.94.33.7
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses clearly test, at the \(5 \%\) level of significance, whether or not there is evidence of a correlation between the number of students per member of staff and the student satisfaction score.
Edexcel S3 2014 June Q1
11 marks Standard +0.3
  1. A journalist is investigating factors which influence people when they buy a new car. One possible factor is fuel efficiency. The journalist randomly selects 8 car models. Each model's annual sales and fuel efficiency, in km/litre, are shown in the table below.
Car model\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Annual sales18005400181007100930048001220010700
Fuel efficiency5.218.614.813.218.311.916.517.7
  1. Calculate Spearman's rank correlation coefficient for these data. The journalist believes that car models with higher fuel efficiency will achieve higher sales.
  2. Stating your hypotheses clearly, test whether or not the data support the journalist's belief. Use a \(5 \%\) level of significance.
  3. State the assumption necessary for a product moment correlation coefficient to be valid in this case.
  4. The mean and median fuel efficiencies of the car models in the random sample are 14.5 km /litre and 15.65 km /litre respectively. Considering these statistics, as well as the distribution of the fuel efficiency data, state whether or not the data suggest that the assumption in part (c) might be true in this case. Give a reason for your answer. (No further calculations are required.)
Edexcel S3 2014 June Q8
16 marks Standard +0.3
8. The heights, in metres, and weights, in kilograms, of a random sample of 9 men are shown in the table below
Man\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)
Height \(( x )\)1.681.741.751.761.781.821.841.881.98
Weight \(( y )\)757610077909511096120
  1. Given that \(\mathrm { S } _ { x x } = 0.0632 , \mathrm {~S} _ { y y } = 1957.5556\) and \(\mathrm { S } _ { x y } = 9.3433\) calculate, to 3 decimal places, the product moment correlation coefficient between height and weight for these men.
  2. Use your value of the product moment correlation coefficient to test whether or not there is evidence of a positive correlation between the height and weight of men. Use a \(5 \%\) significance level. State your hypotheses clearly. Peter does not know the heights or weights of the 9 men. He is given photographs of them and asked to put them in order of increasing weight. He puts them in the order $$A C E B G D I F H$$
  3. Find, to 3 decimal places, Spearman's rank correlation coefficient between Peter's order and the actual order.
  4. Use your value of Spearman's rank correlation coefficient to test for evidence of Peter's ability to correctly order men, by their weight, from their photographs. Use a 5\% significance level and state your hypotheses clearly.
Edexcel S3 2015 June Q1
9 marks Standard +0.3
A mobile library has 160 books for children on its records. The librarian believes that books with fewer pages are borrowed more often. He takes a random sample of 10 books for children.
  1. Explain how the librarian should select this random sample.
    (2) The librarian ranked the 10 books according to how often they had been borrowed, with 1 for the book borrowed the most and 10 for the book borrowed the least. He also recorded the number of pages in each book. The results are in the table below.
    Book\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
    Borrowing rank12345678910
    Number of pages502121158030190356283152317
  2. Calculate Spearman's rank correlation coefficient for these data.
  3. Test the librarian's belief using a \(5 \%\) level of significance. State your hypotheses clearly.
Edexcel S3 2017 June Q3
10 marks Standard +0.3
  1. A junior judge is being trained by a senior judge to learn how to assess ice skaters. After the training, the judges each assess 6 ice skaters \(A , B , C , D , E\) and \(F\). They each list them in order of preference with the best ice skater first. The results are shown in the table below.
Rank123456
Senior Judge\(A\)\(B\)\(D\)\(C\)\(F\)\(E\)
Junior Judge\(B\)\(D\)\(A\)\(F\)\(C\)\(E\)
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Test, at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between the rankings of the junior judge and the senior judge. State your hypotheses clearly.
  3. Comment on the effectiveness of the training delivered by the senior judge.
Edexcel S3 2018 June Q1
13 marks Standard +0.3
  1. Phil measures the concentration of a radioactive element, \(c\), and the amount of dissolved solids, \(a\), of 8 random samples of groundwater. His results are shown in the table below.
Sample\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
\(c\)625700650645720600825665
\(a\)1.281.301.001.201.551.151.401.45
Given that $$\mathrm { S } _ { c c } = 34787.5 \quad \mathrm {~S} _ { a a } = 0.2172875 \quad \mathrm {~S} _ { c a } = 47.7625$$
  1. calculate, to 3 decimal places, the product moment correlation coefficient between the concentration of the radioactive element and the amount of dissolved solids for these groundwater samples.
  2. Use your value of the product moment correlation coefficient to test whether or not there is evidence of a positive correlation between the concentration of this radioactive element and the amount of dissolved solids in groundwater. Use a \(5 \%\) significance level. State your hypotheses clearly.
  3. Calculate, to 3 decimal places, Spearman's rank correlation coefficient between the concentration of the radioactive element and the amount of dissolved solids.
  4. Use your value of Spearman's rank correlation coefficient to test for evidence of a positive correlation between the concentration of the radioactive element and the amount of dissolved solids. Use a \(5 \%\) significance level. State your hypotheses clearly.
  5. Using your conclusions in part (b) and part (d), comment on the possible relationship between these variables.
Edexcel S3 Q5
12 marks Standard +0.3
5. A marathon runner believes that she is more likely to win a medal at her national championships the higher the temperature is on the day of the race. She records the temperature at the start of each of eight races against fields of a similar standard and her finishing position in each race. Her results are shown in the table below.
Temperature \(\left( { } ^ { \circ } \mathrm { C } \right)\)1691157211215
Finishing position215519104611
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Using a 5\% level of significance and stating your hypotheses clearly, interpret your result. Another runner suggests that she should use her time in each race instead of her finishing position and calculate the product moment correlation coefficient for the data.
  3. Comment on this suggestion.
Edexcel S3 Q4
12 marks Standard +0.3
4. For a project a student collects data on engine size and sales over a period of time for the models of cars made by one particular manufacturer. Her results are shown in the table below.
Engine Capacity
(litres)
1.11.31.62.12.42.62.83.0
Sales527632840619350425487401
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses clearly, test at the \(5 \%\) level of significance whether or not there is any evidence of correlation.
  3. Explain why it is more appropriate to use Spearman's rank correlation coefficient for this test than the product moment correlation coefficient.
    (2 marks)
OCR MEI Further Statistics A AS 2018 June Q3
12 marks Standard +0.3
3 Samples of water are taken from 10 randomly chosen wells in an area of a country. A researcher is investigating whether there is any relationship between the levels of dissolved oxygen, \(x\), and the amounts of radium, \(y\), in the water from the wells. Both quantities are measured in suitable units. The table and the scatter diagram in Fig. 3 show the values of \(x\) and \(y\) for the ten wells.
\(x\)45.948.352.264.666.667.669.375.077.482.8
\(y\)25.423.926.618.818.919.016.816.317.817.2
\begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{e3ac0ba0-9692-4018-894e-2b04b07eaf32-3_865_786_657_635} \captionsetup{labelformat=empty} \caption{Fig. 3}
\end{figure}
  1. Explain why it may not be appropriate to carry out a hypothesis test based on the product moment correlation coefficient.
  2. Calculate Spearman's rank correlation coefficient for these data.
  3. Using this value of Spearman's rank correlation coefficient, carry out a hypothesis test at the 1\% significance level to investigate whether there is any association between \(x\) and \(y\).
  4. Explain the meaning of the term 'significance level' in the context of the test carried out in part (iii).
OCR MEI Further Statistics A AS 2022 June Q3
10 marks Standard +0.3
3 A biology student is doing an experiment in which plants are inoculated with a particular microorganism in an attempt to help them grow. She is investigating whether there is any association between the percentage of roots which have been colonised by the microorganism and the dry weight of the plant shoots. After the plants have grown for a few weeks, the student takes a random sample of 10 plants and measures the percentage of roots which have been colonised by the microorganism and the dry weight of the plant shoots. The spreadsheet output shows the data, together with a scatter diagram to illustrate the data. \includegraphics[max width=\textwidth, alt={}, center]{8f1e0c68-a334-4657-823e-386ab0994c02-3_722_1648_635_244}
  1. The student decides that a test based on Pearson's product moment correlation coefficient may not be valid. Explain why she comes to this conclusion.
  2. Calculate the value of Spearman's rank correlation coefficient.
  3. Carry out a test based on this coefficient, at the \(5 \%\) significance level, to investigate whether there is any association between percentage colonisation and shoot dry weight.
OCR MEI Further Statistics Minor 2022 June Q5
14 marks Standard +0.3
5 A medical researcher is investigating whether there is any relationship between the age of a person and the level of a particular protein in the person's blood. She measures the levels of the protein (measured in suitable units) in a random sample of 12 hospital patients of various ages (in years). The spreadsheet shows the values obtained, together with a scatter diagram which illustrates the data. \includegraphics[max width=\textwidth, alt={}, center]{e8624e9b-5143-49d2-9683-cc3a1082694e-5_736_1470_1087_246}
  1. The researcher decides that a test based on Pearson's product moment correlation coefficient may not be valid. Explain why she comes to this conclusion.
  2. Calculate the value of Spearman's rank correlation coefficient.
  3. Carry out a test based on this coefficient at the \(5 \%\) significance level to investigate whether there is any association between age and protein level.
  4. Explain why the researcher chose a sample that was random.
  5. The researcher had originally intended to use a sample size of 6 rather than the 12 that she actually used. Explain what advantage there is in using the larger sample size.
OCR MEI Further Statistics Minor 2023 June Q6
10 marks Standard +0.3
6 Each competitor in a lumberjacking competition has to perform various disciplines for which they are timed. A spectator thinks that the times for two of the disciplines, chopping wood and sawing wood, are related. The table and the scatter diagram below show the times of a random sample of 8 competitors in these two disciplines.
CompetitorABCDEFGH
Sawing17.116.714.314.012.821.515.314.4
Chopping23.520.621.918.821.524.819.719.3
\includegraphics[max width=\textwidth, alt={}, center]{72215d69-c3e6-492d-bb3e-bdc28aeb4613-6_786_1130_708_239}
  1. The spectator decides to carry out a hypothesis test to investigate whether there is any relationship. Explain why the spectator decides that a test based on Pearson's product moment correlation coefficient may not be valid.
  2. Determine the value of Spearman's rank correlation coefficient.
  3. Carry out a hypothesis test at the \(5 \%\) significance level to investigate whether there is positive association between sawing and chopping times.
OCR MEI Further Statistics Minor 2020 November Q5
17 marks Moderate -0.3
5 A student is investigating immunisation. He wonders if there is any relationship between the percentage of young children who have been given measles vaccine and the percentage who have been given BCG vaccine in various countries. He takes a random sample of 8 countries and finds the data for the two variables. The spreadsheet in Fig. 5.1 shows the values obtained, together with a scatter diagram which illustrates the data. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{882f9f3c-40d8-4abb-822a-49bd505a33ea-5_910_1653_541_246} \captionsetup{labelformat=empty} \caption{Fig. 5.1}
\end{figure}
  1. The student decides that a test based on Pearson's product moment correlation coefficient is not valid. Explain why he comes to this conclusion. The student carries out a test based on Spearman's rank correlation coefficient.
  2. Calculate the value of Spearman's rank correlation coefficient.
  3. Carry out a test based on this coefficient at the \(5 \%\) significance level to investigate whether there is any association between measles and BCG vaccination levels. The student then decides to investigate the relationship between number of doctors per 1000 people in a country and unemployment rate in that country (unemployment rate is the percentage of the working age population who are not in paid work). He selects a random sample of 6 countries. The spreadsheet in Fig. 5.2 shows the values obtained, together with a scatter diagram which illustrates the data. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{882f9f3c-40d8-4abb-822a-49bd505a33ea-6_776_1649_495_248} \captionsetup{labelformat=empty} \caption{Fig. 5.2}
    \end{figure}
  4. Use your calculator to write down the equation of the regression line of unemployment rate on doctors per 1000.
  5. Use the regression line to estimate the unemployment rate for a country with 2.00 doctors per 1000.
  6. Comment briefly on the reliability of your answer to part (e). The student decides to add the data for another country with 3.99 doctors per 1000 and unemployment rate 11.42 to his diagram.
  7. Add this point to the scatter diagram in the Printed Answer Booklet.
  8. Without doing any further calculations, comment on what difference, if any, including this extra data point would make to the usefulness of a regression line of unemployment rate on doctors per 1000.
OCR MEI Further Statistics Major 2022 June Q8
14 marks Standard +0.3
8 A swimming coach is investigating whether there is correlation between the times taken by teenage swimmers to swim 50 m Butterfly and 50 m Freestyle. The coach selects a random sample of 11 teenage swimmers and records the times that each of them take for each event. The spreadsheet shows the data, together with a scatter diagram to illustrate the data. \includegraphics[max width=\textwidth, alt={}, center]{77eabbd6-a058-457f-9601-d66f3c2db005-06_712_1465_456_274}
  1. In the scatter diagram, Butterfly times have been plotted on the horizontal axis and Freestyle times on the vertical axis. A student states that the variables should have been plotted the other way around. Explain whether the student is correct. The student decides to carry out a hypothesis test to investigate whether there is any correlation between the times taken for the two events.
  2. Explain why the student decides to carry out a test based on Spearman's rank correlation coefficient.
  3. In this question you must show detailed reasoning. Carry out the test at the 5\% significance level.
  4. The student concludes that there is definitely no correlation between the times. Comment on the student's conclusion.
OCR MEI Further Statistics Major 2024 June Q6
11 marks Standard +0.3
6 A student is investigating the relationship between age and grip strength in adults. The student selects 10 people and records their ages in years and the grip strengths of their dominant hand, measured in kg. The data are shown in the table below, together with a scatter diagram to illustrate the data.
Age22293639535760717682
Grip strength38464249374736333424
\includegraphics[max width=\textwidth, alt={}]{bab116b3-6e5f-44db-ac86-670e4040d649-05_634_1107_641_239}
The student decides to carry out a hypothesis test to investigate whether there is negative association between age and grip strength.
  1. Explain why the student decides to carry out a test based on Spearman's rank correlation coefficient.
  2. State what property of the sample is required in order for it to be valid to carry out a hypothesis test.
  3. In this question you must show detailed reasoning. Assuming that the property in part (b) holds, carry out the test at the \(5 \%\) significance level.
WJEC Further Unit 2 2019 June Q1
7 marks Standard +0.3
  1. Sketch a scatter diagram of a dataset for which Spearman's rank correlation coefficient is + 1 , but the product moment correlation coefficient is less than 1 . Two judges were judging cheese at the UK Cheese Festival. There were 8 blue cheeses in a particular category. The rankings are shown below.
    CheeseABCDEFGH
    Judge 115876432
    Judge 213852467
  2. Calculate Spearman's rank correlation coefficient for this dataset.
  3. By sketching a scatter diagram of the rankings, or otherwise, comment on the extent to which the judges agree.
Edexcel FS2 AS 2018 June Q3
12 marks Standard +0.8
  1. The table below shows the heights cleared, in metres, for each of 6 competitors in a high jump competition.
CompetitorABCDEF
Height (m)2.051.932.021.961.812.02
These 6 competitors also took part in a long jump competition and finished in the following order, with C jumping the furthest.
C
A
F
D
B
E
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses clearly, test at the \(5 \%\) level of significance whether or not there is a positive correlation between results in the high jump and results in the long jump. The product moment correlation coefficient between the height of the high jump and the length of the long jump for each competitor is found to be 0.678
  3. Use this value to test, at the \(5 \%\) level of significance, for evidence of positive correlation between results in the high jump and results in the long jump.
  4. State the condition required for the test in part (c) to be valid.
  5. Explain what your conclusions in part (b) and part (c) suggest about the relationship between results in the high jump and results in the long jump.
    V349 SIHI NI IMIMM ION OCVJYV SIHIL NI LIIIM ION OOVJYV SIHIL NI JIIYM ION OC
Edexcel FS2 AS 2019 June Q1
10 marks Standard +0.3
  1. Bara is investigating whether or not the two judges of a skating competition are in agreement. The two judges gave a score to each of the 8 skaters in the competition as shown in the table below.
\cline { 2 - 9 } \multicolumn{1}{c|}{}Skater
\cline { 2 - 9 }\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Judge 17170726263615753
Judge 27371676462565253
Bara decided to calculate Spearman's rank correlation coefficient for these data.
  1. Calculate Spearman's rank correlation coefficient between the ranks of the two judges.
  2. Test, at the \(1 \%\) level of significance, whether or not the two judges are in agreement. Judge 1 accidentally swapped the scores for skaters \(D\) and \(E\). The score for skater \(D\) should be 63 and the score for skater \(E\) should be 62
  3. Without carrying out any further calculations, explain how Spearman's rank correlation coefficient will change. Give a reason for your answer.
Edexcel FS2 AS 2020 June Q2
9 marks Standard +0.3
  1. Mary, Jahil and Dawn are judging the cakes in a village show. They have 5 features to consider and each feature is awarded up to 5 points. The total score the judges gave each cake are given in the table below.
CakeA\(B\)C\(D\)\(E\)\(F\)\(G\)\(H\)I
Mary19172310211512814
Jahil221821102420161215
Dawn911618915132013
  1. Calculate Spearman's rank correlation coefficient between Mary's scores and Jahil's scores.
  2. Calculate Spearman's rank correlation coefficient between Jahil's scores and Dawn's scores. The judges discussed their interpretation of the points system and agreed that the first prize should go to cake \(C\).
  3. Explain how different interpretations of the points system could give rise to the results in part (a) and part (b).
Edexcel FS2 AS 2022 June Q1
7 marks Standard +0.3
  1. Abena and Meghan are both given the same list of 10 films.
Each of them ranks the 10 films from most favourite to least favourite.
For the differences, \(d\), between their ranks for these 10 films, \(\sum d ^ { 2 } = 84\)
  1. Calculate Spearman's rank correlation coefficient between Abena's ranks and Meghan's ranks. A test is carried out at the 5\% level of significance to see if there is agreement between their ranks for the films. The hypotheses for the test are $$\mathrm { H } _ { 0 } : \rho _ { \mathrm { S } } = 0 \quad \mathrm { H } _ { 1 } : \rho _ { \mathrm { S } } > 0$$
    1. Find the critical region for the test.
    2. State the conclusion of the test. An 11th film is added to the list. Abena and Meghan both agree that this film is their least favourite. A new test is carried out at the \(5 \%\) level of significance using the same hypotheses.
  2. Determine the conclusion of this test. You should state the test statistic and the critical value used.
Edexcel FS2 AS 2023 June Q1
10 marks Standard +0.3
  1. Every applicant for a job at Donala is given three different tasks, \(P , Q\) and \(R\).
For each task the applicant is awarded a score.
The scores awarded to 9 of the applicants, for the tasks \(P\) and \(Q\), are given below.
Applicant\(A\)\(B\)C\(D\)E\(F\)GHI
Task \(\boldsymbol { P }\)1916161281712125
Task \(Q\)1711147618151110
  1. Calculate Spearman's rank correlation coefficient for the scores awarded for the tasks \(P\) and \(Q\).
  2. Test, at the \(1 \%\) level of significance, whether or not there is evidence for a positive correlation between the ranks of scores for tasks \(P\) and \(Q\). You should state your hypotheses and critical value clearly. The Spearman's rank correlation coefficient for \(P\) and \(R\) is 0.290 and for \(Q\) and \(R\) is 0.795 The manager of Donala wishes to reduce the number of tasks given to job applicants from three to two.
  3. Giving a reason for your answer, state which 2 tasks you would recommend the manager uses.
Edexcel FS2 AS Specimen Q1
10 marks Standard +0.3
  1. In a gymnastics competition, two judges scored each of 8 competitors on the vault.
CompetitorABCDEFGH
J udge 1's scores4.69.18.48.89.09.59.29.4
J udge 2's scores7.88.88.68.59.19.69.09.3
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses clearly, test at the \(1 \%\) level of significance, whether or not the two judges are generally in agreement.
  3. Give a reason to support the use of Spearman's rank correlation coefficient in this case. The judges also scored the competitors on the beam.
    Spearman's rank correlation coefficient for their ranks on the beam was found to be 0.952
  4. Compare the judges' ranks on the vault with their ranks on the beam.
Edexcel FS2 2019 June Q8
11 marks Challenging +1.8
8 Nine athletes, \(A , B , C , D , E , F , G , H\) and \(I\), competed in both the 100 m sprint and the long jump. After the two events the positions of each athlete were recorded and Spearman's rank correlation coefficient was calculated and found to be 0.85
  1. Stating your hypotheses clearly, test whether or not there is evidence to suggest that the higher an athlete's position is in the 100 m sprint, the higher their position is in the long jump. Use a \(5 \%\) level of significance. The piece of paper the positions were recorded on was mislaid. Although some of the athletes agreed their positions, there was some disagreement between athletes \(B , C\) and \(D\) over their long jump results. The table shows the results that are agreed to be correct.
    Athlete\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)
    Position in 100 m sprint467928315
    Position in long jump549312
    Given that there were no tied ranks,
  2. find the correct positions of athletes \(B , C\) and \(D\) in the long jump. You must show your working clearly and give reasons for your answers.
  3. Without recalculating the coefficient, explain how Spearman's rank correlation coefficient would change if athlete \(H\) was disqualified from both the 100 m sprint and the long jump.
Edexcel FS2 2021 June Q1
7 marks Standard +0.3
  1. Anisa is investigating the relationship between marks on a History test and marks on a Geography test. She collects information from 7 students. She wants to calculate the Spearman's rank correlation coefficient for the 7 students so she ranks their performance on each test.
StudentHistory markGeography markHistory rankGeography rank
A765813
B706022
C6457\(s\)\(t\)
D6463\(s\)1
E6457\(s\)\(t\)
F595067
G555276
  1. Write down the value of \(s\) and the value of \(t\) The full product moment correlation coefficient (pmcc) formula is used with the ranks to calculate the Spearman's rank correlation coefficient instead of \(r _ { s } = 1 - \frac { 6 \Sigma d ^ { 2 } } { n \left( n ^ { 2 } - 1 \right) }\) and the value obtained is 0.7106 to 4 significant figures.
  2. Explain why the full pmcc formula is used to carry out the calculation.
  3. Stating your hypotheses clearly, test whether or not there is evidence to suggest that the higher a student ranks in the History test, the higher the student ranks in the Geography test. Use a \(5 \%\) level of significance.
Edexcel FS2 2024 June Q2
7 marks Standard +0.3
  1. An estate agent asks customers to rank 7 features of a house, \(A , B , C , D , E , F\) and \(G\), in order of importance. The responses for two randomly selected customers are in the table below.
Rank1234567
Customer 1\(A\)\(E\)\(C\)\(F\)\(G\)\(B\)\(D\)
Customer 2\(E\)\(F\)\(C\)\(G\)\(A\)\(D\)\(B\)
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses and critical value clearly, test at the \(5 \%\) level of significance, whether or not the two customers are generally in agreement.