5.08g Compare: Pearson vs Spearman

19 questions

Sort by: Default | Easiest first | Hardest first
OCR S1 2006 June Q6
10 marks Standard +0.3
6 The table shows the total distance travelled, in thousands of miles, and the amount of commission earned, in thousands of pounds, by each of seven sales agents in 2005.
Agent\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)
Distance travelled18151214162413
Commission earned18451924272223
  1. (a) Calculate Spearman's rank correlation coefficient, \(r _ { s }\), for these data.
    (b) Comment briefly on your value of \(r _ { s }\) with reference to this context.
    (c) After these data were collected, agent \(A\) found that he had made a mistake. He had actually travelled 19000 miles in 2005. State, with a reason, but without further calculation, whether the value of Spearman's rank correlation coefficient will increase, decrease or stay the same. The agents were asked to indicate their level of job satisfaction during 2005. A score of 0 represented no job satisfaction, and a score of 10 represented high job satisfaction. Their scores, \(y\), together with the data for distance travelled, \(x\), are illustrated in the scatter diagram below.
    [diagram]
  2. For this scatter diagram, what can you say about the value of
    (a) Spearman's rank correlation coefficient,
    (b) the product moment correlation coefficient?
OCR S1 2016 June Q4
8 marks Moderate -0.3
4 In this question the product moment correlation coefficient is denoted by \(r\) and Spearman's rank correlation coefficient is denoted by \(r _ { s }\).
  1. The scatter diagram in Fig. 1 shows the results of an experiment involving some bivariate data. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{b5ce3230-7528-439c-9e85-ef159a49cba3-4_597_595_434_733} \captionsetup{labelformat=empty} \caption{Fig. 1}
    \end{figure} Write down the value of \(r _ { s }\) for these data.
  2. On the diagram in the Answer Booklet, draw five points such that \(r _ { s } = 1\) and \(r \neq 1\).
  3. The scatter diagram in Fig. 2 shows the results of another experiment involving 5 items of bivariate data. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{b5ce3230-7528-439c-9e85-ef159a49cba3-4_604_608_1484_731} \captionsetup{labelformat=empty} \caption{Fig. 2}
    \end{figure} Calculate the value of \(r _ { s }\).
  4. A random variable \(X\) has the distribution \(\mathrm { B } ( 25,0.6 )\). Find
    1. \(\mathrm { P } ( X \leqslant 14 )\),
    2. \(\mathrm { P } ( X = 14 )\),
    3. \(\quad \operatorname { Var } ( X )\).
    4. A random variable \(Y\) has the distribution \(\mathrm { B } ( 24,0.3 )\). Write down an expression for \(\mathrm { P } ( Y = y )\) and evaluate this probability in the case where \(y = 8\).
    5. A random variable \(Z\) has the distribution \(\mathrm { B } ( 2,0.2 )\). Find the probability that two randomly chosen values of \(Z\) are equal.
      (a) Find the number of ways in which 12 people can be divided into three groups containing 5 people, 4 people and 3 people, without regard to order.
      (b) The diagram shows 7 cards, each with a letter on it. $$\mathrm { A } \mathrm {~A} \mathrm {~A} \mathrm {~B} \text { } \mathrm { B } \text { } \mathrm { R } \text { } \mathrm { R }$$ The 7 cards are arranged in a random order in a straight line.
      1. Find the number of possible arrangements of the 7 letters.
      2. Find the probability that the 7 letters form the name BARBARA. The 7 cards are shuffled. Now 4 of the 7 cards are chosen at random and arranged in a random order in a straight line.
      3. Find the probability that the letters form the word ABBA .
OCR Further Statistics 2023 June Q4
10 marks Standard +0.3
4 Two magazines give numerical ratings to hi-fi systems. Li wishes to test whether there is agreement between the opinions of the magazines. Li chooses a random sample of 5 hi -fi systems and looks up the ratings given by the two magazines. The results are shown in the table.
SystemABCDE
Magazine I6875778392
Magazine II3025403545
  1. Give a reason why Li might choose to use a test based on Spearman's rank correlation coefficient rather than on Pearson's product-moment correlation coefficient.
  2. Calculate the value of Spearman's rank correlation coefficient for the data.
  3. Use your answer to part (b) to carry out a hypothesis test at the \(5 \%\) significance level.
  4. The value of Spearman's rank correlation coefficient between the ratings given by magazine I and by a third magazine, magazine III, has the same numerical value as the answer to part (b) but with the sign changed. In the Printed Answer Booklet, complete the table showing the rankings given by magazine III.
Edexcel S3 2014 June Q1
11 marks Standard +0.3
  1. A journalist is investigating factors which influence people when they buy a new car. One possible factor is fuel efficiency. The journalist randomly selects 8 car models. Each model's annual sales and fuel efficiency, in km/litre, are shown in the table below.
Car model\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Annual sales18005400181007100930048001220010700
Fuel efficiency5.218.614.813.218.311.916.517.7
  1. Calculate Spearman's rank correlation coefficient for these data. The journalist believes that car models with higher fuel efficiency will achieve higher sales.
  2. Stating your hypotheses clearly, test whether or not the data support the journalist's belief. Use a \(5 \%\) level of significance.
  3. State the assumption necessary for a product moment correlation coefficient to be valid in this case.
  4. The mean and median fuel efficiencies of the car models in the random sample are 14.5 km /litre and 15.65 km /litre respectively. Considering these statistics, as well as the distribution of the fuel efficiency data, state whether or not the data suggest that the assumption in part (c) might be true in this case. Give a reason for your answer. (No further calculations are required.)
OCR MEI Further Statistics A AS Specimen Q6
12 marks Standard +0.3
6 A motorist decides to check the fuel consumption, \(y\) miles per gallon, of her car at particular speeds, \(x \mathrm { mph }\), on flat roads. She carries out the check on a suitable stretch of motorway. Fig. 6 shows her results. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{880026ad-1cd3-40bb-bc87-8dcc94bd9bbd-4_707_1091_1320_477} \captionsetup{labelformat=empty} \caption{Fig. 6}
\end{figure}
  1. Explain why it would not be appropriate to carry out a hypothesis test for correlation based on the product moment correlation coefficient.
  2. (A) One of the results is an outlier. Circle the outlier on the copy of Fig. 6 in the Printed Answer Booklet.
    (B) Suggest one possible reason for the outlier in part (ii) (A) not being used in any analysis. The motorist decides to remove this item of data from any analysis. The table below shows part of a spreadsheet that was used to analyse the 14 remaining data items (with the outlier removed). Some rows of the spreadsheet have been deliberately omitted.
    Data item\(x\)\(y\)\(x ^ { 2 }\)\(y ^ { 2 }\)\(x y\)
    15053.625002872.962680
    25053.325002840.892665
    137044.849002007.043136
    147044.249001953.643094
    Sum8406865115033779.740812
  3. Calculate the equation of the regression line of \(y\) on \(x\).
  4. Use the equation of the regression line to predict the fuel consumption of the car at
    (A) 58 mph ,
    (B) 30 mph .
  5. Comment on the reliability of your predictions in part (iv). }{www.ocr.org.uk}) after the live examination series. If OCR has unwittingly failed to correctly acknowledge or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity.
    For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1GE.
    OCR is part of the }\section*{}
OCR MEI Further Statistics Minor 2022 June Q5
14 marks Standard +0.3
5 A medical researcher is investigating whether there is any relationship between the age of a person and the level of a particular protein in the person's blood. She measures the levels of the protein (measured in suitable units) in a random sample of 12 hospital patients of various ages (in years). The spreadsheet shows the values obtained, together with a scatter diagram which illustrates the data. \includegraphics[max width=\textwidth, alt={}, center]{e8624e9b-5143-49d2-9683-cc3a1082694e-5_736_1470_1087_246}
  1. The researcher decides that a test based on Pearson's product moment correlation coefficient may not be valid. Explain why she comes to this conclusion.
  2. Calculate the value of Spearman's rank correlation coefficient.
  3. Carry out a test based on this coefficient at the \(5 \%\) significance level to investigate whether there is any association between age and protein level.
  4. Explain why the researcher chose a sample that was random.
  5. The researcher had originally intended to use a sample size of 6 rather than the 12 that she actually used. Explain what advantage there is in using the larger sample size.
OCR MEI Further Statistics Minor 2023 June Q6
10 marks Standard +0.3
6 Each competitor in a lumberjacking competition has to perform various disciplines for which they are timed. A spectator thinks that the times for two of the disciplines, chopping wood and sawing wood, are related. The table and the scatter diagram below show the times of a random sample of 8 competitors in these two disciplines.
CompetitorABCDEFGH
Sawing17.116.714.314.012.821.515.314.4
Chopping23.520.621.918.821.524.819.719.3
\includegraphics[max width=\textwidth, alt={}, center]{72215d69-c3e6-492d-bb3e-bdc28aeb4613-6_786_1130_708_239}
  1. The spectator decides to carry out a hypothesis test to investigate whether there is any relationship. Explain why the spectator decides that a test based on Pearson's product moment correlation coefficient may not be valid.
  2. Determine the value of Spearman's rank correlation coefficient.
  3. Carry out a hypothesis test at the \(5 \%\) significance level to investigate whether there is positive association between sawing and chopping times.
OCR MEI Further Statistics Minor 2020 November Q5
17 marks Moderate -0.3
5 A student is investigating immunisation. He wonders if there is any relationship between the percentage of young children who have been given measles vaccine and the percentage who have been given BCG vaccine in various countries. He takes a random sample of 8 countries and finds the data for the two variables. The spreadsheet in Fig. 5.1 shows the values obtained, together with a scatter diagram which illustrates the data. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{882f9f3c-40d8-4abb-822a-49bd505a33ea-5_910_1653_541_246} \captionsetup{labelformat=empty} \caption{Fig. 5.1}
\end{figure}
  1. The student decides that a test based on Pearson's product moment correlation coefficient is not valid. Explain why he comes to this conclusion. The student carries out a test based on Spearman's rank correlation coefficient.
  2. Calculate the value of Spearman's rank correlation coefficient.
  3. Carry out a test based on this coefficient at the \(5 \%\) significance level to investigate whether there is any association between measles and BCG vaccination levels. The student then decides to investigate the relationship between number of doctors per 1000 people in a country and unemployment rate in that country (unemployment rate is the percentage of the working age population who are not in paid work). He selects a random sample of 6 countries. The spreadsheet in Fig. 5.2 shows the values obtained, together with a scatter diagram which illustrates the data. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{882f9f3c-40d8-4abb-822a-49bd505a33ea-6_776_1649_495_248} \captionsetup{labelformat=empty} \caption{Fig. 5.2}
    \end{figure}
  4. Use your calculator to write down the equation of the regression line of unemployment rate on doctors per 1000.
  5. Use the regression line to estimate the unemployment rate for a country with 2.00 doctors per 1000.
  6. Comment briefly on the reliability of your answer to part (e). The student decides to add the data for another country with 3.99 doctors per 1000 and unemployment rate 11.42 to his diagram.
  7. Add this point to the scatter diagram in the Printed Answer Booklet.
  8. Without doing any further calculations, comment on what difference, if any, including this extra data point would make to the usefulness of a regression line of unemployment rate on doctors per 1000.
Edexcel FS2 AS Specimen Q1
10 marks Standard +0.3
  1. In a gymnastics competition, two judges scored each of 8 competitors on the vault.
CompetitorABCDEFGH
J udge 1's scores4.69.18.48.89.09.59.29.4
J udge 2's scores7.88.88.68.59.19.69.09.3
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses clearly, test at the \(1 \%\) level of significance, whether or not the two judges are generally in agreement.
  3. Give a reason to support the use of Spearman's rank correlation coefficient in this case. The judges also scored the competitors on the beam.
    Spearman's rank correlation coefficient for their ranks on the beam was found to be 0.952
  4. Compare the judges' ranks on the vault with their ranks on the beam.
OCR FS1 AS 2017 December Q6
9 marks Standard +0.3
6 Arlosh, Sarah and Desi are investigating the ratings given to six different films by two critics.
  1. Arlosh calculates Spearman's rank correlation coefficient \(r _ { s }\) for the critics' ratings. He calculates that \(\Sigma d ^ { 2 } = 72\). Show that this value must be incorrect.
  2. Arlosh checks his working with Sarah, whose answer \(r _ { s } = \frac { 29 } { 35 }\) is correct. Find the correct value of \(\Sigma d ^ { 2 }\).
  3. Carry out an appropriate two-tailed significance test of the value of \(r _ { s }\) at the \(5 \%\) significance level, stating your hypotheses clearly. Each critic gives a score out of 100 to each film. Desi uses these scores to calculate Pearson's product-moment correlation coefficient. She carries out a two-tailed significance test of this value at the \(5 \%\) significance level.
  4. Explain with a reason whether you would expect the conclusion of Desi's test to be the same as the result of the test in part (iii).
OCR FS1 AS 2018 March Q5
4 marks Easy -1.8
5 The speed \(v \mathrm {~ms} ^ { - 1 }\) of a car at time \(t\) seconds after it starts to accelerate was measured at 1 -second intervals. The results are shown in the following diagram. \includegraphics[max width=\textwidth, alt={}, center]{d5843350-52f9-4fed-adf4-86ceb958033f-3_661_1186_1078_443}
  1. State whether \(t\) or \(v\) or neither is a controlled variable. The value of the product moment correlation coefficient \(r\) for the data is 0.987 correct to 3 significant figures.
  2. The speed of the car is converted to miles per hour and the time to minutes. State the value of \(r\) for the converted data.
  3. State the value of Spearman's rank correlation coefficient \(r _ { s }\) for the data.
  4. What information does \(r\) give about the data that is not given by \(r _ { s }\) ?
OCR MEI Further Statistics Major Specimen Q3
11 marks Standard +0.3
3 A researcher is investigating factors that might affect how many hours per day different species of mammals spend asleep. First she investigates human beings. She collects data on body mass index, \(x\), and hours of sleep, \(y\), for a random sample of people. A scatter diagram of the data is shown in Fig. 3.1 together with the regression line of \(y\) on \(x\). \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{e6ee3a4a-3e76-4422-9a78-17b64b458f83-04_885_1584_598_274} \captionsetup{labelformat=empty} \caption{Fig. 3.1}
\end{figure}
  1. Calculate the residual for the data point which has the residual with the greatest magnitude.
  2. Use the equation of the regression line to estimate the mean number of hours spent asleep by a person with body mass index
    (A) 26,
    (B) 16,
    commenting briefly on each of your predictions. The researcher then collects additional data for a large number of species of mammals and analyses different factors for effect size. Definitions of the variables measured for a typical animal of the species, the correlations between these variables, and guidelines often used when considering effect size are given in Fig. 3.2.
    VariableDefinition
    Body massMass of animal in kg
    Brain massMass of brain in g
    Hours of sleep/dayNumber of hours per day spent asleep
    Life spanHow many years the animal lives
    DangerA measure of how dangerous the animal's situation is when asleep, taking into account predators and how protected the animal's den is: higher value indicates greater danger.
    Correlations (pmcc)Body MassBrain MassHours of sleep/dayLife spanDanger
    Body Mass1.00
    Brain Mass0.931.00
    Hours of sleep/day-0.31-0.361.00
    Life span0.300.51-0.411.00
    Danger0.130.15-0.590.061.00
    \begin{table}[h]
    Product moment
    correlation coefficient
    Effect size
    0.1Small
    0.3Medium
    0.5Large
    \captionsetup{labelformat=empty} \caption{Fig. 3.2}
    \end{table}
  3. State two conclusions the researcher might draw from these tables, relevant to her investigation into how many hours mammals spend asleep. One of the researcher's students notices the high correlation between body mass and brain mass and produces a scatter diagram for these two variables, shown in Fig. 3.3 below. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e6ee3a4a-3e76-4422-9a78-17b64b458f83-05_675_698_1802_735} \captionsetup{labelformat=empty} \caption{Fig. 3.3}
    \end{figure}
  4. Comment on the suitability of a linear model for these two variables.
OCR MEI Further Statistics Major Specimen Q6
3 marks Standard +0.3
6 Fig. 6 shows the wages earned in the last 12 months by each of a random sample of American males aged between 16 and 65 . \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{e6ee3a4a-3e76-4422-9a78-17b64b458f83-07_771_1278_340_392} \captionsetup{labelformat=empty} \caption{Fig. 6}
\end{figure} A researcher wishes to test whether the sample provides evidence of a tendency for higher wages to be earned by older men in the age range 16 to 65 in America.
  1. The researcher needs to decide whether to use a test based on Pearson's product moment correlation coefficient or Spearman's rank correlation coefficient. Use the information in Fig. 6 to decide which test is more appropriate.
  2. Should it be a one-tail or a two-tail test? Justify your answer.
Edexcel S3 2009 June Q3
11 marks Standard +0.3
A doctor is interested in the relationship between a person's Body Mass Index (BMI) and their level of fitness. She believes that a lower BMI leads to a greater level of fitness. She randomly selects 10 female 18 year-olds and calculates each individual's BMI. The females then run a race and the doctor records their finishing positions. The results are shown in the table.
Individual\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
BMI17.421.418.924.419.420.122.618.425.828.1
Finishing position35196410278
  1. Calculate Spearman's rank correlation coefficient for these data. [5]
  2. Stating your hypotheses clearly and using a one tailed test with a 5\% level of significance, interpret your rank correlation coefficient. [5]
  3. Give a reason to support the use of the rank correlation coefficient rather than the product moment correlation coefficient with these data. [1]
Edexcel S3 2016 June Q3
Moderate -0.3
  1. Describe when you would use Spearman's rank correlation coefficient rather than the product moment correlation coefficient to measure the strength of the relationship between two variables. (1) A shop sells sunglasses and ice cream. For one week in the summer the shopkeeper ranked the daily sales of ice cream and sunglasses. The ranks are shown in the table below.
    SunMonTuesWedsThursFriSat
    Ice cream6475321
    Sunglasses6572341
  2. Calculate Spearman's rank correlation coefficient for these data. (3)
  3. Test, at the 5\% level of significance, whether or not there is a positive correlation between sales of ice cream and sales of sunglasses. State your hypotheses clearly. (4) The shopkeeper calculates the product moment correlation coefficient from his raw data and finds \(r = 0.65\)
  4. Using this new coefficient, test, at the 5\% level of significance, whether or not there is a positive correlation between sales of ice cream and sales of sunglasses. (2)
  5. Using your answers to part (c) and part (d), comment on the nature of the relationship between sales of sunglasses and sales of ice cream. (1)
Edexcel S3 Q7
16 marks Standard +0.3
For one of the activities at a gymnastics competition, 8 gymnasts were awarded marks out of 10 for each of artistic performance and technical ability. The results were as follows.
Gymnast\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Technical ability8.58.69.57.56.89.19.49.2
Artistic performance6.27.58.26.76.07.28.09.1
The value of the product moment correlation coefficient for these data is 0.774.
  1. Stating your hypotheses clearly and using a 1% level of significance, interpret this value. [5]
  2. Calculate the value of the rank correlation coefficient for these data. [6]
  3. Stating your hypotheses clearly and using a 1% level of significance, interpret this coefficient. [3]
  4. Explain why the rank correlation coefficient might be the better one to use with these data. [2]
OCR S1 2013 January Q7
7 marks Standard +0.3
  1. Two judges rank \(n\) competitors, where \(n\) is an even number. Judge 2 reverses each consecutive pair of ranks given by Judge 1, as shown.
    Competitor\(C_1\)\(C_2\)\(C_3\)\(C_4\)\(C_5\)\(C_6\)\(\ldots\)\(C_{n-1}\)\(C_n\)
    Judge 1 rank123456\(\ldots\)\(n-1\)\(n\)
    Judge 2 rank214365\(\ldots\)\(n\)\(n-1\)
    Given that the value of Spearman's coefficient of rank correlation is \(\frac{63}{65}\), find \(n\). [4]
  2. An experiment produced some data from a bivariate distribution. The product moment correlation coefficient is denoted by \(r\), and Spearman's rank correlation coefficient is denoted by \(r_s\).
    1. Explain whether the statement $$r = 1 \Rightarrow r_s = 1$$ is true or false. [1]
    2. Use a diagram to explain whether the statement $$r \neq 1 \Rightarrow r_s \neq 1$$ is true or false. [2]
WJEC Further Unit 2 2023 June Q5
12 marks Standard +0.3
  1. Give two circumstances where it may be more appropriate to use Spearman's rank correlation coefficient rather than Pearson's product moment correlation coefficient. [2]
  2. A farmer needs a new tractor. The tractor salesman selects 6 tractors at random to show the farmer. The farmer ranks these tractors, in order of preference, according to their ability to meet his needs on the farm. The tractor salesman makes a note of the price and power take-off (PTO) of the tractors.
    TractorFarmer's rankPTO (horsepower)Price (£1000s)
    A177·580
    B687·945
    C553·047
    D441·053
    E2112·060
    F390·061
    Spearman's rank correlation coefficient between the farmer's ranks and the price is 0·9429.
    1. Test at the 5% significance level whether there is an association between the price of a tractor and the farmer's judgement of the ability of the tractor to meet his needs on the farm. [4]
    2. Calculate Spearman's rank correlation coefficient between the farmer's rank and PTO. [4]
    3. How should the tractor salesman interpret the results in (i) and (ii)? [2]
WJEC Further Unit 2 Specimen Q3
9 marks Standard +0.3
A class of 8 students sit examinations in History and Geography. The marks obtained by these students are given below.
StudentABCDEFGH
History mark7359834957826760
Geography mark5551585944664967
  1. Calculate Spearman's rank correlation coefficient for this data set. [6]
  2. Hence determine whether or not, at the 5% significance level, there is evidence of a positive association between marks in History and marks in Geography. [2]
  3. Explain why it might not have been appropriate to use Pearson's product moment correlation coefficient to test association using this data set. [1]