5.08f Hypothesis test: Spearman rank

95 questions

Sort by: Default | Easiest first | Hardest first
Edexcel S3 Q4
12 marks Standard +0.3
4. For a project a student collects data on engine size and sales over a period of time for the models of cars made by one particular manufacturer. Her results are shown in the table below.
Engine Capacity
(litres)
1.11.31.62.12.42.62.83.0
Sales527632840619350425487401
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses clearly, test at the \(5 \%\) level of significance whether or not there is any evidence of correlation.
  3. Explain why it is more appropriate to use Spearman's rank correlation coefficient for this test than the product moment correlation coefficient.
    (2 marks)
OCR MEI Further Statistics A AS 2018 June Q3
12 marks Standard +0.3
3 Samples of water are taken from 10 randomly chosen wells in an area of a country. A researcher is investigating whether there is any relationship between the levels of dissolved oxygen, \(x\), and the amounts of radium, \(y\), in the water from the wells. Both quantities are measured in suitable units. The table and the scatter diagram in Fig. 3 show the values of \(x\) and \(y\) for the ten wells.
\(x\)45.948.352.264.666.667.669.375.077.482.8
\(y\)25.423.926.618.818.919.016.816.317.817.2
\begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{e3ac0ba0-9692-4018-894e-2b04b07eaf32-3_865_786_657_635} \captionsetup{labelformat=empty} \caption{Fig. 3}
\end{figure}
  1. Explain why it may not be appropriate to carry out a hypothesis test based on the product moment correlation coefficient.
  2. Calculate Spearman's rank correlation coefficient for these data.
  3. Using this value of Spearman's rank correlation coefficient, carry out a hypothesis test at the 1\% significance level to investigate whether there is any association between \(x\) and \(y\).
  4. Explain the meaning of the term 'significance level' in the context of the test carried out in part (iii).
OCR MEI Further Statistics A AS 2019 June Q5
13 marks Standard +0.3
5 A researcher is investigating births of females and males in a particular species of animal which very often produces litters of 7 offspring.
The table shows some data about the number of females per litter in 200 litters of 7 offspring. The researcher thinks that a binomial distribution \(\mathrm { B } ( 7 , p )\) may be an appropriate model for these data. (c) Complete the test at the \(5 \%\) significance level. Fig. 5 shows the probability distribution \(\mathrm { B } ( 7,0.35 )\) together with the relative frequencies of the observed data (the numbers of litters each divided by 200). \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{fd496303-10f1-450e-bbeb-421ab6f4de21-5_659_1285_342_319} \captionsetup{labelformat=empty} \caption{Fig. 5}
\end{figure} (d) Comment on the result of the test completed in part (c) by considering Fig. 5.
OCR MEI Further Statistics A AS 2022 June Q3
10 marks Standard +0.3
3 A biology student is doing an experiment in which plants are inoculated with a particular microorganism in an attempt to help them grow. She is investigating whether there is any association between the percentage of roots which have been colonised by the microorganism and the dry weight of the plant shoots. After the plants have grown for a few weeks, the student takes a random sample of 10 plants and measures the percentage of roots which have been colonised by the microorganism and the dry weight of the plant shoots. The spreadsheet output shows the data, together with a scatter diagram to illustrate the data. \includegraphics[max width=\textwidth, alt={}, center]{8f1e0c68-a334-4657-823e-386ab0994c02-3_722_1648_635_244}
  1. The student decides that a test based on Pearson's product moment correlation coefficient may not be valid. Explain why she comes to this conclusion.
  2. Calculate the value of Spearman's rank correlation coefficient.
  3. Carry out a test based on this coefficient, at the \(5 \%\) significance level, to investigate whether there is any association between percentage colonisation and shoot dry weight.
OCR MEI Further Statistics A AS Specimen Q6
12 marks Standard +0.3
6 A motorist decides to check the fuel consumption, \(y\) miles per gallon, of her car at particular speeds, \(x \mathrm { mph }\), on flat roads. She carries out the check on a suitable stretch of motorway. Fig. 6 shows her results. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{880026ad-1cd3-40bb-bc87-8dcc94bd9bbd-4_707_1091_1320_477} \captionsetup{labelformat=empty} \caption{Fig. 6}
\end{figure}
  1. Explain why it would not be appropriate to carry out a hypothesis test for correlation based on the product moment correlation coefficient.
  2. (A) One of the results is an outlier. Circle the outlier on the copy of Fig. 6 in the Printed Answer Booklet.
    (B) Suggest one possible reason for the outlier in part (ii) (A) not being used in any analysis. The motorist decides to remove this item of data from any analysis. The table below shows part of a spreadsheet that was used to analyse the 14 remaining data items (with the outlier removed). Some rows of the spreadsheet have been deliberately omitted.
    Data item\(x\)\(y\)\(x ^ { 2 }\)\(y ^ { 2 }\)\(x y\)
    15053.625002872.962680
    25053.325002840.892665
    137044.849002007.043136
    147044.249001953.643094
    Sum8406865115033779.740812
  3. Calculate the equation of the regression line of \(y\) on \(x\).
  4. Use the equation of the regression line to predict the fuel consumption of the car at
    (A) 58 mph ,
    (B) 30 mph .
  5. Comment on the reliability of your predictions in part (iv). }{www.ocr.org.uk}) after the live examination series. If OCR has unwittingly failed to correctly acknowledge or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity.
    For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1GE.
    OCR is part of the }\section*{}
OCR MEI Further Statistics Minor 2022 June Q5
14 marks Standard +0.3
5 A medical researcher is investigating whether there is any relationship between the age of a person and the level of a particular protein in the person's blood. She measures the levels of the protein (measured in suitable units) in a random sample of 12 hospital patients of various ages (in years). The spreadsheet shows the values obtained, together with a scatter diagram which illustrates the data. \includegraphics[max width=\textwidth, alt={}, center]{e8624e9b-5143-49d2-9683-cc3a1082694e-5_736_1470_1087_246}
  1. The researcher decides that a test based on Pearson's product moment correlation coefficient may not be valid. Explain why she comes to this conclusion.
  2. Calculate the value of Spearman's rank correlation coefficient.
  3. Carry out a test based on this coefficient at the \(5 \%\) significance level to investigate whether there is any association between age and protein level.
  4. Explain why the researcher chose a sample that was random.
  5. The researcher had originally intended to use a sample size of 6 rather than the 12 that she actually used. Explain what advantage there is in using the larger sample size.
OCR MEI Further Statistics Minor 2023 June Q6
10 marks Standard +0.3
6 Each competitor in a lumberjacking competition has to perform various disciplines for which they are timed. A spectator thinks that the times for two of the disciplines, chopping wood and sawing wood, are related. The table and the scatter diagram below show the times of a random sample of 8 competitors in these two disciplines.
CompetitorABCDEFGH
Sawing17.116.714.314.012.821.515.314.4
Chopping23.520.621.918.821.524.819.719.3
\includegraphics[max width=\textwidth, alt={}, center]{72215d69-c3e6-492d-bb3e-bdc28aeb4613-6_786_1130_708_239}
  1. The spectator decides to carry out a hypothesis test to investigate whether there is any relationship. Explain why the spectator decides that a test based on Pearson's product moment correlation coefficient may not be valid.
  2. Determine the value of Spearman's rank correlation coefficient.
  3. Carry out a hypothesis test at the \(5 \%\) significance level to investigate whether there is positive association between sawing and chopping times.
OCR MEI Further Statistics Minor 2020 November Q5
17 marks Moderate -0.3
5 A student is investigating immunisation. He wonders if there is any relationship between the percentage of young children who have been given measles vaccine and the percentage who have been given BCG vaccine in various countries. He takes a random sample of 8 countries and finds the data for the two variables. The spreadsheet in Fig. 5.1 shows the values obtained, together with a scatter diagram which illustrates the data. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{882f9f3c-40d8-4abb-822a-49bd505a33ea-5_910_1653_541_246} \captionsetup{labelformat=empty} \caption{Fig. 5.1}
\end{figure}
  1. The student decides that a test based on Pearson's product moment correlation coefficient is not valid. Explain why he comes to this conclusion. The student carries out a test based on Spearman's rank correlation coefficient.
  2. Calculate the value of Spearman's rank correlation coefficient.
  3. Carry out a test based on this coefficient at the \(5 \%\) significance level to investigate whether there is any association between measles and BCG vaccination levels. The student then decides to investigate the relationship between number of doctors per 1000 people in a country and unemployment rate in that country (unemployment rate is the percentage of the working age population who are not in paid work). He selects a random sample of 6 countries. The spreadsheet in Fig. 5.2 shows the values obtained, together with a scatter diagram which illustrates the data. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{882f9f3c-40d8-4abb-822a-49bd505a33ea-6_776_1649_495_248} \captionsetup{labelformat=empty} \caption{Fig. 5.2}
    \end{figure}
  4. Use your calculator to write down the equation of the regression line of unemployment rate on doctors per 1000.
  5. Use the regression line to estimate the unemployment rate for a country with 2.00 doctors per 1000.
  6. Comment briefly on the reliability of your answer to part (e). The student decides to add the data for another country with 3.99 doctors per 1000 and unemployment rate 11.42 to his diagram.
  7. Add this point to the scatter diagram in the Printed Answer Booklet.
  8. Without doing any further calculations, comment on what difference, if any, including this extra data point would make to the usefulness of a regression line of unemployment rate on doctors per 1000.
OCR MEI Further Statistics Major 2022 June Q8
14 marks Standard +0.3
8 A swimming coach is investigating whether there is correlation between the times taken by teenage swimmers to swim 50 m Butterfly and 50 m Freestyle. The coach selects a random sample of 11 teenage swimmers and records the times that each of them take for each event. The spreadsheet shows the data, together with a scatter diagram to illustrate the data. \includegraphics[max width=\textwidth, alt={}, center]{77eabbd6-a058-457f-9601-d66f3c2db005-06_712_1465_456_274}
  1. In the scatter diagram, Butterfly times have been plotted on the horizontal axis and Freestyle times on the vertical axis. A student states that the variables should have been plotted the other way around. Explain whether the student is correct. The student decides to carry out a hypothesis test to investigate whether there is any correlation between the times taken for the two events.
  2. Explain why the student decides to carry out a test based on Spearman's rank correlation coefficient.
  3. In this question you must show detailed reasoning. Carry out the test at the 5\% significance level.
  4. The student concludes that there is definitely no correlation between the times. Comment on the student's conclusion.
OCR MEI Further Statistics Major 2024 June Q6
11 marks Standard +0.3
6 A student is investigating the relationship between age and grip strength in adults. The student selects 10 people and records their ages in years and the grip strengths of their dominant hand, measured in kg. The data are shown in the table below, together with a scatter diagram to illustrate the data.
Age22293639535760717682
Grip strength38464249374736333424
\includegraphics[max width=\textwidth, alt={}]{bab116b3-6e5f-44db-ac86-670e4040d649-05_634_1107_641_239}
The student decides to carry out a hypothesis test to investigate whether there is negative association between age and grip strength.
  1. Explain why the student decides to carry out a test based on Spearman's rank correlation coefficient.
  2. State what property of the sample is required in order for it to be valid to carry out a hypothesis test.
  3. In this question you must show detailed reasoning. Assuming that the property in part (b) holds, carry out the test at the \(5 \%\) significance level.
WJEC Further Unit 2 2019 June Q1
7 marks Standard +0.3
  1. Sketch a scatter diagram of a dataset for which Spearman's rank correlation coefficient is + 1 , but the product moment correlation coefficient is less than 1 . Two judges were judging cheese at the UK Cheese Festival. There were 8 blue cheeses in a particular category. The rankings are shown below.
    CheeseABCDEFGH
    Judge 115876432
    Judge 213852467
  2. Calculate Spearman's rank correlation coefficient for this dataset.
  3. By sketching a scatter diagram of the rankings, or otherwise, comment on the extent to which the judges agree.
Edexcel FS2 AS 2018 June Q3
12 marks Standard +0.8
  1. The table below shows the heights cleared, in metres, for each of 6 competitors in a high jump competition.
CompetitorABCDEF
Height (m)2.051.932.021.961.812.02
These 6 competitors also took part in a long jump competition and finished in the following order, with C jumping the furthest.
C
A
F
D
B
E
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses clearly, test at the \(5 \%\) level of significance whether or not there is a positive correlation between results in the high jump and results in the long jump. The product moment correlation coefficient between the height of the high jump and the length of the long jump for each competitor is found to be 0.678
  3. Use this value to test, at the \(5 \%\) level of significance, for evidence of positive correlation between results in the high jump and results in the long jump.
  4. State the condition required for the test in part (c) to be valid.
  5. Explain what your conclusions in part (b) and part (c) suggest about the relationship between results in the high jump and results in the long jump.
    V349 SIHI NI IMIMM ION OCVJYV SIHIL NI LIIIM ION OOVJYV SIHIL NI JIIYM ION OC
Edexcel FS2 AS 2019 June Q1
10 marks Standard +0.3
  1. Bara is investigating whether or not the two judges of a skating competition are in agreement. The two judges gave a score to each of the 8 skaters in the competition as shown in the table below.
\cline { 2 - 9 } \multicolumn{1}{c|}{}Skater
\cline { 2 - 9 }\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Judge 17170726263615753
Judge 27371676462565253
Bara decided to calculate Spearman's rank correlation coefficient for these data.
  1. Calculate Spearman's rank correlation coefficient between the ranks of the two judges.
  2. Test, at the \(1 \%\) level of significance, whether or not the two judges are in agreement. Judge 1 accidentally swapped the scores for skaters \(D\) and \(E\). The score for skater \(D\) should be 63 and the score for skater \(E\) should be 62
  3. Without carrying out any further calculations, explain how Spearman's rank correlation coefficient will change. Give a reason for your answer.
Edexcel FS2 AS 2022 June Q1
7 marks Standard +0.3
  1. Abena and Meghan are both given the same list of 10 films.
Each of them ranks the 10 films from most favourite to least favourite.
For the differences, \(d\), between their ranks for these 10 films, \(\sum d ^ { 2 } = 84\)
  1. Calculate Spearman's rank correlation coefficient between Abena's ranks and Meghan's ranks. A test is carried out at the 5\% level of significance to see if there is agreement between their ranks for the films. The hypotheses for the test are $$\mathrm { H } _ { 0 } : \rho _ { \mathrm { S } } = 0 \quad \mathrm { H } _ { 1 } : \rho _ { \mathrm { S } } > 0$$
    1. Find the critical region for the test.
    2. State the conclusion of the test. An 11th film is added to the list. Abena and Meghan both agree that this film is their least favourite. A new test is carried out at the \(5 \%\) level of significance using the same hypotheses.
  2. Determine the conclusion of this test. You should state the test statistic and the critical value used.
Edexcel FS2 AS 2023 June Q1
10 marks Standard +0.3
  1. Every applicant for a job at Donala is given three different tasks, \(P , Q\) and \(R\).
For each task the applicant is awarded a score.
The scores awarded to 9 of the applicants, for the tasks \(P\) and \(Q\), are given below.
Applicant\(A\)\(B\)C\(D\)E\(F\)GHI
Task \(\boldsymbol { P }\)1916161281712125
Task \(Q\)1711147618151110
  1. Calculate Spearman's rank correlation coefficient for the scores awarded for the tasks \(P\) and \(Q\).
  2. Test, at the \(1 \%\) level of significance, whether or not there is evidence for a positive correlation between the ranks of scores for tasks \(P\) and \(Q\). You should state your hypotheses and critical value clearly. The Spearman's rank correlation coefficient for \(P\) and \(R\) is 0.290 and for \(Q\) and \(R\) is 0.795 The manager of Donala wishes to reduce the number of tasks given to job applicants from three to two.
  3. Giving a reason for your answer, state which 2 tasks you would recommend the manager uses.
Edexcel FS2 AS 2024 June Q2
7 marks Standard +0.3
  1. A random sample of size \(n = 8\) of paired data is taken from a population. The data are plotted below. \includegraphics[max width=\textwidth, alt={}, center]{ba41c616-0805-4466-81b8-b985b0bdd94b-06_572_983_335_541}
Test, at the \(1 \%\) level of significance, whether or not there is evidence of a negative rank correlation between the two variables. You should state your hypotheses and critical value and show your working clearly.
Edexcel FS2 AS Specimen Q1
10 marks Standard +0.3
  1. In a gymnastics competition, two judges scored each of 8 competitors on the vault.
CompetitorABCDEFGH
J udge 1's scores4.69.18.48.89.09.59.29.4
J udge 2's scores7.88.88.68.59.19.69.09.3
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses clearly, test at the \(1 \%\) level of significance, whether or not the two judges are generally in agreement.
  3. Give a reason to support the use of Spearman's rank correlation coefficient in this case. The judges also scored the competitors on the beam.
    Spearman's rank correlation coefficient for their ranks on the beam was found to be 0.952
  4. Compare the judges' ranks on the vault with their ranks on the beam.
Edexcel FS2 2019 June Q8
11 marks Challenging +1.8
8 Nine athletes, \(A , B , C , D , E , F , G , H\) and \(I\), competed in both the 100 m sprint and the long jump. After the two events the positions of each athlete were recorded and Spearman's rank correlation coefficient was calculated and found to be 0.85
  1. Stating your hypotheses clearly, test whether or not there is evidence to suggest that the higher an athlete's position is in the 100 m sprint, the higher their position is in the long jump. Use a \(5 \%\) level of significance. The piece of paper the positions were recorded on was mislaid. Although some of the athletes agreed their positions, there was some disagreement between athletes \(B , C\) and \(D\) over their long jump results. The table shows the results that are agreed to be correct.
    Athlete\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)
    Position in 100 m sprint467928315
    Position in long jump549312
    Given that there were no tied ranks,
  2. find the correct positions of athletes \(B , C\) and \(D\) in the long jump. You must show your working clearly and give reasons for your answers.
  3. Without recalculating the coefficient, explain how Spearman's rank correlation coefficient would change if athlete \(H\) was disqualified from both the 100 m sprint and the long jump.
Edexcel FS2 2021 June Q1
7 marks Standard +0.3
  1. Anisa is investigating the relationship between marks on a History test and marks on a Geography test. She collects information from 7 students. She wants to calculate the Spearman's rank correlation coefficient for the 7 students so she ranks their performance on each test.
StudentHistory markGeography markHistory rankGeography rank
A765813
B706022
C6457\(s\)\(t\)
D6463\(s\)1
E6457\(s\)\(t\)
F595067
G555276
  1. Write down the value of \(s\) and the value of \(t\) The full product moment correlation coefficient (pmcc) formula is used with the ranks to calculate the Spearman's rank correlation coefficient instead of \(r _ { s } = 1 - \frac { 6 \Sigma d ^ { 2 } } { n \left( n ^ { 2 } - 1 \right) }\) and the value obtained is 0.7106 to 4 significant figures.
  2. Explain why the full pmcc formula is used to carry out the calculation.
  3. Stating your hypotheses clearly, test whether or not there is evidence to suggest that the higher a student ranks in the History test, the higher the student ranks in the Geography test. Use a \(5 \%\) level of significance.
Edexcel FS2 2024 June Q2
7 marks Standard +0.3
  1. An estate agent asks customers to rank 7 features of a house, \(A , B , C , D , E , F\) and \(G\), in order of importance. The responses for two randomly selected customers are in the table below.
Rank1234567
Customer 1\(A\)\(E\)\(C\)\(F\)\(G\)\(B\)\(D\)
Customer 2\(E\)\(F\)\(C\)\(G\)\(A\)\(D\)\(B\)
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses and critical value clearly, test at the \(5 \%\) level of significance, whether or not the two customers are generally in agreement.
Edexcel FS2 Specimen Q2
9 marks Standard +0.3
  1. A researcher claims that, at a river bend, the water gradually gets deeper as the distance from the inner bank increases. He measures the distance from the inner bank, \(b \mathrm {~cm}\), and the depth of a river, \(s \mathrm {~cm}\), at 7 positions. The results are shown in the table below.
PositionABCDEFG
Distance from
inner bank \(\boldsymbol { b } \mathbf { c m }\)
100200300400500600700
Depth \(\boldsymbol { s } \mathbf { c m }\)60758576110120104
The Spearman's rank correlation coefficient between \(b\) and \(s\) is \(\frac { 6 } { 7 }\)
  1. Stating your hypotheses clearly, test whether or not the data provides support for the researcher's claim. Use a \(1 \%\) level of significance.
  2. Without re-calculating the correlation coefficient, explain how the Spearman's rank correlation coefficient would change if
    1. the depth for G is 109 instead of 104
    2. an extra value H with distance from the inner bank of 800 cm and depth 130 cm is included. The researcher decided to collect extra data and found that there were now many tied ranks.
  3. Describe how you would find the correlation with many tied ranks.
OCR FS1 AS 2017 December Q6
9 marks Standard +0.3
6 Arlosh, Sarah and Desi are investigating the ratings given to six different films by two critics.
  1. Arlosh calculates Spearman's rank correlation coefficient \(r _ { s }\) for the critics' ratings. He calculates that \(\Sigma d ^ { 2 } = 72\). Show that this value must be incorrect.
  2. Arlosh checks his working with Sarah, whose answer \(r _ { s } = \frac { 29 } { 35 }\) is correct. Find the correct value of \(\Sigma d ^ { 2 }\).
  3. Carry out an appropriate two-tailed significance test of the value of \(r _ { s }\) at the \(5 \%\) significance level, stating your hypotheses clearly. Each critic gives a score out of 100 to each film. Desi uses these scores to calculate Pearson's product-moment correlation coefficient. She carries out a two-tailed significance test of this value at the \(5 \%\) significance level.
  4. Explain with a reason whether you would expect the conclusion of Desi's test to be the same as the result of the test in part (iii).
OCR Further Statistics 2018 March Q8
11 marks Challenging +1.2
8 At a wine-tasting competition, two judges give marks out of 100 to 7 wines as follows.
Wine\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)
Judge I86.387.587.688.889.489.990.5
Judge II85.388.182.787.789.089.491.5
  1. A spectator claims that there is a high level of agreement between the rank orders of the marks given by the two judges. Test the spectator's claim at the \(1 \%\) significance level.
  2. A competitor ranks the wines in a random order. The value of Spearman's rank correlation coefficient between the competitor and Judge I is \(r _ { s }\).
    1. Find the probability that \(r _ { s } = 1\).
    2. Show that \(r _ { s }\) cannot take the value \(\frac { 55 } { 56 }\).
OCR FS1 AS 2018 March Q8
8 marks Challenging +1.2
8 In a competition, entrants have to give ranks from 1 to 7 to each of seven resorts. The correct ranks for the resorts are decided by an expert.
  1. One competitor chooses his ranks randomly. By considering all the possible rankings, find the probability that the value of Spearman's rank correlation coefficient \(r _ { s }\) between the competitor's ranks and the expert's ranks is at least \(\frac { 27 } { 28 }\).
  2. Another competitor ranks the seven resorts. A significance test is carried out to test whether there is evidence that this competitor is merely guessing the rank order of the seven resorts. The critical region is \(r _ { s } \geqslant \frac { 27 } { 28 }\). State the significance level of the test. \section*{END OF QUESTION PAPER}
OCR S1 Q8
13 marks Moderate -0.3
8 The table shows the population, \(x\) million, of each of nine countries in Western Europe together with the population, \(y\) million, of its capital city.
GermanyUnited KingdomFranceItalySpainThe NetherlandsPortugalAustriaSwitzerland
\(x\)82.159.259.156.739.215.99.98.17.3
\(y\)3.57.09.02.72.90.80.71.60.1
$$\left[ n = 9 , \Sigma x = 337.5 , \Sigma x ^ { 2 } = 18959.11 , \Sigma y = 28.3 , \Sigma y ^ { 2 } = 161.65 , \Sigma x y = 1533.76 . \right]$$
  1. (a) Calculate Spearman's rank correlation coefficient, \(r _ { s }\).
    (b) Explain what your answer indicates about the populations of these countries and their capital cities.
  2. Calculate the product moment correlation coefficient, \(r\). The data are illustrated in the scatter diagram. \includegraphics[max width=\textwidth, alt={}, center]{11316ea6-3999-4003-b77d-bee8b547c1da-09_936_881_1162_632}
  3. By considering the diagram, state the effect on the value of the product moment correlation coefficient, \(r\), if the data for France and the United Kingdom were removed from the calculation.
  4. In a certain country in Africa, most people live in remote areas and hence the population of the country is unknown. However, the population of the capital city is known to be approximately 1 million. An official suggests that the population of this country could be estimated by using a regression line drawn on the above scatter diagram.
    (a) State, with a reason, whether the regression line of \(y\) on \(x\) or the regression line of \(x\) on \(y\) would need to be used.
    (b) Comment on the reliability of such an estimate in this situation. 1 Some observations of bivariate data were made and the equations of the two regression lines were found to be as follows. $$\begin{array} { c c } y \text { on } x : & y = - 0.6 x + 13.0 \\ x \text { on } y : & x = - 1.6 y + 21.0 \end{array}$$
  1. State, with a reason, whether the correlation between \(x\) and \(y\) is negative or positive.
  2. Neither variable is controlled. Calculate an estimate of the value of \(x\) when \(y = 7.0\).
  3. Find the values of \(\bar { x }\) and \(\bar { y }\). 2 A bag contains 5 black discs and 3 red discs. A disc is selected at random from the bag. If it is red it is replaced in the bag. If it is black, it is not replaced. A second disc is now selected at random from the bag. Find the probability that
  1. the second disc is black, given that the first disc was black,
  2. the second disc is black,
  3. the two discs are of different colours. 3 Each of the 7 letters in the word DIVIDED is printed on a separate card. The cards are arranged in a row.
  1. How many different arrangements of the letters are possible?
  2. In how many of these arrangements are all three Ds together? The 7 cards are now shuffled and 2 cards are selected at random, without replacement.
  3. Find the probability that at least one of these 2 cards has D printed on it.
4
  1. The random variable \(X\) has the distribution \(\mathrm { B } ( 25,0.2 )\). Using the tables of cumulative binomial probabilities, or otherwise, find \(\mathrm { P } ( X \geqslant 5 )\).
  2. The random variable \(Y\) has the distribution \(\mathrm { B } ( 10,0.27 )\). Find \(\mathrm { P } ( Y = 3 )\).
  3. The random variable \(Z\) has the distribution \(B ( n , 0.27 )\). Find the smallest value of \(n\) such that \(\mathrm { P } ( Z \geqslant 1 ) > 0.95\). 5 The probability distribution of a discrete random variable, \(X\), is given in the table.
    \(x\)0123
    \(\mathrm { P } ( X = x )\)\(\frac { 1 } { 3 }\)\(\frac { 1 } { 4 }\)\(p\)\(q\)
    It is given that the expectation, \(\mathrm { E } ( X )\), is \(1 \frac { 1 } { 4 }\).
  1. Calculate the values of \(p\) and \(q\).
  2. Calculate the standard deviation of \(X\).