Wilcoxon rank-sum test (Mann-Whitney U test)

A question is this type if and only if it asks to test for differences between two independent samples using the Wilcoxon rank-sum or Mann-Whitney U test.

26 questions · Standard +0.4

5.07d Paired vs two-sample: selection
Sort by: Default | Easiest first | Hardest first
CAIE Further Paper 4 2024 June Q1
7 marks Standard +0.3
1 A college uses two assessments, \(X\) and \(Y\), when interviewing applicants for research posts at the college. These assessments have been used for a large number of applicants this year. The scores for a random sample of 9 applicants who took assessment \(X\) are as follows. $$\begin{array} { l l l l l l l l l } 21.4 & 24.6 & 25.3 & 22.7 & 20.8 & 21.5 & 22.9 & 21.3 & 22.3 \end{array}$$ The scores for a random sample of 10 applicants who took assessment \(Y\) are as follows. $$\begin{array} { l l l l l l l l l l } 20.9 & 23.5 & 24.8 & 21.9 & 23.4 & 24.0 & 23.8 & 24.1 & 25.1 & 25.8 \end{array}$$ The interviewer believes that the population median score from assessment \(X\) is lower than the population median score from assessment \(Y\). Carry out a Wilcoxon rank-sum test, at the \(1 \%\) significance level, to test whether the interviewer's belief is supported by the data. \includegraphics[max width=\textwidth, alt={}, center]{b5ff998a-fcb6-4a1b-ae86-ec66b0dccc3c-02_2714_37_143_2008}
CAIE Further Paper 4 2021 November Q6
9 marks Standard +0.3
6 The blood cholesterol levels, measured in suitable units, of a random sample of 11 women and a random sample of 12 men are shown below.
Women51552421671522567513798238235
Men3112621703021753202202607235186333
Carry out a Wilcoxon rank-sum test, at the \(5 \%\) significance level, to test whether, on average, there is a difference in cholesterol levels between women and men.
If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE Further Paper 4 2021 November Q4
8 marks Standard +0.3
4 Applicants for a particular college take a written test when they attend for interview. There are two different written tests, \(A\) and \(B\), and each applicant takes one or the other. The interviewer wants to determine whether the medians of the distribution of marks obtained in the two tests are equal. The marks obtained by a random sample of 8 applicants who took test \(A\) and a random sample of 8 applicants who took test \(B\) are as follows.
Test \(A\)4632291233182540
Test \(B\)3628493748354131
  1. Carry out a Wilcoxon rank-sum test at the \(5 \%\) significance level to determine whether there is a difference in the population median marks obtained in the two tests.
    The interviewer considers using the given information to carry out a paired sample \(t\)-test to determine whether there is a difference in the population means for the two tests.
  2. Give two reasons why it is not appropriate to use this test.
CAIE Further Paper 4 2023 November Q5
16 marks Standard +0.8
5 A company is deciding which of two machines, \(X\) and \(Y\), can make a certain type of electrical component more quickly. The times taken, in minutes, to make one component of this type are recorded for a random sample of 8 components made by machine \(X\) and a random sample of 9 components made by machine \(Y\). These times are as follows.
Machine \(X\)4.04.64.74.85.05.25.65.8
Machine \(Y\)4.54.95.15.35.45.75.96.36.4
The manager claims that on average the time taken by machine \(X\) to make one component is less than that taken by machine \(Y\).
  1. Carry out a Wilcoxon rank-sum test at the \(5 \%\) significance level to test whether the manager's claim is supported by the data.
  2. Assuming that the times taken to produce the components by the two machines are normally distributed with equal variances, carry out a \(t\)-test at the \(5 \%\) significance level to test whether the manager's claim is supported by the data.
    \section*{Question 5(c) is printed on the next page.}
  3. In general, would you expect the conclusions from the tests in parts (a) and (b) to be the same? Give a reason for your answer.
    If you use the following page to complete the answer to any question, the question number must be clearly shown.
OCR S4 2007 June Q2
7 marks Standard +0.3
2 Of 9 randomly chosen students attending a lecture, 4 were found to be smokers and 5 were nonsmokers. During the lecture their pulse-rates were measured, with the following results in beats per minute.
Smokers77859098
Non-smokers5964688088
It may be assumed that these two groups of students were random samples from the student populations of smokers and non-smokers. Using a suitable Wilcoxon test at the \(10 \%\) significance level, test whether there is a difference in the median pulse-rates of the two populations.
OCR S4 2008 June Q4
7 marks Standard +0.3
4 William takes a bus regularly on the same journey, sometimes in the morning and sometimes in the afternoon. He wishes to compare morning and afternoon journey times. He records the journey times on 7 randomly chosen mornings and 8 randomly chosen afternoons. The results, each correct to the nearest minute, are as follows, where M denotes a morning time and A denotes an afternoon time.
MAAMMMMMMAAAAAA
192022242526283031333537383942
William wishes to test for a difference between the average times of morning and afternoon journeys.
  1. Given that \(s _ { M } ^ { 2 } = 16.5\) and \(s _ { A } ^ { 2 } = 64.5\), with the usual notation, explain why a \(t\)-test is not appropriate in this case.
  2. William chooses a non-parametric test at the \(5 \%\) significance level. Carry out the test, stating the rejection region.
OCR S4 2011 June Q5
11 marks Standard +0.3
5 A test was carried out to compare the breaking strengths of two brands of elastic band, \(A\) and \(B\), of the same size. Random samples of 6 were selected from each brand and the breaking strengths were measured. The results, in suitable units and arranged in ascending order for each brand, are as follows.
Brand \(A :\)5.68.79.210.711.212.6
Brand \(B :\)10.111.612.012.212.913.5
  1. Give one advantage that a non-parametric test might have over a parametric test in this context.
  2. Carry out a suitable Wilcoxon test at the \(5 \%\) significance level of whether there is a difference between the average breaking strengths of the two brands.
  3. An extra elastic band of brand \(B\) was tested and found to have a breaking strength exceeding all of the other 12 bands. Determine whether this information alters the conclusion of your test.
OCR S4 2012 June Q3
9 marks Standard +0.3
3 Because of the large number of students enrolled for a university geography course and the limited accommodation in the lecture theatre, the department provides a filmed lecture. Students are randomly assigned to two groups, one to attend the lecture theatre and the other the film. At the end of term the two groups are given the same examination. The geography professor wishes to test whether there is a difference in the performance of the two groups and selects the marks of two random samples of students, 6 from the group attending the lecture theatre and 7 from the group attending the films. The marks for the two samples, ordered for convenience, are shown below.
Lecture theatre:303648515962
Filmed lecture:40495256636468
  1. Stating a necessary assumption, carry out a suitable non-parametric test, at the \(10 \%\) significance level, for a difference between the median marks of the two groups.
  2. State conditions under which a two-sample \(t\)-test could have been used.
  3. Assuming that the tests in parts (i) and (ii) are both valid, state with a reason which test would be preferable.
OCR S4 2013 June Q4
10 marks Standard +0.3
4 The effect of water salinity on the growth of a type of grass was studied by a biologist. A random sample of 22 seedlings was divided into two groups \(A\) and \(B\), each of size 11 .
Group \(A\) was treated with water of \(0 \%\) salinity and group \(B\) was treated with water of \(0.5 \%\) salinity. After three weeks the height (in cm) of each seedling was measured with the following results, which are ordered for convenience.
Group \(A\)8.69.49.79.810.110.511.011.211.812.7
Group \(B\)7.48.48.58.89.29.39.59.910.011.1
Jeffery was asked to test whether the two treatments resulted, on average, in a difference in growth. He chose the Wilcoxon rank sum test.
  1. Justify Jeffery's choice of test.
  2. Carry out the test at the \(5 \%\) significance level.
OCR MEI S4 2012 June Q3
24 marks Standard +0.3
3 At an agricultural research station, trials are being made of two fertilisers, A and B, to see whether they differ in their effects on the yield of a crop. Preliminary investigations have established that the underlying variances of the distributions of yields using the two fertilisers may be assumed equal. Scientific analysis of the fertilisers has suggested that fertiliser A may be inferior in that it leads, on the whole, to lower yield. A statistical analysis is being carried out to investigate this. The crop is grown in carefully controlled conditions in 14 experimental plots, 6 with fertiliser A and 8 with fertiliser B. The yields, in kg per plot, are as follows, arranged in ascending order for each fertiliser.
Fertiliser A9.810.210.911.512.713.3
Fertiliser B10.811.912.012.212.913.513.613.7
  1. Carry out a Wilcoxon rank sum test at the \(5 \%\) significance level to examine appropriate hypotheses.
  2. Carry out a \(t\) test at the \(5 \%\) significance level to examine appropriate hypotheses.
  3. Goodness of fit tests based on more extensive data sets from other trials with these fertilisers have failed to reject hypotheses of underlying Normal distributions. Discuss the relative merits of the analyses in parts (i) and (ii).
OCR S4 2016 June Q2
8 marks Standard +0.3
2 Low density lipoprotein (LDL) cholesterol is known as 'bad' cholesterol.
15 randomly chosen patients, each with an LDL level of 190 mg per decilitre of blood, were given one of two treatments, chosen at random. After twelve weeks their LDL levels, in mg per decilitre, were as follows.
Treatment \(A\)189168176186183187188
Treatment \(B\)177179173180178170175174
Use a Wilcoxon rank sum test, at the \(5 \%\) level of significance, to test whether the LDL levels of patients given treatment \(B\) are lower than the LDL levels of patients given treatment \(A\).
OCR S4 2017 June Q4
12 marks Standard +0.3
4 The heights of eleven randomly selected primary school children are measured. The results, in metres, are
Girls1.481.311.631.381.561.57
Boys1.441.351.321.281.27
  1. Use a Wilcoxon rank-sum test, at the \(1 \%\) significance level, to test whether primary school girls are taller than primary school boys.
  2. It is decided to repeat the test, using larger random samples. The heights of twenty girls and eighteen boys are measured. Find the greatest value of the test statistic \(W\) which will result in the conclusion that there is evidence, at the \(1 \%\) level of significance, that primary school girls are taller than primary school boys.
OCR S4 2015 June Q6
9 marks Moderate -0.8
6 In a two-tail Wilcoxon rank-sum test, the sample sizes are 13 and 15. The sum of the ranks for the sample of size 13 is 135 . Carry out the test at the \(5 \%\) level of significance.
OCR S4 2018 June Q2
8 marks Standard +0.3
2 The distances from home to work, in km , of 8 men and 5 women were recorded and are given below. The workers were chosen at random.
Men47101316172021
Women12141822
Carry out a Wilcoxon rank-sum test at the \(5 \%\) significance level to test whether there is a significant difference between the distances from home to work between men and women.
OCR Further Statistics 2019 June Q8
10 marks Standard +0.3
8 A university course was taught by two different professors. Students could choose whether to attend the lectures given by Professor \(Q\) or the lectures given by Professor \(R\). At the end of the course all the students took the same examination. The examination marks of a random sample of 30 students taught by Professor \(Q\) and a random sample of 24 students taught by Professor \(R\) were ranked. The sum of the ranks of the students taught by Professor \(Q\) was 726 . Test at the 5\% significance level whether there is a difference in the ranks of the students taught by the two professors.
OCR Further Statistics 2023 June Q5
10 marks Challenging +1.2
5 An historian has reason to believe that the average age at which men got married in the seventeenth century was higher in urban areas compared to rural areas. The historian collected data from a random sample of 8 men in an urban area and a random sample of 6 men in a rural area, all of whom were married in the seventeenth century. The results were as follows, given in the form years/months.
Urban:\(18 / 3\)\(18 / 5\)\(19 / 9\)\(20 / 7\)\(25 / 6\)\(34 / 6\)\(41 / 8\)\(46 / 3\)
Rural:\(18 / 0\)\(18 / 1\)\(18 / 4\)\(19 / 11\)\(22 / 2\)\(28 / 11\)
  1. Use an appropriate non-parametric method to test at the \(5 \%\) significance level whether the average age at marriage of men is higher in urban areas than in rural areas.
  2. When checking the data, the historian found that the age of one of the men, Mr X, which had been recorded as 28/11, had been wrongly recorded. When corrected, the result of the test in part (a) was unchanged. Determine the youngest age that Mr X could have been, given that it was not the same, in years and months, as that of any of the other men in the sample.
OCR Further Statistics 2021 November Q7
12 marks Standard +0.3
7 In a school opinion poll a random sample of 8 pupils were asked to rate school lunches on a scale of 0 to 20 . The results were as follows. \(\begin{array} { l l l l l l l l } 0 & 1 & 2 & 3 & 4 & 10 & 11 & 13 \end{array}\) After a new menu was introduced, the test was repeated with a different random sample of 8 pupils. The results were as follows. \(\begin{array} { l l l l l l l l } 7 & 8 & 9 & 14 & 15 & 17 & 19 & 20 \end{array}\)
  1. Carry out an appropriate Wilcoxon test at the \(5 \%\) significance level to test whether pupils' opinions of school lunches have changed. A statistics student tells the organisers of the opinion poll that it would have been better to have asked the same 8 pupils both times.
  2. Explain why the statistics student's suggestion would produce a better test.
  3. State which test should be used if the student's suggestion is followed.
  4. You are given that there are 12870 ways in which 8 different integers can be chosen from the integers 1 to 16 inclusive. Estimate the number of ways of selecting 8 different digits between 1 and 16 inclusive that have a sum less than or equal to the critical value used in the test in part (a).
WJEC Further Unit 5 2023 June Q6
7 marks Standard +0.3
6. A triathlon race organiser wishes to know whether competitors who are members of a triathlon club race more frequently than competitors who are not members of a triathlon club. Six competitors from a triathlon club and six competitors who are not members of a triathlon club are selected at random. The table below shows the number of triathlon races they each entered last year.
Club
members
11412537
Not club
members
294086
  1. Use a Mann-Whitney U test at a significance level as close to \(5 \%\) as possible to carry out the race organiser's investigation.
  2. Briefly explain why a Wilcoxon signed-rank test is not appropriate in this case.
OCR Further Statistics 2018 September Q8
8 marks Standard +0.3
8 In an experiment to investigate the effect of background music in carrying out work, ten students were each given a task. Five of the students did the task in silence and the other five did the task with background music. The scores on the tasks were as follows.
Silence4346555861
Background music1931385270
  1. Use a Wilcoxon rank-sum test to test at the 10\% level whether the presence of background music affects scores.
  2. A statistician suggests that the experiment is redesigned so that each student takes one task in silence and another task with background music. The differences in the test scores would then be analysed using a paired-sample method. State an advantage in redesigning the experiment in this way.
WJEC Further Unit 5 2022 June Q3
8 marks Standard +0.3
3. A statistics teacher wants to investigate whether students from the north of a county and students from the south of the same county feel similarly stressed about examinations. The teacher carries out a psychometric test on 10 randomly selected students to give a score between 0 (low stress) and 100 (high stress) to measure their stress levels before a set of examinations. The results are shown in the table below.
StudentAreaStress Level
HeleddNorth67
MairNorth55
HywelSouth26
GwynSouth70
LiamSouth36
MarcinSouth57
GosiaSouth32
KestutasNorth64
EricaNorth60
TomosNorth22
  1. State one reason why a Mann-Whitney test is appropriate.
  2. Conduct a Mann-Whitney test at a significance level as close to \(5 \%\) as possible. State your conclusion clearly.
  3. How could this investigation be improved?
OCR Further Statistics 2020 November Q3
9 marks Challenging +1.2
Jo can use either of two different routes, A or B, for her journey to school. She believes that route A has shorter journey times. She measures how long her journey takes for 17 journeys by route A and 12 journeys by route B. She ranks the 29 journeys in increasing order of time taken, and she finds that the sum of the ranks of the journeys by route B is 219.
  1. Test at the 10\% significance level whether route A has shorter journey times than route B. [8]
  2. State an assumption about the 29 journeys which is necessary for the conclusion of the test to be valid. [1]
WJEC Further Unit 5 2019 June Q7
7 marks Standard +0.3
Nathan believes that shearers from Wales can shear more sheep, on average, in a given time than shearers from New Zealand. He takes a random sample of 8 shearers from Wales and 7 shearers from New Zealand. The numbers below indicate how many sheep were sheared in 45 minutes by the 15 shearers. Wales: \quad 60 \quad 53 \quad 42 \quad 38 \quad 37 \quad 36 \quad 31 \quad 28 New Zealand: \quad 39 \quad 35 \quad 27 \quad 26 \quad 17 \quad 16 \quad 15 Use a Mann-Whitney U test at the 1\% significance level to test whether Nathan is correct. You must state your hypotheses clearly and state the critical region. [7]
WJEC Further Unit 5 2024 June Q6
6 marks Standard +0.8
Alana is a PhD student researching language acquisition. She gives one group of randomly selected participants, Group A, 4 minutes to memorise 40 words that are similar in meaning. She gives a different, randomly selected group of participants, Group B, 4 minutes to memorise 40 words that are different in meaning. Alana believes that the students in Group B will do better than the students in Group A. The following results are the number of words recalled on testing the students from the two groups.
Group A32824161020221823212614
Group B302911253836281217
Conduct a Mann-Whitney U test at a significance level as close as possible to 5\% to test Alana's belief. [6]
WJEC Further Unit 5 Specimen Q3
9 marks Challenging +1.2
A motoring organisation wishes to determine whether or not the petrol consumption of two different car models A and B are the same. A trial is therefore carried out in which 6 cars of each model are given 10 litres of petrol and driven at a predetermined speed around a track until the petrol is used up. The distances travelled, in miles, are shown below Model A: \(86.3 \quad 84.2 \quad 85.8 \quad 83.1 \quad 84.7 \quad 85.3\) Model B: \(84.9 \quad 85.9 \quad 84.8 \quad 86.5 \quad 85.2 \quad 85.5\) It is proposed to use a test with significance level 5% based on the Mann-Whitney statistic \(U\).
  1. State suitable hypotheses. [2]
  2. Find the critical region for the test. [3]
  3. Determine the value of \(U\) for the above data and state your conclusion in context. You must justify your answer. [4]
OCR Further Statistics 2021 June Q5
10 marks Standard +0.8
A university course was taught by two different professors. Students could choose whether to attend the lectures given by Professor \(Q\) or the lectures given by Professor \(R\). At the end of the course all the students took the same examination. The examination marks of a random sample of 30 students taught by Professor \(Q\) and a random sample of 24 students taught by Professor \(R\) were ranked. The sum of the ranks of the students taught by Professor \(Q\) was 726. Test at the 5% significance level whether there is a difference in the ranks of the students taught by the two professors. [10]