Wilcoxon rank-sum test (Mann-Whitney U test)

A question is this type if and only if it asks to test for differences between two independent samples using the Wilcoxon rank-sum or Mann-Whitney U test.

26 questions · Standard +0.3

Sort by: Default | Easiest first | Hardest first
CAIE Further Paper 4 2024 June Q1
7 marks Standard +0.3
1 A college uses two assessments, \(X\) and \(Y\), when interviewing applicants for research posts at the college. These assessments have been used for a large number of applicants this year. The scores for a random sample of 9 applicants who took assessment \(X\) are as follows. $$\begin{array} { l l l l l l l l l } 21.4 & 24.6 & 25.3 & 22.7 & 20.8 & 21.5 & 22.9 & 21.3 & 22.3 \end{array}$$ The scores for a random sample of 10 applicants who took assessment \(Y\) are as follows. $$\begin{array} { l l l l l l l l l l } 20.9 & 23.5 & 24.8 & 21.9 & 23.4 & 24.0 & 23.8 & 24.1 & 25.1 & 25.8 \end{array}$$ The interviewer believes that the population median score from assessment \(X\) is lower than the population median score from assessment \(Y\). Carry out a Wilcoxon rank-sum test, at the \(1 \%\) significance level, to test whether the interviewer's belief is supported by the data. \includegraphics[max width=\textwidth, alt={}, center]{b5ff998a-fcb6-4a1b-ae86-ec66b0dccc3c-02_2714_37_143_2008}
CAIE Further Paper 4 2021 November Q6
9 marks Standard +0.3
6 The blood cholesterol levels, measured in suitable units, of a random sample of 11 women and a random sample of 12 men are shown below.
Women51552421671522567513798238235
Men3112621703021753202202607235186333
Carry out a Wilcoxon rank-sum test, at the \(5 \%\) significance level, to test whether, on average, there is a difference in cholesterol levels between women and men.
If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE Further Paper 4 2021 November Q4
8 marks Standard +0.3
4 Applicants for a particular college take a written test when they attend for interview. There are two different written tests, \(A\) and \(B\), and each applicant takes one or the other. The interviewer wants to determine whether the medians of the distribution of marks obtained in the two tests are equal. The marks obtained by a random sample of 8 applicants who took test \(A\) and a random sample of 8 applicants who took test \(B\) are as follows.
Test \(A\)4632291233182540
Test \(B\)3628493748354131
  1. Carry out a Wilcoxon rank-sum test at the \(5 \%\) significance level to determine whether there is a difference in the population median marks obtained in the two tests.
    The interviewer considers using the given information to carry out a paired sample \(t\)-test to determine whether there is a difference in the population means for the two tests.
  2. Give two reasons why it is not appropriate to use this test.
CAIE Further Paper 4 2023 November Q5
16 marks Standard +0.8
5 A company is deciding which of two machines, \(X\) and \(Y\), can make a certain type of electrical component more quickly. The times taken, in minutes, to make one component of this type are recorded for a random sample of 8 components made by machine \(X\) and a random sample of 9 components made by machine \(Y\). These times are as follows.
Machine \(X\)4.04.64.74.85.05.25.65.8
Machine \(Y\)4.54.95.15.35.45.75.96.36.4
The manager claims that on average the time taken by machine \(X\) to make one component is less than that taken by machine \(Y\).
  1. Carry out a Wilcoxon rank-sum test at the \(5 \%\) significance level to test whether the manager's claim is supported by the data.
  2. Assuming that the times taken to produce the components by the two machines are normally distributed with equal variances, carry out a \(t\)-test at the \(5 \%\) significance level to test whether the manager's claim is supported by the data.
    \section*{Question 5(c) is printed on the next page.}
  3. In general, would you expect the conclusions from the tests in parts (a) and (b) to be the same? Give a reason for your answer.
    If you use the following page to complete the answer to any question, the question number must be clearly shown.
OCR S4 2007 June Q2
7 marks Standard +0.3
2 Of 9 randomly chosen students attending a lecture, 4 were found to be smokers and 5 were nonsmokers. During the lecture their pulse-rates were measured, with the following results in beats per minute.
Smokers77859098
Non-smokers5964688088
It may be assumed that these two groups of students were random samples from the student populations of smokers and non-smokers. Using a suitable Wilcoxon test at the \(10 \%\) significance level, test whether there is a difference in the median pulse-rates of the two populations.
OCR S4 2008 June Q4
7 marks Standard +0.3
4 William takes a bus regularly on the same journey, sometimes in the morning and sometimes in the afternoon. He wishes to compare morning and afternoon journey times. He records the journey times on 7 randomly chosen mornings and 8 randomly chosen afternoons. The results, each correct to the nearest minute, are as follows, where M denotes a morning time and A denotes an afternoon time.
MAAMMMMMMAAAAAA
192022242526283031333537383942
William wishes to test for a difference between the average times of morning and afternoon journeys.
  1. Given that \(s _ { M } ^ { 2 } = 16.5\) and \(s _ { A } ^ { 2 } = 64.5\), with the usual notation, explain why a \(t\)-test is not appropriate in this case.
  2. William chooses a non-parametric test at the \(5 \%\) significance level. Carry out the test, stating the rejection region.
OCR S4 2011 June Q5
11 marks Standard +0.3
5 A test was carried out to compare the breaking strengths of two brands of elastic band, \(A\) and \(B\), of the same size. Random samples of 6 were selected from each brand and the breaking strengths were measured. The results, in suitable units and arranged in ascending order for each brand, are as follows.
Brand \(A :\)5.68.79.210.711.212.6
Brand \(B :\)10.111.612.012.212.913.5
  1. Give one advantage that a non-parametric test might have over a parametric test in this context.
  2. Carry out a suitable Wilcoxon test at the \(5 \%\) significance level of whether there is a difference between the average breaking strengths of the two brands.
  3. An extra elastic band of brand \(B\) was tested and found to have a breaking strength exceeding all of the other 12 bands. Determine whether this information alters the conclusion of your test.
OCR S4 2012 June Q3
9 marks Standard +0.3
3 Because of the large number of students enrolled for a university geography course and the limited accommodation in the lecture theatre, the department provides a filmed lecture. Students are randomly assigned to two groups, one to attend the lecture theatre and the other the film. At the end of term the two groups are given the same examination. The geography professor wishes to test whether there is a difference in the performance of the two groups and selects the marks of two random samples of students, 6 from the group attending the lecture theatre and 7 from the group attending the films. The marks for the two samples, ordered for convenience, are shown below.
Lecture theatre:303648515962
Filmed lecture:40495256636468
  1. Stating a necessary assumption, carry out a suitable non-parametric test, at the \(10 \%\) significance level, for a difference between the median marks of the two groups.
  2. State conditions under which a two-sample \(t\)-test could have been used.
  3. Assuming that the tests in parts (i) and (ii) are both valid, state with a reason which test would be preferable.
OCR S4 2013 June Q4
10 marks Standard +0.3
4 The effect of water salinity on the growth of a type of grass was studied by a biologist. A random sample of 22 seedlings was divided into two groups \(A\) and \(B\), each of size 11 .
Group \(A\) was treated with water of \(0 \%\) salinity and group \(B\) was treated with water of \(0.5 \%\) salinity. After three weeks the height (in cm) of each seedling was measured with the following results, which are ordered for convenience.
Group \(A\)8.69.49.79.810.110.511.011.211.812.7
Group \(B\)7.48.48.58.89.29.39.59.910.011.1
Jeffery was asked to test whether the two treatments resulted, on average, in a difference in growth. He chose the Wilcoxon rank sum test.
  1. Justify Jeffery's choice of test.
  2. Carry out the test at the \(5 \%\) significance level.
OCR MEI S4 2012 June Q3
24 marks Standard +0.3
3 At an agricultural research station, trials are being made of two fertilisers, A and B, to see whether they differ in their effects on the yield of a crop. Preliminary investigations have established that the underlying variances of the distributions of yields using the two fertilisers may be assumed equal. Scientific analysis of the fertilisers has suggested that fertiliser A may be inferior in that it leads, on the whole, to lower yield. A statistical analysis is being carried out to investigate this. The crop is grown in carefully controlled conditions in 14 experimental plots, 6 with fertiliser A and 8 with fertiliser B. The yields, in kg per plot, are as follows, arranged in ascending order for each fertiliser.
Fertiliser A9.810.210.911.512.713.3
Fertiliser B10.811.912.012.212.913.513.613.7
  1. Carry out a Wilcoxon rank sum test at the \(5 \%\) significance level to examine appropriate hypotheses.
  2. Carry out a \(t\) test at the \(5 \%\) significance level to examine appropriate hypotheses.
  3. Goodness of fit tests based on more extensive data sets from other trials with these fertilisers have failed to reject hypotheses of underlying Normal distributions. Discuss the relative merits of the analyses in parts (i) and (ii).
OCR S4 2016 June Q2
8 marks Standard +0.3
2 Low density lipoprotein (LDL) cholesterol is known as 'bad' cholesterol.
15 randomly chosen patients, each with an LDL level of 190 mg per decilitre of blood, were given one of two treatments, chosen at random. After twelve weeks their LDL levels, in mg per decilitre, were as follows.
Treatment \(A\)189168176186183187188
Treatment \(B\)177179173180178170175174
Use a Wilcoxon rank sum test, at the \(5 \%\) level of significance, to test whether the LDL levels of patients given treatment \(B\) are lower than the LDL levels of patients given treatment \(A\).
OCR S4 2017 June Q4
12 marks Standard +0.3
4 The heights of eleven randomly selected primary school children are measured. The results, in metres, are
Girls1.481.311.631.381.561.57
Boys1.441.351.321.281.27
  1. Use a Wilcoxon rank-sum test, at the \(1 \%\) significance level, to test whether primary school girls are taller than primary school boys.
  2. It is decided to repeat the test, using larger random samples. The heights of twenty girls and eighteen boys are measured. Find the greatest value of the test statistic \(W\) which will result in the conclusion that there is evidence, at the \(1 \%\) level of significance, that primary school girls are taller than primary school boys.
OCR S4 2015 June Q6
9 marks Moderate -0.8
6 In a two-tail Wilcoxon rank-sum test, the sample sizes are 13 and 15. The sum of the ranks for the sample of size 13 is 135 . Carry out the test at the \(5 \%\) level of significance.
OCR S4 2018 June Q2
8 marks Standard +0.3
2 The distances from home to work, in km , of 8 men and 5 women were recorded and are given below. The workers were chosen at random.
Men47101316172021
Women12141822
Carry out a Wilcoxon rank-sum test at the \(5 \%\) significance level to test whether there is a significant difference between the distances from home to work between men and women.
OCR Further Statistics 2019 June Q8
10 marks Standard +0.3
8 A university course was taught by two different professors. Students could choose whether to attend the lectures given by Professor \(Q\) or the lectures given by Professor \(R\). At the end of the course all the students took the same examination. The examination marks of a random sample of 30 students taught by Professor \(Q\) and a random sample of 24 students taught by Professor \(R\) were ranked. The sum of the ranks of the students taught by Professor \(Q\) was 726 . Test at the 5\% significance level whether there is a difference in the ranks of the students taught by the two professors.
OCR Further Statistics 2023 June Q5
10 marks Challenging +1.2
5 An historian has reason to believe that the average age at which men got married in the seventeenth century was higher in urban areas compared to rural areas. The historian collected data from a random sample of 8 men in an urban area and a random sample of 6 men in a rural area, all of whom were married in the seventeenth century. The results were as follows, given in the form years/months.
Urban:\(18 / 3\)\(18 / 5\)\(19 / 9\)\(20 / 7\)\(25 / 6\)\(34 / 6\)\(41 / 8\)\(46 / 3\)
Rural:\(18 / 0\)\(18 / 1\)\(18 / 4\)\(19 / 11\)\(22 / 2\)\(28 / 11\)
  1. Use an appropriate non-parametric method to test at the \(5 \%\) significance level whether the average age at marriage of men is higher in urban areas than in rural areas.
  2. When checking the data, the historian found that the age of one of the men, Mr X, which had been recorded as 28/11, had been wrongly recorded. When corrected, the result of the test in part (a) was unchanged. Determine the youngest age that Mr X could have been, given that it was not the same, in years and months, as that of any of the other men in the sample.
OCR Further Statistics 2020 November Q3
9 marks Standard +0.3
3 Jo can use either of two different routes, A or B, for her journey to school. She believes that route A has shorter journey times. She measures how long her journey takes for 17 journeys by route A and 12 journeys by route B . She ranks the 29 journeys in increasing order of time taken, and she finds that the sum of the ranks of the journeys by route B is 219 .
  1. Test at the \(10 \%\) significance level whether route A has shorter journey times than route B .
  2. State an assumption about the 29 journeys which is necessary for the conclusion of the test to be valid.
OCR Further Statistics 2021 November Q7
12 marks Standard +0.3
7 In a school opinion poll a random sample of 8 pupils were asked to rate school lunches on a scale of 0 to 20 . The results were as follows. \(\begin{array} { l l l l l l l l } 0 & 1 & 2 & 3 & 4 & 10 & 11 & 13 \end{array}\) After a new menu was introduced, the test was repeated with a different random sample of 8 pupils. The results were as follows. \(\begin{array} { l l l l l l l l } 7 & 8 & 9 & 14 & 15 & 17 & 19 & 20 \end{array}\)
  1. Carry out an appropriate Wilcoxon test at the \(5 \%\) significance level to test whether pupils' opinions of school lunches have changed. A statistics student tells the organisers of the opinion poll that it would have been better to have asked the same 8 pupils both times.
  2. Explain why the statistics student's suggestion would produce a better test.
  3. State which test should be used if the student's suggestion is followed.
  4. You are given that there are 12870 ways in which 8 different integers can be chosen from the integers 1 to 16 inclusive. Estimate the number of ways of selecting 8 different digits between 1 and 16 inclusive that have a sum less than or equal to the critical value used in the test in part (a).
WJEC Further Unit 5 2019 June Q7
7 marks Standard +0.3
7. Nathan believes that shearers from Wales can shear more sheep, on average, in a given time than shearers from New Zealand. He takes a random sample of 8 shearers from Wales and 7 shearers from New Zealand. The numbers below indicate how many sheep were sheared in 45 minutes by the 15 shearers.
Wales:6053423837363128
New Zealand:39352726171615
Use a Mann-Whitney U test at the \(1 \%\) significance level to test whether Nathan is correct. You must state your hypotheses clearly and state the critical region.
WJEC Further Unit 5 2023 June Q6
7 marks Standard +0.3
6. A triathlon race organiser wishes to know whether competitors who are members of a triathlon club race more frequently than competitors who are not members of a triathlon club. Six competitors from a triathlon club and six competitors who are not members of a triathlon club are selected at random. The table below shows the number of triathlon races they each entered last year.
Club
members
11412537
Not club
members
294086
  1. Use a Mann-Whitney U test at a significance level as close to \(5 \%\) as possible to carry out the race organiser's investigation.
  2. Briefly explain why a Wilcoxon signed-rank test is not appropriate in this case.
WJEC Further Unit 5 2024 June Q6
6 marks Standard +0.8
6. Alana is a PhD student researching language acquisition. She gives one group of randomly selected participants, Group A, 4 minutes to memorise 40 words that are similar in meaning. She gives a different, randomly selected group of participants, Group B, 4 minutes to memorise 40 words that are different in meaning. Alana believes that the students in Group B will do better than the students in Group A. The following results are the number of words recalled on testing the students from the two groups.
Group A32824161020221823212614
Group B302911253836281217
Conduct a Mann-Whitney U test at a significance level as close as possible to \(5 \%\) to test Alana's belief.
WJEC Further Unit 5 Specimen Q3
9 marks Standard +0.3
3. A motoring organisation wishes to determine whether or not the petrol consumption of two different car models \(A\) and \(B\) are the same. A trial is therefore carried out in which 6 cars of each model are given 10 litres of petrol and driven at a predetermined speed around a track until the petrol is used up. The distances travelled, in miles, are shown below
Model A:86.384.285.883.184.785.3
Model B:84.985.984.886.585.285.5
It is proposed to use a test with significance level \(5 \%\) based on the Mann-Whitney statistic \(U\).
  1. State suitable hypotheses.
  2. Find the critical region for the test.
  3. Determine the value of \(U\) for the above data and state your conclusion in context. You must justify your answer.
OCR Further Statistics 2018 September Q8
8 marks Standard +0.3
8 In an experiment to investigate the effect of background music in carrying out work, ten students were each given a task. Five of the students did the task in silence and the other five did the task with background music. The scores on the tasks were as follows.
Silence4346555861
Background music1931385270
  1. Use a Wilcoxon rank-sum test to test at the 10\% level whether the presence of background music affects scores.
  2. A statistician suggests that the experiment is redesigned so that each student takes one task in silence and another task with background music. The differences in the test scores would then be analysed using a paired-sample method. State an advantage in redesigning the experiment in this way.
WJEC Further Unit 5 2022 June Q3
8 marks Standard +0.3
3. A statistics teacher wants to investigate whether students from the north of a county and students from the south of the same county feel similarly stressed about examinations. The teacher carries out a psychometric test on 10 randomly selected students to give a score between 0 (low stress) and 100 (high stress) to measure their stress levels before a set of examinations. The results are shown in the table below.
StudentAreaStress Level
HeleddNorth67
MairNorth55
HywelSouth26
GwynSouth70
LiamSouth36
MarcinSouth57
GosiaSouth32
KestutasNorth64
EricaNorth60
TomosNorth22
  1. State one reason why a Mann-Whitney test is appropriate.
  2. Conduct a Mann-Whitney test at a significance level as close to \(5 \%\) as possible. State your conclusion clearly.
  3. How could this investigation be improved?
OCR Further Statistics 2021 June Q5
10 marks Standard +0.3
5 A university course was taught by two different professors. Students could choose whether to attend the lectures given by Professor \(Q\) or the lectures given by Professor \(R\). At the end of the course all the students took the same examination. The examination marks of a random sample of 30 students taught by Professor \(Q\) and a random sample of 24 students taught by Professor \(R\) were ranked. The sum of the ranks of the students taught by Professor \(Q\) was 726 . Test at the \(5 \%\) significance level whether there is a difference in the ranks of the students taught by the two professors.
[0pt] [10] Total Marks for Question Set 3: 38 \section*{Mark scheme} \section*{Marking Instructions} a An element of professional judgement is required in the marking of any written paper. Remember that the mark scheme is designed to assist in marking incorrect solutions. Correct solutions leading to correct answers are awarded full marks but work must not always be judged on the answer alone, and answers that are given in the question, especially, must be validly obtained; key steps in the working must always be looked at and anything unfamiliar must be investigated thoroughly. Correct but unfamiliar or unexpected methods are often signalled by a correct result following an apparently incorrect method. Such work must be carefully assessed.
b The following types of marks are available. \section*{M} A suitable method has been selected and applied in a manner which shows that the method is essentially understood. Method marks are not usually lost for numerical errors, algebraic slips or errors in units. However, it is not usually sufficient for a candidate just to indicate an intention of using some method or just to quote a formula; the formula or idea must be applied to the specific problem in hand, e.g. by substituting the relevant quantities into the formula. In some cases the nature of the errors allowed for the award of an M mark may be specified.
A method mark may usually be implied by a correct answer unless the question includes the DR statement, the command words "Determine" or "Show that", or some other indication that the method must be given explicitly. \section*{A} Accuracy mark, awarded for a correct answer or intermediate step correctly obtained. Accuracy marks cannot be given unless the associated Method mark is earned (or implied). Therefore M0 A1 cannot ever be awarded. \section*{B} Mark for a correct result or statement independent of Method marks. \section*{E} A given result is to be established or a result has to be explained. This usually requires more working or explanation than the establishment of an unknown result. Unless otherwise indicated, marks once gained cannot subsequently be lost, e.g. wrong working following a correct form of answer is ignored. Sometimes this is reinforced in the mark scheme by the abbreviation isw. However, this would not apply to a case where a candidate passes through the correct answer as part of a wrong argument.
c When a part of a question has two or more 'method' steps, the M marks are in principle independent unless the scheme specifically says otherwise; and similarly where there are several B marks allocated. (The notation 'dep*' is used to indicate that a particular mark is dependent on an earlier, asterisked, mark in the scheme.) Of course, in practice it may happen that when a candidate has once gone wrong in a part of a question, the work from there on is worthless so that no more marks can sensibly be given. On the other hand, when two or more steps are successfully run together by the candidate, the earlier marks are implied and full credit must be given.
d The abbreviation FT implies that the A or B mark indicated is allowed for work correctly following on from previously incorrect results. Otherwise, A and B marks are given for correct work only - differences in notation are of course permitted. A (accuracy) marks are not given for answers obtained from incorrect working. When A or B marks are awarded for work at an intermediate stage of a solution, there may be various alternatives that are equally acceptable. In such cases, what is acceptable will be detailed in the mark scheme. Sometimes the answer to one part of a question is used in a later part of the same question. In this case, A marks will often be 'follow through'.
e We are usually quite flexible about the accuracy to which the final answer is expressed; over-specification is usually only penalised where the scheme explicitly says so.
  • When a value is given in the paper only accept an answer correct to at least as many significant figures as the given value.
  • When a value is not given in the paper accept any answer that agrees with the correct value to \(\mathbf { 3 ~ s } . \mathbf { f }\). unless a different level of accuracy has been asked for in the question, or the mark scheme specifies an acceptable range.
Follow through should be used so that only one mark in any question is lost for each distinct accuracy error.
Candidates using a value of \(9.80,9.81\) or 10 for \(g\) should usually be penalised for any final accuracy marks which do not agree to the value found with 9.8 which is given in the rubric.
f Rules for replaced work and multiple attempts:
  • If one attempt is clearly indicated as the one to mark, or only one is left uncrossed out, then mark that attempt and ignore the others.
  • If more than one attempt is left not crossed out, then mark the last attempt unless it only repeats part of the first attempt or is substantially less complete.
  • if a candidate crosses out all of their attempts, the assessor should attempt to mark the crossed out answer(s) as above and award marks appropriately.
For a genuine misreading (of numbers or symbols) which is such that the object and the difficulty of the question remain unaltered, mark according to the scheme but following through from the candidate's data. A penalty is then applied; 1 mark is generally appropriate, though this may differ for some units. This is achieved by withholding one A or B mark in the question. Marks designated as cao may be awarded as long as there are no other errors.
If a candidate corrects the misread in a later part, do not continue to follow through. Note that a miscopy of the candidate's own working is not a misread but an accuracy error.
h If a calculator is used, some answers may be obtained with little or no working visible. Allow full marks for correct answers, provided that there is nothing in the wording of the question specifying that analytical methods are required such as the bold "In this question you must show detailed reasoning", or the command words "Show" or "Determine". Where an answer is wrong but there is some evidence of method, allow appropriate method marks. Wrong answers with no supporting method score zero. \begin{table}[h]
\captionsetup{labelformat=empty} \caption{Abbreviations}
Abbreviations used in the mark schemeMeaning
dep*Mark dependent on a previous mark, indicated by *. The * may be omitted if only one previous M mark
caoCorrect answer only
оеOr equivalent
rotRounded or truncated
soiSeen or implied
wwwWithout wrong working
AGAnswer given
awrtAnything which rounds to
BCBy Calculator
DRThis question included the instruction: In this question you must show detailed reasoning.
\end{table}
QuestionAnswerMarkAOGuidance
1(a)0.8392...B1 [1]1.1Awrt 0.839\(\begin{aligned} S _ { x x }= 1.7449 \ldots , S _ { y y } = 41.2 \ldots ,
S _ { x y }= 7.116 \ldots \end{aligned}\)
1(b)\(y = - 1.180 + 4.0781 x\)
B1
B1
[2]
1.1
1.1
Both coeffs, awrt -1.18 and 4.08
Letters correct, needs 1 correct coefficient
1(c)Value of PMCC suggests that there is strong correlation, or 0.75 shown close to mean 0.399
B1
[1]
3.5a
E.g. " \(r\) high so points lie close to line". " \(r\) is high" alone is enough.
No wrong extras
Not "0.75 is close to mean", unless properly justified, e.g. SD (= 0.264) calculated
1(d)Whether \(x = 0.75\) is within the data range
B1
[1]
3.5b
E.g. "maximum and minimum values of \(x\) "; not "all data points".
No wrong extras
Or clear reference to interpolation. NB: 95\% CI for \(x\) is ( \(- 0.156,0.954\) )
2(a)
Po(497)
\(\mathrm { P } ( \geq 520 ) = 1 - \mathrm { P } ( \leq 519 )\) used correctly
\(= 0.1564 \ldots\)
B1
M1
A1 [3]
1.1
1.1a
1.1
Stated or implied
Allow 0.146(08) from 1 \(\mathrm { P } ( \leq 520 )\)
In range [0.156,0.157]
SC: Normal approx.:
N(497, 497) B1
In range [0.156, 0.157]: B2
2(b)Occurrence of a bus is not a random event if it runs on or close to a schedule.
B1
[1]
2.4
Needs context (not just "events").
Allow just "buses not random", or "buses not independent because time between buses is regulated"
Not "not independent" without such justification. Not "not constant rate". No extras.
QuestionAnswerMarkAOGuidance
\multirow[t]{4}{*}{3}(a)\(\mathrm { H } _ { 0 } : \mu = 500 , \mathrm { H } _ { 1 } : \mu < 500\)B11.1One error, e.g. \(\mathrm { H } _ { 1 } : \mu \neq 500\), or \(\mu\) not defined, or all in words: B1\(x\) or \(\bar { x } : 0\) unless defined as population mean (then B1)
\(\begin{aligned}\bar { X } \sim \mathrm {~N} \left( 500 , \frac { 80 ^ { 2 } } { 40 } \right) = \mathrm { N } ( 500,160 ) \text { and } \bar { X }
\mathrm { P } ( \bar { X } < 473 ) = 0.01640 \text { or } z = - 2.13 ( 45 )
\text { or } \mathrm { CV } = 470.6 \end{aligned}\)М13.3\(p\) or \(z\) correct to 3 sf .Can be implied by 0.0164, 0.9836, 0.433, 0.198, 0.000 but not 0.3679 or 0.00127
\(p > 0.01\) or \(z > - 2.326 \quad\) or \(473 > 470.6\)A11.1Compare \(p\) with 0.01 or \(z\) with -2.326, or 2.326 used in CVMust be like-with-like, Not e.g. 0.9836 > 0.01 or \(p < 2.326\)
Do not reject \(\mathrm { H } _ { 0 }\). Insufficient evidence that greatest weight that new design can support is less than the greatest weight that the traditional design can support.
M1ft
A1ft [7]
1.1Correct first conclusion, needs correct method and like-with-like, ft on test statistic if method correct Contextualised, not too definiteBut BOD if no explicit comparison of \(p\) with 0.01 Not "the new design does not have a smaller greatest weight . . ."
3(b)Standard deviation/variance remains unchanged, or sample must be randomB1 [1]1.2No extras. Not "same distribution".Not "assume normal"; this is not needed
3(c)
Either: Yes as we do not know that the distribution of weights for the new design is normal
Or: \(\quad\) No as the population distribution known to be normal
B1 [1]2.1Allow "population distribution assumed to be normal". No extras, e.g. "and sample size is large".Allow "yes as we do not know that the distribution for the new design is normal" only if clearly refers to the new design only
QuestionAnswerMarkAOGuidance
4(a)\(9 \times \frac { 40 } { 9 } = 40\)B1 [1]1.140 or awrt 40.0 only
4(b)\(\frac { 1 - p } { p ^ { 2 } } = \frac { 40 } { 9 }\)М13.1bUse correct formula for varianceSC: insufficient working, \(\frac { 3 } { 8 }\) only: M0B1 for \(\frac { 3 } { 8 }\), then B0
\(\begin{aligned}\mathrm { E } ( D ) = 1 / p \quad \left[ = \frac { 8 } { 3 } \right]
\mathrm { E } ( 3 D + 5 ) = 3 \times \frac { 8 } { 3 } + 5 \quad [ = 13 ] \end{aligned}\)B1ft2.3- formula for \(\mathrm { E } ( D )\)
Allow for explicit rejection of a solution even if both are wrong
\(p\) doesn't need to be between 0 and 1 for either of these marks
A1ft [6]1.1\(3 \times (\) their \(\mathrm { E } ( D ) ) + 5\)
SC: \(\frac { 1 - p } { p ^ { 2 } } = 40\) (their 40), \(p = \frac { - 1 \pm \sqrt { 161 } } { 80 }\), reject negative solution, \(\mathrm { E } ( D ) = \frac { 1 + \sqrt { 161 } } { 2 } = 6.844 , \mathrm { E } ( 3 D + 5 ) = 25.53 : \quad \mathrm { M } 1 , \mathrm { M } 1 \mathrm {~A} 0 , \mathrm {~B} 1 , \mathrm {~B} 2\) total \(5 / 6\)
4(c)
\(\begin{aligned}\mathrm { P } ( D > \mathrm { E } ( D ) ) = \mathrm { P } ( D \geq 3 )
= ( 1 - p ) ^ { 2 }
= \frac { 25 } { 64 } \text { or } 0.390625 \end{aligned}\)
M1ft М1
A1 [3]
3.1a 1.1a
1.1
Convert inequality to integer, their \([ 1 / p ] + 1\), allow >
\(( 1 - p ) ^ { r } , \mathrm { ft }\) on their \(p , r\), e.g. 8/3 or 13
Allow \(( 1 - p ) ^ { 3 } = 125 / 512\) or 0.244
Answer, exact or art 0.391, www
Not their 13
\(( 1 - p ) ^ { 8 / 3 } [ 0.286 ]\) : M0M1A0
Need \(0 < p < 1\) here
Allow \(( 1 - p ) ^ { 6 } = 0.3876\) from SC above
QuestionAnswerMarkAOGuidance
\multirow[t]{10}{*}{5}\multirow{10}{*}{}\(\mathrm { H } _ { 0 } : m _ { Q } = m _ { R } , \mathrm { H } _ { 1 } : m _ { Q } \neq m _ { R }\), where \(m _ { Q }\) and \(m _ { R }\) are the medians of the rankings given to \(Q\) andB11.1Allow \(m\) undefined. If verbal, must mention medians, \(m\) or distribution. Allow \(m _ { d } = 0\) as opposed to \(m Q = m _ { R }\)Not anything that might be \(\mu\) unless symbol clearly defined as median. Not "there is no difference in the ranks ..."
Sum of ranks \(= 1 / 2 \times 54 \times 55 = 1485\)М11.1Find sum of ranks
\(R _ { m } = 1485 - 726 = 759 \quad\) [or 561]A11.1Correct value of \(R _ { m }\) seenAllow even if 726 used later
\(\begin{array} { r } R _ { m } \sim \mathrm {~N} ( 660 ,
\quad \ldots 3300 ) \end{array}\)
М1
A1
3.1b
3.3
normal, mean their \(\frac { 1 } { 2 } \times 24 \times\) 55
Allow SD/Var muddle
\(\begin{aligned} \mathrm { P } \left( R _ { m } \geq 759 \right)= 0.0432 \text { (3 s.f.) }
{ [ \text { or } z }= 1.715 ] \end{aligned}\)
М1
A1
3.4
1.1
Both parameters correct Standardise, their \(R _ { m }\)
Correct test statistic (0.0432) 0.0424 or 0.0416 (no/wrong cc): M1A0
(Same for \(\mathrm { P } \left( R _ { m } \leq 561 \right)\) Allow \(z \square \in [ 1.71,1.715 ]\), allow \(z = 1.72\) only if cc demonstrated correct
Alternatively: \(\operatorname { CV } 660 + 1.96 \sqrt { } 3300 [ = 772.6 ]\)
758.5 < 772.6
M1 A1Not 759 - or 726 - ...; not wrong tail for comparison, but allow ± Needs correct ccOr 561.5 > 547.4 Wrong \(z\)-value: M1A1ft B0
\(p > 0.025,2 p > 0.05 , z < 1.96\), or 1.96 used in CVB11.1Explicit correct comparisonNeeds like-with-like (e.g. \(p\) must be < 0.5)
Do not reject \(\mathrm { H } _ { 0 }\). Insufficient evidence of a difference between the ranks.
M1ft
A1ft [10]
1.1
2.2b
Correct first conclusion, needs correct method and like-with-like Contextualised, not too definiteft on wrong ts, or 1-tail/2-tail confusions, e.g. \(p\) compared with 0.05 or not explicit, or \(z \geq 1.645\)
\includegraphics[max width=\textwidth, alt={}]{6cdb3135-90ca-42f1-bab1-a4b35451cea2-10_54_1750_1703_611}