5.07b Sign test: and Wilcoxon signed-rank

84 questions

Sort by: Default | Easiest first | Hardest first
OCR S4 2012 June Q5
11 marks Standard +0.3
5 A one-tail sign test of a population median is to be carried out at the \(5 \%\) significance level using a sample of size \(n\).
  1. Show by calculation that the test can never result in rejection of the null hypothesis when \(n = 4\). The coach of a college swimming team expects Elena, the best 50 m freestyle swimmer, to have a median time less than 30 seconds. Elena found from records of her previous 72 swims that 44 were less than 30 seconds and 28 were greater than 30 seconds.
  2. Stating a necessary assumption, test at the \(5 \%\) significance level whether Elena's median time for the 50 m freestyle is less than 30 seconds.
OCR S4 2013 June Q2
7 marks Standard +0.3
2 Two drugs, I and II, for alleviating hay fever are trialled in a hospital on each of 12 volunteer patients. Each received drug I on one day and drug II on a different day. After receiving a drug, the number of times each patient sneezed over a period of one hour was noted. The results are given in the table.
Patient123456789101112
Drug I1134191610296172013425
Drug II122010183219131019912
The patients may be considered to be a random sample of all hay fever sufferers.
A researcher believes that patients taking drug II sneeze less than patients taking drug I.
Test this belief using the Wilcoxon signed rank test at the \(5 \%\) significance level.
OCR S4 2013 June Q4
10 marks Standard +0.3
4 The effect of water salinity on the growth of a type of grass was studied by a biologist. A random sample of 22 seedlings was divided into two groups \(A\) and \(B\), each of size 11 .
Group \(A\) was treated with water of \(0 \%\) salinity and group \(B\) was treated with water of \(0.5 \%\) salinity. After three weeks the height (in cm) of each seedling was measured with the following results, which are ordered for convenience.
Group \(A\)8.69.49.79.810.110.511.011.211.812.7
Group \(B\)7.48.48.58.89.29.39.59.910.011.1
Jeffery was asked to test whether the two treatments resulted, on average, in a difference in growth. He chose the Wilcoxon rank sum test.
  1. Justify Jeffery's choice of test.
  2. Carry out the test at the \(5 \%\) significance level.
OCR S4 2014 June Q1
8 marks Standard +0.3
1 A teacher believes that the calculator paper in a GCSE Mathematics examination was easier than the non-calculator paper. The marks of a random sample of ten students are shown in the table.
StudentABCDEFGHIJ
Mark on paper 1 (non-calculator)66795887675575625084
Mark on paper 2 (calculator)57847090754282726582
  1. Use a Wilcoxon signed-rank test, at the \(5 \%\) significance level, to test the teacher's belief.
  2. State the assumption necessary for this test to be applied.
OCR S4 2014 June Q6
8 marks Standard +0.3
6 A Wilcoxon rank-sum test with samples of sizes 11 and 12 is carried out.
  1. What is the least possible value of the test statistic \(W\) ?
  2. The null hypothesis is that the two samples came from identical populations. Given that the null hypothesis was rejected at the \(1 \%\) level using a 2 -tail test, find the set of possible values of \(W\).
OCR MEI S4 2006 June Q3
24 marks Standard +0.3
3 The human resources department of a large company is investigating two methods, A and B, for training employees to carry out a certain complicated and intricate task.
  1. Two separate random samples of employees who have not previously performed the task are taken. The first sample is of size 10 ; each of the employees in it is trained by method A. The second sample is of size 12; each of the employees in it is trained by method B. After completing the training, the time for each employee to carry out the task is measured, in controlled conditions. The times are as follows, in minutes.
    Employees trained by method A:35.247.825.838.053.631.033.9
    35.421.642.5
    Employees trained by method B:43.057.568.620.931.444.962.8
    27.641.846.139.861.6
    Stating appropriate assumptions concerning the underlying populations, use a \(t\) test at the \(5 \%\) significance level to examine whether either training method is better in respect of leading, on the whole, to a lower time to carry out the task.
  2. A further trial of method B is carried out to see if the performance of experienced and skilled workers can be improved by re-training them. A random sample of 8 such workers is taken. The times in minutes, under controlled conditions, for each worker to carry out the task before and after re-training are as follows.
    Worker\(W _ { 1 }\)\(W _ { 2 }\)\(W _ { 3 }\)\(W _ { 4 }\)\(W _ { 5 }\)\(W _ { 6 }\)\(W _ { 7 }\)\(W _ { 8 }\)
    Time before32.628.522.927.634.928.834.231.3
    Time after26.224.119.028.629.320.036.019.2
    Stating an appropriate assumption, use a \(t\) test at the \(5 \%\) significance level to examine whether the re-training appears, on the whole, to lead to a lower time to carry out the task.
  3. Explain how the test procedure in part (ii) is enhanced by designing it as a paired comparison.
OCR S4 2016 June Q1
8 marks Moderate -0.8
1 Ten archers shot at targets with two types of bow. Their scores out of 100 are shown in the table.
Archer\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
Bow type \(P\)95979285879290899877
Bow type \(Q\)91918890808893859484
  1. Use the sign test, at the \(5 \%\) level of significance, to test the hypothesis that bow type \(P\) is better than bow type \(Q\).
  2. Why would a Wilcoxon signed rank test, if valid, be a better test than the sign test?
OCR S4 2017 June Q1
4 marks Standard +0.3
1 A meteorologist claims that the median daily rainfall in London is 2.2 mm . A single sample sign test is to be used to test the claim, using the following hypotheses: \(\mathrm { H } _ { 0 }\) : a sample comes from a population with median 2.2, \(\mathrm { H } _ { 1 }\) : the sample does not come from a population with median 2.2.
30 randomly selected observations of daily rainfall in London are compared with 2.2, and given a '+' sign if greater than 2.2 and a '-' sign if less than 2.2. (You may assume that no data values are exactly equal to 2.2.) The test is to be carried out at the \(5 \%\) level of significance. Let the number of ' + ' signs be \(k\). Find, in terms of \(k\), the critical region for the test showing the values of any relevant probabilities.
CAIE Further Paper 4 2020 Specimen Q1
7 marks Moderate -0.5
1
  1. State briefly the circumstances under which a non-parametric test of significance should be used rather than a parametric test. The level of pollution in a river was measured at 12 randomly chosen locations. The results, in suitable units, are shown below, where higher values represent greater pollution.
    5.625.736.556.816.105.755.876.475.866.266.995.91
  2. Use a Wilcoxon signed-rank test to test whether the average pollution level in the river is more than 6.00. Use a \(5\%\) significance level.
    [0pt] [6]
CAIE Further Paper 4 2020 Specimen Q3
8 marks Standard +0.3
3 Employees at a particular company have been working seven hours each day, from 9 am to 4 pm. To try to reduce absence, the company decides to introduce 'flexi-time' and allow employees to work their seven hours each day at any time between 7 am and 9 pm. For a random sample of 10 employees, the numbers of hours of absence in the year before and the year after the introduction of flexi-time are given in the following table.
Employee\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
Before4235967420578451460
After34321007231261351400
Test, at the \(10\%\) significance level, whether the population mean number of hours of absence has decreased following the introduction of flexi-time, stating any assumption that you make.
OCR MEI S3 2009 January Q1
18 marks Standard +0.3
1
  1. A continuous random variable \(X\) has probability density function $$\mathrm { f } ( x ) = \lambda x ^ { c } , \quad 0 \leqslant x \leqslant 1 ,$$ where \(c\) is a constant and the parameter \(\lambda\) is greater than 1 .
    1. Find \(c\) in terms of \(\lambda\).
    2. Find \(\mathrm { E } ( X )\) in terms of \(\lambda\).
    3. Show that \(\operatorname { Var } ( X ) = \frac { \lambda } { ( \lambda + 2 ) ( \lambda + 1 ) ^ { 2 } }\).
  2. Every day, Godfrey does a puzzle from the newspaper and records the time taken in minutes. Last year, his median time was 32 minutes. His times for a random sample of 12 puzzles this year are as follows. $$\begin{array} { l l l l l l l l l l l l } 40 & 20 & 18 & 11 & 47 & 36 & 38 & 35 & 22 & 14 & 12 & 21 \end{array}$$ Use an appropriate test, with a 5\% significance level, to examine whether Godfrey's times this year have decreased on the whole.
OCR MEI S3 2010 January Q2
19 marks Standard +0.3
2
  1. A continuous random variable, \(X\), has probability density function $$f ( x ) = \begin{cases} \frac { 1 } { 72 } \left( 8 x - x ^ { 2 } \right) & 2 \leqslant x \leqslant 8 \\ 0 & \text { otherwise } \end{cases}$$
    1. Find \(\mathrm { F } ( x )\), the cumulative distribution function of \(X\).
    2. Sketch \(\mathrm { F } ( x )\).
    3. The median of \(X\) is \(m\). Show that \(m\) satisfies the equation \(m ^ { 3 } - 12 m ^ { 2 } + 148 = 0\). Verify that \(m \approx 4.42\).
  2. The random variable in part (a) is thought to model the weights, in kilograms, of lambs at birth. The birth weights, in kilograms, of a random sample of 12 lambs, given in ascending order, are as follows. $$\begin{array} { l l l l l l l l l l l l } 3.16 & 3.62 & 3.80 & 3.90 & 4.02 & 4.72 & 5.14 & 6.36 & 6.50 & 6.58 & 6.68 & 6.78 \end{array}$$ Test at the 5\% level of significance whether a median of 4.42 is consistent with these data.
OCR MEI S3 2011 January Q2
18 marks Standard +0.3
2
    1. What is stratified sampling? Why would it be used?
    2. A local authority official wishes to conduct a survey of households in the borough. He decides to select a stratified sample of 2000 households using Council Tax property bands as the strata. At the time of the survey there are 79368 households in the borough. The table shows the numbers of households in the different tax bands.
      Tax bandA - BC - DE - FG - H
      Number of households322983321197394120
      Calculate the number of households that the official should choose from each stratum in order to obtain his sample of 2000 households so that each stratum is represented proportionally.
    1. What assumption needs to be made when using a Wilcoxon single sample test?
    2. As part of an investigation into trends in local authority spending, one of the categories of expenditure considered was 'Highways and the Environment'. For a random sample of 10 local authorities, the percentages of their total expenditure spent on Highways and the Environment in 1999 and then in 2009 are shown in the table.
      Local authorityABCDEFGHIJ
      19999.608.408.679.329.899.357.918.089.618.55
      20098.948.427.878.4110.1710.118.319.769.549.67
      Use a Wilcoxon test, with a significance level of \(10 \%\), to determine whether there appears to be any change to the average percentage of total expenditure spent on Highways and the Environment between 1999 and 2009.
OCR MEI S3 2012 January Q3
18 marks Standard +0.3
3
  1. A medical researcher is looking into the delay, in years, between first and second myocardial infarctions (heart attacks). The following table shows the results for a random sample of 225 patients.
    Delay (years)\(0 -\)\(1 -\)\(2 -\)\(3 -\)\(4 - 10\)
    Number of patients160401393
    The mean of this sample is used to construct a model which gives the following expected frequencies.
    Delay (years)\(0 -\)\(1 -\)\(2 -\)\(3 -\)\(4 - 10\)
    Number of patients142.2352.3219.257.084.12
    Carry out a test, using a \(2.5 \%\) level of significance, of the goodness of fit of the model to the data.
  2. A further piece of research compares the incidence of myocardial infarction in men aged 55 to 70 with that in women aged 55 to 70 . Incidence is measured by the number of infarctions per 10000 of the population. For a random sample of 8 health authorities across the UK, the following results for the year 2010 were obtained.
    Health authorityABCDEFGH
    Incidence in men4756155145545032
    Incidence in women3630304754552727
    A Wilcoxon paired sample test, using the hypotheses \(\mathrm { H } _ { 0 } : m = 0\) and \(\mathrm { H } _ { 1 } : m \neq 0\) where \(m\) is the population median difference, is to be carried out to investigate whether there is any difference between men and women on the whole.
    1. Explain why a paired test is being used in this context.
    2. Carry out the test using a \(10 \%\) level of significance.
OCR MEI S3 2013 January Q4
18 marks Moderate -0.3
4
  1. At a college, two examiners are responsible for marking, independently, the students' projects. Each examiner awards a mark out of 100 to each project. There is some concern that the examiners' marks do not agree, on average. Consequently a random sample of 12 projects is selected and the marks awarded to them are compared.
    1. Describe how a random sample of projects should be chosen.
    2. The marks given for the projects in the sample are as follows.
      Project123456789101112
      Examiner A583772786777624180606570
      Examiner B734774717896542797736066
      Carry out a test at the \(10 \%\) level of significance of the hypotheses \(\mathrm { H } _ { 0 } : m = 0 , \mathrm { H } _ { 1 } : m \neq 0\), where \(m\) is the population median difference.
  2. A calculator has a built-in random number function which can be used to generate a list of random digits. If it functions correctly then each digit is equally likely to be generated. When it was used to generate 100 random digits, the frequencies of the digits were as follows.
    Digit0123456789
    Frequency681114129155146
    Use a goodness of fit test, with a significance level of \(10 \%\), to investigate whether the random number function is generating digits with equal probability.
OCR MEI S3 2009 June Q3
17 marks Standard +0.3
3 A company which employs 600 staff wishes to improve its image by introducing new uniforms for the staff to wear. The human resources manager would like to obtain the views of the staff. She decides to do this by means of a systematic sample of \(10 \%\) of the staff.
  1. How should she go about obtaining such a sample, ensuring that all members of staff are equally likely to be selected? Explain whether this constitutes a simple random sample. At a later stage in the process, the choice of uniform has been reduced to two possibilities. Twelve members of staff are selected to take part in deciding which of the two uniforms to adopt. Each of the twelve assesses each uniform for comfort, appearance and practicality, giving it a total score out of 10. The scores are as follows.
    Staff member123456789101112
    Uniform A4.22.610.09.08.22.85.07.42.86.810.09.8
    Uniform B5.05.21.42.82.26.47.47.86.81.23.47.6
    A Wilcoxon signed rank test is to be used to decide whether there is any evidence of a preference for one of the uniforms.
  2. Explain why this test is appropriate in these circumstances and state the hypotheses that should be used.
  3. Carry out the test at the \(5 \%\) significance level.
OCR MEI S3 2012 June Q2
18 marks Easy -1.8
2
    1. Give two reasons why an investigator might need to take a sample in order to obtain information about a population.
    2. State two requirements of a sample.
    3. Discuss briefly the advantage of the sampling being random.
    1. Under what circumstances might one use a Wilcoxon single sample test in order to test a hypothesis about the median of a population? What distributional assumption is needed for the test?
    2. On a stretch of road leading out of the centre of a town, highways officials have been monitoring the speed of the traffic in case it has increased. Previously it was known that the median speed on this stretch was 28.7 miles per hour. For a random sample of 12 vehicles on the stretch, the following speeds were recorded. $$\begin{array} { l l l l l l l l l l l l } 32.0 & 29.1 & 26.1 & 35.2 & 34.4 & 28.6 & 32.3 & 28.5 & 27.0 & 33.3 & 28.2 & 31.9 \end{array}$$ Carry out a test, with a \(5 \%\) significance level, to see whether the speed of the traffic on this stretch of road seems to have increased on the whole.
      [0pt] [10]
OCR MEI S3 2013 June Q1
18 marks Standard +0.3
1 In the past, the times for workers in a factory to complete a particular task had a known median of 7.4 minutes. Following a review, managers at the factory wish to know if the median time to complete the task has been reduced.
  1. A random sample of 12 times, in minutes, gives the following results. $$\begin{array} { l l l l l l l l l l l l } 6.90 & 7.23 & 6.54 & 7.62 & 7.04 & 7.33 & 6.74 & 6.45 & 7.81 & 7.71 & 7.50 & 6.32 \end{array}$$ Carry out an appropriate test using a \(5 \%\) level of significance.
  2. Some time later, a much larger random sample of times gives the following results. $$n = 80 \quad \sum x = 555.20 \quad \sum x ^ { 2 } = 3863.9031$$ Find a \(95 \%\) confidence interval for the true mean time for the task. Justify your choice of which distribution to use.
  3. Describe briefly one advantage and one disadvantage of having a \(99 \%\) confidence interval instead of a \(95 \%\) confidence interval.
OCR MEI S3 2014 June Q2
19 marks Standard +0.3
2
  1. Explain what is meant by a simple random sample. A manufacturer produces tins of paint which nominally contain 1 litre. The quantity of paint delivered by the machine that fills the tins can be assumed to be a Normally distributed random variable. The machine is designed to deliver an average of 1.05 litres to each tin. However, over time paint builds up in the delivery nozzle of the machine, reducing the quantity of paint delivered. Random samples of 10 tins are taken regularly from the production process. If a significance test, carried out at the \(5 \%\) level, suggests that the average quantity of paint delivered is less than 1.02 litres, the machine is cleaned.
  2. By carrying out an appropriate test, determine whether or not the sample below leads to the machine being cleaned. $$\begin{array} { l l l l l l l l l l } 0.994 & 1.010 & 1.021 & 1.015 & 1.016 & 1.022 & 1.009 & 1.007 & 1.011 & 1.026 \end{array}$$ Each time the machine has been cleaned, a random sample of 10 tins is taken to determine whether or not the average quantity of paint delivered has returned to 1.05 litres.
  3. On one occasion after the machine has been cleaned, the quality control manager thinks that the distribution of the quantity of paint is symmetrical but not necessarily Normal. The sample on this occasion is as follows.
    1.0551.0641.0631.0431.0621.0701.0591.0441.054
    1.053
    By carrying out an appropriate test at the \(5 \%\) level of significance, determine whether or not this sample supports the conclusion that the average quantity of paint delivered is 1.05 litres.
OCR MEI S3 2016 June Q2
18 marks Standard +0.3
2
  1. A genetic model involving body colour and eye colour of fruit flies predicts that offspring will consist of four phenotypes in the ratio \(9 : 3 : 3 : 1\). A random sample of 200 such offspring is taken. Their phenotypes are found to be as follows.
    PhenotypeBrown body Red eyeBrown body Brown eyeBlack body Red eyeBlack body Brown eye
    Frequency12537326
    Relative proportion from model9331
    Carry out a test, using a \(2.5 \%\) level of significance, of the goodness of fit of the genetic model to these data.
  2. The median length of European fruit flies is 2.5 mm . South American fruit flies are believed to be larger than European fruit flies. A random sample of 12 South American fruit flies is taken. The flies are found to have the following lengths (in mm). \(1.7 \quad 1.4\) \(3.1 \quad 3.5\) 3.8
    4.2
    2.2
    2.9
    4.4
    2.6 \(3.9 \quad 3.2\) Carry out a Wilcoxon signed rank test, using a \(5 \%\) level of significance, to test this belief.
OCR S4 2009 June Q2
11 marks Standard +0.8
2 A company wishes to buy a new lathe for making chair legs. Two models of lathe, 'Allegro' and 'Vivace', were trialled. The company asked 12 randomly selected employees to make a particular type of chair leg on each machine. The times, in seconds, for each employee are shown in the table.
Employee123456789101112
Time on Allegro162111194159202210183168165150185160
Time on Vivace182130193181192205186184192180178189
The company wishes to test whether there is any difference in average times for the two machines.
  1. State the circumstances under which a non-parametric test should be used.
  2. Use two different non-parametric tests and show that they lead to different conclusions at the 5\% significance level.
  3. State, with a reason, which conclusion is to be preferred.
OCR S4 2010 June Q5
11 marks Standard +0.3
5 In order to test whether the median salary of employees in a certain industry who had worked for three years was \(\pounds 19500\), the salaries \(x\), in thousands of pounds, of 50 randomly chosen employees were obtained.
  1. The values \(| x - 19.5 |\) were calculated and ranked. No two values of \(x\) were identical and none was equal to 19.5 . The sum of the ranks corresponding to positive values of \(( x - 19.5 )\) was 867. Stating a required assumption, carry out a suitable test at the \(5 \%\) significance level.
  2. If the assumption you stated in part (i) does not hold, what test could have been used?
OCR S4 2015 June Q2
8 marks Standard +0.3
2 The manufacturer of a painkiller, designed to relieve headaches, claims that people taking the painkiller feel relief in at most 30 minutes, on average. A random sample of eight users of the painkiller recorded the times it took for them to feel relief from their headaches. These times, in minutes, were as follows: $$\begin{array} { l l l l l l l l } 33 & 39 & 29 & 35 & 40 & 32 & 26 & 37 \end{array}$$ Use a Wilcoxon single-sample signed-rank test at the \(5 \%\) significance level to test the manufacturer's claim, stating a necessary assumption.
OCR S4 2018 June Q1
5 marks Moderate -0.5
1 A Wilcoxon signed-rank test is carried out at the \(5 \%\) level of significance on a random sample of size 32 . The hypotheses are \(\mathrm { H } _ { 0 } : m = m _ { 0 } , \mathrm { H } _ { 1 } : m < m _ { 0 }\) where \(m\) is the population median and \(m _ { 0 }\) is a specific numerical value. The value obtained for the test statistic \(T\) is 162 . Find the outcome of the test.
OCR MEI S3 2008 January Q1
18 marks Moderate -0.3
1
  1. The time (in milliseconds) taken by my computer to perform a particular task is modelled by the random variable \(T\). The probability that it takes more than \(t\) milliseconds to perform this task is given by the expression \(\mathrm { P } ( T > t ) = \frac { k } { t ^ { 2 } }\) for \(t \geqslant 1\), where \(k\) is a constant.
    1. Write down the cumulative distribution function of \(T\) and hence show that \(k = 1\).
    2. Find the probability density function of \(T\).
    3. Find the mean time for the task.
  2. For a different task, the times (in milliseconds) taken by my computer on 10 randomly chosen occasions were as follows. $$\begin{array} { c c c c c c c c c c } 6.4 & 5.9 & 5.0 & 6.2 & 6.8 & 6.0 & 5.2 & 6.5 & 5.7 & 5.3 \end{array}$$ From past experience it is thought that the median time for this task is 5.4 milliseconds. Carry out a test at the \(5 \%\) level of significance to investigate this, stating your hypotheses carefully.