5.07d Paired vs two-sample: selection

38 questions

Sort by: Default | Easiest first | Hardest first
CAIE Further Paper 4 2020 June Q6
11 marks Moderate -0.3
6 A biologist is studying the effect of nutrients on the heights to which plants grow. A random sample of 24 similar young plants is divided into two equal groups \(A\) and \(B\). The plants in group \(A\) are fed with nutrients and water and the plants in group \(B\) are given only water. After four weeks, the height, in cm, of each plant is measured and the results are as follows.
Group \(A\)12.311.812.113.211.110.613.812.012.212.413.513.9
Group \(B\)11.710.810.911.311.212.611.010.511.912.510.711.6
The biologist decides to carry out a test at the \(5 \%\) significance level to test whether the nutrients have resulted in an increase in growth.
  1. She carries out a Wilcoxon rank-sum test. Give a reason why this is an appropriate choice of test.
  2. Carry out the Wilcoxon rank-sum test for these results.
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE Further Paper 4 2022 June Q1
8 marks Standard +0.3
1 A manager is investigating the times taken by employees to complete a particular task as a result of the introduction of new technology. He claims that the mean time taken to complete the task is reduced by more than 0.4 minutes. He chooses a random sample of 10 employees. The times taken, in minutes, before and after the introduction of the new technology are recorded in the table.
Employee\(A\)\(B\)\(C\)D\(E\)\(F\)G\(H\)IJ
Time before new technology10.29.812.411.610.811.214.610.612.311.0
Time after new technology9.68.512.410.910.210.612.810.812.510.6
  1. Test at the 10\% significance level whether the manager's claim is justified.
  2. State an assumption that is necessary for this test to be valid.
CAIE Further Paper 4 2024 June Q1
7 marks Standard +0.3
1 A college uses two assessments, \(X\) and \(Y\), when interviewing applicants for research posts at the college. These assessments have been used for a large number of applicants this year. The scores for a random sample of 9 applicants who took assessment \(X\) are as follows. $$\begin{array} { l l l l l l l l l } 21.4 & 24.6 & 25.3 & 22.7 & 20.8 & 21.5 & 22.9 & 21.3 & 22.3 \end{array}$$ The scores for a random sample of 10 applicants who took assessment \(Y\) are as follows. $$\begin{array} { l l l l l l l l l l } 20.9 & 23.5 & 24.8 & 21.9 & 23.4 & 24.0 & 23.8 & 24.1 & 25.1 & 25.8 \end{array}$$ The interviewer believes that the population median score from assessment \(X\) is lower than the population median score from assessment \(Y\). Carry out a Wilcoxon rank-sum test, at the \(1 \%\) significance level, to test whether the interviewer's belief is supported by the data. \includegraphics[max width=\textwidth, alt={}, center]{b5ff998a-fcb6-4a1b-ae86-ec66b0dccc3c-02_2714_37_143_2008}
CAIE Further Paper 4 2021 November Q6
9 marks Standard +0.3
6 The blood cholesterol levels, measured in suitable units, of a random sample of 11 women and a random sample of 12 men are shown below.
Women51552421671522567513798238235
Men3112621703021753202202607235186333
Carry out a Wilcoxon rank-sum test, at the \(5 \%\) significance level, to test whether, on average, there is a difference in cholesterol levels between women and men.
If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE Further Paper 4 2021 November Q4
8 marks Standard +0.3
4 Applicants for a particular college take a written test when they attend for interview. There are two different written tests, \(A\) and \(B\), and each applicant takes one or the other. The interviewer wants to determine whether the medians of the distribution of marks obtained in the two tests are equal. The marks obtained by a random sample of 8 applicants who took test \(A\) and a random sample of 8 applicants who took test \(B\) are as follows.
Test \(A\)4632291233182540
Test \(B\)3628493748354131
  1. Carry out a Wilcoxon rank-sum test at the \(5 \%\) significance level to determine whether there is a difference in the population median marks obtained in the two tests.
    The interviewer considers using the given information to carry out a paired sample \(t\)-test to determine whether there is a difference in the population means for the two tests.
  2. Give two reasons why it is not appropriate to use this test.
CAIE Further Paper 4 2022 November Q6
10 marks Standard +0.3
6 The manager of a technology company \(A\) claims that his employees earn more per year than the employees at technology company \(B\). The amounts earned per year, in hundreds of dollars, by a random sample of 12 employees from company \(A\) and an independent random sample of 12 employees from company \(B\) are shown below.
Company \(A\)461482374512415452502427398545612359
Company \(B\)454506491384361443401472414342355437
  1. Carry out a Wilcoxon rank-sum test at the \(5 \%\) significance level to test whether the manager's claim is supported by the data.
  2. Explain whether a paired sample \(t\)-test would be appropriate to test the manager's claim if earnings are normally distributed.
    If you use the following page to complete the answer to any question, the question number must be clearly shown.
CAIE Further Paper 4 2023 November Q5
16 marks Standard +0.8
5 A company is deciding which of two machines, \(X\) and \(Y\), can make a certain type of electrical component more quickly. The times taken, in minutes, to make one component of this type are recorded for a random sample of 8 components made by machine \(X\) and a random sample of 9 components made by machine \(Y\). These times are as follows.
Machine \(X\)4.04.64.74.85.05.25.65.8
Machine \(Y\)4.54.95.15.35.45.75.96.36.4
The manager claims that on average the time taken by machine \(X\) to make one component is less than that taken by machine \(Y\).
  1. Carry out a Wilcoxon rank-sum test at the \(5 \%\) significance level to test whether the manager's claim is supported by the data.
  2. Assuming that the times taken to produce the components by the two machines are normally distributed with equal variances, carry out a \(t\)-test at the \(5 \%\) significance level to test whether the manager's claim is supported by the data.
    \section*{Question 5(c) is printed on the next page.}
  3. In general, would you expect the conclusions from the tests in parts (a) and (b) to be the same? Give a reason for your answer.
    If you use the following page to complete the answer to any question, the question number must be clearly shown.
OCR MEI S3 2007 January Q4
18 marks Standard +0.3
4
  1. An amateur weather forecaster has been keeping records of air pressure, measured in atmospheres. She takes the measurement at the same time every day using a barometer situated in her garden. A random sample of 100 of her observations is summarised in the table below. The corresponding expected frequencies for a Normal distribution, with its two parameters estimated by sample statistics, are also shown in the table.
    Pressure ( \(a\) atmospheres)Observed frequencyFrequency as given by Normal model
    \(a \leqslant 0.98\)41.45
    \(0.98 < a \leqslant 0.99\)65.23
    \(0.99 < a \leqslant 1.00\)913.98
    \(1.00 < a \leqslant 1.01\)1523.91
    \(1.01 < a \leqslant 1.02\)3726.15
    \(1.02 < a \leqslant 1.03\)2118.29
    \(1.03 < a\)810.99
    Carry out a test at the \(5 \%\) level of significance of the goodness of fit of the Normal model. State your conclusion carefully and comment on your findings.
  2. The forecaster buys a new digital barometer that can be linked to her computer for easier recording of observations. She decides that she wishes to compare the readings of the new barometer with those of the old one. For a random sample of 10 days, the readings (in atmospheres) of the two barometers are shown below.
    DayABCDEFGHIJ
    Old0.9921.0051.0011.0111.0260.9801.0201.0251.0421.009
    New0.9851.0031.0021.0141.0220.9881.0301.0161.0471.025
    Use an appropriate Wilcoxon test to examine at the \(10 \%\) level of significance whether there is any reason to suppose that, on the whole, readings on the old and new barometers do not agree.
OCR S4 2008 June Q4
7 marks Standard +0.3
4 William takes a bus regularly on the same journey, sometimes in the morning and sometimes in the afternoon. He wishes to compare morning and afternoon journey times. He records the journey times on 7 randomly chosen mornings and 8 randomly chosen afternoons. The results, each correct to the nearest minute, are as follows, where M denotes a morning time and A denotes an afternoon time.
MAAMMMMMMAAAAAA
192022242526283031333537383942
William wishes to test for a difference between the average times of morning and afternoon journeys.
  1. Given that \(s _ { M } ^ { 2 } = 16.5\) and \(s _ { A } ^ { 2 } = 64.5\), with the usual notation, explain why a \(t\)-test is not appropriate in this case.
  2. William chooses a non-parametric test at the \(5 \%\) significance level. Carry out the test, stating the rejection region.
OCR S4 2011 June Q5
11 marks Standard +0.3
5 A test was carried out to compare the breaking strengths of two brands of elastic band, \(A\) and \(B\), of the same size. Random samples of 6 were selected from each brand and the breaking strengths were measured. The results, in suitable units and arranged in ascending order for each brand, are as follows.
Brand \(A :\)5.68.79.210.711.212.6
Brand \(B :\)10.111.612.012.212.913.5
  1. Give one advantage that a non-parametric test might have over a parametric test in this context.
  2. Carry out a suitable Wilcoxon test at the \(5 \%\) significance level of whether there is a difference between the average breaking strengths of the two brands.
  3. An extra elastic band of brand \(B\) was tested and found to have a breaking strength exceeding all of the other 12 bands. Determine whether this information alters the conclusion of your test.
OCR MEI S4 2006 June Q3
24 marks Standard +0.3
3 The human resources department of a large company is investigating two methods, A and B, for training employees to carry out a certain complicated and intricate task.
  1. Two separate random samples of employees who have not previously performed the task are taken. The first sample is of size 10 ; each of the employees in it is trained by method A. The second sample is of size 12; each of the employees in it is trained by method B. After completing the training, the time for each employee to carry out the task is measured, in controlled conditions. The times are as follows, in minutes.
    Employees trained by method A:35.247.825.838.053.631.033.9
    35.421.642.5
    Employees trained by method B:43.057.568.620.931.444.962.8
    27.641.846.139.861.6
    Stating appropriate assumptions concerning the underlying populations, use a \(t\) test at the \(5 \%\) significance level to examine whether either training method is better in respect of leading, on the whole, to a lower time to carry out the task.
  2. A further trial of method B is carried out to see if the performance of experienced and skilled workers can be improved by re-training them. A random sample of 8 such workers is taken. The times in minutes, under controlled conditions, for each worker to carry out the task before and after re-training are as follows.
    Worker\(W _ { 1 }\)\(W _ { 2 }\)\(W _ { 3 }\)\(W _ { 4 }\)\(W _ { 5 }\)\(W _ { 6 }\)\(W _ { 7 }\)\(W _ { 8 }\)
    Time before32.628.522.927.634.928.834.231.3
    Time after26.224.119.028.629.320.036.019.2
    Stating an appropriate assumption, use a \(t\) test at the \(5 \%\) significance level to examine whether the re-training appears, on the whole, to lead to a lower time to carry out the task.
  3. Explain how the test procedure in part (ii) is enhanced by designing it as a paired comparison.
OCR MEI S4 2012 June Q3
24 marks Standard +0.3
3 At an agricultural research station, trials are being made of two fertilisers, A and B, to see whether they differ in their effects on the yield of a crop. Preliminary investigations have established that the underlying variances of the distributions of yields using the two fertilisers may be assumed equal. Scientific analysis of the fertilisers has suggested that fertiliser A may be inferior in that it leads, on the whole, to lower yield. A statistical analysis is being carried out to investigate this. The crop is grown in carefully controlled conditions in 14 experimental plots, 6 with fertiliser A and 8 with fertiliser B. The yields, in kg per plot, are as follows, arranged in ascending order for each fertiliser.
Fertiliser A9.810.210.911.512.713.3
Fertiliser B10.811.912.012.212.913.513.613.7
  1. Carry out a Wilcoxon rank sum test at the \(5 \%\) significance level to examine appropriate hypotheses.
  2. Carry out a \(t\) test at the \(5 \%\) significance level to examine appropriate hypotheses.
  3. Goodness of fit tests based on more extensive data sets from other trials with these fertilisers have failed to reject hypotheses of underlying Normal distributions. Discuss the relative merits of the analyses in parts (i) and (ii).
OCR MEI S4 2015 June Q3
24 marks Standard +0.3
3 At an agricultural research station, trials are being carried out to compare a standard variety of tomato with one that has been genetically modified (GM). The trials are concerned with the mean weight of the tomatoes and also with the aesthetic appearance of the tomatoes.
    1. Tomatoes of the standard and GM varieties are grown under similar conditions. The tomatoes are weighed and the data are summarised as follows.
      VarietySample sizeSum of weights \(( \mathrm { g } )\)
      Sum of squares of
      weights \(\left( \mathrm { g } ^ { 2 } \right)\)
      Standard303218.3349257
      GM262954.1338691
      Carry out a test, using the Normal distribution, to investigate whether there is evidence, at the 5\% level of significance, that the two varieties of tomato differ in mean weight. State one assumption required for this test to be valid.
    2. The data in part (i) could have been used to carry out a test for the equality of means based on the \(t\) distribution. State two additional assumptions required for this test to be valid. Discuss briefly which test would be preferable in this case.
  1. In order to judge whether, on the whole, GM tomatoes have a better aesthetic appearance than standard tomatoes, a trial is carried out as follows. 10 of each variety are chosen and consumer panel is asked to arrange the 20 tomatoes in order according to their appearance.
    1. State two important features of the way in which this trial should be designed. Comment briefly on how reliable the evidence from the trial is likely to be.
    2. The order in which the consumer panel arranges the tomatoes is as follows. The tomato with best appearance is listed first. \(G\) and \(S\) denote GM and standard tomatoes respectively. $$\begin{array} { c c c c c c c c c c c c c c c c c c c c } G & G & G & S & G & G & G & S & G & S & S & S & G & G & S & G & S & S & S & S \end{array}$$ Carry out an appropriate test at the \(1 \%\) level of significance.
OCR MEI S4 2016 June Q3
24 marks Standard +0.3
3 A large department in a university wished to compare the standards of literacy and numeracy of its students. A random sample of 24 students was taken and sub-divided, randomly, into two groups of 12 . The students in one group took a literacy assessment (scores denoted by \(x\) ); the students in the other group took a numeracy assessment (scores denoted by \(y\) ). The two assessments were designed to give the same distributions of scores when taken by random samples from the general population. The scores obtained by the students on the two assessments are shown in the table.
\(x\)234243464848505458596265
\(y\)443663555358638061578354
$$\sum x = 598 \quad \sum x ^ { 2 } = 31196 \quad \sum y = 707 \quad \sum y ^ { 2 } = 43543$$
  1. Carry out an appropriate \(t\) test, at the \(5 \%\) level of significance, to compare the standards of literacy and numeracy.
  2. State the distributional assumptions required for the \(t\) test to be valid. Name the test that you would use if the assumptions required for the \(t\) test are thought not to hold. State the hypotheses for this new test. Explain, in general terms, which of the two tests is more powerful, and why. A statistician at the university looked at the data and commented that a paired sample design would have been better.
  3. Explain how a paired sample design would be applied in this context, and how the data would be analysed. Explain also why it would be better than the design used.
OCR S4 2016 June Q2
8 marks Standard +0.3
2 Low density lipoprotein (LDL) cholesterol is known as 'bad' cholesterol.
15 randomly chosen patients, each with an LDL level of 190 mg per decilitre of blood, were given one of two treatments, chosen at random. After twelve weeks their LDL levels, in mg per decilitre, were as follows.
Treatment \(A\)189168176186183187188
Treatment \(B\)177179173180178170175174
Use a Wilcoxon rank sum test, at the \(5 \%\) level of significance, to test whether the LDL levels of patients given treatment \(B\) are lower than the LDL levels of patients given treatment \(A\).
OCR S4 2017 June Q4
12 marks Standard +0.3
4 The heights of eleven randomly selected primary school children are measured. The results, in metres, are
Girls1.481.311.631.381.561.57
Boys1.441.351.321.281.27
  1. Use a Wilcoxon rank-sum test, at the \(1 \%\) significance level, to test whether primary school girls are taller than primary school boys.
  2. It is decided to repeat the test, using larger random samples. The heights of twenty girls and eighteen boys are measured. Find the greatest value of the test statistic \(W\) which will result in the conclusion that there is evidence, at the \(1 \%\) level of significance, that primary school girls are taller than primary school boys.
CAIE Further Paper 4 2020 Specimen Q3
8 marks Standard +0.3
3 Employees at a particular company have been working seven hours each day, from 9 am to 4 pm. To try to reduce absence, the company decides to introduce 'flexi-time' and allow employees to work their seven hours each day at any time between 7 am and 9 pm. For a random sample of 10 employees, the numbers of hours of absence in the year before and the year after the introduction of flexi-time are given in the following table.
Employee\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
Before4235967420578451460
After34321007231261351400
Test, at the \(10\%\) significance level, whether the population mean number of hours of absence has decreased following the introduction of flexi-time, stating any assumption that you make.
OCR S4 2009 June Q2
11 marks Standard +0.8
2 A company wishes to buy a new lathe for making chair legs. Two models of lathe, 'Allegro' and 'Vivace', were trialled. The company asked 12 randomly selected employees to make a particular type of chair leg on each machine. The times, in seconds, for each employee are shown in the table.
Employee123456789101112
Time on Allegro162111194159202210183168165150185160
Time on Vivace182130193181192205186184192180178189
The company wishes to test whether there is any difference in average times for the two machines.
  1. State the circumstances under which a non-parametric test should be used.
  2. Use two different non-parametric tests and show that they lead to different conclusions at the 5\% significance level.
  3. State, with a reason, which conclusion is to be preferred.
OCR S4 2015 June Q6
9 marks Moderate -0.8
6 In a two-tail Wilcoxon rank-sum test, the sample sizes are 13 and 15. The sum of the ranks for the sample of size 13 is 135 . Carry out the test at the \(5 \%\) level of significance.
OCR S4 2018 June Q2
8 marks Standard +0.3
2 The distances from home to work, in km , of 8 men and 5 women were recorded and are given below. The workers were chosen at random.
Men47101316172021
Women12141822
Carry out a Wilcoxon rank-sum test at the \(5 \%\) significance level to test whether there is a significant difference between the distances from home to work between men and women.
OCR Further Statistics 2019 June Q8
10 marks Standard +0.3
8 A university course was taught by two different professors. Students could choose whether to attend the lectures given by Professor \(Q\) or the lectures given by Professor \(R\). At the end of the course all the students took the same examination. The examination marks of a random sample of 30 students taught by Professor \(Q\) and a random sample of 24 students taught by Professor \(R\) were ranked. The sum of the ranks of the students taught by Professor \(Q\) was 726 . Test at the 5\% significance level whether there is a difference in the ranks of the students taught by the two professors.
OCR Further Statistics 2021 November Q7
12 marks Standard +0.3
7 In a school opinion poll a random sample of 8 pupils were asked to rate school lunches on a scale of 0 to 20 . The results were as follows. \(\begin{array} { l l l l l l l l } 0 & 1 & 2 & 3 & 4 & 10 & 11 & 13 \end{array}\) After a new menu was introduced, the test was repeated with a different random sample of 8 pupils. The results were as follows. \(\begin{array} { l l l l l l l l } 7 & 8 & 9 & 14 & 15 & 17 & 19 & 20 \end{array}\)
  1. Carry out an appropriate Wilcoxon test at the \(5 \%\) significance level to test whether pupils' opinions of school lunches have changed. A statistics student tells the organisers of the opinion poll that it would have been better to have asked the same 8 pupils both times.
  2. Explain why the statistics student's suggestion would produce a better test.
  3. State which test should be used if the student's suggestion is followed.
  4. You are given that there are 12870 ways in which 8 different integers can be chosen from the integers 1 to 16 inclusive. Estimate the number of ways of selecting 8 different digits between 1 and 16 inclusive that have a sum less than or equal to the critical value used in the test in part (a).
Edexcel S4 2006 January Q5
13 marks Standard +0.3
5. Seven pipes of equal length are selected at random. Each pipe is cut in half. One piece of each pipe is coated with protective paint and the other is left uncoated. All of the pieces of pipe are buried to the same depth in various soils for 6 months. The table gives the percentage area of the pieces of pipe in the various soils that are subject to corrosion.
SoilABCDEFG
\% Corrosion
coated pipe
39404332423336
\% Corrosion
uncoated pipe
41366148424845
  1. Stating your hypotheses clearly and using a \(5 \%\) significance level, carry out a paired \(t\)-test to assess whether or not there is a difference between the mean percentage of corrosion on the coated pipes and the mean percentage of corrosion on the uncoated pipes.
    1. State an assumption that has been made in order to carry out this test.
    2. Comment on the validity of this assumption.
  2. State what difference would be made to the conclusion in part (a) if the test had been to determine whether or not the percentage of corrosion on the uncoated pipes was higher than the mean percentage of corrosion on the coated pipes. Justify your answer.
Edexcel S4 2006 January Q7
16 marks Standard +0.3
7. A psychologist gives a test to students from two different schools, \(A\) and \(B\). A group of 9 students is randomly selected from school \(A\) and given instructions on how to do the test.
A group of 7 students is randomly selected from school \(B\) and given the test without the instructions. The table shows the time taken, to the nearest second, to complete the test by the two groups.
\(A\)111212131415161717
\(B\)8101113131414
Stating your hypotheses clearly,
  1. test at the \(10 \%\) significance level, whether or not the variance of the times taken to complete the test by students from school \(A\) is the same as the variance of the times taken to complete the test by students from school \(B\). (You may assume that times taken for each school are normally distributed.)
  2. test at the \(5 \%\) significance level, whether or not the mean time taken to complete the test by students from school \(A\) is greater than the mean time taken to complete the test by students from school \(B\).
  3. Why does the result to part (a) enable you to carry out the test in part (b)?
  4. Give one factor that has not been taken into account in your analysis.
Edexcel S4 2011 June Q3
8 marks Standard +0.3
3. Manuel is planning to buy a new machine to squeeze oranges in his cafe and he has two models, at the same price, on trial. The manufacturers of machine \(B\) claim that their machine produces more juice from an orange than machine \(A\). To test this claim Manuel takes a random sample of 8 oranges, cuts them in half and puts one half in machine \(A\) and the other half in machine \(B\). The amount of juice, in ml , produced by each machine is given in the table below.
Orange12345678
Machine \(A\)6058555352515456
Machine \(B\)6160585255505258
Stating your hypotheses clearly, test, at the \(10 \%\) level of significance, whether or not the mean amount of juice produced by machine \(B\) is more than the mean amount produced by machine \(A\).