2.05a Hypothesis testing language: null, alternative, p-value, significance

282 questions

Sort by: Default | Easiest first | Hardest first
CAIE S2 2018 June Q5
8 marks Standard +0.3
5 The time taken for a particular train journey is normally distributed. In the past, the time had mean 2.4 hours and standard deviation 0.3 hours. A new timetable is introduced and on 30 randomly chosen occasions the time for this journey is measured. The mean time for these 30 occasions is found to be 2.3 hours.
  1. Stating any assumption(s), test, at the \(5 \%\) significance level, whether the mean time for this journey has changed.
  2. A similar test at the \(5 \%\) significance level was carried out using the times from another randomly chosen 30 occasions.
    1. State the probability of a Type I error.
    2. State what is meant by a Type II error in this context.
CAIE S2 2012 June Q3
5 marks Moderate -0.3
3 When the council published a plan for a new road, only \(15 \%\) of local residents approved the plan. The council then published a revised plan and, out of a random sample of 300 local residents, 60 approved the revised plan. Is there evidence, at the \(2.5 \%\) significance level, that the proportion of local residents who approve the revised plan is greater than for the original plan?
CAIE S2 2012 June Q6
11 marks Standard +0.3
6 A survey taken last year showed that the mean number of computers per household in Branley was 1.66 . This year a random sample of 50 households in Branley answered a questionnaire with the following results.
Number of computers01234\(> 4\)
Number of households512181050
  1. Calculate unbiased estimates for the population mean and variance of the number of computers per household in Branley this year.
  2. Test at the \(5 \%\) significance level whether the mean number of computers per household has changed since last year.
  3. Explain whether it is possible that a Type I error may have been made in the test in part (ii).
  4. State what is meant by a Type II error in the context of the test in part (ii), and give the set of values of the test statistic that could lead to a Type II error being made.
CAIE S2 2021 November Q5
9 marks Moderate -0.5
5
  1. The proportion of people having a particular medical condition is 1 in 100000 . A random sample of 2500 people is obtained. The number of people in the sample having the condition is denoted by \(X\).
    1. State, with a justification, a suitable approximating distribution for \(X\), giving the values of any parameters.
    2. Use the approximating distribution to calculate \(\mathrm { P } ( X > 0 )\).
  2. The percentage of people having a different medical condition is thought to be \(30 \%\). A researcher suspects that the true percentage is less than \(30 \%\). In a medical trial a random sample of 28 people was selected and 4 people were found to have this condition. Use a binomial distribution to test the researcher's suspicion at the \(2 \%\) significance level.
CAIE S2 2021 November Q7
10 marks Standard +0.3
7 The masses, in grams, of apples from a certain farm have mean \(\mu\) and standard deviation 5.2. The farmer says that the value of \(\mu\) is 64.6. A quality control inspector claims that the value of \(\mu\) is actually less than 64.6. In order to test his claim he chooses a random sample of 100 apples from the farm.
  1. The mean mass of the 100 apples is found to be 63.5 g . Carry out the test at the \(2.5 \%\) significance level.
  2. Later another test of the same hypotheses at the \(2.5 \%\) significance level, with another random sample of 100 apples from the same farm, is carried out. Given that the value of \(\mu\) is in fact 62.7 , calculate the probability of a Type II error.
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE S2 2021 November Q4
7 marks Moderate -0.3
4 A certain kind of firework is supposed to last for 30 seconds, on average, after it is lit. An inspector suspects that the fireworks actually last a shorter time than this, on average. He takes a random sample of 100 fireworks of this kind. Each firework in the sample is lit and the time it lasts is noted.
  1. Give a reason why it is necessary to take a sample rather than testing all the fireworks of this kind.
    It is given that the population standard deviation of the times that fireworks of this kind last is 5 seconds.
  2. The mean time lasted by the 100 fireworks in the sample is found to be 29 seconds. Test the inspector's suspicion at the \(1 \%\) significance level.
  3. State with a reason whether the Central Limit theorem was needed in the solution to part (b).
CAIE S2 2021 November Q6
10 marks Standard +0.3
6 A machine is supposed to produce random digits. Bob thinks that the machine is not fair and that the probability of it producing the digit 0 is less than \(\frac { 1 } { 10 }\). In order to test his suspicion he notes the number of times the digit 0 occurs in 30 digits produced by the machine. He carries out a test at the \(10 \%\) significance level.
  1. State suitable null and alternative hypotheses.
  2. Find the rejection region for the test.
  3. State the probability of a Type I error.
    It is now given that the machine actually produces a 0 once in every 40 digits, on average.
  4. Find the probability of a Type II error.
  5. Explain the meaning of a Type II error in this context.
CAIE S2 2022 November Q7
10 marks Standard +0.3
7 In the past Laxmi's time, in minutes, for her journey to college had mean 32.5 and standard deviation 3.1. After a change in her route, Laxmi wishes to test whether the mean time has decreased. She notes her journey times for a random sample of 50 journeys and she finds that the sample mean is 31.8 minutes. You should assume that the standard deviation is unchanged.
  1. Carry out a hypothesis test, at the \(8 \%\) significance level, of whether Laxmi's mean journey time has decreased.
    Later Laxmi carries out a similar test with the same hypotheses, at the \(8 \%\) significance level, using another random sample of size 50 .
  2. Given that the population mean is now 31.5, find the probability of a Type II error.
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE S2 2024 November Q7
9 marks Moderate -0.3
7 The heights of one-year-old trees of a certain variety are known to have mean 2.3 m . A scientist believes that, on average, trees of this age and variety in her region are slightly taller than in other places. She plans to carry out a hypothesis test, at the \(2 \%\) significance level, in order to test her belief.
  1. State the probability that she will make a Type I error.
    She takes a random sample of 100 such trees in her region and measures their heights, \(h \mathrm {~m}\). Her results are summarised below. $$n = 100 \quad \sum h = 238 \quad \sum h ^ { 2 } = 580$$
  2. Carry out the test at the \(2 \%\) significance level. \includegraphics[max width=\textwidth, alt={}, center]{9ac74d4c-f5e0-4c5d-ab25-5692dfb06f0b-10_2717_35_109_2012}
  3. The scientist carries out the test correctly, but another scientist claims that she has made a Type II error. Comment on this claim.
    If you use the following page to complete the answer to any question, the question number must be clearly shown.
CAIE S2 2020 Specimen Q6
9 marks Standard +0.3
6 At a certain hospital it was found that the probability that a patient did not arrive for an appointment was 0.2 . The hospital carries out some publicity in the hope that this probability will be reduced. They wish to test whether the publicity has worked. A random sample of 30 appointments is selected and the number of patients that do not arrive is noted. This figure is used to carry out a test at the \(5 \%\) significance level.
  1. Explain why the test is one-tailed and state suitable null and alternative hypotheses.
  2. Use a binomial distribution to find the critical region, and find the probability of a Type I error.
  3. In fact 3 patients out of the 30 do not arrive. State the conclusion of the test, explaining your answer.
CAIE S2 2004 June Q1
5 marks Moderate -0.3
1 Each multiple choice question in a test has 4 suggested answers, exactly one of which is correct. Rehka knows nothing about the subject of the test, but claims that she has a special method for answering the questions that is better than just guessing. There are 60 questions in the test, and Rehka gets 22 correct.
  1. State null and alternative hypotheses for a test of Rehka's claim.
  2. Using a normal approximation, test at the \(5 \%\) significance level whether Rehka's claim is justified.
CAIE S2 2004 June Q5
8 marks Standard +0.3
5 The lectures in a mathematics department are scheduled to last 54 minutes, and the times of individual lectures may be assumed to have a normal distribution with mean \(\mu\) minutes and standard deviation 3.1 minutes. One of the students commented that, on average, the lectures seemed too short. To investigate this, the times for a random sample of 10 lectures were used to test the null hypothesis \(\mu = 54\) against the alternative hypothesis \(\mu < 54\) at the \(10 \%\) significance level.
  1. Show that the null hypothesis is rejected in favour of the alternative hypothesis if \(\bar { x } < 52.74\), where \(\bar { x }\) minutes is the sample mean.
  2. Find the probability of a Type II error given that the actual mean length of lectures is 51.5 minutes.
CAIE S2 2005 June Q4
7 marks Standard +0.3
4 A study of a large sample of books by a particular author shows that the number of words per sentence can be modelled by a normal distribution with mean 21.2 and standard deviation 7.3. A researcher claims to have discovered a previously unknown book by this author. The mean length of 90 sentences chosen at random in this book is found to be 19.4 words.
  1. Assuming the population standard deviation of sentence lengths in this book is also 7.3, test at the \(5 \%\) level of significance whether the mean sentence length is the same as the author's. State your null and alternative hypotheses.
  2. State in words relating to the context of the test what is meant by a Type I error and state the probability of a Type I error in the test in part (i).
CAIE S2 2006 June Q7
11 marks Standard +0.3
7 The number of cars caught speeding on a certain length of motorway is 7.2 per day, on average. Speed cameras are introduced and the results shown in the following table are those from a random selection of 40 days after this.
Number of cars caught speeding45678910
Number of days57810523
  1. Calculate unbiased estimates of the population mean and variance of the number of cars per day caught speeding after the speed cameras were introduced.
  2. Taking the null hypothesis \(\mathrm { H } _ { 0 }\) to be \(\mu = 7.2\), test at the \(5 \%\) level whether there is evidence that the introduction of speed cameras has resulted in a reduction in the number of cars caught speeding.
  3. State what is meant by a Type I error in words relating to the context of the test in part (ii). Without further calculation, illustrate on a suitable diagram the region representing the probability of this Type I error.
CAIE S2 2007 June Q3
5 marks Moderate -0.3
3 A machine has produced nails over a long period of time, where the length in millimetres was distributed as \(\mathrm { N } ( 22.0,0.19 )\). It is believed that recently the mean length has changed. To test this belief a random sample of 8 nails is taken and the mean length is found to be 21.7 mm . Carry out a hypothesis test at the \(5 \%\) significance level to test whether the population mean has changed, assuming that the variance remains the same.
CAIE S2 2007 June Q4
7 marks Standard +0.3
4 At a certain airport 20\% of people take longer than an hour to check in. A new computer system is installed, and it is claimed that this will reduce the time to check in. It is decided to accept the claim if, from a random sample of 22 people, the number taking longer than an hour to check in is either 0 or 1 .
  1. Calculate the significance level of the test.
  2. State the probability that a Type I error occurs.
  3. Calculate the probability that a Type II error occurs if the probability that a person takes longer than an hour to check in is now 0.09 .
CAIE S2 2008 June Q4
7 marks Standard +0.3
4 People who diet can expect to lose an average of 3 kg in a month. In a book, the authors claim that people who follow a new diet will lose an average of more than 3 kg in a month. The weight losses of the 180 people in a random sample who had followed the new diet for a month were noted. The mean was 3.3 kg and the standard deviation was 2.8 kg .
  1. Test the authors' claim at the \(5 \%\) significance level, stating your null and alternative hypotheses.
  2. State what is meant by a Type II error in words relating to the context of the test in part (i).
CAIE S2 2008 June Q5
8 marks Standard +0.3
5 When a guitar is played regularly, a string breaks on average once every 15 months. Broken strings occur at random times and independently of each other.
  1. Show that the mean number of broken strings in a 5 -year period is 4 . A guitar is fitted with a new type of string which, it is claimed, breaks less frequently. The number of broken strings of the new type was noted after a period of 5 years.
  2. The mean number of broken strings of the new type in a 5 -year period is denoted by \(\lambda\). Find the rejection region for a test at the \(10 \%\) significance level when the null hypothesis \(\lambda = 4\) is tested against the alternative hypothesis \(\lambda < 4\).
  3. Hence calculate the probability of making a Type I error. The number of broken guitar strings of the new type, in a 5 -year period, was in fact 1 .
  4. State, with a reason, whether there is evidence at the \(10 \%\) significance level that guitar strings of the new type break less frequently.
CAIE S2 2009 June Q1
5 marks Moderate -0.3
1 In Europe the diameters of women's rings have mean 18.5 mm . Researchers claim that women in Jakarta have smaller fingers than women in Europe. The researchers took a random sample of 20 women in Jakarta and measured the diameters of their rings. The mean diameter was found to be 18.1 mm . Assuming that the diameters of women's rings in Jakarta have a normal distribution with standard deviation 1.1 mm , carry out a hypothesis test at the \(2 \frac { 1 } { 2 } \%\) level to determine whether the researchers' claim is justified.
CAIE S2 2009 June Q4
9 marks Standard +0.3
4 In a certain city it is necessary to pass a driving test in order to be allowed to drive a car. The probability of passing the driving test at the first attempt is 0.36 on average. A particular driving instructor claims that the probability of his pupils passing at the first attempt is higher than 0.36 . A random sample of 8 of his pupils showed that 7 passed at the first attempt.
  1. Carry out an appropriate hypothesis test to test the driving instructor's claim, using a significance level of \(5 \%\).
  2. In fact, most of this random sample happened to be careful and sensible drivers. State which type of error in the hypothesis test (Type I or Type II) could have been made in these circumstances and find the probability of this type of error when a sample of size 8 is used for the test.
CAIE S2 2010 June Q2
7 marks Moderate -0.8
2 A random sample of \(n\) people were questioned about their internet use. 87 of them had a high-speed internet connection. A confidence interval for the population proportion having a high-speed internet connection is \(0.1129 < p < 0.1771\).
  1. Write down the mid-point of this confidence interval and hence find the value of \(n\).
  2. This interval is an \(\alpha \%\) confidence interval. Find \(\alpha\).
CAIE S2 2010 June Q3
7 marks Standard +0.3
3 Metal bolts are produced in large numbers and have lengths which are normally distributed with mean 2.62 cm and standard deviation 0.30 cm .
  1. Find the probability that a random sample of 45 bolts will have a mean length of more than 2.55 cm .
  2. The machine making these bolts is given an annual service. This may change the mean length of bolts produced but does not change the standard deviation. To test whether the mean has changed, a random sample of 30 bolts is taken and their lengths noted. The sample mean length is \(m \mathrm {~cm}\). Find the set of values of \(m\) which result in rejection at the \(10 \%\) significance level of the hypothesis that no change in the mean length has occurred.
CAIE S2 2010 June Q7
10 marks Standard +0.8
7 A hospital patient's white blood cell count has a Poisson distribution. Before undergoing treatment the patient had a mean white blood cell count of 5.2. After the treatment a random measurement of the patient's white blood cell count is made, and is used to test at the \(10 \%\) significance level whether the mean white blood cell count has decreased.
  1. State what is meant by a Type I error in the context of the question, and find the probability that the test results in a Type I error.
  2. Given that the measured value of the white blood cell count after the treatment is 2 , carry out the test.
  3. Find the probability of a Type II error if the mean white blood cell count after the treatment is actually 4.1.
CAIE S2 2010 June Q1
5 marks Moderate -0.3
1 At the 2009 election, \(\frac { 1 } { 3 }\) of the voters in Chington voted for the Citizens Party. One year later, a researcher questioned 20 randomly selected voters in Chington. Exactly 3 of these 20 voters said that if there were an election next week they would vote for the Citizens Party. Test at the \(2.5 \%\) significance level whether there is evidence of a decrease in support for the Citizens Party in Chington, since the 2009 election.
CAIE S2 2010 June Q2
5 marks Moderate -0.5
2 Dipak carries out a test, at the \(10 \%\) significance level, using a normal distribution. The null hypothesis is \(\mu = 35\) and the alternative hypothesis is \(\mu \neq 35\).
  1. Is this a one-tail or a two-tail test? State briefly how you can tell. Dipak finds that the value of the test statistic is \(z = - 1.750\).
  2. Explain what conclusion he should draw.
  3. This result is significant at the \(\alpha \%\) level. Find the smallest possible value of \(\alpha\), correct to the nearest whole number.