2.05a Hypothesis testing language: null, alternative, p-value, significance

282 questions

Sort by: Default | Easiest first | Hardest first
OCR MEI S2 2007 June Q4
18 marks Standard +0.3
4 The sexes and ages of a random sample of 300 runners taking part in marathons are classified as follows.
ObservedSex\multirow{2}{*}{Row totals}
\cline { 3 - 4 }MaleFemale
\multirow{3}{*}{
Age
group
}
Under 407054124
\cline { 2 - 4 }\(40 - 49\)7636112
\cline { 2 - 5 }50 and over521264
Column totals198102300
  1. Carry out a test at the \(5 \%\) significance level to examine whether there is any association between age group and sex. State carefully your null and alternative hypotheses. Your working should include a table showing the contributions of each cell to the test statistic.
  2. Does your analysis support the suggestion that women are less likely than men to enter marathons as they get older? Justify your answer. For marathons in general, on average \(3 \%\) of runners are 'Female, 50 and over'. The random variable \(X\) represents the number of 'Female, 50 and over' runners in a random sample of size 300.
  3. Use a suitable approximating distribution to find \(\mathrm { P } ( X \geqslant 12 )\).
OCR MEI S2 2008 June Q3
18 marks Moderate -0.3
3 A company has a fleet of identical vans. Company policy is to replace all of the tyres on a van as soon as any one of them is worn out. The random variable \(X\) represents the number of miles driven before the tyres on a van are replaced. \(X\) is Normally distributed with mean 27500 and standard deviation 4000.
  1. Find \(\mathrm { P } ( X > 25000 )\).
  2. 10 vans in the fleet are selected at random. Find the probability that the tyres on exactly 7 of them last for more than 25000 miles.
  3. The tyres of \(99 \%\) of vans last for more than \(k\) miles. Find the value of \(k\). A tyre supplier claims that a different type of tyre will have a greater mean lifetime. A random sample of 15 vans is fitted with these tyres. For each van, the number of miles driven before the tyres are replaced is recorded. A hypothesis test is carried out to investigate the claim. You may assume that these lifetimes are also Normally distributed with standard deviation 4000.
  4. Write down suitable null and alternative hypotheses for the test.
  5. For the 15 vans, it is found that the mean lifetime of the tyres is 28630 miles. Carry out the test at the \(5 \%\) level.
OCR MEI S2 2008 June Q4
18 marks Standard +0.3
4 A student is investigating whether there is any association between the species of shellfish that occur on a rocky shore and where they are located. A random sample of 160 shellfish is selected and the numbers of shellfish in each category are summarised in the table below.
Location
\cline { 3 - 5 } \multicolumn{2}{|c|}{}ExposedShelteredPool
\multirow{3}{*}{Species}Limpet243216
\cline { 2 - 5 }Mussel24113
\cline { 2 - 5 }Other52223
  1. Write down null and alternative hypotheses for a test to examine whether there is any association between species and location. The contributions to the test statistic for the usual \(\chi ^ { 2 }\) test are shown in the table below.
    ContributionLocation
    \cline { 3 - 5 }ExposedShelteredPool
    \multirow{3}{*}{Species}Limpet0.00090.25850.4450
    \cline { 2 - 5 }Mussel10.34721.27564.8773
    \cline { 2 - 5 }Other8.07190.14027.4298
    The sum of these contributions is 32.85 .
  2. Calculate the expected frequency for mussels in pools. Verify the corresponding contribution 4.8773 to the test statistic.
  3. Carry out the test at the \(5 \%\) level of significance, stating your conclusion clearly.
  4. For each species, comment briefly on how its distribution compares with what would be expected if there were no association.
  5. If 3 of the 160 shellfish are selected at random, one from each of the 3 types of location, find the probability that all 3 of them are limpets.
OCR MEI S4 2008 June Q3
24 marks Standard +0.3
3
  1. Explain the meaning of the following terms in the context of hypothesis testing: Type I error, Type II error, operating characteristic. A machine fills salt containers that will be sold in shops. The containers are supposed to contain 750 g of salt. The machine operates in such a way that the amount of salt delivered to each container is a Normally distributed random variable with standard deviation 20 g . The machine should be calibrated in such a way that the mean amount delivered, \(\mu\), is 750 g . Each hour, a random sample of 9 containers is taken from the previous hour's output and the sample mean amount of salt is determined. If this is between 735 g and 765 g , the previous hour's output is accepted. If not, the previous hour's output is rejected and the machine is recalibrated.
  2. Find the probability of rejecting the previous hour's output if the machine is properly calibrated. Comment on your result.
  3. Find the probability of accepting the previous hour's output if \(\mu = 725 \mathrm {~g}\). Comment on your result.
  4. Obtain an expression for the operating characteristic of this testing procedure in terms of the cumulative distribution function \(\Phi ( z )\) of the standard Normal distribution. Evaluate the operating characteristic for the following values (in g) of \(\mu\) : 720, 730, 740, 750, 760, 770, 780.
OCR S2 2015 June Q4
10 marks Standard +0.3
4 A continuous random variable is normally distributed with mean \(\mu\). A significance test for \(\mu\) is carried out, at the \(5 \%\) significance level, on 90 independent occasions.
  1. Given that the null hypothesis is correct on all 90 occasions, use a suitable approximation to find the probability that on 6 or fewer occasions the test results in a Type I error. Justify your approximation.
  2. Given instead that on all 90 occasions the probability of a Type II error is 0.35 , use a suitable approximation to find the probability that on fewer than 29 occasions the test results in a Type II error.
OCR S2 2015 June Q6
12 marks Standard +0.3
6 Records for a doctors' surgery over a long period suggest that the time taken for a consultation, \(T\) minutes, has a mean of 11.0. Following the introduction of new regulations, a doctor believes that the average time has changed. She finds that, with new regulations, the consultation times for a random sample of 120 patients can be summarised as $$n = 120 , \Sigma t = 1411.20 , \Sigma t ^ { 2 } = 18737.712 .$$
  1. Test, at the \(10 \%\) significance level, whether the doctor's belief is correct.
  2. Explain whether, in answering part (i), it was necessary to assume that the consultation times were normally distributed.
OCR S2 2015 June Q7
13 marks Standard +0.3
7 A large railway network suffers points failures at an average rate of 1 every 3 days. Assume that the number of points failures can be modelled by a Poisson distribution. The network employs a new firm of engineers. After the new engineers have become established, it is found that in a randomly chosen period of 15 days there are 2 instances of points failures.
  1. Test, at the \(5 \%\) significance level, whether there is evidence that the mean number of points failures has been reduced.
  2. A new test is carried out over a period of 150 days. Use a suitable approximation to find the greatest number of points failures there could be in 150 days that would lead to a \(5 \%\) significance test concluding that the average number of points failures had been reduced.
OCR S2 2015 June Q8
7 marks Standard +0.8
8 The random variable \(S\) has the distribution \(\mathrm { B } ( 14 , p )\). A significance test is carried out of the null hypothesis \(\mathrm { H } _ { 0 } : p = 0.3\) against the alternative hypothesis \(\mathrm { H } _ { 1 } : p > 0.3\). The critical region for the test is \(S \geqslant 8\).
  1. Find the significance level of the test, correct to 3 significant figures.
  2. It is given that, on each occasion that the test is carried out, the true value of \(p\) is equally likely to be \(0.3,0.5\) or 0.7 , independently of any other test. Four independent tests are carried out. Find the probability that at least one of the tests results in a Type II error.
OCR S3 2014 June Q3
7 marks Standard +0.3
3 An athlete finds that her times for running 100 m are normally distributed. Before a period of intensive training, her mean time is 11.8 s . After the period of intensive training, five randomly selected times, in seconds, are as follows. $$\begin{array} { l l l l l } 11.70 & 11.65 & 11.80 & 11.75 & 11.60 \end{array}$$ Carry out a suitable test, at the \(5 \%\) significance level, to investigate whether times after the training are less, on average, than times before the training.
OCR S3 2014 June Q8
10 marks Standard +0.3
8 A random sample of 20 plots of land, each of equal area, was used to test whether the addition of phosphorus would increase the yield of corn. 10 plots were treated with phosphorus and 10 plots were untreated. The yields of corn, in litres, on a treated plot and on an untreated plot are denoted by \(X\) and \(Y\) respectively. You are given that $$\sum x = 2112 , \quad \sum y = 2008$$ You are also given that an unbiased estimate for the variance of treated plots is 87.96 and an unbiased estimate for the variance of untreated plots is 31.96 , both correct to 4 significant figures.
  1. You may assume that the population variance estimates are sufficiently similar for the assumption of common variance to be made. What other assumption needs to be made for a \(t\)-test to be valid?
  2. Carry out a suitable \(t\)-test at the \(1 \%\) significance level, to test whether the use of phosphorus increases the yield of corn.
OCR S3 2015 June Q2
7 marks Standard +0.3
2 In a poll of people aged 18-21, 46 out of 200 randomly chosen university students agreed with a proposition. 51 out of 300 randomly chosen others who were not university students agreed with it. Test, at the \(5 \%\) significance level, whether the proportion of university students who agree with the proposition differs from the proportion of those who are not university students.
OCR S3 2015 June Q3
12 marks Standard +0.3
3 A tutor gave an assessment to 6 randomly chosen new eleven-year-old students. After each student had received 30 hours of tuition, they were given a second assessment. The scores are shown in the table.
Student\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)
1st assessment124121111113118119
2nd assessment127119114110120122
  1. Show that, at the \(5 \%\) significance level, there is insufficient evidence that students' scores are higher, on average, after tuition than before tuition. State a necessary assumption.
  2. Disappointed by this result, the tutor looked again at the first assessment. She discovered that the first assessment was too easy, in fact being a test for ten-year-olds, not eleven-year-olds. She decided to reduce each score for the first assessment by a constant integer \(k\). Find the least value of \(k\) for which there is evidence at the \(5 \%\) significance level that the students' scores have, on average, improved.
OCR MEI S1 2013 June Q5
8 marks Moderate -0.3
5 A researcher is investigating whether people can identify whether a glass of water they are given is bottled water or tap water. She suspects that people do no better than they would by guessing. Twenty people are selected at random; thirteen make a correct identification. She carries out a hypothesis test.
  1. Explain why the null hypothesis should be \(p = 0.5\), where \(p\) represents the probability that a randomly selected person makes a correct identification.
  2. Briefly explain why she uses an alternative hypothesis of \(p > 0.5\).
  3. Complete the test at the \(5 \%\) significance level.
OCR MEI S1 2013 June Q6
18 marks Easy -1.2
6 The birth weights in kilograms of 25 female babies are shown below, in ascending order.
1.392.502.682.762.822.822.843.033.063.163.163.243.32
3.363.403.543.563.563.703.723.723.844.024.244.34
  1. Find the median and interquartile range of these data.
  2. Draw a box and whisker plot to illustrate the data.
  3. Show that there is exactly one outlier. Discuss whether this outlier should be removed from the data. The cumulative frequency curve below illustrates the birth weights of 200 male babies. \includegraphics[max width=\textwidth, alt={}, center]{6b886da6-3fb8-4b4c-b572-f4b770ae5a4c-3_929_1569_1450_248}
  4. Find the median and interquartile range of the birth weights of the male babies.
  5. Compare the weights of the female and male babies.
  6. Two of these male babies are chosen at random. Calculate an estimate of the probability that both of these babies weigh more than any of the female babies.
OCR MEI S1 2015 June Q7
17 marks Standard +0.3
7 A drug for treating a particular minor illness cures, on average, \(78 \%\) of patients. Twenty people with this minor illness are selected at random and treated with the drug.
  1. \(( A )\) Find the probability that exactly 19 patients are cured.
    (B) Find the probability that at most 18 patients are cured. \(( C )\) Find the expected number of patients who are cured.
  2. A pharmaceutical company is trialling a new drug to treat this illness. Researchers at the company hope that a higher percentage of patients will be cured when given this new drug. Twenty patients are selected at random, and given the new drug. Of these, 19 are cured. Carry out a hypothesis test at the \(1 \%\) significance level to investigate whether there is any evidence to suggest that the new drug is more effective than the old one.
  3. If the researchers had chosen to carry out the hypothesis test at the \(5 \%\) significance level, what would the result have been? Justify your answer.
OCR S2 2009 January Q4
10 marks Moderate -0.3
4 A television company believes that the proportion of adults who watched a certain programme is 0.14 . Out of a random sample of 22 adults, it is found that 2 watched the programme.
  1. Carry out a significance test, at the \(10 \%\) level, to determine, on the basis of this sample, whether the television company is overestimating the proportion of adults who watched the programme.
  2. The sample was selected randomly. State what properties of this method of sampling are needed to justify the use of the distribution used in your test.
OCR S2 2009 January Q6
11 marks Standard +0.3
6 The weight of a plastic box manufactured by a company is \(W\) grams, where \(W \sim \mathrm {~N} ( \mu , 20.25 )\). A significance test of the null hypothesis \(\mathrm { H } _ { 0 } : \mu = 50.0\), against the alternative hypothesis \(\mathrm { H } _ { 1 } : \mu \neq 50.0\), is carried out at the \(5 \%\) significance level, based on a sample of size \(n\).
  1. Given that \(n = 81\),
    1. find the critical region for the test, in terms of the sample mean \(\bar { W }\),
    2. find the probability that the test results in a Type II error when \(\mu = 50.2\).
    3. State how the probability of this Type II error would change if \(n\) were greater than 81 .
OCR S2 2009 January Q7
12 marks Standard +0.3
7 A motorist records the time taken, \(T\) minutes, to drive a particular stretch of road on each of 64 occasions. Her results are summarised by $$\Sigma t = 876.8 , \quad \Sigma t ^ { 2 } = 12657.28$$
  1. Test, at the \(5 \%\) significance level, whether the mean time for the motorist to drive the stretch of road is greater than 13.1 minutes.
  2. Explain whether it is necessary to use the Central Limit Theorem in your test.
OCR S2 2011 January Q4
7 marks Standard +0.3
4 The continuous random variable \(X\) has mean \(\mu\) and standard deviation 45. A significance test is to be carried out of the null hypothesis \(\mathrm { H } _ { 0 } : \mu = 230\) against the alternative hypothesis \(\mathrm { H } _ { 1 } : \mu \neq 230\), at the \(1 \%\) significance level. A random sample of size 50 is obtained, and the sample mean is found to be 213.4.
  1. Carry out the test.
  2. Explain whether it is necessary to use the Central Limit Theorem in your test.
OCR S2 2011 January Q5
7 marks Standard +0.3
5 A temporary job is advertised annually. The number of applicants for the job is a random variable which is known from many years' experience to have a distribution \(\operatorname { Po } ( 12 )\). In 2010 there were 19 applicants for the job. Test, at the 10\% significance level, whether there is evidence of an increase in the mean number of applicants for the job.
OCR S2 2011 January Q9
11 marks Standard +0.3
9 A pharmaceutical company is developing a new drug to treat a certain disease. The company will continue to develop the drug if the proportion \(p\) of those who have the disease and show a substantial improvement after treatment is greater than 0.7 . The company carries out a test, at the \(5 \%\) significance level, on a random sample of 14 patients who suffer from the disease.
  1. Find the critical region for the test.
  2. Given that 12 of the 14 patients in the sample show a substantial improvement, carry out the test.
  3. Find the probability that the test results in a Type II error if in fact \(p = 0.8\). RECOGNISING ACHIEVEMENT
OCR S2 2009 June Q3
7 marks Moderate -0.3
3 An electronics company is developing a new sound system. The company claims that \(60 \%\) of potential buyers think that the system would be good value for money. In a random sample of 12 potential buyers, 4 thought that it would be good value for money. Test, at the 5\% significance level, whether the proportion claimed by the company is too high.
OCR S2 2009 June Q8
11 marks Standard +0.3
8 In a large company the time taken for an employee to carry out a certain task is a normally distributed random variable with mean 78.0 s and unknown variance. A new training scheme is introduced and after its introduction the times taken by a random sample of 120 employees are recorded. The mean time for the sample is 76.4 s and an unbiased estimate of the population variance is \(68.9 \mathrm {~s} ^ { 2 }\).
  1. Test, at the \(1 \%\) significance level, whether the mean time taken for the task has changed.
  2. It is required to redesign the test so that the probability of making a Type I error is less than 0.01 when the sample mean is 77.0 s . Calculate an estimate of the smallest sample size needed, and explain why your answer is only an estimate.
OCR MEI S4 2011 June Q3
24 marks Challenging +1.2
3
  1. Explain the meaning of the following terms in the context of hypothesis testing: Type I error, Type II error, operating characteristic, power.
  2. A market research organisation is designing a sample survey to investigate whether expenditure on everyday food items has increased in 2011 compared with 2010. For one of the populations being studied, the random variable \(X\) is used to model weekly expenditure, in \(\pounds\), on these items in 2011, where \(X\) is Normally distributed with mean \(\mu\) and variance \(\sigma ^ { 2 }\). As the corresponding mean value in 2010 was 94 , the hypotheses to be examined are $$\begin{aligned} & \mathrm { H } _ { 0 } : \mu = 94 \\ & \mathrm { H } _ { 1 } : \mu > 94 \end{aligned}$$ By comparison with the corresponding 2010 value, \(\sigma ^ { 2 }\) is assumed to be 25 .
    The following criteria for the survey are laid down.
    A random sample of size \(n\) is to be taken and the usual Normal test based on \(\bar { X }\) is to be used, with a critical value of \(c\) such that \(\mathrm { H } _ { 0 }\) is rejected if the value of \(\bar { X }\) exceeds \(c\). Find \(c\) and the smallest value of \(n\) that is required.
  3. Sketch the power function of an ideal test for examining the hypotheses in part (ii).
OCR MEI S4 2011 June Q4
24 marks Standard +0.3
4
  1. Provide an example of an experimental situation where there is one factor of primary interest and where a suitable experimental design would be
    1. randomised blocks,
    2. a Latin square. In each case, explain carefully why the design is suitable and why the other design would not be appropriate.
  2. An industrial experiment to compare four treatments for increasing the tensile strength of steel is carried out according to a completely randomised design. For various reasons, it is not possible to use the same number of replicates for each treatment. The increases, in a suitable unit of tensile strength, are as follows.
    Treatment
    A
    Treatment
    B
    Treatment
    C
    Treatment
    D
    10.121.19.222.6
    21.220.38.817.4
    11.616.015.223.1
    13.615.019.2
    [The sum of these data items is 256.8 and the sum of their squares is 4471.92 .] Construct the usual one-way analysis of variance table. Carry out the appropriate test, using a \(5 \%\) significance level. RECOGNISING ACHIEVEMENT