Hypothesis test of a normal distribution

562 questions · 28 question types identified

Sort by: Question count | Difficulty
One-sample t-test, variance unknown

Test a hypothesis about the population mean when the population variance is unknown and must be estimated from the sample, using the t-distribution with n-1 degrees of freedom.

72 Standard +0.3
12.8% of questions
Show example »
7 A random sample of 9 observations of a normal variable \(X\) is taken. The results are summarised as follows. $$\Sigma x = 24.6 \quad \Sigma x ^ { 2 } = 68.5$$ Test, at the \(5 \%\) significance level, whether the population mean is greater than 2.5.
View full question →
Easiest question Moderate -0.8 »
1 Judith, the village postmistress, believes that, since moving the post office counter into the local pharmacy, the mean daily number of customers that she serves has increased from 79. In order to investigate her belief, she counts the number of customers that she serves on 12 randomly selected days, with the following results. $$\begin{array} { l l l l l l l l l l l l } 88 & 81 & 84 & 89 & 90 & 77 & 72 & 80 & 82 & 81 & 75 & 85 \end{array}$$ Stating a necessary distributional assumption, test Judith's belief at the 5\% level of significance. \begin{verbatim} QUESTION PART REFERENCE \end{verbatim}
\includegraphics[max width=\textwidth, alt={}]{c31c5c67-834e-42ce-b4af-555890c393d5-03_2484_1709_223_153}
\includegraphics[max width=\textwidth, alt={}]{c31c5c67-834e-42ce-b4af-555890c393d5-04_2496_1724_214_143}
\includegraphics[max width=\textwidth, alt={}]{c31c5c67-834e-42ce-b4af-555890c393d5-05_2484_1709_223_153}
View full question →
Hardest question Challenging +1.8 »
  1. At the start of each academic year, a large college carries out a diagnostic test on a random sample of new students. Past experience has shown that the standard deviation of the scores on this test is 19.71
The admissions tutor claimed that the new students in 2013 would have more varied scores than usual. The scores for the students taking the test can be assumed to come from a normal distribution. A random sample of 10 new students was taken and the score \(x\), for each student was recorded. The data are summarised as \(\sum x = 619 \sum x ^ { 2 } = 42397\)
  1. Stating your hypotheses clearly, and using a \(5 \%\) level of significance, test the admission tutor's claim. The admissions tutor decides that in future he will use the same hypotheses but take a larger sample of size 30 and use a significance level of 1\%.
  2. Use the tables to show that, to 3 decimal places, the critical region for \(S ^ { 2 }\) is \(S ^ { 2 } > 664.281\)
  3. Find the probability of a type II error using this test when the true value of the standard deviation is in fact 22.20
View full question →
Two-sample t-test with summary statistics

Questions providing summary statistics (sums, means, variances) for two independent samples where students must calculate test statistics and perform hypothesis tests, typically with large samples or assumed normal distributions.

62 Standard +0.6
11.0% of questions
Show example »
9 Experiments are conducted to test the breaking strength of each of two types of rope, \(P\) and \(Q\). A random sample of 50 ropes of type \(P\) and a random sample of 70 ropes of type \(Q\) are selected. The breaking strengths, \(p\) and \(q\), measured in appropriate units, are summarised as follows. $$\Sigma p = 321.2 \quad \Sigma p ^ { 2 } = 2120.0 \quad \Sigma q = 475.3 \quad \Sigma q ^ { 2 } = 3310.0$$ Test, at the \(10 \%\) significance level, whether the mean breaking strengths of type \(P\) and type \(Q\) ropes are the same.
View full question →
Easiest question Moderate -0.3 »
2 As part of a comparison of two varieties of cucumber, Fanfare and Marketmore, random samples of harvested cucumbers of each variety were selected and their lengths measured, in centimetres. The results are summarised in the table.
\multirow{2}{*}{}\multirow[b]{2}{*}{Sample size}Length (cm)
Sample meanSample standard deviation
\multirow{2}{*}{Cucumber variety}Fanfare5022.01.31
Marketmore7521.60.702
  1. Test, at the \(1 \%\) level of significance, the hypothesis that there is no difference between the mean length of harvested Fanfare cucumbers and that of harvested Marketmore cucumbers.
  2. In addition to length, name one other characteristic of cucumbers that could be used for comparative purposes.
View full question →
Hardest question Challenging +1.3 »
The times taken, in hours, by cyclists from two different clubs, \(A\) and \(B\), to complete a 50 km time trial are being compared. The times taken by a cyclist from club \(A\) and by a cyclist from club \(B\) are denoted by \(t _ { A }\) and \(t _ { B }\) respectively. A random sample of 50 cyclists from \(A\) and a random sample of 60 cyclists from \(B\) give the following summarised data. $$\Sigma t _ { A } = 102.0 \quad \Sigma t _ { A } ^ { 2 } = 215.18 \quad \Sigma t _ { B } = 129.0 \quad \Sigma t _ { B } ^ { 2 } = 282.3$$ Using a 5\% significance level, test whether, on average, cyclists from club \(A\) take less time to complete the time trial than cyclists from club \(B\). A test at the \(\alpha \%\) significance level shows that there is evidence that the population mean time for cyclists from club \(B\) exceeds the population mean time for cyclists from club \(A\) by more than 0.05 hours. Find the set of possible values of \(\alpha\).
View full question →
One-sample z-test, variance known

Test a hypothesis about the population mean when the population variance (or standard deviation) is known and given, using the standard normal distribution.

56 Standard +0.2
10.0% of questions
Show example »
1 Roger claims that, on average, his journey time from home to work each day is greater than 45 minutes. The times, \(x\) minutes, of 30 randomly selected journeys result in \(\bar { x } = 45.8\) and \(s ^ { 2 } = 4.8\).
Investigate Roger's claim at the \(1 \%\) level of significance.
View full question →
Easiest question Moderate -0.3 »
1
  1. The manager of a company that employs 250 travelling sales representatives wishes to carry out a detailed analysis of the expenses claimed by the representatives. He has an alphabetical (by surname) list of the representatives. He chooses a sample of representatives by selecting the 10th, 20th, 30th and so on. Name the type of sampling the manager is attempting to use. Describe a weakness in his method of using it, and explain how he might overcome this weakness. The representatives each use their own cars to drive to meetings with customers. The total distance, in miles, travelled by a representative in a month is Normally distributed with mean 2018 and standard deviation 96.
  2. Find the probability that, in a randomly chosen month, a randomly chosen representative travels more than 2100 miles.
  3. Find the probability that, in a randomly chosen 3-month period, a randomly chosen representative travels less than 6000 miles. What assumption is needed here? Give a reason why it may not be realistic.
  4. Each month every representative submits a claim for travelling expenses plus commission. Travelling expenses are paid at the rate of 45 pence per mile. The commission is \(10 \%\) of the value of sales in that month. The value, in \(\pounds\), of the monthly sales has the distribution \(\mathrm { N } \left( 21200,1100 ^ { 2 } \right)\). Find the probability that a randomly chosen claim lies between \(\pounds 3000\) and \(\pounds 3300\). William Sealy, a biochemistry student, is doing work experience at a brewery. One of his tasks is to monitor the specific gravity of the brewing mixture during the brewing process. For one particular recipe, an initial specific gravity of 1.040 is required. A random sample of 9 measurements of the specific gravity at the start of the process gave the following results. $$\begin{array} { l l l l l l l l l } 1.046 & 1.048 & 1.039 & 1.055 & 1.038 & 1.054 & 1.038 & 1.051 & 1.038 \end{array}$$
  5. William has to test whether the specific gravity of the mixture meets the requirement. Why might a \(t\) test be used for these data and what assumption must be made?
  6. Carry out the test using a significance level of \(10 \%\).
  7. Find a 95\% confidence interval for the true mean specific gravity of the mixture and explain what is meant by a \(95 \%\) confidence interval.
View full question →
Hardest question Challenging +1.2 »
13 Each weekday Keira drives to work with her son Kaito. She always sets off at 8.00 a.m. She models her journey time, \(x\) minutes, by the distribution \(X \sim \mathrm {~N} ( 15,4 )\). Over a long period of time she notes that her journey takes less than 14 minutes on \(7 \%\) of the journeys, and takes more than 18 minutes on \(31 \%\) of the journeys.
  1. Investigate whether Keira's model is a good fit for the data. Kaito believes that Keira’s value for the variance is correct, but realises that the mean is not correct.
  2. Find, correct to two significant figures, the value of the mean that Keira should use in a refined model which does fit the data. Keira buys a new car. After driving to work in it each day for several weeks, she randomly selects the journey times for \(n\) of these days. Her mean journey time for these \(n\) days is 16 minutes. Using the refined model she conducts a hypothesis test to see if her mean journey time has changed, and finds that the result is significant at the \(5 \%\) level.
  3. Determine the smallest possible value of \(n\).
View full question →
Unknown variance (t-distribution)

Questions where the population variance is unknown and must be estimated from the sample, requiring use of the t-distribution for the confidence interval.

41 Standard +0.3
7.3% of questions
Show example »
7 Customers buying euros ( €) at a travel agency must pay for them in pounds ( \(\pounds\) ). The amounts paid, \(\pounds x\), by a sample of 40 customers were, in ascending order, as follows.
View full question →
Easiest question Easy -1.2 »
1 There are 18 people in Millie's class. To choose a person at random she numbers the people in the class from 1 to 18 and presses the random number button on her calculator to obtain a 3-digit decimal. Millie then multiplies the first digit in this decimal by two and chooses the person corresponding to this new number. Decimals in which the first digit is zero are ignored.
  1. Give a reason why this is not a satisfactory method of choosing a person. Millie obtained a random sample of 5 people of her own age by a satisfactory sampling method and found that their heights in metres were \(1.66,1.68,1.54,1.65\) and 1.57 . Heights are known to be normally distributed with variance \(0.0052 \mathrm {~m} ^ { 2 }\).
  2. Find a \(98 \%\) confidence interval for the mean height of people of Millie's age.
View full question →
Hardest question Challenging +1.3 »
7. A doctor wishes to study the level of blood glucose in males. The level of blood glucose is normally distributed. The doctor measured the blood glucose of 10 randomly selected male students from a school. The results, in mmol/litre, are given below. $$\begin{array} { l l l l l l l l l l } 4.7 & 3.6 & 3.8 & 4.7 & 4.1 & 2.2 & 3.6 & 4.0 & 4.4 & 5.0 \end{array}$$
  1. Calculate a \(95 \%\) confidence interval for the mean.
  2. Calculate a 95\% confidence interval for the variance. A blood glucose reading of more than 7 mmol/litre is counted as high.
  3. Use appropriate confidence limits from parts (a) and (b) to find the highest estimate of the proportion of male students in the school with a high blood glucose level. \section*{END}
View full question →
Type I/II errors and power

Calculate or explain Type I error, Type II error, significance level, power, or operating characteristic of a test.

40 Standard +0.5
7.1% of questions
Show example »
3 The random variable \(X\) has the distribution \(\mathrm { N } \left( \mu , 5 ^ { 2 } \right)\). A hypothesis test is carried out of \(\mathrm { H } _ { 0 } : \mu = 20.0\) against \(\mathrm { H } _ { 1 } : \mu < 20.0\), at the \(1 \%\) level of significance, based on the mean of a sample of size 16. Given that in fact \(\mu = 15.0\), find the probability that the test results in a Type II error.
View full question →
Easiest question Easy -1.2 »
2 Jamie is conducting a hypothesis test on a random variable which has a normal distribution with standard deviation 1 The hypotheses are $$\begin{aligned} & \mathrm { H } _ { 0 } : \mu = 5 \\ & \mathrm { H } _ { 1 } : \mu > 5 \end{aligned}$$ He takes a random sample of size 4
The mean of his sample is 6
He uses a 5\% level of significance.
Before Jamie conducted the test, what was the probability that he would make a Type I error? Circle your answer.
[0pt] [1 mark] \(0.0228 \quad 0.0456 \quad 0.0500 \quad 0.1587\)
View full question →
Hardest question Challenging +1.8 »
8 The quantity, \(X\) milligrams per litre, of silicon dioxide in a certain brand of mineral water is a random variable with distribution \(\mathrm { N } \left( \mu , 5.6 ^ { 2 } \right)\).
  1. A random sample of 80 observations of \(X\) has sample mean 100.7. Test, at the \(1 \%\) significance level, the null hypothesis \(\mathrm { H } _ { 0 } : \mu = 102\) against the alternative hypothesis \(\mathrm { H } _ { 1 } : \mu \neq 102\).
  2. The test is redesigned so as to meet the following conditions.
    • The hypotheses are \(\mathrm { H } _ { 0 } : \mu = 102\) and \(\mathrm { H } _ { 1 } : \mu < 102\).
    • The significance level is \(1 \%\).
    • The probability of making a Type II error when \(\mu = 100\) is to be (approximately) 0.05 .
    The sample size is \(n\), and the critical region is \(\bar { X } < c\), where \(\bar { X }\) denotes the sample mean.
    (a) Show that \(n\) and \(c\) satisfy (approximately) the equation \(102 - c = \frac { 13.0256 } { \sqrt { n } }\).
    (b) Find another equation satisfied by \(n\) and \(c\).
    (c) Hence find the values of \(n\) and \(c\).
View full question →
Paired t-test

Test for a difference in means using paired data (before/after, matched pairs) by analyzing the differences.

25 Standard +0.3
4.4% of questions
Show example »
3 Ten randomly chosen athletes were coached for a 200 m event. For each athlete, the times taken to run 200 m before and after coaching were measured. The sample mean times before and after coaching were 23.43 seconds and 22.84 seconds respectively. For each athlete the difference, \(d\) seconds, in the times before and after coaching was calculated and an unbiased estimate of the population variance of \(d\) was found to be 0.548 . Stating any required assumption, test at the \(5 \%\) significance level whether the population mean time for the 200 m run decreased after coaching.
View full question →
Easiest question Standard +0.3 »
1 A manager is investigating the times taken by employees to complete a particular task as a result of the introduction of new technology. He claims that the mean time taken to complete the task is reduced by more than 0.4 minutes. He chooses a random sample of 10 employees. The times taken, in minutes, before and after the introduction of the new technology are recorded in the table.
Employee\(A\)\(B\)\(C\)D\(E\)\(F\)G\(H\)IJ
Time before new technology10.29.812.411.610.811.214.610.612.311.0
Time after new technology9.68.512.410.910.210.612.810.812.510.6
  1. Test at the 10\% significance level whether the manager's claim is justified.
  2. State an assumption that is necessary for this test to be valid.
View full question →
Hardest question Standard +0.8 »
4 Manet has developed a new training course to help athletes improve their time taken to run 800 m . Manet claims that his course will decrease an athlete's time by more than 2 s on average. For a random sample of 10 athletes the times taken, in seconds, before and after the course are given in the following table.
Athlete\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
Before150146131135126142130129137134
After145138129135122135132128127137
Use a \(t\)-test, at the \(5 \%\) significance level, to test whether Manet's claim is justified, stating any assumption that you make.
View full question →
Two-tail z-test

Test whether the population mean has changed in either direction (H₁: μ ≠ μ₀), using a two-tail test with critical values on both sides.

24 Moderate -0.2
4.3% of questions
Show example »
5 The acidity \(A\) (measured in pH ) of soil of a particular type has a normal distribution. The pH values of a random sample of 80 soil samples from a certain region can be summarised as $$\Sigma a = 496 , \quad \Sigma a ^ { 2 } = 3126 .$$ Test, at the \(10 \%\) significance level, whether in this region the mean pH of soil is 6.1 .
View full question →
Easiest question Easy -1.2 »
2 Karim has noted the lifespans, in weeks, of a large random sample of certain insects. He carries out a test, at the \(1 \%\) significance level, for the population mean, \(\mu\). Karim's null hypothesis is \(\mu = 6.4\).
  1. Given that Karim's test is two-tail, state the alternative hypothesis.
    Karim finds that the value of the test statistic is \(z = 2.43\).
  2. Explain what conclusion he should draw.
  3. Explain briefly when a one-tail test is appropriate, rather than a two-tail test.
View full question →
Hardest question Standard +0.3 »
5 Last year the mean time for pizza deliveries from Pete's Pizza Pit was 32.4 minutes. This year the time, \(t\) minutes, for pizza deliveries from Pete's Pizza Pit was recorded for a random sample of 50 deliveries. The results were as follows. $$n = 50 \quad \Sigma t = 1700 \quad \Sigma t ^ { 2 } = 59050$$
  1. Find unbiased estimates of the population mean and variance.
  2. Test, at the \(2 \%\) significance level, whether the mean delivery time has changed since last year.
  3. Under what circumstances would it not be necessary to use the Central Limit Theorem in answering (b)?
View full question →
Interpret confidence interval

Use a given confidence interval to comment on a claim, test a hypothesis, or determine if a value is plausible.

24 Moderate -0.0
4.3% of questions
Show example »
2 A six-sided die has faces marked \(1,2,3,4,5,6\). When the die is thrown 300 times it shows a six on 56 throws.
  1. Calculate an approximate \(96 \%\) confidence interval for the probability that the die shows a six on one throw.
  2. Maroulla claims that the die is biased. Use your answer to part (a) to comment on this claim.
View full question →
Easiest question Moderate -0.8 »
3
  1. Give a reason for using a sample rather than the whole population in carrying out a statistical investigation.
  2. Tennis balls of a certain brand are known to have a mean height of bounce of 64.7 cm , when dropped from a height of 100 cm . A change is made in the manufacturing process and it is required to test whether this change has affected the mean height of bounce. 100 new tennis balls are tested and it is found that their mean height of bounce when dropped from a height of 100 cm is 65.7 cm and the unbiased estimate of the population variance is \(15 \mathrm {~cm} ^ { 2 }\).
    (a) Calculate a \(95 \%\) confidence interval for the population mean.
    (b) Use your answer to part (ii) (a) to explain what conclusion can be drawn about whether the change has affected the mean height of bounce.
View full question →
Hardest question Challenging +1.2 »
The time taken for a randomly chosen student at College \(P\) to complete a particular puzzle has a normal distribution with mean \(\mu\) minutes. The times, \(x\) minutes, are recorded for a random sample of 8 students chosen from the college. The results are summarised as follows. $$\Sigma x = 42.8 \quad \Sigma x ^ { 2 } = 236.0$$ Find a 95\% confidence interval for \(\mu\). A test is carried out on this sample data, at the \(10 \%\) significance level. The test supports the claim that \(\mu > k\). Find the greatest possible value of \(k\). A random sample, of size 12, is taken from the students at College \(Q\). Their times to complete the puzzle give a sample mean of 4.60 minutes and an unbiased variance estimate of 1.962 minutes \({ } ^ { 2 }\). Use a 2 -sample test at the \(10 \%\) significance level to test whether the mean time for students at College \(Q\) to complete the puzzle is less than the mean time for students at College \(P\) to complete the puzzle. You should state any assumptions necessary for the test to be valid.
View full question →
Critical region determination

Find the critical region or rejection region for a hypothesis test in terms of the test statistic or sample mean.

21 Standard +0.6
3.7% of questions
Show example »
7. A machine produces bricks. The lengths, \(x \mathrm {~mm}\), of the bricks are distributed \(\mathrm { N } \left( \mu , 2 ^ { 2 } \right)\). At the start of each week a random sample of \(n\) bricks is taken to check the machine is working correctly.
A test is then carried out at the \(1 \%\) level of significance with $$\mathrm { H } _ { 0 } : \mu = 202 \text { and } \mathrm { H } _ { 1 } : \mu < 202$$
  1. Find, in terms of \(n\), the critical region of the test. The probability of a type II error, when \(\mu = 200\), is less than 0.05
  2. Find the minimum value of \(n\).
View full question →
Easiest question Standard +0.3 »
6 Last year, the mean time taken by students at a school to complete a certain test was 25 minutes. Akash believes that the mean time taken by this year's students was less than 25 minutes. In order to test this belief, he takes a large random sample of this year's students and he notes the time taken by each student. He carries out a test, at the \(2.5 \%\) significance level, for the population mean time, \(\mu\) minutes. Akash uses the null hypothesis \(\mathrm { H } _ { 0 } : \mu = 25\).
  1. Give a reason why Akash should use a one-tailed test.
    Akash finds that the value of the test statistic is \(z = - 2.02\).
  2. Explain what conclusion he should draw.
    In a different one-tailed hypothesis test the \(z\)-value was found to be 2.14 .
  3. Given that this value would lead to a rejection of the null hypothesis at the \(\alpha \%\) significance level, find the set of possible values of \(\alpha\).
    The population mean time taken by students at another school to complete a test last year was \(m\) minutes. Sorin carries out a one-tailed test to determine whether the population mean this year is less than \(m\), using a random sample of 100 students. He assumes that the population standard deviation of the times is 3.9 minutes. The sample mean is 24.8 minutes, and this result just leads to the rejection of the null hypothesis at the 5\% significance level.
  4. Find the value of \(m\).
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
View full question →
Hardest question Challenging +1.8 »
  1. A manufacturer has a machine that produces lollipop sticks.
The length of a lollipop stick produced by the machine is normally distributed with unknown mean \(\mu\) and standard deviation 0.2 Farhan believes that the machine is not working properly and the mean length of the lollipop sticks has decreased.
He takes a random sample of size \(n\) to test, at the 1\% level of significance, the hypotheses $$\mathrm { H } _ { 0 } : \mu = 15 \quad \mathrm { H } _ { 1 } : \mu < 15$$
  1. Write down the size of this test. Given that the actual value of \(\mu\) is 14.9
    1. calculate the minimum value of \(n\) such that the probability of a Type II error is less than 0.05
      Show your working clearly.
    2. Farhan uses the same sample size, \(n\), but now carries out the test at a \(5 \%\) level of significance. Without doing any further calculations, state how this would affect the probability of a Type II error.
View full question →
Confidence interval for proportion

Calculate an approximate confidence interval for a population proportion using sample proportion and normal approximation.

21 Moderate -0.4
3.7% of questions
Show example »
1 A coin is thrown 100 times and it shows heads 60 times. Calculate an approximate \(98 \%\) confidence interval for the probability, \(p\), that the coin shows heads on any throw.
View full question →
Easiest question Easy -1.2 »
1 In a survey of 2000 randomly chosen adults, 1602 said that they owned a smartphone. Calculate an approximate \(95 \%\) confidence interval for the proportion of adults in the whole population who own a smartphone.
View full question →
Hardest question Challenging +1.2 »
3 A die is biased so that the probability that it shows a six on any throw is \(p\).
  1. In an experiment, the die shows a six on 22 out of 100 throws. Find an approximate \(97 \%\) confidence interval for \(p\).
  2. The experiment is repeated and another \(97 \%\) confidence interval is found. Find the probability that exactly one of the two confidence intervals includes the true value of \(p\).
View full question →
Find confidence level from interval

Given a confidence interval and sample data, work backwards to find the confidence level (α%) used.

19 Standard +0.6
3.4% of questions
Show example »
  1. A random sample of the daily sales (in £s) of a small company is taken and, using tables of the normal distribution, a 99\% confidence interval for the mean daily sales is found to be
    (123.5, 154.7)
Find a \(95 \%\) confidence interval for the mean daily sales of the company.
(6)
View full question →
Easiest question Moderate -0.8 »
3 Based on a random sample of 700 people living in a certain area, a confidence interval for the proportion, \(p\), of all people living in that area who had travelled abroad was found to be \(0.5672 < p < 0.6528\).
  1. Find the proportion of people in the sample who had travelled abroad.
  2. Find the confidence level of this confidence interval. Give your answer correct to the nearest integer.
View full question →
Hardest question Challenging +1.8 »
Petra is studying a particular species of bird. She takes a random sample of 12 birds from nature reserve \(A\) and measures the wing span, \(x \mathrm {~cm}\), for each bird. She then calculates a \(95 \%\) confidence interval for the population mean wing span, \(\mu \mathrm { cm }\), for birds of this species, assuming that wing spans are normally distributed. Later, she is not able to find the summary of the results for the sample, but she knows that the \(95 \%\) confidence interval is \(25.17 \leqslant \mu \leqslant 26.83\). Find the values of \(\sum x\) and \(\sum x ^ { 2 }\) for this sample. Petra also measures the wing spans of a random sample of 7 birds from nature reserve \(B\). Their wing spans, \(y \mathrm {~cm}\), are as follows. $$\begin{array} { l l l l l l l } 23.2 & 22.4 & 27.6 & 25.3 & 28.4 & 26.5 & 23.6 \end{array}$$ She believes that the mean wing span of birds found in nature reserve \(A\) is greater than the mean wing span of birds found in nature reserve \(B\). Assuming that this second sample also comes from a normal distribution, with variance the same as the first distribution, test, at the \(10 \%\) significance level, whether there is evidence to support Petra's belief.
View full question →
One-tail z-test (lower tail)

Test whether the population mean has decreased (H₁: μ < μ₀), using a one-tail test with negative critical value.

16 Standard +0.1
2.8% of questions
Show example »
5 The mean solubility rating of widgets inserted into beer cans is thought to be 84.0, in appropriate units. A random sample of 50 widgets is taken. The solubility ratings, \(x\), are summarised by $$n = 50 , \quad \Sigma x = 4070 , \quad \Sigma x ^ { 2 } = 336100$$ Test, at the \(5 \%\) significance level, whether the mean solubility rating is less than 84.0 .
View full question →
Easiest question Moderate -0.8 »
2 In the past, the mean length of a particular variety of worm has been 10.3 cm , with standard deviation 2.6 cm . Following a change in the climate, it is thought that the mean length of this variety of worm has decreased. The lengths of a random sample of 100 worms of this variety are found and the mean of this sample is found to be 9.8 cm . Assuming that the standard deviation remains at 2.6 cm , carry out a test at the \(2 \%\) significance level of whether the mean length has decreased. \(31.6 \%\) of adults in a certain town ride a bicycle. A random sample of 200 adults from this town is selected.
  1. Use a suitable approximating distribution to find the probability that more than 3 of these adults ride a bicycle.
  2. Justify your approximating distribution.
View full question →
Hardest question Standard +0.3 »
3 Batteries of type \(A\) are known to have a mean life of 150 hours. It is required to test whether a new type of battery, type \(B\), has a shorter mean life than type \(A\) batteries.
  1. Give a reason for using a sample rather than the whole population in carrying out this test.
    A random sample of 120 type \(B\) batteries are tested and it is found that their mean life is 147 hours, and an unbiased estimate of the population variance is 225 hours \(^ { 2 }\).
  2. Test, at the \(2 \%\) significance level, whether type \(B\) batteries have a shorter mean life than type \(A\) batteries.
  3. Calculate a \(94 \%\) confidence interval for the population mean life of type \(B\) batteries.
View full question →
Confidence interval with known population standard deviation

Questions where the population standard deviation (or variance) is explicitly given or stated as known, requiring use of the normal distribution directly.

14 Moderate -0.3
2.5% of questions
Show example »
2. A random sample of 30 apples was taken from a batch. The mean weight of the sample was 124 g with standard deviation 20 g .
  1. Find a \(99 \%\) confidence interval for the mean weight \(\mu\) grams of the population of apples. Write down any assumptions you made in your calculations. Given that the actual value of \(\mu\) is 140 ,
  2. state, with a reason, what you can conclude about the sample of 30 apples.
View full question →
Easiest question Moderate -0.8 »
3 A consumer group, interested in the mean fat content of a particular type of sausage, takes a random sample of 20 sausages and sends them away to be analysed. The percentage of fat in each sausage is as follows. $$\begin{array} { l l l l l l l l l l l l l l l l l l l l } 26 & 27 & 28 & 28 & 28 & 29 & 29 & 30 & 30 & 31 & 32 & 32 & 32 & 33 & 33 & 34 & 34 & 34 & 35 & 35 \end{array}$$ Assume that the percentage of fat is normally distributed with mean \(\mu\), and that the standard deviation is known to be 3 .
  1. Calculate a 98\% confidence interval for the population mean percentage of fat.
  2. The manufacturer claims that the mean percentage of fat in sausages of this type is 30 . Use your answer to part (i) to determine whether the consumer group should accept this claim.
View full question →
Hardest question Standard +0.3 »
  1. Assam produces bags of flour. The stated weight printed on the bags of flour is 3 kg . The weights of the bags of flour are normally distributed with standard deviation 0.015 kg .
Assam weighs a random sample of 9 bags of flour and finds their mean weight is 2.977 kg .
  1. Calculate the \(99 \%\) confidence interval for the mean weight of a bag of flour. Give your limits to 3 decimal places. Assam decides to increase the amount of flour put into the bags.
  2. Explain why the confidence interval has led Assam to take this action. After the increase a random sample of \(n\) bags of flour is taken. The sample mean weight of these \(n\) bags is 2.995 kg . A \(95 \%\) confidence interval for \(\mu\) gave a lower limit of less than 2.991 kg .
  3. Find the maximum value of \(n\).
    VILV SIHI NI IIII M I ON OC
    VIAV SIHI NI III IM I ON OO
    VIAV SIHI NI III HM ION OC
View full question →
Confidence interval interpretation or related probability

Questions that extend beyond calculation to ask about interpretation of confidence intervals, probability of multiple intervals, or required sample sizes.

12 Moderate -0.0
2.1% of questions
Show example »
1 The result of a fitness trial is a random variable \(X\) which is normally distributed with mean \(\mu\) and standard deviation 2.4. A researcher uses the results from a random sample of 90 trials to calculate a \(98 \%\) confidence interval for \(\mu\). What is the width of this interval?
View full question →
Easiest question Moderate -0.8 »
1 The result of a fitness trial is a random variable \(X\) which is normally distributed with mean \(\mu\) and standard deviation 2.4. A researcher uses the results from a random sample of 90 trials to calculate a \(98 \%\) confidence interval for \(\mu\). What is the width of this interval?
View full question →
Hardest question Standard +0.8 »
3 The time taken in minutes for a certain daily train journey has a normal distribution with standard deviation 5.8. For a random sample of 20 days the journey times were noted and the mean journey time was found to be 81.5 minutes.
  1. Calculate a \(98 \%\) confidence interval for the population mean journey time.
    A student was asked for the meaning of this confidence interval. The student replied as follows.
    'The times for \(98 \%\) of these journeys are likely to be within the confidence interval.'
  2. Explain briefly whether this statement is true or not.
    Two independent 98\% confidence intervals are found.
  3. Given that at least one of these intervals contains the population mean, find the probability that both intervals contain the population mean.
View full question →
Known variance (z-distribution)

Questions where the population standard deviation is given or assumed known, requiring use of the normal (z) distribution for the confidence interval.

12 Moderate -0.0
2.1% of questions
Show example »
1 The weights, in grams, of packets of sugar are distributed with mean \(\mu\) and standard deviation 23. A random sample of 150 packets is taken. The mean weight of this sample is found to be 494 g . Calculate a 98\% confidence interval for \(\mu\).
View full question →
Easiest question Moderate -0.8 »
1 The weights, in grams, of packets of sugar are distributed with mean \(\mu\) and standard deviation 23. A random sample of 150 packets is taken. The mean weight of this sample is found to be 494 g . Calculate a 98\% confidence interval for \(\mu\).
View full question →
Hardest question Challenging +1.2 »
7. Branwen intends to buy a new bike, either a Cannotrek or a Bianchondale. If there is evidence that the difference in the mean times on the two bikes over a 10 km time trial is more than 1.25 minutes, she will buy the faster bike. Otherwise, she will base her decision on other factors. She negotiates a test period to try both bikes. The times, in minutes, taken by Branwen to complete a 10 km time trial on the Cannotrek may be modelled by a normal distribution with mean \(\mu _ { C }\) and standard deviation \(0 \cdot 75\). The times, in minutes, taken by Branwen to complete a 10 km time trial on the Bianchondale may be modelled by a normal distribution with mean \(\mu _ { B }\) and standard deviation \(0 \cdot 6\). During the test period, she completes 6 time trials with a mean time of 19.5 minutes on the Cannotrek, and 5 time trials with a mean time of 17.3 minutes on the Bianchondale. She calculates a \(p \%\) confidence interval for \(\mu _ { C } - \mu _ { B }\).
  1. What would be the largest value of \(p\) that would lead Branwen to base her purchasing decision on the time trials, without considering other factors?
  2. State an assumption you have made in part (a).
View full question →
One-tail z-test (upper tail)

Test whether the population mean has increased (H₁: μ > μ₀), using a one-tail test with positive critical value.

12 Moderate -0.1
2.1% of questions
Show example »
3 The lengths, in centimetres, of rods produced in a factory have mean \(\mu\) and standard deviation 0.2. The value of \(\mu\) is supposed to be 250 , but a manager claims that one machine is producing rods that are too long on average. A random sample of 40 rods from this machine is taken and the sample mean length is found to be 250.06 cm . Test at the \(5 \%\) significance level whether the manager's claim is justified.
View full question →
Easiest question Moderate -0.8 »
3 An architect wishes to investigate whether the buildings in a certain city are higher, on average, than buildings in other cities. He takes a large random sample of buildings from the city and finds the mean height of the buildings in the sample. He calculates the value of the test statistic, \(z\), and finds that \(z = 2.41\).
  1. Explain briefly whether he should use a one-tail test or a two-tail test.
  2. Carry out the test at the \(1 \%\) significance level.
View full question →
Hardest question Standard +0.3 »
3 In the past, the annual amount of wheat produced per farm by a large number of similar sized farms in a certain region had mean 24.0 tonnes and standard deviation 5.2 tonnes. Last summer a new fertiliser was used by all the farms, and it was expected that the mean amount of wheat produced per farm would be greater than 24.0 tonnes. In order to test whether this was true, a scientist recorded the amounts of wheat produced by a random sample of 50 farms last summer. He found that the value of the sample mean was 25.8 tonnes. Stating a necessary assumption, carry out the test at the \(1 \%\) significance level.
View full question →
Sample size determination

Find the required sample size to achieve a confidence interval of specified width or precision.

11 Standard +0.3
2.0% of questions
Show example »
1 The result of a memory test is known to be normally distributed with mean \(\mu\) and standard deviation 1.9. It is required to have a \(95 \%\) confidence interval for \(\mu\) with a total width of less than 2.0 . Find the least possible number of tests needed to achieve this.
View full question →
Easiest question Standard +0.3 »
4 A certain train journey takes place every day throughout the year. The time taken, in minutes, for the journey is normally distributed with variance 11.2.
  1. The mean time for a random sample of \(n\) of these journeys was found. A \(94 \%\) confidence interval for the population mean time was calculated and was found to have a width of 1.4076 minutes, correct to 4 decimal places. Find the value of \(n\).
  2. A passenger noted the times for 50 randomly chosen journeys in January, February and March. Give a reason why this sample is unsuitable for use in finding a confidence interval for the population mean time.
  3. A researcher took 4 random samples and a \(94 \%\) confidence interval for the population mean was found from each sample. Find the probability that exactly 3 of these confidence intervals contain the true value of the population mean.
View full question →
Hardest question Standard +0.3 »
4 A certain train journey takes place every day throughout the year. The time taken, in minutes, for the journey is normally distributed with variance 11.2.
  1. The mean time for a random sample of \(n\) of these journeys was found. A \(94 \%\) confidence interval for the population mean time was calculated and was found to have a width of 1.4076 minutes, correct to 4 decimal places. Find the value of \(n\).
  2. A passenger noted the times for 50 randomly chosen journeys in January, February and March. Give a reason why this sample is unsuitable for use in finding a confidence interval for the population mean time.
  3. A researcher took 4 random samples and a \(94 \%\) confidence interval for the population mean was found from each sample. Find the probability that exactly 3 of these confidence intervals contain the true value of the population mean.
View full question →
Paired comparison or matched samples

Questions involving the same subjects measured twice or matched pairs (e.g., same person testing two bikes, same plots with/without treatment) requiring a paired t-test approach rather than independent samples.

9 Standard +0.5
1.6% of questions
Show example »
3 A new treatment of cotton thread, designed to increase the breaking strength, was tested on a random sample of 6 pieces of a standard length. The breaking strengths, in grams, were as follows. $$\begin{array} { l l l l l l } 17.3 & 18.4 & 18.6 & 17.2 & 17.5 & 19.3 \end{array}$$ The breaking strengths of a random sample of 5 similar pieces of the thread which had not been treated were as follows. \section*{\(\begin{array} { l l l l l } 18.6 & 17.2 & 16.3 & 17.4 & 16.8 \end{array}\)} A test of whether the treatment has been successful is to be carried out.
  1. State what distributional assumptions are needed.
  2. Carry out the test at the \(10 \%\) significance level.
View full question →
Easiest question Standard +0.3 »
6 Ansal is investigating the wingspans of Monarch butterflies in two different regions, \(X\) and \(Y\). He takes a random sample of 8 Monarch butterflies from region \(X\) and records their wingspans, \(x \mathrm {~cm}\). His results are as follows. $$\begin{array} { l l l l l l l l } 8.2 & 7.0 & 7.3 & 8.8 & 7.8 & 8.5 & 9.2 & 7.4 \end{array}$$ Ansal also takes a random sample of 9 Monarch butterflies from region \(Y\) and records their wingspans, \(y \mathrm {~cm}\). His results are summarised as follows. $$\sum y = 71.10 \quad \sum y ^ { 2 } = 567.13$$ Ansal suspects that the mean wingspan of Monarch butterflies from region \(X\) is greater than the mean wingspan of Monarch butterflies from region \(Y\). It is known that the wingspans of Monarch butterflies in regions \(X\) and \(Y\) are normally distributed with equal population variances. Test, at the 10\% significance level, whether Ansal's suspicion is supported by the data. \includegraphics[max width=\textwidth, alt={}, center]{8b2a13d7-62f4-45a7-84c5-7d5bc870b8ce-12_2715_44_110_2006} \includegraphics[max width=\textwidth, alt={}, center]{8b2a13d7-62f4-45a7-84c5-7d5bc870b8ce-13_2726_35_97_20}
If you use the following page to complete the answer to any question, the question number must be clearly shown. \includegraphics[max width=\textwidth, alt={}, center]{8b2a13d7-62f4-45a7-84c5-7d5bc870b8ce-14_2714_38_109_2010}
View full question →
Hardest question Standard +0.8 »
6 A scientist is investigating the masses of a particular type of fish found in lakes \(A\) and \(B\). He chooses a random sample of 10 fish of this type from lake \(A\) and records their masses, \(x \mathrm {~kg}\), as follows.
0.9
1.8
1.8
1.9
2.1
2.4
2.6
2.2
2.5
3.0 The scientist also chooses a random sample of 12 fish of this type from lake \(B\), but he only has a summary of their masses, \(y \mathrm {~kg}\), as follows. $$\sum y = 24.48 \quad \sum y ^ { 2 } = 53.75$$ Test at the \(10 \%\) significance level whether the mean mass of fish of this type in lake \(A\) is greater than the mean mass of fish of this type in lake \(B\). You should state any assumptions that you need to make for the test to be valid.
[0pt] [10]
If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
View full question →
Test using proportion

Test a hypothesis about a population proportion using sample data and normal approximation.

9 Standard +0.1
1.6% of questions
Show example »
5 Each of a random sample of 200 steel bars taken from a production line was examined and 27 were found to be faulty.
  1. Find an approximate \(90 \%\) confidence interval for the proportion of faulty bars produced. A change in the production method was introduced which, it was claimed, would reduce the proportion of faulty bars. After the change, each of a further random sample of 100 bars was examined and 8 were found to be faulty.
  2. Test the claim, at the \(10 \%\) significance level.
View full question →
Easiest question Moderate -0.8 »
11 Zac is planning to write a report on the music preferences of the students at his college. There is a large number of students at the college.
  1. State one reason why Zac might wish to obtain information from a sample of students, rather than from all the students.
  2. Amaya suggests that Zac should use a sample that is stratified by school year. Give one advantage of this method as compared with random sampling, in this context. Zac decides to take a random sample of 60 students from his college. He asks each student how many hours per week, on average, they spend listening to music during term. From his results he calculates the following statistics.
    Mean
    Standard
    deviation
    Median
    Lower
    quartile
    Upper
    quartile
    21.04.2020.518.022.9
  3. Sundip tells Zac that, during term, she spends on average 30 hours per week listening to music. Discuss briefly whether this value should be considered an outlier.
  4. Layla claims that, during term, each student spends on average 20 hours per week listening to music. Zac believes that the true figure is higher than 20 hours. He uses his results to carry out a hypothesis test at the 5\% significance level. Assume that the time spent listening to music is normally distributed with standard deviation 4.20 hours. Carry out the test.
View full question →
Hardest question Challenging +1.2 »
12 The table shows information for England and Wales, taken from the UK 2011 census.
Total populationNumber of children aged 5-17
560759128473617
A random sample of 10000 people in another country was chosen in 2011 , and the number, \(m\), of children aged 5-17 was noted.
It was found that there was evidence at the \(2.5 \%\) level that the proportion of children aged 5-17 in the same year was higher than in the UK.
Unfortunately, when the results were recorded the value of \(m\) was omitted. Use an appropriate normal distribution to find an estimate of the smallest possible value of \(m\). TURN OVER FOR THE NEXT QUESTION
View full question →
Expected number of intervals containing parameter

Calculate how many confidence intervals from multiple samples would be expected to contain the true parameter value.

8 Moderate -0.1
1.4% of questions
Show example »
1 The diameters, \(x\) millimetres, of a random sample of 200 discs made by a certain machine were recorded. The results are summarised below. $$n = 200 \quad \Sigma x = 2520 \quad \Sigma x ^ { 2 } = 31852$$
  1. Calculate a 95\% confidence interval for the population mean diameter.
  2. Jean chose 40 random samples and used each sample to calculate a 95\% confidence interval for the population mean diameter. How many of these 40 confidence intervals would be expected to include the true value of the population mean diameter?
View full question →
State assumptions for validity

State the assumptions necessary for a test or confidence interval to be valid (normality, independence, random sampling, etc.).

7 Moderate -0.4
1.2% of questions
Show example »
2 In the past the yield of a certain crop, in tonnes per hectare, had mean 0.56 and standard deviation 0.08 Following the introduction of a new fertilizer, the farmer intends to test at the \(2.5 \%\) significance level whether the mean yield has increased. He finds that the mean yield over 10 years is 0.61 tonnes per hectare.
  1. State two assumptions that are necessary for the test.
  2. Carry out the test.
View full question →
Confidence interval from coded data

Calculate confidence interval when data has been coded or transformed (e.g., y = x - 1000), then interpret in original units.

7 Standard +0.8
1.2% of questions
Show example »
1 A basketball club has a large number of players. The heights, \(x \mathrm {~m}\), of a random sample of 10 of these players are measured. A \(90 \%\) confidence interval for the population mean height, \(\mu \mathrm { m }\), of players in this club is calculated. It is assumed that heights are normally distributed. The confidence interval is \(1.78 \leqslant \mu \leqslant 2.02\). Find the values of \(\sum x\) and \(\sum x ^ { 2 }\) for this sample.
View full question →
Pooled variance estimation

Calculate pooled estimate of variance from two independent samples assumed to have equal variance.

6 Standard +0.5
1.1% of questions
Show example »
6 The independent random variables \(X\) and \(Y\) have distributions with the same variance \(\sigma ^ { 2 }\). Random samples of 5 observations of \(X\) and \(n\) observations of \(Y\) are made and the results are summarised by $$\Sigma x = 5.5 , \quad \Sigma x ^ { 2 } = 15.05 , \quad \Sigma y = 8.0 , \quad \Sigma y ^ { 2 } = 36.4$$ Given that the pooled estimate of \(\sigma ^ { 2 }\) is 3 , find the value of \(n\).
View full question →
Confidence interval with estimated standard deviation

Questions where the population standard deviation is unknown and must be estimated from sample data (using unbiased estimate of variance or sample standard deviation).

5 Moderate -0.2
0.9% of questions
Show example »
2 A die is biased. The mean and variance of a random sample of 70 scores on this die are found to be 3.61 and 2.70 respectively. Calculate a \(95 \%\) confidence interval for the population mean score.
View full question →
From summary statistics (Σx, Σx²)

Calculate unbiased estimates when Σx and Σx² are already provided, using formulas μ̂ = Σx/n and σ̂² = [Σx² - (Σx)²/n]/(n-1).

2 Challenging +1.0
0.4% of questions
Show example »
  1. A random sample \(W _ { 1 } , W _ { 2 } \ldots , W _ { n }\) is taken from a distribution with mean \(\mu\) and variance \(\sigma ^ { 2 }\)
    1. Write down \(\mathrm { E } \left( \sum _ { i = 1 } ^ { n } W _ { i } \right)\) and show that \(\mathrm { E } \left( \sum _ { i = 1 } ^ { n } W _ { i } ^ { 2 } \right) = n \left( \sigma ^ { 2 } + \mu ^ { 2 } \right)\)
    An estimator for \(\mu\) is $$\bar { X } = \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i }$$
  2. Show that \(\bar { X }\) is a consistent estimator for \(\mu\). An estimator of \(\sigma ^ { 2 }\) is $$U = \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i } ^ { 2 } - \left( \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i } \right) ^ { 2 }$$
  3. Find the bias of \(U\).
  4. Write down an unbiased estimator of \(\sigma ^ { 2 }\) in the form \(k U\), where \(k\) is in terms of \(n\). Turn over
    1. George owns a garage and he records the mileage of cars, \(x\) thousands of miles, between services. The results from a random sample of 10 cars are summarised below.
    $$\sum x = 113.4 \quad \sum x ^ { 2 } = 1414.08$$ The mileage of cars between services is normally distributed and George believes that the standard deviation is 2.4 thousand miles. Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, whether or not these data support George’s belief.
    2. Every 6 months some engineers are tested to see if their times, in minutes, to assemble a particular component have changed. The times taken to assemble the component are normally distributed. A random sample of 8 engineers was chosen and their times to assemble the component were recorded in January and in July. The data are given in the table below. \end{table} Table 1 Figure 1 shows a graph of the power function for the scientist's test.
  5. On the same axes draw the graph of the power function for the statistician's test. Given that it takes 20 minutes to collect and test a 20 ml sample and 15 minutes to collect and test a 10 ml sample
  6. show that the expected time of the statistician's test is slower than the scientist's test for \(\lambda \mathrm { e } ^ { - \lambda } > \frac { 1 } { 3 }\)
  7. By considering the times when \(\lambda = 1\) and \(\lambda = 2\) together with the power curves in part (e) suggest, giving a reason, which test you would use.
    (2) \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{a1841cf5-93f3-4043-b6ed-651168b13b87-93_1179_1152_1455_395} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure}
    1. The carbon content, measured in suitable units, of steel is normally distributed. Two independent random samples of steel were taken from a refining plant at different times and their carbon content recorded. The results are given below.
    Sample \(A : \quad 1.5 \quad 0.9 \quad 1.3 \quad 1.2\) \(\begin{array} { l l l l l l l } \text { Sample } B : & 0.4 & 0.6 & 0.8 & 0.3 & 0.5 & 0.4 \end{array}\)
  8. Stating your hypotheses clearly, carry out a suitable test, at the \(10 \%\) level of significance, to show that both samples can be assumed to have come from populations with a common variance \(\sigma ^ { 2 }\).
  9. Showing your working clearly, find the \(99 \%\) confidence interval for \(\sigma ^ { 2 }\) based on both samples.
View full question →
From frequency table

Calculate unbiased estimates from grouped or discrete frequency distributions, requiring calculation of Σfx and Σfx² from the table.

1 Standard +0.3
0.2% of questions
Show example »
  1. Kaff coffee is sold in packets. A seller measures the masses of the contents of a random sample of 90 packets of Kaff coffee from her stock. The results are shown in the table below.
Mass \(w ( \mathrm {~g} )\)Midpoint \(y ( \mathrm {~g} )\)Frequency f
\(240 \leq w < 245\)242.58
\(245 \leq w < 248\)246.515
\(248 \leq w < 252\)250.035
\(252 \leq w < 255\)253.523
\(255 \leq w < 260\)257.59
$$\text { (You may use } \sum \mathrm { fy } ^ { 2 } = 5644 \text { 171.75) }$$ A histogram is drawn and the class \(245 \leq w < 248\) is represented by a rectangle of width 1.2 cm and height 10 cm .
  1. Calculate the width and the height of the rectangle representing the class \(255 \leq w < 260\).
  2. Use linear interpolation to estimate the median mass of the contents of a packet of Kaff coffee to 1 decimal place.
  3. Estimate the mean and the standard deviation of the mass of the contents of a packet of Kaff coffee to 1 decimal place. The seller claims that the mean mass of the contents of the packets is more than the stated mass. Given that the stated mass of the contents of a packet of Kaff coffee is 250 g and the actual standard deviation of the contents of a packet of Kaff coffee is 4 g ,
  4. test, using a 5\% level of significance, whether or not the seller's claim is justified. State your hypotheses clearly.
    (You may assume that the mass of the contents of a packet is normally distributed.)
  5. Using your answers to parts (b) and (c), comment on the assumption that the mass of the contents of a packet is normally distributed.
    (Total 14 marks)
View full question →
From raw data values

Calculate unbiased estimates when given individual data values (not summary statistics), requiring calculation of Σx and Σx² first.

1 Moderate -0.5
0.2% of questions
Show example »
1 The lengths, \(X\) centimetres, of a random sample of 7 leaves from a certain variety of tree are as follows.
3.9
4.8
4.8
4.4
View full question →
Two-sample t-test with raw data

Questions providing complete raw data for one or both samples where students must first calculate summary statistics before performing the hypothesis test, typically with small samples.

0
0.0% of questions
Unclassified

Questions not yet assigned to a type.

25
4.4% of questions
Show 25 unclassified »
8 The tensile strength of rope is measured in kilograms. The standard deviation of the tensile strength of a particular design of 10 mm diameter rope is known to be 285 kilograms. A retail organisation, which buys such rope from two manufacturers, A and B , wishes to compare their ropes for mean tensile strength. The mean tensile strength, \(\bar { x }\), of a random sample of 80 lengths from manufacturer A was 3770 kilograms. The mean tensile strength, \(\bar { y }\), of a random sample of 120 lengths from manufacturer B was 3695 kilograms.
    1. Test, at the \(5 \%\) level of significance, the hypothesis that there is no difference between the mean tensile strength of rope from manufacturer A and that of rope from manufacturer B.
    2. Why was it not necessary to know the distributions of tensile strength in order for your test in part (a)(i) to be valid?
    1. Deduce that, for your test in part (a)(i), the critical values of \(( \bar { x } - \bar { y } )\) are \(\pm 80.63\), correct to two decimal places.
    2. In fact, the mean tensile strength of rope from manufacturer A exceeds that of rope from manufacturer B by 125 kilograms. Determine the probability of a Type II error for a test of the hypothesis in part (a)(i) at the \(5 \%\) level of significance, based upon a random sample of 80 lengths from manufacturer A and a random sample of 120 lengths from manufacturer B. (4 marks)
7 The masses, in grams, of apples from a certain farm have mean \(\mu\) and standard deviation 5.2. The farmer says that the value of \(\mu\) is 64.6. A quality control inspector claims that the value of \(\mu\) is actually less than 64.6. In order to test his claim he chooses a random sample of 100 apples from the farm.
  1. The mean mass of the 100 apples is found to be 63.5 g . Carry out the test at the \(2.5 \%\) significance level.
  2. Later another test of the same hypotheses at the \(2.5 \%\) significance level, with another random sample of 100 apples from the same farm, is carried out. Given that the value of \(\mu\) is in fact 62.7 , calculate the probability of a Type II error.
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
6 In previous years, the marks obtained in a French test by students attending Topnotch College have been modelled satisfactorily by a normal distribution with a mean of 65 and a standard deviation of 9 . Teachers in the French department at Topnotch College suspect that this year their students are, on average, underachieving. In order to investigate this suspicion, the teachers selected a random sample of 35 students to take the French test and found that their mean score was 61.5.
  1. Investigate, at the \(5 \%\) level of significance, the teachers' suspicion.
  2. Explain, in the context of this question, the meaning of a Type I error.
8 Bottles of sherry nominally contain 1000 millilitres. After the introduction of a new method of filling the bottles, there is a suspicion that the mean volume of sherry in a bottle has changed. In order to investigate this suspicion, a random sample of 12 bottles of sherry is taken and the volume of sherry in each bottle is measured. The volumes, in millilitres, of sherry in these bottles are found to be
9961006100999910071003
998101099799610081007
Assuming that the volume of sherry in a bottle is normally distributed, investigate, at the \(5 \%\) level of significance, whether the mean volume of sherry in a bottle differs from 1000 millilitres.
7 A market researcher is investigating the length of time that customers spend at an information desk. He plans to choose a sample of 50 customers on a particular day.
  1. He considers choosing the first 50 customers who visit the information desk. Explain why this method is unsuitable.
    The actual lengths of time, in minutes, that customers spend at the information desk may be assumed to have mean \(\mu\) and variance 4.8. The researcher knows that in the past the value of \(\mu\) was 6.0. He wishes to test, at the \(2 \%\) significance level, whether this is still true. He chooses a random sample of 50 customers and notes how long they each spend at the information desk.
  2. State the probability of making a Type I error and explain what is meant by a Type I error in this context.
  3. Given that the mean time spent at the information desk by the 50 customers is 6.8 minutes, carry out the test.
  4. Give a reason why it was necessary to use the Central Limit theorem in your answer to part (c).
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
2 In the past, the time, in hours, for a particular train journey has had mean 1.40 and standard deviation 0.12 . Following the introduction of some new signals, it is required to test whether the mean journey time has decreased.
  1. State what is meant by a Type II error in this context.
  2. The mean time for a random sample of 50 journeys is found to be 1.36 hours. Assuming that the standard deviation of journey times is still 0.12 hours, test at the \(2.5 \%\) significance level whether the population mean journey time has decreased.
  3. State, with a reason, which of the errors, Type I or Type II, might have been made in the test in part (b).
4 The lengths, in millimetres, of rods produced by a machine are normally distributed with mean \(\mu\) and standard deviation 0.9. A random sample of 75 rods produced by the machine has mean length 300.1 mm .
  1. Find a \(99 \%\) confidence interval for \(\mu\), giving your answer correct to 2 decimal places.
    The manufacturer claims that the machine produces rods with mean length 300 mm .
  2. Use the confidence interval found in part (i) to comment on this claim.
8 In order to test the effect of a drug, a researcher monitors the concentration, \(X\), of a certain protein in the blood stream of patients. For patients who are not taking the drug the mean value of \(X\) is 0.185 . A random sample of 150 patients taking the drug was selected and the values of \(X\) were found. The results are summarised below. $$n = 150 \quad \Sigma x = 27.0 \quad \Sigma x ^ { 2 } = 5.01$$ The researcher wishes to test at the \(1 \%\) significance level whether the mean concentration of the protein in the blood stream of patients taking the drug is less than 0.185 .
  1. Carry out the test.
  2. Given that, in fact, the mean concentration for patients taking the drug is 0.175 , find the probability of a Type II error occurring in the test.
7 A mill owner claims that the mean mass of sacks of flour produced at his mill is 51 kg . A quality control officer suspects that the mean mass is actually less than 51 kg . In order to test the owner's claim she finds the mass, \(x \mathrm {~kg}\), of each of a random sample of 150 sacks and her results are summarised as follows. $$n = 150 \quad \Sigma x = 7480 \quad \Sigma x ^ { 2 } = 380000$$
  1. Carry out the test at the \(2.5 \%\) significance level.
    You may now assume that the population standard deviation of the masses of sacks of flour is 6.856 kg . The quality control officer weighs another random sample of 150 sacks and carries out another test at the 2.5\% significance level.
  2. Given that the population mean mass is 49 kg , find the probability of a Type II error.
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
4 A company has two different machines, \(X\) and \(Y\), each of which fills empty cups with coffee. The manager is investigating the volumes of coffee, \(x\) and \(y\), measured in appropriate units, in the cups filled by machines \(X\) and \(Y\) respectively. She chooses a random sample of 50 cups filled by machine \(X\) and a random sample of 40 cups filled by machine \(Y\). The volumes are summarised as follows. $$\sum x = 15.2 \quad \sum x ^ { 2 } = 5.1 \quad \sum y = 13.4 \quad \sum y ^ { 2 } = 4.8$$ The manager claims that there is no difference between the mean volume of coffee in cups filled by machine \(X\) and the mean volume of coffee in cups filled by machine \(Y\). Test the manager's claim at the \(10 \%\) significance level.
5 A large number of children are competing in a throwing competition. The distances, in metres, thrown by a random sample of 8 children are as follows. \(\begin{array} { l l l l l l l l } 19.8 & 22.1 & 24.4 & 21.5 & 20.8 & 26.3 & 23.7 & 25.0 \end{array}\)
  1. Assuming that distances are normally distributed, test, at the \(5 \%\) significance level, whether the population mean distance thrown is more than 22.0 metres.
  2. Find a 95\% confidence interval for the population mean distance thrown.
1 A random sample of 7 observations of a variable \(X\) are as follows. $$\begin{array} { l l l l l l l } 8.26 & 7.78 & 7.92 & 8.04 & 8.27 & 7.95 & 8.34 \end{array}$$ The population mean of \(X\) is \(\mu\).
  1. Test, at the \(10 \%\) significance level, the null hypothesis \(\mu = 8.22\) against the alternative hypothesis \(\mu < 8.22\).
  2. State an assumption necessary for the test in part (a) to be valid.
4 A scientist is investigating the lengths of the leaves of birch trees in different regions. He takes a random sample of 50 leaves from birch trees in region \(A\) and a random sample of 60 leaves from birch trees in region \(B\). He records their lengths in \(\mathrm { cm } , x\) and \(y\), respectively. His results are summarised as follows. $$\sum x = 282 \quad \sum x ^ { 2 } = 1596 \quad \sum y = 328 \quad \sum y ^ { 2 } = 1808$$ The population mean lengths of leaves from birch trees in regions \(A\) and \(B\) are \(\mu _ { A } \mathrm {~cm}\) and \(\mu _ { B } \mathrm {~cm}\) respectively. Carry out a test at the \(5 \%\) significance level to test the null hypothesis \(\mu _ { \mathrm { A } } = \mu _ { \mathrm { B } }\) against the alternative hypothesis \(\mu _ { \mathrm { A } } \neq \mu _ { \mathrm { B } }\).
1 A manager is investigating the times taken by employees to complete a particular task as a result of the introduction of new technology. He claims that the mean time taken to complete the task is reduced by more than 0.4 minutes. He chooses a random sample of 10 employees. The times taken, in minutes, before and after the introduction of the new technology are recorded in the table.
Employee\(A\)\(B\)\(C\)D\(E\)\(F\)G\(H\)IJ
Time before new technology10.29.812.411.610.811.214.610.612.311.0
Time after new technology9.68.512.410.910.210.612.810.812.510.6
  1. Test at the 10\% significance level whether the manager's claim is justified.
  2. State an assumption that is necessary for this test to be valid.
5 Raman is researching the heights of male giraffes in a particular region. Raman assumes that the heights of male giraffes in this region are normally distributed. He takes a random sample of 8 male giraffes from the region and measures the height, in metres, of each giraffe. These heights are as follows. $$\begin{array} { c c c c c c c c } 5.2 & 5.8 & 4.9 & 6.1 & 5.5 & 5.9 & 5.4 & 5.6 \end{array}$$
  1. Find a \(90 \%\) confidence interval for the population mean height of male giraffes in this region. [5]
    Raman claims that the population mean height of male giraffes in the region is less than 5.9 metres.
  2. Test at the \(2.5 \%\) significance level whether this sample provides sufficient evidence to support Raman's claim.
1 The lengths of the leaves of a particular type of tree are normally distributed with mean \(\mu \mathrm { cm }\). The lengths, \(x \mathrm {~cm}\), of a random sample of 12 leaves of this type are recorded. The results are summarised as follows. $$\sum x = 91.2 \quad \sum x ^ { 2 } = 695.8$$ Find a 95\% confidence interval for \(\mu\).
2 The children at two large schools, \(P\) and \(Q\), are all given the same puzzle to solve. A random sample of size 10 is taken from the children at school \(P\). Their individual times to complete the puzzle give a sample mean of 9.12 minutes and an unbiased variance estimate of 2.16 minutes \({ } ^ { 2 }\). A random sample of size 12 is taken from the children at school \(Q\). Their individual times, \(x\) minutes, to complete the puzzle are summarised by $$\sum x = 99.6 \quad \sum ( x - \bar { x } ) ^ { 2 } = 21.5$$ where \(\bar { x }\) is the sample mean. Times to complete the puzzle are assumed to be normally distributed with the same population variance. Test at the \(5 \%\) significance level whether the population mean time taken to complete the puzzle by children at school \(P\) is greater than the population mean time taken to complete the puzzle by children at school \(Q\).
1 The times taken by members of a large cycling club to complete a cross-country circuit have a normal distribution with mean \(\mu\) minutes. The times taken, \(x\) minutes, are recorded for a random sample of 14 members of the club. The results are summarised as follows, where \(\bar { x }\) is the sample mean. $$\bar { x } = 42.8 \quad \sum ( x - \bar { x } ) ^ { 2 } = 941.5$$ Find a 95\% confidence interval for \(\mu\).
6 Jade is a swimming instructor at a sports college. She claims that, as a result of an intensive training course, the mean time taken by students to swim 50 metres has reduced by more than 1 second. She chooses a random sample of 10 students. The times taken, in seconds, before and after the training course are recorded in the table.
StudentABCD\(E\)\(F\)G\(H\)IJ
Time before course54.247.452.159.055.351.048.952.258.451.4
Time after course50.146.352.558.851.448.449.548.758.351.4
  1. Test, at the 10\% significance level, whether Jade's claim is justified.
  2. State an assumption that is necessary for this test to be valid.
1 Kayla is investigating the lengths of the leaves of a certain type of tree found in two forests \(X\) and \(Y\). She chooses a random sample of 40 leaves of this type from forest \(X\) and records their lengths, \(x \mathrm {~cm}\). She also records the lengths, \(y \mathrm {~cm}\), for a random sample of 60 leaves of this type from forest \(Y\). Her results are summarised as follows. $$\sum x = 242.0 \quad \sum x ^ { 2 } = 1587.0 \quad \sum y = 373.2 \quad \sum y ^ { 2 } = 2532.6$$ Find a \(90 \%\) confidence interval for the difference between the population mean lengths of leaves in forests \(X\) and \(Y\).
4 Members of the Sprints athletics club have been taking part in an intense training scheme, aimed at reducing their times taken to run 400 m . For a random sample of 9 athletes from the club, the times taken, in seconds, before and after the training scheme are given in the following table.
Athlete\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)
Time before48.848.250.349.649.448.947.650.348.4
Time after47.947.849.649.149.648.947.749.148.1
The organiser of the training scheme claims that on average an athlete's time will be reduced by at least 0.3 seconds. Test at the 10\% significance level whether the organiser's claim is justified, stating any assumption that you make.
1 The times taken for students at a college to run 200 m have a normal distribution with mean \(\mu \mathrm { s }\). The times, \(x\) s, are recorded for a random sample of 10 students from the college. The results are summarised as follows, where \(\bar { x }\) is the sample mean. $$\bar { x } = 25.6 \quad \sum ( x - \bar { x } ) ^ { 2 } = 78.5$$
  1. Find a 90\% confidence interval for \(\mu\).
    A test of the null hypothesis \(\mu = k\) is carried out on this sample, using a \(10 \%\) significance level. The test does not support the alternative hypothesis \(\mu < k\).
  2. Find the greatest possible value of \(k\).
4 Manet has developed a new training course to help athletes improve their time taken to run 800 m . Manet claims that his course will decrease an athlete's time by more than 2 s on average. For a random sample of 10 athletes the times taken, in seconds, before and after the course are given in the following table.
Athlete\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
Before150146131135126142130129137134
After145138129135122135132128127137
Use a \(t\)-test, at the \(5 \%\) significance level, to test whether Manet's claim is justified, stating any assumption that you make.
1 Jasmine is researching the heights of pine trees in forests in two regions \(A\) and \(B\). She chooses a random sample of 50 pine trees in region \(A\) and records their heights, \(x \mathrm {~m}\). She also chooses a random sample of 60 pine trees in region \(B\) and records their heights, \(y \mathrm {~m}\). Her results are summarised as follows. $$\sum x = 1625 \quad \sum x ^ { 2 } = 53200 \quad \sum y = 1854 \quad \sum y ^ { 2 } = 57900$$ Find a \(95 \%\) confidence interval for the difference between the population mean heights of pine trees in regions \(A\) and \(B\).
6 A company manufactures copper pipes. The pipes are produced by two different machines, \(A\) and \(B\). An inspector claims that the mean diameter of the pipes produced by machine \(A\) is greater than the mean diameter of the pipes produced by machine \(B\). He takes a random sample of 12 pipes produced by machine \(A\) and measures their diameters, \(x \mathrm {~cm}\). His results are summarised as follows. $$\sum x = 6.24 \quad \sum x ^ { 2 } = 3.26$$ He also takes a random sample of 10 pipes produced by machine \(B\) and measures their diameters in cm. His results are as follows. $$\begin{array} { l l l l l l l l l l } 0.48 & 0.53 & 0.47 & 0.54 & 0.54 & 0.55 & 0.46 & 0.55 & 0.50 & 0.48 \end{array}$$ The diameters of the pipes produced by each machine are assumed to be normally distributed with equal population variances. Test at the \(2.5 \%\) significance level whether the data supports the inspector's claim.
If you use the following page to complete the answer to any question, the question number must be clearly shown.