5.05c Hypothesis test: normal distribution for population mean

681 questions

Sort by: Default | Easiest first | Hardest first
AQA Further Paper 3 Statistics 2024 June Q8
5 marks Moderate -0.3
8
16
256 2 The random variable \(T\) has an exponential distribution with mean 2 Find \(\mathrm { P } ( T \leq 1.4 )\) Circle your answer. \(\mathrm { e } ^ { - 2.8 }\) \(\mathrm { e } ^ { - 0.7 }\) \(1 - e ^ { - 0.7 }\) \(1 - \mathrm { e } ^ { - 2.8 }\) The continuous random variable \(Y\) has cumulative distribution function $$\mathrm { F } ( y ) = \left\{ \begin{array} { l r } 0 & y < 2 \\ - \frac { 1 } { 9 } y ^ { 2 } + \frac { 10 } { 9 } y - \frac { 16 } { 9 } & 2 \leq y < 5 \\ 1 & y \geq 5 \end{array} \right.$$ Find the median of \(Y\) Circle your answer. 2 \(\frac { 10 - 3 \sqrt { 2 } } { 2 }\) \(\frac { 7 } { 2 }\) \(\frac { 10 + 3 \sqrt { 2 } } { 2 }\) Turn over for the next question 4 Research has shown that the mean number of volcanic eruptions on Earth each day is 20 Sandra records 162 volcanic eruptions during a period of one week. Sandra claims that there has been an increase in the mean number of volcanic eruptions per week. Test Sandra's claim at the \(5 \%\) level of significance.
5 The continuous random variable \(X\) has probability density function $$f ( x ) = \begin{cases} \frac { 1 } { 6 } e ^ { \frac { x } { 3 } } & 0 \leq x \leq \ln 27 \\ 0 & \text { otherwise } \end{cases}$$ Show that the mean of \(X\) is \(\frac { 3 } { 2 } ( \ln 27 - 2 )\) 6 Over time it has been accepted that the mean retirement age for professional baseball players is 29.5 years old. Imran claims that the mean retirement age is no longer 29.5 years old.
He takes a random sample of 5 recently retired professional baseball players and records their retirement ages, \(x\). The results are $$\sum x = 152.1 \quad \text { and } \quad \sum ( x - \bar { x } ) ^ { 2 } = 7.81$$ 6
  1. State an assumption that you should make about the distribution of the retirement ages to investigate Imran's claim. 6
  2. Investigate Imran's claim, using the 10\% level of significance.
AQA Further Paper 3 Statistics 2024 June Q16
Moderate -0.8
16
256 2 The random variable \(T\) has an exponential distribution with mean 2 Find \(\mathrm { P } ( T \leq 1.4 )\) Circle your answer. \(\mathrm { e } ^ { - 2.8 }\) \(\mathrm { e } ^ { - 0.7 }\) \(1 - e ^ { - 0.7 }\) \(1 - \mathrm { e } ^ { - 2.8 }\) The continuous random variable \(Y\) has cumulative distribution function $$\mathrm { F } ( y ) = \left\{ \begin{array} { l r } 0 & y < 2 \\ - \frac { 1 } { 9 } y ^ { 2 } + \frac { 10 } { 9 } y - \frac { 16 } { 9 } & 2 \leq y < 5 \\ 1 & y \geq 5 \end{array} \right.$$ Find the median of \(Y\) Circle your answer. 2 \(\frac { 10 - 3 \sqrt { 2 } } { 2 }\) \(\frac { 7 } { 2 }\) \(\frac { 10 + 3 \sqrt { 2 } } { 2 }\) Turn over for the next question 4 Research has shown that the mean number of volcanic eruptions on Earth each day is 20 Sandra records 162 volcanic eruptions during a period of one week. Sandra claims that there has been an increase in the mean number of volcanic eruptions per week. Test Sandra's claim at the \(5 \%\) level of significance.
5 The continuous random variable \(X\) has probability density function $$f ( x ) = \begin{cases} \frac { 1 } { 6 } e ^ { \frac { x } { 3 } } & 0 \leq x \leq \ln 27 \\ 0 & \text { otherwise } \end{cases}$$ Show that the mean of \(X\) is \(\frac { 3 } { 2 } ( \ln 27 - 2 )\) 6 Over time it has been accepted that the mean retirement age for professional baseball players is 29.5 years old. Imran claims that the mean retirement age is no longer 29.5 years old.
He takes a random sample of 5 recently retired professional baseball players and records their retirement ages, \(x\). The results are $$\sum x = 152.1 \quad \text { and } \quad \sum ( x - \bar { x } ) ^ { 2 } = 7.81$$ 6
  1. State an assumption that you should make about the distribution of the retirement ages to investigate Imran's claim. 6
  2. Investigate Imran's claim, using the 10\% level of significance.
WJEC Further Unit 5 2022 June Q5
13 marks Standard +0.3
5. A laboratory carrying out screening for a certain blood disorder claims that the average time taken for test results to be returned is 38 hours. A reporter for a national newspaper suspects that the results take longer, on average, to be returned than claimed by the laboratory. The reporter finds the time, \(x\) hours, for 50 randomly selected results, in order to conduct a hypothesis test. The following summary statistics were obtained. $$\sum x = 2163 \quad \sum x ^ { 2 } = 98508$$
  1. Calculate the \(p\)-value for the reporter's hypothesis test, and complete the test using a \(5 \%\) level of significance. Hence write a headline for the reporter to use.
  2. Explain the relevance or otherwise of the Central Limit Theorem to your answer in part (a).
  3. Briefly explain why a random sample is preferable to taking a batch of 50 consecutive results.
  4. On another occasion, the reporter took a different random sample of 10 results.
    1. State, with a reason, what type of hypothesis test the reporter should use on this occasion.
    2. State one assumption required to carry out this test.
WJEC Further Unit 5 2022 June Q7
19 marks Challenging +1.2
7. \includegraphics[max width=\textwidth, alt={}, center]{65369843-222f-48b2-b8cd-a1c304eac3d9-6_707_718_347_660} The diagram above shows a cyclic quadrilateral \(A B C D\), where \(\widehat { B A D } = \alpha , \widehat { B C D } = \beta\) and \(\alpha + \beta = 180 ^ { \circ }\). These angles are measured.
The random variables \(X\) and \(Y\) denote the measured values, in degrees, of \(\widehat { B A D }\) and \(\widehat { B C D }\) respectively. You are given that \(X\) and \(Y\) are independently normally distributed with standard deviation \(\sigma\) and means \(\alpha\) and \(\beta\) respectively.
  1. Calculate, correct to two decimal places, the probability that \(X + Y\) will differ from \(180 ^ { \circ }\) by less than \(\sigma\).
  2. Show that \(T _ { 1 } = 45 ^ { \circ } + \frac { 1 } { 4 } ( 3 X - Y )\) is an unbiased estimator for \(\alpha\) and verify that it is a better estimator than \(X\) for \(\alpha\).
  3. Now consider \(T _ { 2 } = \lambda X + ( 1 - \lambda ) \left( 180 ^ { \circ } - Y \right)\).
    1. Show that \(T _ { 2 }\) is an unbiased estimator for \(\alpha\) for all values of \(\lambda\).
    2. Find \(\operatorname { Var } \left( T _ { 2 } \right)\) in terms of \(\lambda\) and \(\sigma\).
    3. Hence determine the value of \(\lambda\) which gives the best unbiased estimator for \(\alpha\).
Pre-U Pre-U 9795/2 2011 June Q3
10 marks Moderate -0.8
3 The fuel economy of two similar cars produced by manufacturers \(A\) and \(B\) was compared. A random sample of 15 cars was selected from manufacturer \(A\) and a random sample of 10 cars was selected from manufacturer \(B\). All the selected cars were driven over the same distance and the petrol consumption in miles per gallon (mpg) was calculated for each car. The results, \(x _ { A } \operatorname { mpg }\) and \(x _ { B } \operatorname { mpg }\) for cars from manufacturers \(A\) and \(B\) respectively, are summarised below, where \(\bar { x }\) denotes the sample mean and \(n\) the sample size. $$\begin{array} { l l l } \Sigma x _ { A } = 460.5 & \Sigma \left( x _ { A } - \bar { x } _ { A } \right) ^ { 2 } = 156.88 & n _ { A } = 15 \\ \Sigma x _ { B } = 334 & \Sigma \left( x _ { B } - \bar { x } _ { B } \right) ^ { 2 } = 123.97 & n _ { B } = 10 \end{array}$$
  1. (a) Assuming that the populations are normally distributed with a common variance, show that the pooled estimate of this common variance is 12.21 , correct to 4 significant figures. [2]
    (b) Construct a 95\% confidence interval for \(\mu _ { B } - \mu _ { A }\), the difference in the population means for manufacturers \(A\) and \(B\).
  2. Comment on a claim that the fuel economy for manufacturer \(B\) 's cars is better than that for manufacturer \(A\) 's cars.
  1. A random variable \(X\) has probability density function given by $$\mathrm { f } ( x ) = \begin{cases} \frac { 1 } { \theta } \mathrm { e } ^ { - \frac { x } { \theta } } & x \geqslant 0 \\ 0 & x < 0 \end{cases}$$ where \(\theta\) is a positive constant. Find \(\mathrm { E } \left( X ^ { 2 } \right)\).
  2. A random sample \(X _ { 1 } , X _ { 2 } , \ldots , X _ { n }\) is taken from a population with the distribution in part (i). The estimator \(T\) is defined by \(T = k \sum _ { i = 1 } ^ { n } X _ { i } ^ { 2 }\), where \(k\) is a constant. Find the value of \(k\) such that \(T\) is an unbiased estimator of \(\theta ^ { 2 }\).
  1. The discrete random variable \(X\) has distribution \(\operatorname { Geo } ( p )\). Show that the moment generating function of \(X\) is given by \(\mathrm { M } _ { X } ( t ) = \frac { p \mathrm { e } ^ { t } } { 1 - q \mathrm { e } ^ { t } }\), where \(q = 1 - p\).
  2. Use the moment generating function to find
    1. \(\mathrm { E } ( X )\),
    2. \(\operatorname { Var } ( X )\).
    3. An unbiased six-sided die is thrown repeatedly until a five is obtained, and \(Y\) denotes the number of throws up to and including the throw on which the five is obtained. Find \(\mathrm { P } ( | Y - \mu | < \sigma )\), where \(\mu\) and \(\sigma\) are the mean and standard deviation, respectively, of the distribution of \(Y\).
    1. The continuous random variable \(X\) has a uniform distribution over the interval \(0 < x < \frac { 1 } { 2 } \pi\). Show that the probability density function of \(Y\), where \(Y = \sin X\), is given by $$\mathrm { f } ( y ) = \begin{cases} \frac { 2 } { \pi \sqrt { 1 - y ^ { 2 } } } & 0 < y < 1 \\ 0 & \text { otherwise. } \end{cases}$$
    2. Deduce, using the probability density function, the exact values of

    (a) the median value of \(Y\),
    (b) \(\mathrm { E } ( Y )\).
Pre-U Pre-U 9795/2 2013 November Q5
Standard +0.3
5 The random variable \(X\) has a binomial distribution with parameters \(n\) and \(p\), where \(p > 0.5\). A random sample of \(4 n\) observations of \(X\) is taken and \(\bar { X }\) denotes the sample mean. It is given that \(\mathrm { E } ( \bar { X } ) = 180\) and \(\operatorname { Var } ( \bar { X } ) = 0.0225\).
  1. Find
    1. the values of \(p\) and \(n\),
    2. \(\mathrm { P } ( \bar { X } < 179.8 )\),
    3. the value of \(a\) for which \(\mathrm { P } ( 180 - a < \bar { X } < 180 + a ) = 0.99\), giving your answer correct to 2 decimal places.
    4. State how you have used the Central Limit Theorem in part (i).
Pre-U Pre-U 9795/2 2019 Specimen Q4
3 marks Standard +0.3
4 The independent random variables \(X\) and \(Y\) have normal distributions where \(X \sim \mathrm {~N} \left( \mu , \sigma ^ { 2 } \right)\) and \(Y \sim \mathrm {~N} \left( 3 \mu , 4 \sigma ^ { 2 } \right)\). Two random samples each of size \(n\) are taken, one from each of these normal populations.
  1. Show that \(a \bar { X } + b \bar { Y }\) is an unbiased estimator of \(\mu\) provided that \(a + 3 b = 1\), where \(a\) and \(b\) are constants and \(\bar { X }\) and \(\bar { Y }\) are the respective sample means. In the remainder of the question assume that \(a \bar { X } + b \bar { Y }\) is an unbiased estimator of \(\mu\).
  2. Show that \(\operatorname { Var } ( \overline { a X } + b \bar { Y } )\) can be written as \(\frac { \sigma ^ { 2 } } { n } \left( 1 - 6 b + 13 b ^ { 2 } \right)\).
  3. The value of the constant \(b\) can be varied. Find the value of \(b\) that gives the minimum of \(\operatorname { Var } ( a \bar { X } + b \bar { Y } )\), and hence find the minimum of \(\operatorname { Var } ( a \bar { X } + b \bar { Y } )\) in terms of \(\sigma\) and \(n\).
CAIE FP2 2012 June Q7
8 marks Standard +0.3
A random sample of 8 swimmers from a swimming club were timed over a distance of 100 metres, once in an outdoor pool and once in an indoor pool. Their times, in seconds, are given in the following table.
Swimmer\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Outdoor time66.262.460.865.468.864.365.267.2
Indoor time66.160.360.965.266.463.862.469.8
Assuming a normal distribution, test, at the 5% significance level, whether there is a non-zero difference between mean time in the outdoor pool and mean time in the indoor pool. [8]
CAIE FP2 2012 June Q9
10 marks Standard +0.3
A random sample of 8 observations of a normal random variable \(X\) gave the following summarised data, where \(\overline{x}\) denotes the sample mean. $$\Sigma x = 42.5 \quad \Sigma(x - \overline{x})^2 = 15.519$$ Test, at the 5% significance level, whether the population mean of \(X\) is greater than 4.5. [7] Calculate a 95% confidence interval for the population mean of \(X\). [3]
CAIE FP2 2012 June Q7
8 marks Standard +0.3
A random sample of 8 swimmers from a swimming club were timed over a distance of 100 metres, once in an outdoor pool and once in an indoor pool. Their times, in seconds, are given in the following table.
Swimmer\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Outdoor time66.262.460.865.468.864.365.267.2
Indoor time66.160.360.965.266.463.862.469.8
Assuming a normal distribution, test, at the 5\% significance level, whether there is a non-zero difference between mean time in the outdoor pool and mean time in the indoor pool. [8]
CAIE FP2 2012 June Q9
10 marks Standard +0.3
A random sample of 8 observations of a normal random variable \(X\) gave the following summarised data, where \(\bar{x}\) denotes the sample mean. $$\Sigma x = 42.5 \quad \Sigma(x - \bar{x})^2 = 15.519$$ Test, at the 5\% significance level, whether the population mean of \(X\) is greater than 4.5. [7] Calculate a 95\% confidence interval for the population mean of \(X\). [3]
CAIE FP2 2017 June Q7
7 marks Standard +0.8
A farmer grows a particular type of fruit tree. On average, the mass of fruit produced per tree has been 6.2 kg. He has developed a new kind of soil and claims that the mean mass of fruit produced per tree when growing in this new soil has increased. A random sample of 10 trees grown in the new soil is chosen. The masses, \(x\) kg, of fruit produced are summarised as follows. $$\Sigma x = 72.0 \qquad \Sigma x^2 = 542.0$$ Test at the 5% significance level whether the farmer's claim is justified, assuming a normal distribution. [7]
CAIE FP2 2017 June Q9
10 marks Challenging +1.2
Two fish farmers \(X\) and \(Y\) produce a particular type of fish. Farmer \(X\) chooses a random sample of 8 of his fish and records the masses, \(x\) kg, as follows. 1.2 \quad 1.4 \quad 0.8 \quad 2.1 \quad 1.8 \quad 2.6 \quad 1.5 \quad 2.0 Farmer \(Y\) chooses a random sample of 10 of his fish and summarises the masses, \(y\) kg, as follows. $$\Sigma y = 20.2 \qquad \Sigma y^2 = 44.6$$ You should assume that both distributions are normal with equal variances. Test at the 10% significance level whether the mean mass of fish produced by farmer \(X\) differs from the mean mass of fish produced by farmer \(Y\). [10]
CAIE FP2 2017 June Q7
7 marks Standard +0.8
A farmer grows a particular type of fruit tree. On average, the mass of fruit produced per tree has been 6.2 kg. He has developed a new kind of soil and claims that the mean mass of fruit produced per tree when growing in this new soil has increased. A random sample of 10 trees grown in the new soil is chosen. The masses, \(x\) kg, of fruit produced are summarised as follows. $$\Sigma x = 72.0 \quad \Sigma x^2 = 542.0$$ Test at the 5% significance level whether the farmer's claim is justified, assuming a normal distribution. [7]
CAIE FP2 2017 June Q9
10 marks Challenging +1.2
Two fish farmers \(X\) and \(Y\) produce a particular type of fish. Farmer \(X\) chooses a random sample of 8 of his fish and records the masses, \(x\) kg, as follows. 1.2 \quad 1.4 \quad 0.8 \quad 2.1 \quad 1.8 \quad 2.6 \quad 1.5 \quad 2.0 Farmer \(Y\) chooses a random sample of 10 of his fish and summarises the masses, \(y\) kg, as follows. $$\Sigma y = 20.2 \quad \Sigma y^2 = 44.6$$ You should assume that both distributions are normal with equal variances. Test at the 10% significance level whether the mean mass of fish produced by farmer \(X\) differs from the mean mass of fish produced by farmer \(Y\). [10]
CAIE FP2 2017 June Q6
5 marks Standard +0.8
The independent variables \(X\) and \(Y\) have distributions with the same variance \(\sigma^2\). Random samples of \(N\) observations of \(X\) and \(2N\) observations of \(Y\) are taken, and the results are summarised by $$\Sigma x = 4, \quad \Sigma x^2 = 10, \quad \Sigma y = 8, \quad \Sigma y^2 = 102.$$ These data give a pooled estimate of \(10\) for \(\sigma^2\). Find \(N\). [5]
CAIE FP2 2019 June Q8
8 marks Standard +0.3
A large number of runners are attending a summer training camp. A random sample of 6 runners is chosen and their times to run 1500 m at the beginning of the camp and at the end of the camp are recorded. Their times, in minutes, are shown in the following table.
Runner\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)
Time at beginning of camp3.823.623.553.713.753.92
Time at end of camp3.723.553.523.683.543.73
The organiser of the training camp claims that a runner's time will improve by more than 0.05 minutes between the beginning and end of the camp. Assuming that differences in time over the two runs are normally distributed, test at the 10% significance level whether the organiser's claim is justified. [8]
CAIE FP2 2009 November Q8
9 marks Challenging +1.2
150 sheep, chosen from a large flock of sheep, were divided into two groups of 75. Over a fixed period, one group had their grazing controlled and the other group grazed freely. The gains in weight, in kg, were recorded for each animal and the table below shows the sample means and the unbiased estimates of the population variances for the two samples.
Sample meanUnbiased estimate of population variance
Controlled grazing19.1420.54
Free grazing15.369.84
It is required to test whether the population mean for sheep having their grazing controlled exceeds the population mean for sheep grazing freely by less than 5 kg. State, giving a reason, if it is necessary for the validity of the test to assume that the two population variances are equal. [1] Stating any other assumption, carry out the test at the 5\% significance level. [8]
CAIE FP2 2010 November Q9
10 marks Challenging +1.2
A national athletics coach suspects that, on average, 200-metre runners' indoor times exceed their outdoor times by more than 0.1 seconds. In order to test this, the coach randomly selects eight 200-metre runners and records their indoor and outdoor times. The results, in seconds, are shown in the table.
Runner\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Indoor time21.521.820.921.221.421.421.221.0
Outdoor time21.121.720.720.921.321.021.120.8
Stating suitable hypotheses and any necessary assumption that you make, test the coach's suspicion at the 2.5% level of significance. [10]
CAIE FP2 2014 November Q6
5 marks Challenging +1.2
A random sample of 50 observations of a random variable \(X\) and a random sample of 60 observations of a random variable \(Y\) are taken. The results for the sample means, \(\bar{x}\) and \(\bar{y}\), and the unbiased estimates for the population variances, \(s_x^2\) and \(s_y^2\), respectively, are as follows. $$\bar{x} = 25.4 \quad \bar{y} = 23.6 \quad s_x^2 = 23.2 \quad s_y^2 = 27.8$$ A test, at the \(\alpha\%\) significance level, of the null hypothesis that the population means of \(X\) and \(Y\) are equal against the alternative hypothesis that they are not equal is carried out. Given that the null hypothesis is not rejected, find the set of possible values of \(\alpha\). [5]
CAIE FP2 2018 November Q8
9 marks Standard +0.3
The weekly salaries of employees at two large electronics companies, \(A\) and \(B\), are being compared. The weekly salary of an employee from company \(A\) and an employee from company \(B\) are denoted by \(\\)x\( and \)\\(y\) respectively. A random sample of 50 employees from company \(A\) and a random sample of 40 employees from company \(B\) give the following summarised data. $$\Sigma x = 5120 \quad \Sigma x^2 = 531000 \quad \Sigma y = 3760 \quad \Sigma y^2 = 375135$$
  1. The population mean salaries of employees from companies \(A\) and \(B\) are denoted by \(\\)\mu_A\( and \)\\(\mu_B\) respectively. Using a 5\% significance level, test the null hypothesis \(\mu_A = \mu_B\) against the alternative hypothesis \(\mu_A \neq \mu_B\). [8]
  2. State, with a reason, whether any assumptions about the distributions of employees' salaries are needed for the test in part (i). [1]
CAIE FP2 2018 November Q9
10 marks Standard +0.3
There are a large number of students at a particular college. The heights, in metres, of a random sample of 8 students are as follows. $$1.75 \quad 1.72 \quad 1.62 \quad 1.70 \quad 1.82 \quad 1.75 \quad 1.68 \quad 1.84$$ You may assume that heights of students are normally distributed.
  1. Test, at the 5\% significance level, whether the population mean height of students at this college is greater than 1.70 metres. [7]
  2. Find a 95\% confidence interval for the population mean height of students at this college. [3]
CAIE FP2 2018 November Q6
6 marks Moderate -0.3
The heights, in metres, of a random sample of 8 trees of a particular type are as follows. 14.2 11.3 10.8 8.4 12.8 11.5 12.1 9.2 Assuming that heights of trees of this type are normally distributed, calculate a 95% confidence interval for the mean height of trees of this type. [6]
CAIE FP2 2019 November Q6
7 marks Challenging +1.2
A random sample of 9 members is taken from the large number of members of a sports club, and their heights are measured. The heights of all the members of the club are assumed to be normally distributed. A 95% confidence interval for the population mean height, \(\mu\) metres, is calculated from the data as \(1.65 \leqslant \mu \leqslant 1.85\).
  1. Find an unbiased estimate for the population variance. [3]
  2. Denoting the height of a member of the club by \(x\) metres, find \(\Sigma x^2\) for this sample of 9 members. [4]
CAIE FP2 2019 November Q8
9 marks Challenging +1.2
A random sample of 8 elephants from region \(A\) is taken and their weights, \(x\) tonnes, are recorded. (1 tonne = 1000 kg.) The results are summarised as follows. $$\Sigma x = 32.4 \quad \Sigma x^2 = 131.82$$ A random sample of 10 elephants from region \(B\) is taken. Their weights give a sample mean of 3.78 tonnes and an unbiased variance estimate of 0.1555 tonnes\(^2\). The distributions of the weights of elephants in regions \(A\) and \(B\) are both assumed to be normal with the same population variance. Test at the 10% significance level whether the mean weight of elephants in region \(A\) is the same as the mean weight of elephants in region \(B\). [9]