5.05a Sample mean distribution: central limit theorem

222 questions

Sort by: Default | Easiest first | Hardest first
CAIE FP2 2014 June Q9
Easy -3.0
9 The continuous random variable \(X\) has distribution function F given by $$\mathrm { F } ( x ) = \begin{cases} 0 & x < 2 , \\ \frac { 1 } { 8 } x - \frac { 1 } { 4 } & 2 \leqslant x \leqslant 10 , \\ 1 & x > 10 . \end{cases}$$ Find the value of \(k\) for which \(\mathrm { P } ( X \geqslant k ) = 0.6\). The random variable \(Y\) is defined by \(Y = 2 \ln X\). Find the distribution function of \(Y\). Find the probability density function of \(Y\) and sketch its graph.
CAIE S2 2023 March Q4
5 marks Standard +0.3
4 The number of accidents per 3-month period on a certain road has the distribution \(\operatorname { Po } ( \lambda )\). In the past the value of \(\lambda\) has been 5.7. Following some changes to the road, the council carries out a hypothesis test to determine whether the value of \(\lambda\) has decreased. If there are fewer than 3 accidents in a randomly chosen 3 -month period, the council will conclude that the value of \(\lambda\) has decreased.
  1. Find the probability of a Type I error.
  2. Find the probability of a Type II error if the mean number of accidents per 3-month period is now actually 0.9 .
CAIE S2 2023 March Q6
9 marks Standard +0.3
6 Last year, the mean time taken by students at a school to complete a certain test was 25 minutes. Akash believes that the mean time taken by this year's students was less than 25 minutes. In order to test this belief, he takes a large random sample of this year's students and he notes the time taken by each student. He carries out a test, at the \(2.5 \%\) significance level, for the population mean time, \(\mu\) minutes. Akash uses the null hypothesis \(\mathrm { H } _ { 0 } : \mu = 25\).
  1. Give a reason why Akash should use a one-tailed test.
    Akash finds that the value of the test statistic is \(z = - 2.02\).
  2. Explain what conclusion he should draw.
    In a different one-tailed hypothesis test the \(z\)-value was found to be 2.14 .
  3. Given that this value would lead to a rejection of the null hypothesis at the \(\alpha \%\) significance level, find the set of possible values of \(\alpha\).
    The population mean time taken by students at another school to complete a test last year was \(m\) minutes. Sorin carries out a one-tailed test to determine whether the population mean this year is less than \(m\), using a random sample of 100 students. He assumes that the population standard deviation of the times is 3.9 minutes. The sample mean is 24.8 minutes, and this result just leads to the rejection of the null hypothesis at the 5\% significance level.
  4. Find the value of \(m\).
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE S2 2020 June Q2
7 marks Moderate -0.3
2 In the past the yield of a certain crop, in tonnes per hectare, had mean 0.56 and standard deviation 0.08 Following the introduction of a new fertilizer, the farmer intends to test at the \(2.5 \%\) significance level whether the mean yield has increased. He finds that the mean yield over 10 years is 0.61 tonnes per hectare.
  1. State two assumptions that are necessary for the test.
  2. Carry out the test.
CAIE S2 2020 June Q4
6 marks Standard +0.3
4 A fair spinner has five sides numbered \(1,2,3,4,5\). The score on one spin is denoted by \(X\).
  1. Show that \(\operatorname { Var } ( X ) = 2\).
    Fiona has another spinner, also with five sides numbered \(1,2,3,4,5\). She suspects that it is biased so that the expected score is less than 3 . In order to test her suspicion, she plans to spin her spinner 40 times. If the mean score is less than 2.6 she will conclude that her spinner is biased in this way.
  2. Find the probability of a Type I error.
  3. State what is meant by a Type II error in this context.
CAIE S2 2020 June Q2
6 marks Moderate -0.8
2 A shop obtains apples from a certain farm. It has been found that 5\% of apples from this farm are Grade A. Following a change in growing conditions at the farm, the shop management plan to carry out a hypothesis test to find out whether the proportion of Grade A apples has increased. They select 25 apples at random. If the number of Grade A apples is more than 3 they will conclude that the proportion has increased.
  1. State suitable null and alternative hypotheses for the test.
  2. Find the probability of a Type I error.
    In fact 2 of the 25 apples were Grade A .
  3. Which of the errors, Type I or Type II, is possible? Justify your answer.
CAIE S2 2020 June Q4
12 marks Standard +0.3
4 The score on one spin of a 5 -sided spinner is denoted by the random variable \(X\) with probability distribution as shown in the table.
\(x\)01234
\(\mathrm { P } ( X = x )\)0.10.20.40.20.1
  1. Show that \(\operatorname { Var } ( X ) = 1.2\).
    The spinner is spun 200 times. The score on each spin is noted and the mean, \(\bar { X }\), of the 200 scores is found.
  2. Given that \(\mathrm { P } ( \bar { X } > a ) = 0.1\), find the value of \(a\).
  3. Explain whether it was necessary to use the Central Limit theorem in your answer to part (b).
  4. Johann has another, similar, spinner. He suspects that it is biased so that the mean score is less than 2 . He spins his spinner 200 times and finds that the mean of the 200 scores is 1.86 . Given that the variance of the score on one spin of this spinner is also 1.2 , test Johann's suspicion at the 5\% significance level.
CAIE S2 2002 June Q5
8 marks Moderate -0.5
5 To test whether a coin is biased or not, it is tossed 10 times. The coin will be considered biased if there are 9 or 10 heads, or 9 or 10 tails.
  1. Show that the probability of making a Type I error in this test is approximately 0.0215 .
  2. Find the probability of making a Type II error in this test when the probability of a head is actually 0.7.
CAIE S2 2003 June Q5
8 marks Standard +0.3
5 Over a long period of time it is found that the time spent at cash withdrawal points follows a normal distribution with mean 2.1 minutes and standard deviation 0.9 minutes. A new system is tried out, to speed up the procedure. The null hypothesis is that the mean time spent is the same under the new system as previously. It is decided to reject the null hypothesis and accept that the new system is quicker if the mean withdrawal time from a random sample of 20 cash withdrawals is less than 1.7 minutes. Assume that, for the new system, the standard deviation is still 0.9 minutes, and the time spent still follows a normal distribution.
  1. Calculate the probability of a Type I error.
  2. If the mean withdrawal time under the new system is actually 1.5 minutes, calculate the probability of a Type II error.
CAIE S2 2020 June Q7
9 marks Moderate -0.8
7 A market researcher is investigating the length of time that customers spend at an information desk. He plans to choose a sample of 50 customers on a particular day.
  1. He considers choosing the first 50 customers who visit the information desk. Explain why this method is unsuitable.
    The actual lengths of time, in minutes, that customers spend at the information desk may be assumed to have mean \(\mu\) and variance 4.8. The researcher knows that in the past the value of \(\mu\) was 6.0. He wishes to test, at the \(2 \%\) significance level, whether this is still true. He chooses a random sample of 50 customers and notes how long they each spend at the information desk.
  2. State the probability of making a Type I error and explain what is meant by a Type I error in this context.
  3. Given that the mean time spent at the information desk by the 50 customers is 6.8 minutes, carry out the test.
  4. Give a reason why it was necessary to use the Central Limit theorem in your answer to part (c).
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE S2 2021 June Q4
8 marks Moderate -0.8
4 Wendy's journey to work consists of three parts: walking to the train station, riding on the train and then walking to the office. The times, in minutes, for the three parts of her journey are independent and have the distributions \(\mathrm { N } \left( 15.0,1.1 ^ { 2 } \right) , \mathrm { N } \left( 32.0,3.5 ^ { 2 } \right)\) and \(\mathrm { N } \left( 8.6,1.2 ^ { 2 } \right)\) respectively.
  1. Find the mean and variance of the total time for Wendy's journey.
    If Wendy's journey takes more than 60 minutes, she is late for work.
  2. Find the probability that, on a randomly chosen day, Wendy will be late for work.
  3. Find the probability that the mean of Wendy's journey times over 15 randomly chosen days will be less than 54.5 minutes.
CAIE S2 2021 June Q4
9 marks Standard +0.3
4 The masses, \(m\) kilograms, of flour in a random sample of 90 sacks of flour are summarised as follows. $$n = 90 \quad \Sigma m = 4509 \quad \Sigma m ^ { 2 } = 225950$$
  1. Find unbiased estimates of the population mean and variance.
  2. Calculate a \(98 \%\) confidence interval for the population mean.
  3. Explain why it was necessary to use the Central Limit theorem in answering part (b).
  4. Find the probability that the confidence interval found in part (b) is wholly above the true value of the population mean.
CAIE S2 2022 June Q6
10 marks Standard +0.3
6 The masses, in kilograms, of large and small sacks of grain have the distributions \(\mathrm { N } ( 53,11 )\) and \(\mathrm { N } ( 14,3 )\) respectively.
  1. Find the probability that the mass of a randomly chosen large sack is greater than four times the mass of a randomly chosen small sack.
  2. A lift can safely carry a maximum mass of 1000 kg . Find the probability that the lift can safely carry 12 randomly chosen large sacks and 25 randomly chosen small sacks. \(7 X\) is a random variable with distribution \(\operatorname { Po } ( 2.90 )\). A random sample of 100 values of \(X\) is taken. Find the probability that the sample mean is less than 2.88 .
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE S2 2022 June Q1
3 marks Moderate -0.5
1 The number of characters in emails sent by a particular company is modelled by the distribution \(\mathrm { N } \left( 1250,480 ^ { 2 } \right)\). Find the probability that the mean number of characters in a random sample of 100 emails sent by the company is more than 1300 .
CAIE S2 2023 June Q5
9 marks Standard +0.3
5 Last year the mean time for pizza deliveries from Pete's Pizza Pit was 32.4 minutes. This year the time, \(t\) minutes, for pizza deliveries from Pete's Pizza Pit was recorded for a random sample of 50 deliveries. The results were as follows. $$n = 50 \quad \Sigma t = 1700 \quad \Sigma t ^ { 2 } = 59050$$
  1. Find unbiased estimates of the population mean and variance.
  2. Test, at the \(2 \%\) significance level, whether the mean delivery time has changed since last year.
  3. Under what circumstances would it not be necessary to use the Central Limit Theorem in answering (b)?
CAIE S2 2024 June Q4
9 marks Moderate -0.3
4
  1. A random sample of 8 boxes of cereal from a certain supplier was taken. Each box was weighed and the masses in grams were as follows. $$\begin{array} { l l l l l l l l } 261 & 249 & 259 & 252 & 255 & 256 & 258 & 254 \end{array}$$ Find unbiased estimates of the population mean and variance.
  2. The supplier claims that the mean mass of boxes of cereal is 253 g . A quality control officer suspects that the mean mass is actually more than 253 g . In order to test this claim, he weighs a random sample of 100 boxes of cereal and finds that the total mass is 25360 g .
    1. Given that the population standard deviation of the masses is 3.5 g , test at the \(5 \%\) significance level whether the population mean mass is more than 253 g .
      An employee says, 'This test is invalid because it uses the normal distribution, but we do not know whether the masses of the boxes are normally distributed.'
    2. Explain briefly whether this statement is true or not.
CAIE S2 2024 June Q2
4 marks Moderate -0.5
2 The widths, \(w \mathrm {~cm}\), of a random sample of 150 leaves of a certain kind were measured. The sample mean of \(w\) was found to be 3.12 cm . Using this sample, an approximate \(95 \%\) confidence interval for the population mean of the widths in centimetres was found to be [3.01, 3.23].
  1. Calculate an estimate of the population standard deviation.
  2. Explain whether it was necessary to use the Central Limit theorem in your answer to part (a). [1]
CAIE S2 2024 June Q7
13 marks Standard +0.3
7 The independent random variables \(X\) and \(Y\) have the distributions \(\operatorname { Po } ( 1.9 )\) and \(\operatorname { Po } ( 2.2 )\) respectively.
  1. Find \(\mathrm { P } ( X + Y < 4 )\). \includegraphics[max width=\textwidth, alt={}, center]{7c078a14-98f9-4292-ae76-a2642238176f-10_74_1581_406_322} \includegraphics[max width=\textwidth, alt={}, center]{7c078a14-98f9-4292-ae76-a2642238176f-10_75_1581_497_322}
  2. Find the probability that \(X = 2\) given that \(X + Y < 4\). \includegraphics[max width=\textwidth, alt={}, center]{7c078a14-98f9-4292-ae76-a2642238176f-10_2715_35_144_2012}
  3. A sample of 60 randomly chosen pairs of values of \(X\) and \(Y\) is taken,and the value of \(X + Y\) is calculated for each pair.The sample mean of these 60 values is found. Find the probability that the sample mean of \(X + Y\) is less than 4.0 .
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE S2 2021 March Q1
7 marks Moderate -0.5
1 A construction company notes the time, \(t\) days, that it takes to build each house of a certain design. The results for a random sample of 60 such houses are summarised as follows. $$\Sigma t = 4820 \quad \Sigma t ^ { 2 } = 392050$$
  1. Calculate a 98\% confidence interval for the population mean time.
  2. Explain why it was necessary to use the Central Limit theorem in part (a).
CAIE S2 2014 June Q7
10 marks Standard +0.3
7 A researcher is investigating the actual lengths of time that patients spend with the doctor at their appointments. He plans to choose a sample of 12 appointments on a particular day.
  1. Which of the following methods is preferable, and why?
    • Choose the first 12 appointments of the day.
    • Choose 12 appointments evenly spaced throughout the day.
    Appointments are scheduled to last 10 minutes. The actual lengths of time, in minutes, that patients spend with the doctor may be assumed to have a normal distribution with mean \(\mu\) and standard deviation 3.4. The researcher suspects that the actual time spent is more than 10 minutes on average. To test this suspicion, he recorded the actual times spent for a random sample of 12 appointments and carried out a hypothesis test at the 1\% significance level.
  2. State the probability of making a Type I error and explain what is meant by a Type I error in this context.
  3. Given that the total length of time spent for the 12 appointments was 147 minutes, carry out the test.
  4. Give a reason why the Central Limit theorem was not needed in part (iii).
CAIE S2 2015 June Q5
7 marks Moderate -0.8
5 The mean breaking strength of cables made at a certain factory is supposed to be 5 tonnes. The quality control department wishes to test whether the mean breaking strength of cables made by a particular machine is actually less than it should be. They take a random sample of 60 cables. For each cable they find the breaking strength by gradually increasing the tension in the cable and noting the tension when the cable breaks.
  1. Give a reason why it is necessary to take a sample rather then testing all the cables produced by the machine.
  2. The mean breaking strength of the 60 cables in the sample is found to be 4.95 tonnes. Given that the population standard deviation of breaking strengths is 0.15 tonnes, test at the \(1 \%\) significance level whether the population mean breaking strength is less than it should be.
  3. Explain whether it was necessary to use the Central Limit theorem in the solution to part (ii).
CAIE S2 2016 June Q1
4 marks Moderate -0.5
1 The length of time, in minutes, taken by people to complete a task has mean 53.0 and standard deviation 6.2. Find the probability that the mean time taken to complete the task by a random sample of 50 people is more than 51 minutes.
CAIE S2 2017 June Q2
6 marks Moderate -0.3
2 Past experience has shown that the heights of a certain variety of plant have mean 64.0 cm and standard deviation 3.8 cm . During a particularly hot summer, it was expected that the heights of plants of this variety would be less than usual. In order to test whether this was the case, a botanist recorded the heights of a random sample of 100 plants and found that the value of the sample mean was 63.3 cm . Stating a necessary assumption, carry out the test at the \(2.5 \%\) significance level.
CAIE S2 2012 June Q5
10 marks Standard +0.3
5 A random variable \(X\) has the distribution \(\operatorname { Po } ( 3.2 )\).
  1. A random value of \(X\) is found.
    1. Find \(\mathrm { P } ( X \geqslant 3 )\).
    2. Find the probability that \(X = 3\) given that \(X \geqslant 3\).
    3. Random samples of 120 values of \(X\) are taken.
      (a) Describe fully the distribution of the sample mean.
      (b) Find the probability that the mean of a random sample of size 120 is less than 3.3.
CAIE S2 2021 November Q1
5 marks Moderate -0.8
1 It is known that the height \(H\), in metres, of trees of a certain kind has the distribution \(\mathrm { N } ( 12.5,10.24 )\). A scientist takes a random sample of 25 trees of this kind and finds the sample mean, \(\bar { H }\), of the heights.
  1. State the distribution of \(\bar { H }\), giving the values of any parameters.
  2. Find \(\mathrm { P } ( 12 < \bar { H } < 13 )\).