5.05a Sample mean distribution: central limit theorem

222 questions

Sort by: Default | Easiest first | Hardest first
OCR MEI S2 2016 June Q2
16 marks Standard +0.3
2 When a genetic sequence of plant DNA is given a dose of radiation, some of the genes may mutate. The probability that a gene mutates is 0.012 . Mutations occur randomly and independently.
  1. Explain the meanings of the terms 'randomly' and 'independently' in this context. A short stretch of DNA containing 20 genes is given a dose of radiation.
  2. Find the probability that exactly 1 out of the 20 genes mutates. A longer stretch of DNA containing 500 genes is given a dose of radiation.
  3. Explain why a Poisson distribution is an appropriate approximating distribution for the number of genes that mutate.
  4. Use this Poisson distribution to find the probability that there are
    (A) exactly two genes that mutate,
    (B) at least two genes that mutate. A third stretch of DNA containing 50000 genes is given a dose of radiation.
  5. Use a suitable approximating distribution to find the probability that there are at least 650 genes that mutate.
OCR S3 2009 January Q6
13 marks Standard +0.3
6 A mathematics examination is taken by 29 boys and 26 girls. Experience has shown that the probability that any boy forgets to bring a calculator to the examination is 0.3 , and that any girl forgets is 0.2 . Whether or not any student forgets to bring a calculator is independent of all other students. The numbers of boys and girls who forget to bring a calculator are denoted by \(B\) and \(G\) respectively, and \(F = B + G\).
  1. Find \(\mathrm { E } ( F )\) and \(\operatorname { Var } ( F )\).
  2. Using suitable approximations to the distributions of \(B\) and \(G\), which should be justified, find the smallest number of spare calculators that should be available in order to be at least \(99 \%\) certain that all 55 students will have a calculator.
OCR MEI S3 2016 June Q3
18 marks Standard +0.3
3 The random variable \(X\) has the following probability density function: $$\mathrm { f } ( x ) = \begin{cases} k \left( 1 - x ^ { 2 } \right) & - 1 \leqslant x \leqslant 1 \\ 0 & \text { elsewhere } \end{cases}$$ where \(k\) is a positive constant.
  1. Calculate the value of \(k\).
  2. Sketch the probability density function.
  3. Calculate \(\operatorname { Var } ( X )\).
  4. Find a cubic equation satisfied by the upper quartile \(q\), and hence verify that \(q = 0.35\) to 2 decimal places.
  5. A random sample of 40 values of \(X\) is taken. Using a suitable approximating distribution, calculate the probability that the mean of these values is greater than 0.125 . Justify your choice of distribution.
OCR MEI S4 2011 June Q2
24 marks Standard +0.8
2 The random variable \(X\) has the \(\chi _ { n } ^ { 2 }\) distribution. This distribution has moment generating function \(\mathrm { M } ( \theta ) = ( 1 - 2 \theta ) ^ { - \frac { 1 } { 2 } n }\), where \(\theta < \frac { 1 } { 2 }\).
  1. Verify the expression for \(\mathrm { M } ( \theta )\) quoted above for the cases \(n = 2\) and \(n = 4\), given that the probability density functions of \(X\) in these cases are as follows. $$\begin{array} { l l } n = 2 : & \mathrm { f } ( x ) = \frac { 1 } { 2 } \mathrm { e } ^ { - \frac { 1 } { 2 } x } \quad ( x > 0 ) \\ n = 4 : & \mathrm { f } ( x ) = \frac { 1 } { 4 } x \mathrm { e } ^ { - \frac { 1 } { 2 } x } \quad ( x > 0 ) \end{array}$$
  2. For the general case, use \(\mathrm { M } ( \theta )\) to find the mean and variance of \(X\) in terms of \(n\).
  3. \(Y _ { 1 } , Y _ { 2 } , \ldots , Y _ { k }\) are independent random variables, each with the \(\chi _ { 1 } ^ { 2 }\) distribution. Show that \(W = \sum _ { i = 1 } ^ { k } Y _ { i }\) has the \(\chi _ { k } ^ { 2 }\) distribution.
  4. Use the Central Limit Theorem to find an approximation for \(\mathrm { P } ( W < 118.5 )\) for the case \(k = 100\).
OCR Further Statistics 2022 June Q6
7 marks Challenging +1.2
6 The random variable \(X\) was assumed to have a normal distribution with mean \(\mu\). Using a random sample of size 128, a significance test was carried out using the following hypotheses. \(\mathrm { H } _ { 0 } : \mu = 30\) \(\mathrm { H } _ { 1 } : \mu > 30\) It was found that \(\sum x = 3929.6\) and \(\sum x ^ { 2 } = 123483.52\). The conclusion of the test was to reject the null hypothesis.
  1. Determine the range of possible values of the significance level of the test.
  2. It was subsequently found that \(X\) was not normally distributed. Explain whether this invalidates the conclusion of the test.
OCR Further Statistics 2023 June Q3
6 marks Standard +0.3
3 The discrete random variable \(W\) has the distribution \(\mathrm { U } ( 11 )\). The independent discrete random variable \(V\) has the distribution \(\mathrm { U } ( 5 )\).
  1. It is given that, for constants \(m\) and \(n\), with \(m > 0\), \(\mathrm { E } ( \mathrm { mW } + \mathrm { nV } ) = 0\) and \(\operatorname { Var } ( \mathrm { mW } + \mathrm { nV } ) = 1\). Determine the exact values of \(m\) and \(n\). The random variable \(T\) is the mean of three independent observations of \(W\).
  2. Explain whether the Central Limit Theorem can be used to say that the distribution of \(T\) is approximately normal.
Edexcel S2 2015 January Q3
11 marks Moderate -0.8
3. Explain what you understand by
  1. a statistic,
  2. a sampling distribution. A factory stores screws in packets. A small packet contains 100 screws and a large packet contains 200 screws. The factory keeps small and large packets in the ratio 4:3 respectively.
  3. Find the mean and the variance of the number of screws in the packets stored at the factory. A random sample of 3 packets is taken from the factory and \(Y _ { 1 } , Y _ { 2 }\) and \(Y _ { 3 }\) denote the number of screws in each of these packets.
  4. List all the possible samples.
  5. Find the sampling distribution of \(\bar { Y }\)
Edexcel S2 2019 January Q6
12 marks Moderate -0.3
  1. (i) (a) State the conditions under which the Poisson distribution may be used as an approximation to the binomial distribution.
A factory produces tyres for bicycles and \(0.25 \%\) of the tyres produced are defective. A company orders 3000 tyres from the factory.
(b) Find, using a Poisson approximation, the probability that there are more than 7 defective tyres in the company's order.
(ii) At the company \(40 \%\) of employees are known to cycle to work. A random sample of 150 employees is taken. The random variable \(C\) represents the number of employees in the sample who cycle to work.
(a) Describe a suitable sampling frame that can be used to take this sample.
(b) Explain what you understand by the sampling distribution of \(C\) Louis uses a normal approximation to calculate the probability that at most \(\alpha\) employees in the sample cycle to work. He forgets to use a continuity correction and obtains the incorrect probability 0.0668 Find, showing all stages of your working,
(c) the value of \(\alpha\) (d) the correct probability.
Edexcel S2 2018 June Q4
6 marks Moderate -0.8
4. The volume of milk, \(M\) litres, in cartons produced by a dairy, has distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\), where \(\mu\) and \(\sigma\) are unknown. A random sample of 12 cartons is taken and the volume of milk in each carton is measured ( \(M _ { 1 } , M _ { 2 } , \ldots , M _ { 12 }\) ). A statistic \(X\) is based on this sample.
  1. Explain what is meant by "a random sample" in this case.
  2. State the population in this case.
  3. Write down the distribution of \(\frac { M _ { 12 } - \mu } { \sigma }\)
  4. Explain what you understand by the sampling distribution of \(X\).
  5. State, giving a reason, which of the following is not a statistic based on this sample.
    (I) \(3 M _ { 1 } + \frac { 2 M _ { 11 } } { 6 }\) (II) \(\sum _ { i = 1 } ^ { 12 } \left( \frac { M _ { i } - \mu } { \sigma } \right) ^ { 2 }\) (III) \(\sum _ { i = 1 } ^ { 12 } \left( 2 M _ { i } - 3 \right)\)
Edexcel S2 2021 June Q6
10 marks Standard +0.8
  1. The random variable \(Y \sim \mathrm {~B} ( 225 , p )\)
Using a normal approximation, the probability that \(Y\) is at least 188 is 0.1056 to 4 decimal places.
  1. Show that \(p\) satisfies \(145 p ^ { 2 } - 241 p + 100 = 0\) when the normal probability tables are used.
  2. Hence find the value of \(p\), justifying your answer.
Edexcel S2 2022 June Q2
11 marks Standard +0.3
  1. The time, in minutes, spent waiting for a call to a call centre to be answered is modelled by the random variable \(T\) with probability density function
$$f ( t ) = \left\{ \begin{array} { l c } \frac { 1 } { 192 } \left( t ^ { 3 } - 48 t + 128 \right) & 0 \leqslant t \leqslant 4 \\ 0 & \text { otherwise } \end{array} \right.$$
  1. Use algebraic integration to find, in minutes and seconds, the mean waiting time.
  2. Show that \(\mathrm { P } ( 1 < T < 3 ) = \frac { 7 } { 16 }\) A supervisor randomly selects 256 calls to the call centre.
  3. Use a suitable approximation to find the probability that more than 125 of these calls take between 1 and 3 minutes to be answered.
Edexcel S2 2022 June Q4
9 marks Standard +0.3
  1. Past evidence shows that \(7 \%\) of pears grown by a farmer are unfit for sale.
This season it is believed that the proportion of pears that are unfit for sale has decreased. To test this belief a random sample of \(n\) pears is taken. The random variable \(Y\) represents the number of pears in the sample that are unfit for sale.
  1. Find the smallest value of \(n\) such that \(Y = 0\) lies in the critical region for this test at a \(5 \%\) level of significance. In the past, \(8 \%\) of the pears grown by the farmer weigh more than 180 g . This season the farmer believes the proportion of pears weighing more than 180 g has changed. She takes a random sample of 75 pears and finds that 11 of them weigh more than 180 g .
  2. Test, using a suitable approximation, whether there is evidence of a change in the proportion of pears weighing more than 180 g .
    You should use a \(5 \%\) level of significance and state your hypotheses clearly.
Edexcel S2 2023 June Q1
11 marks Moderate -0.3
  1. In a large population \(40 \%\) of adults use online banking.
A random sample of 50 adults is taken.
The random variable \(X\) represents the number of adults in the sample that use online banking.
  1. Find
    1. \(\mathrm { P } ( X = 26 )\)
    2. \(\mathrm { P } ( X \geqslant 26 )\)
    3. the smallest value of \(k\) such that \(\mathrm { P } ( X \leqslant k ) > 0.4\) A random sample of 600 adults is taken.
    1. Find, using a normal approximation, the probability that no more than 222 of these 600 adults use online banking.
    2. Explain why a normal approximation is suitable in part (b)(i)
Edexcel S2 2024 June Q3
15 marks Moderate -0.8
3 Jian owns a large group of shops. She decides to visit a random sample of the shops to check if the stocktaking system is being used incorrectly.
  1. Suggest a suitable sampling frame for Jian to use.
  2. Identify the sampling units.
  3. Give one advantage and one disadvantage of taking a sample rather than a census. Jian believes that the stocktaking system is being used incorrectly in \(40 \%\) of the shops.
    To investigate her belief, a random sample of 30 of the shops is taken.
  4. Using a 5\% level of significance, find the critical region for a two-tailed test of Jian's belief.
    You should state the probability in each tail, which should each be as close as possible to 2.5\% The total number of shops, in the sample of 30, where the stocktaking system is being used incorrectly is 20
  5. Using the critical region from part (d), state what this suggests about Jian's belief. Give a reason for your answer. Jian introduces a new, simpler, stocktaking system to all the shops.
    She takes a random sample of 150 shops and finds that in 47 of these shops the new stocktaking system is being used incorrectly.
  6. Using a suitable approximation, test, at the \(5 \%\) level of significance, whether or not there is evidence that the proportion of shops where the stocktaking system is being used incorrectly is now less than 0.4 You should state your hypotheses and show your working clearly.
Edexcel S2 2018 Specimen Q3
11 marks Moderate -0.3
3. Explain what you understand by
  1. a statistic,
  2. a sampling distribution. A factory stores screws in packets. A small packet contains 100 screws and a large packet contains 200 screws. The factory keeps small and large packets in the ratio 4:3 respectively.
  3. Find the mean and the variance of the number of screws in the packets stored at the factory. A random sample of 3 packets is taken from the factory and \(Y _ { 1 } , Y _ { 2 }\) and \(Y _ { 3 }\) denote the number of screws in each of these packets.
  4. List all the possible samples.
  5. Find the sampling distribution of \(\bar { Y }\)
    VIIIV SIHI NI IIIYM ION OCVIUV SIHI NI JIIIM I I ON OCVEXV SIHII NI JIIIM I ION OO
Edexcel S2 2010 January Q7
11 marks Moderate -0.5
  1. A bag contains a large number of coins. It contains only \(1 p\) and \(2 p\) coins in the ratio \(1 : 3\)
    1. Find the mean \(\mu\) and the variance \(\sigma ^ { 2 }\) of the values of this population of coins.
    A random sample of size 3 is taken from the bag.
  2. List all the possible samples.
  3. Find the sampling distribution of the mean value of the samples.
Edexcel S2 2009 June Q5
10 marks Standard +0.3
  1. An administrator makes errors in her typing randomly at a rate of 3 errors every 1000 words.
    1. In a document of 2000 words find the probability that the administrator makes 4 or more errors.
    The administrator is given an 8000 word report to type and she is told that the report will only be accepted if there are 20 or fewer errors.
  2. Use a suitable approximation to calculate the probability that the report is accepted.
Edexcel S2 2011 June Q5
13 marks Standard +0.3
5. Defects occur at random in planks of wood with a constant rate of 0.5 per 10 cm length. Jim buys a plank of length 100 cm .
  1. Find the probability that Jim's plank contains at most 3 defects. Shivani buys 6 planks each of length 100 cm .
  2. Find the probability that fewer than 2 of Shivani's planks contain at most 3 defects.
  3. Using a suitable approximation, estimate the probability that the total number of defects on Shivani's 6 planks is less than 18.
Edexcel S2 2011 June Q6
14 marks Standard +0.3
  1. A shopkeeper knows, from past records, that \(15 \%\) of customers buy an item from the display next to the till. After a refurbishment of the shop, he takes a random sample of 30 customers and finds that only 1 customer has bought an item from the display next to the till.
    1. Stating your hypotheses clearly, and using a \(5 \%\) level of significance, test whether or not there has been a change in the proportion of customers buying an item from the display next to the till.
    During the refurbishment a new sandwich display was installed. Before the refurbishment \(20 \%\) of customers bought sandwiches. The shopkeeper claims that the proportion of customers buying sandwiches has now increased. He selects a random sample of 120 customers and finds that 31 of them have bought sandwiches.
  2. Using a suitable approximation and stating your hypotheses clearly, test the shopkeeper's claim. Use a \(10 \%\) level of significance.
Edexcel S2 2014 June Q2
7 marks Moderate -0.8
2. A bag contains a large number of counters. Each counter has a single digit number on it and the mean of all the numbers in the bag is the unknown parameter \(\mu\). The number 2 is on \(40 \%\) of the counters and the number 5 is on \(25 \%\) of the counters. All the remaining counters have numbers greater than 5 on them. A random sample of 10 counters is taken from the bag.
  1. State whether or not each of the following is a statistic
    1. \(S =\) the sum of the numbers on the counters in the sample,
    2. \(D =\) the difference between the highest number in the sample and \(\mu\),
    3. \(F =\) the number of counters in the sample with a number 5 on them. The random variable \(T\) represents the number of counters in a random sample of 10 with the number 2 on them.
  2. Specify the sampling distribution of \(T\). The counters are selected one by one.
  3. Find the probability that the third counter selected is the first counter with the number 2 on it.
Edexcel S3 2022 January Q2
8 marks Standard +0.3
  1. Secondary schools in a region conduct ability testing at the start of Year 7 and the start of Year 8. Each year a regional education officer randomly selects 240 Year 7 students and 240 Year 8 students from across the region. The results for last year are summarised in the table below.
\cline { 2 - 3 } \multicolumn{1}{c|}{}Mean scoreVariance of scores
Year 710138
Year 810342
The regional education officer claims that there is no difference between the mean scores of these two year groups.
  1. Test the regional education officer's claim at the \(1 \%\) significance level. You should state your hypotheses, test statistic and critical value clearly.
  2. Explain the significance of the Central Limit Theorem in part (a).
Edexcel S3 2023 January Q6
10 marks Moderate -0.3
6 A garden centre sells bags of stones and large bags of gravel.
The weight, \(X\) kilograms, of stones in a bag can be modelled by a normal distribution with unknown mean \(\mu\) and known standard deviation 0.4 The stones in each of a random sample of 36 bags from a large batch is weighed. The total weight of stones in these 36 bags is found to be 806.4 kg
  1. Find a 98\% confidence interval for the mean weight of stones in the batch.
  2. Explain why the use of the Central Limit theorem is not required to answer part (a) The manufacturer of these bags of stones claims that bags in this batch have a mean weight of 22.5 kg
  3. Using your answer to part (a), comment on the claim made by the manufacturer. The weight, \(Y\) kilograms, of gravel in a large bag can be modelled by a normal distribution with mean 850 kg and standard deviation 5 kg A builder purchases 10 large bags of gravel.
  4. Find the probability that the mean weight of gravel in the 10 large bags is less than 848 kg
Edexcel S3 2024 January Q6
15 marks Standard +0.3
  1. A random sample of 8 three-month-old golden retriever dogs is taken.
The heights of the golden retrievers are recorded.
Using this sample, a 95\% confidence interval for the mean height, in cm, of three-month-old golden retrievers is found to be \(( 45.72,53.88 )\)
  1. Find a 99\% confidence interval for the mean height. You may assume that the heights are normally distributed with known population standard deviation. Some summary statistics for the weights, \(x \mathrm {~kg}\), of this sample are given below. $$\sum x = 91.2 \quad \sum x ^ { 2 } = 1145.16 \quad n = 8$$
  2. Calculate unbiased estimates of the mean and the variance of the weights of three-month-old golden retrievers. A further random sample of 24 three-month-old golden retrievers is taken. The unbiased estimates of the mean and the variance of the weights, in kg , from this sample are found to be 10.8 and 17.64 respectively.
  3. Estimate the standard error of the mean weight for the combined sample of 32 three-month-old golden retrievers.
Edexcel S3 2016 June Q4
8 marks Standard +0.3
4. A random sample of 60 children and a random sample of 50 adults were taken and each person was given the same task to complete. The table below summarises the times taken, \(t\) seconds, to complete the task.
Mean, \(\overline { \boldsymbol { t } }\)Standard deviation, \(\boldsymbol { s }\)\(\boldsymbol { n }\)
Children61.25.960
Adults59.15.250
  1. Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, whether or not there is evidence that the mean time taken to complete the task by children is greater than the mean time taken by adults.
    (6)
  2. Explain the relevance of the Central Limit Theorem to your calculation in part (a).
  3. State an assumption you have made to carry out the test in part (a).
Edexcel S3 2016 June Q8
7 marks Challenging +1.2
8. A six-sided die is labelled with the numbers \(1,2,3,4,5\) and 6 A group of 50 students want to test whether or not the die is fair for the number six.
The 50 students each roll the die 30 times and record the number of sixes they each obtain.
Given that \(\bar { X }\) denotes the mean number of sixes obtained by the 50 students, and using $$\mathrm { H } _ { 0 } : p = \frac { 1 } { 6 } \text { and } \mathrm { H } _ { 1 } : p \neq \frac { 1 } { 6 }$$ where \(p\) is the probability of rolling a 6 ,
  1. use the Central Limit Theorem to find an approximate distribution for \(\bar { X }\), if \(\mathrm { H } _ { 0 }\) is true.
  2. Hence find, in terms of \(\bar { X }\), the critical region for this test. Use a \(5 \%\) level of significance.